Some notes about reading the 《K&C The C Programming language》
Preface
C is not a big language, and it is not well served by a big book.
C is not a very high level language, nor a big one, and is not specialized to any particular area of application.
Besides showing how to make effective use of the language, we have also tried where possible to illustrate useful algorithms and principles of good style and sound design.
CH1 A Tutorial Introduction
The only way to learn a new programming language is by writing programs in it.
1 | /* integer division truncates: any fractional part is discarded. */ |

Magic Number
they convey little information to someone who might have to read the program later, and they are hard to change in a systematic way.
One way to deal with magic numbers is to give them meaningful names. A #define line defines a symbolic name or symbolic constant to be a particular string of characters
Text
A text stream is a sequence of characters divided into lines; each line consists of zero or more characters followed by a newline character. It is the responsibility of the library to make each input or output stream confirm this model
1 |
|
When you need precise control over the character flow, use getchar instead of scanf !
Step to write a program :
Define the problem
questions like “What is a ‘word’”,”What is a ‘shortest path’”,etcCompose test cases
Try to include all boundary conditions, special cases and large/tiny scale of the problem(
len(input) == 0 or len(input) >= 2 ** 32). And you can make the definition more conciseCome up with the ‘right’ algorithm and data structure, solve the problem on paper/iPad first.
You show always spend most of your time planning and solving problem on paper. After you get the right idea, writing code will not cost much time!!!
Write the code
Variables
local variables are also called automatic variables, as they born and recollected automatically, in contrary to static and global/extern variables
Before a function can use an external variable, the name of the variable must be made known to the function; the declaration is the same as before
except for the added keywordextern.
1 | int returnglobal(void){ |
In certain circumstances, the extern declaration can be omitted. If the definition of the external variable occurs in the source file before its use in a particular function, then there is no need for an extern declaration in the function.
CH2 Types, Operators and Objects
Variable name
Names are made up of letters and digits; the first character must be a letter.
library routines often use name begin with a
_Upper and lower case letters are distinct
lower for variable name
upper for symbolic constants(the macro)
Variable size
1 |
Constants
1 | 1234 // int |
1 | 0123 // oct number |
A constant expression is an expression that involves only constants. Such expressions may be evaluated at during compilation rather than run-time, and accordingly may be used in any place that a constant can occur, for example, array definition
The standard library function strlen(s) returns the length of its character string argument s, excluding the terminal '\0'.
1 | 'a' // integer value 97 |
enumeration constant
An enumeration is a list of constant integer values.
Enumerations provide a convenient way to associate constant values with names, an alternative to #define with the advantage that the values can be generated for you.
1 | enum name { var1 , var2 , var3 }; //definition |
1 | // Make a Python-like boolean type when <stdbool.h> is missing |
Arithmetic
The % operator cannot be applied to a float or double.
1 | if (!valid) // read nicely by "If not valid" |
Characters
1 |
|
For portability, specify signed or unsigned if non-
character data is to be stored in char variables.
Operators
1 | ++c ; // increments c before its value is used |
1 | c += 1 + 2 // c += (1+2) |
1 | condition ? res1 : res2 ; // this is an expression |
1 | //write succinct code |
Precedence
Operators on the same line have the same precedence;
rows are in order of decreasing precedence

CH3 Control Flow
Loops
1 | for ( expr1 ; expr2 ; expr3){ |
Comma
A pair of expressions separated by a comma is evaluated left to right, and the type and value of the result are the type and value of the right operand.
The commas that separate function arguments, variables in declarations, etc., are not comma operators, and do not guarantee left to right evaluation.
Continue
expr3 will be executed after continue
1 | for (int i = 0 ; i < N ; i++){ |
Goto
Use it rarely, or try to avoid using it at all.
1 | for ( ... ){ |
CH4 Functions and Program Structure
C does not allow functions to be defined inside other functions(Not like Python), but declaration is OK.
1 |
|
Static
staticserve asprivatein C, limit access to the source file contains it.staticvariable is permanent and private in a function
Register Variables
A register declaration advises the compiler that the variable in question will be heavily used. The idea is that register variables are to be placed in machine registers, which may result in smaller and faster programs. But compilers are free to ignore the advice.
The register declaration can only be applied to automatic variables and to the formal parameters of a function.
it is not possible to take the address of a register variable
Block
1 | { |
An automatic variable declared and initialized in a block is initialized each time the block is entered
hide external variables and functions of the same name.
As a matter of style, it’s best to avoid variable names that conceal names in an outer scope; the potential for confusion and error is too great.
Recursion
A way to print an integer digit by digit other than using an array
1 | void printn(int n){ |
Macro substitution
Substitutions are made only for tokens, and do not take place within quoted strings. For example, if YES is a defined name, there would be no substitution in printf("YES") or in YESMAN.
1 |
|
Names may be undefined with #undef, usually to ensure that a routine is really a function, not a macro
1 |
|
A parameter name is preceded by a # in the replacement text, the combination will be expanded into a quoted string with the parameter replaced by the actual argument.
1 |
|
You can see the substitution result by preprocess the source file
1 | $ gcc -E source.c -o afterpre.c |
The preprocessor operator ## provides a way to concatenate names during macro expansion.
1 |
|
Conditional Inclusion
The #if line evaluates a constant integer expression. If the expression is non-zero, subsequent lines until an #endif or #elif or #else are included.
The expression defined(name) in a #if is 1 if the name has been defined, and 0 otherwise.
1 | //check if header has already been included |
1 | //tests the name SYSTEM to decide which version of a header to include: |
CH5 Array and Pointer
Any pointer can be cast to void * and back again without loss of information
1 | void * p ; //a generic pointer |
When an array name is passed to a function, what is passed is the address of the initial element. Within the called function, this argument is a local variable, and so an array name parameter is a pointer, that is, a variable containing an address, which can be changed.
1 | int array [N] ; |
String Pointers
C does not provide any operators for processing an entire string of characters as a unit. It is always an array
1 | char amessage[] = "now is the time"; /* an array */ |
1 | //strcpy, implementation friendly version |
1 | *p++ = val; /* push val onto stack */ |
Want to return a string in a function ?
We can only return pointer, so let’s have some private and permanent space
1 | char* duplicate(const char * string){ |
Array pointer
1 | char stringarray [10][100] ; |
The important advantage of the pointer array(int* pointers [10]) over multi-dimension array(int array[10][20])is that the rows of the array may be of different lengths(not 20).

Pointers to Functions
In C, a function itself is not a variable, but it is possible to define pointers to functions, which
can be assigned, placed in arrays, passed to functions, returned by functions, and so on.(Since it is just the entry address of a section of code!)
1 |
|
You can use it just like an array name
1 | //Define |
1 | void (*fp)(void); // fp is a pointer, pointing to a function |
1 | //used in sorting function |
How to read a C declaration ?
1 | //recall operator precedence |
1 | char (*(*x())[])() ; |
The book also provide a ‘recursion descend’ program to understand the declaration.
CH6 Structure
A structure is a collection of one or more variables, possibly of different types, grouped together under a single name for convenient handling.
1 | //struct name is optional |
Structures can be copied as a whole entity, passing to or returning by a function.
In
x86_64architecture, according to my observation, bigstructis passing and returning via using stack space,rax or rdxserve as the pointer pointing to returned space
1 |
|
We all know that struct will be padding if they are not aligned. How can we tell the exact size ? We use sizeof
1 | struct { |
Structures are useful when implementing hash table, since pair structure is natural for it.
1 | /* simple hash table */ |
Mimic class via combining struct and function pointer
1 |
|
Union
In effect, a union is a structure in which all members have offset zero from the base, the structure is big enough to hold the ``widest’’ member, and the alignment is appropriate for all
of the types in the union.
As a result, the syntax is almost the same
A union may only be initialized with a value of the type of its first member
a union can be used to force a variable to be
aligned on a particular kind of storage boundary
1 |
|
Bit-fields
Alternative to bitwise operation
1 | struct { |
This defines a variable table called flags that contains three 1-bit fields
The number following the colon represents the field width in bits.
The fields are declared unsigned int to ensure that they are unsigned quantities.
Almost everything about fields is implementation-dependent.
They are not arrays and they do not have addresses, so the
&operator cannot be applied on them.Fields may be declared only as ints; for portability, specify signed or unsigned explicitly
CH 7 Input and Output
printf
1 | :%s: :hello, world: |
Variable-length Argument Lists
the declaration ... means that the number and types of these arguments may vary. The declaration ... can only appear at the end of an argument list.
1 | void myprintf(char * fmt , ...); |
- The standard header
<stdarg.h>contains a set of macro definitions that define how to step through an argument list.
The type
va_listis used to declare a variable that will refer to each argument in turn;The macro
va_startinitializesva_listto point to the first unnamed argument. It must be called once beforeva_listis used. There must be at least one named argument (fmt)Each call of
va_argreturns one argument and stepsva_listto the next;va_arguses a type name
to determine what type to return and how big a step to take.Finally,
va_enddoes whatever cleanup is necessary. It must be called before the program returns
1 |
|
printfinstdiohas a similar logic here, and the printing loop is not controlled bynbut by the format stringfmt
So we can have the format string exploit
1 |
|
1 | target = 0x555555558044 |
1 | // printf.c |
Then in __vfprintf_internal there are thousands lines of code, and there is a function called __find_specwc to find the first “argument” to print, so just don’t dig in too deep.
By the way, if you want to know How is “__builtin_va_list” implemented?, you may need to study the source code of your compiler rather than C stdlibs.
File
When a C program is started, the operating system environment is responsible for opening three files(stdin,stdout,stderr) and providing pointers for them.
1 | //getchar and putchar can be defined in terms of getc, putc, stdin, and stdout as follows: |
CH8 The UNIX System Interface
A directory is a file that contains a list of filenames and some indication of where they are located. The “location’’ is an index into another table called the inode list. The inode for a file is where all information about the file except its name is kept. A directory entry generally consists of only two items, the filename and an inode number.
The whole point with “everything is a file” is not that you have some random filename (indeed, sockets and pipes show that “file” and “filename” have nothing to do with each other), but the fact that you can use common tools to operate on different things.
Is directory and regular file the same?
The answer is : Yes, but also NO
How files are organized is highly depend on the file system, and there are tons of file system
Let’s take EXT2 as an example here
How they can be the same?
They are both byte stream.

We have a directory organization like this in ext2 file system
1 | # left side |
1 | # right side |
We can see that:
1 | getcontent(file4) |
Content of regular file can be anything, characters, video data, binary instructions,…
Content of directory have fixed meaning, since they are created by file system, rather than arbitrary data generated by user. In ext2, they are
1 | struct{ |

1 | # left |
1 | # right |
How they can be different ?
It is nonsense to read the content of a directory as raw bytes and people always want the files in the directory.
In modern times there are many kinds of filesystems that can be mounted by Linux and other unix-like systems (ext4, ZFS, NTFS!), some of which have complex directory formats. You can’t do anything sensible with the raw bytes of an arbitrary directory. So the kernel has taken on the responsibility of providing a generic interface to directories as abstract objects.
readdiris the central piece of that interface.
1 |
|
1 | $ ./a.out |
However, we can still operate on the uniform interface provided by the standard library
1 |
|
1 | $ ./a.out |
More about file system on Linux (ext2 as example)
According to Linux documentation project, file system
We can hide files in a directory by mounting a file system at the directory, the files will be recovered after umounting.
The real file systems were separated from the operating system and system services by an interface layer known as the Virtual File system, or VFS.
All of the details of the Linux file systems are translated by software so that all file systems appear identical to the rest of the Linux kernel and to programs running in the system. Linux’s Virtual File system layer allows you to transparently mount the many different file systems at the same time.
Physical Layout of the EXT2 File system
In the EXT2 file system, the
inodeis the basic building block; every file and directory in the file system is described by one and only oneinode.Inodestructure1
2
3
4
5
6
7class Inode():
def __init__(self):
self.mode
self.ownerInfo
self.size
self.timestamps
self.datablocks
Super block
The Super block contains a description of the basic size and shape of this file system. The information within it allows the file system manager to use and maintain the file system. Usually only the Super block in Block Group 0 is read when the file system is mounted but each Block Group contains a duplicate copy in case of file system corruption.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53$ apt install sleuthkit
$ fsstat file_s.img
FILE SYSTEM INFORMATION
--------------------------------------------
File System Type: Ext2
Volume Name:
Volume ID: bf800aaad8ff14851b4bf9289b25ea9d
Last Written at: 2022-06-29 02:28:03 (EDT)
Last Checked at: 2022-06-29 00:43:35 (EDT)
Last Mounted at: 2022-06-29 02:28:03 (EDT)
Unmounted Improperly
Last mounted on: /root/.cache/test/mountpoint
Source OS: Linux
Dynamic Structure
Compat Features: Ext Attributes, Resize Inode, Dir Index
InCompat Features: Filetype,
Read Only Compat Features: Sparse Super, Large File,
METADATA INFORMATION
--------------------------------------------
Inode Range: 1 - 17
Root Directory: 2
Free Inodes: 0
CONTENT INFORMATION
--------------------------------------------
Block Range: 0 - 99
Block Size: 1024
Reserved Blocks Before Block Groups: 1
Free Blocks: 75
BLOCK GROUP INFORMATION
--------------------------------------------
Number of Block Groups: 1
Inodes per group: 16
Blocks per group: 8192
Group: 0:
Inode Range: 1 - 16
Block Range: 1 - 99
Layout:
Super Block: 1 - 1
Group Descriptor Table: 2 - 2
Data bitmap: 3 - 3
Inode bitmap: 4 - 4
Inode Table: 5 - 6
Data Blocks: 7 - 99
Free Inodes: 0 (0%)
Free Blocks: 75 (75%)
Total Directories: 4Finding a file in
ext2The first
inodewe need is theinodefor the root of the file system and we find its number in the file system’s super block. (To read anEXT2 inodewe must look for it in theinodetable of the appropriate Block Group)use this
inodeto find the data block which hold subdirectories’ names andinodenumbersuse subdirectory’s
inodenumber to find the data block where the file’s name andinodeis holden.use
inodeof the file to find the actual data block
the VFS describes the system’s files in terms of super blocks and
inodesin much the same way as the EXT2 file system uses super blocks andinodes.How it related to
file descriptor
As a conclusion, I think file system is just like device driver
Read more
use debugfs to recover deleted file
copy inode metadata with cp -a
Do experiment yourself
1 | mkdir mountpoint |