Ch10

Interaction and Communication between Programs

System-Level I/O

Input and Output(IO) is the process of copying data between main memory and external devices such as disk drives, terminals, and networks.

  • Input copy data from devices to main memory
  • Output copy data from memory to devices

Most of the time, the higher-level I/O functions work quite well and there is no need to use Unix I/O directly. So why bother learning about Unix I/O?

  • Understanding Unix I/O will help you understand other systems concepts
  • Sometimes you have no choice but to use Unix I/O

10.1 Unix I/O

A Linux file is a sequence of m bytes: {B1 , B2 , B3 , …., Bm-1}

All I/O devices, such as networks, disks, and terminals, are modeled as files, and all input and output is performed by reading and writing the appropriate files.

This elegant mapping of devices to files allows the Linux kernel to export a simple, lowlevel application interface, known as Unix I/O, that enables all input and output to be performed in a uniform and consistent way:

  1. Opening files

    • An application announces its intention to access an I/O device by asking the kernel to open the corresponding file.

    • The kernel returns a small nonnegative integer, called a descriptor, that identifies the file in all subsequent operations on the file.

    • The kernel keeps track of all information about the open file. The application only keeps track of the descriptor.

    • Each process created by a Linux shell begins life with three open files

      stdin : 0 , stdout : 1 , stderr : 2

  2. Changing the current file position

    • The kernel maintains a file position k, initially 0, for each open file.
    • The file position is a byte offset from the beginning of a file. An application can set the current file position k explicitly by performing a seek operation.
  3. Reading and Writing files

    • A read operation copies n > 0 bytes from a file to memory, starting at the current file position k and then incrementing k by n.
    • Given a file with a size of m bytes, performing a read operation when k ≥ m triggers a condition known as end-of-file (EOF), which can be detected by the application. There is no explicit “EOF character” at the end of a file.
  4. Closing files

    • When an application has finished accessing a file, it informs the kernel by asking it to close the file.
    • The kernel responds by freeing the data structures it created when the file was opened and restoring the descriptor to a pool of available descriptors

10.2 Files

Each Linux file has a type that indicates its role in the system

  • A regular file contains arbitrary data.
  • A directory is a file consisting of an array of links, where each link maps a filename to a file, which may be another directory
  • A socket is a file that is used to communicate with another process across a network

Other file types include named pipes, symbolic links, and character and block devices….

  • The Linux kernel organizes all files in a single directory hierarchy anchored by the root directory named / (slash).

directory hierarchy

  • As part of its context, each process has a current working directory that identifies its current location in the directory hierarchy.
  • Pathnames can be either absolute or relative.

10.3 Opening and Closing Files

A process opens an existing file or creates a new file by calling the open function

1
2
3
4
5
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
int open(char *filename, int flags, mode_t mode);
//Returns: new file descriptor if OK, −1 on error
  • The open function converts a filename to a file descriptor and returns the descriptor number.

    (The descriptor returned is always the smallest descriptor that is not currently open in the process.)

  • The flags argument indicates how the process intends to access the file

    (Read-Only, Write-Only, R&W)

    The flags argument can also be ored with one or more bit masks that provide additional instructions for writing(Create, Truncate, Append)

  • The mode argument specifies the access permission bits of new files.

access bits

Finally, a process closes an open file by calling the close function.

1
2
3
#include <unistd.h>
int close(int fd);
//Returns: 0 if OK, −1 on error

Closing a descriptor that is already closed is an error.

Practice Problem 10.1

What’t the output of the following code

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
#include <sys/types.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>


int main() {
int fd1, fd2;
fd1 = open("foo.txt", O_RDONLY, 0);
close(fd1);
fd2 = open("baz.txt", O_RDONLY, 0);
printf("fd2 = %d\n", fd2);
exit(0);
}

Verification:

The descriptor returned is always the smallest descriptor that is not currently open in the process

Since 0,1,2 is already used, the output will 3.

1
2
3
$ touch baz.txt
$ ./a.out
fd2 = 3

10.4 Reading and Writing Files

1
2
3
4
5
#include <unistd.h>
ssize_t read(int fd, void *buf, size_t n);
//Returns: number of bytes read if OK, 0 on EOF, −1 on error
ssize_t write(int fd, const void *buf, size_t n);
//Returns: number of bytes written if OK, −1 on error
  • The read function copies at most n bytes from the current file position of descriptor fd to memory location buf

  • The write function copies at most n bytes from memory location buf to the current file position of descriptor fd

  • Applications can explicitly modify the current file position by calling the lseek function

  • In some situations, read and write transfer fewer bytes than the application requests

    • Encountering EOF on reads.
    • Reading text lines from a terminal.
    • Reading and writing network sockets

So if you want to build robust (reliable) network applications such as Web servers, then you must deal with short counts by repeatedly calling read and write until all requested bytes have been transferred.

What’s the difference between ssize_t and size_t?

On x86-64 systems, a size_t is defined as an unsigned long, and an ssize_t (signed size) is defined as a long. The read function returns a signed size rather than an unsigned size because it must return a −1 on error.

10.5 Robust Reading and Writing with the Rio Package

In this section, we will develop an I/O package, called the Rio (Robust I/O) package, that handles these short counts for you automatically.

Rio provides two different kinds of functions

  • Unbuffered input and output functions.
    • These functions transfer data directly between memory and a file, with no application-level buffering
  • Buffered input functions

10.5.1 Rio Unbuffered Input and Output Functions

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
//Returns: number of bytes transferred if OK, 0 on EOF (rio_readn only), −1 on error


#include <unistd.h>
#include <sys/types.h>
#include <fcntl.h>
#include <errno.h>

ssize_t rio_read(int fd , void * userbuf , size_t n){
size_t nleft = n;
size_t nread ;
char * buf = userbuf;

while(nleft > 0){
if ( (nread = read(fd , buf , nleft)) < 0){
if ( errno == EINTR){
nread = 0; //if interrupted by a signal, restart read() from the beginning
}
else{
return -1 ; //read failed, errno set by read()
}

}
else if (nread == 0){ //encount EOF
break;
}

nleft -= nread ;
buf += nread ;
}

return ( n - nread );
/*
if on error, return -1
if success, return 0
if EOF, return the number of bytes left to be transferred
*/

}


ssize_t rio_write(int fd, void *usrbuf, size_t n)
{
size_t nleft = n;
ssize_t nwritten;
char *bufp = usrbuf;

while (nleft > 0) {
if ((nwritten = write(fd, bufp, nleft)) <= 0) {
if (errno == EINTR) /* Interrupted by sig handler return */
nwritten = 0; /* and call write() again */
else
return -1; /* errno set by write() */
}
nleft -= nwritten;
bufp += nwritten;
}
return n;
}

10.5.2 Rio Buffered Input Functions

system level I/O operations are expensive, so we reduce the amount we call them by adding a buffer between user code and I/O operation

buffer io

printf

For example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include <stdio.h>
#include <stdlib.h>

int main(){
printf("H");
printf("e");
printf("l");
printf("l");
printf("o");
printf("\n");
fflush(stdout);

return 0;
}
1
2
3
4
$ strace -e trace=write ./p
write(1, "Hello\n", 6Hello
) = 6
+++ exited with 0 +++

Implementation

1
2
3
4
5
6
7
#define  BSIZE 1000
struct rio {
int rio_fd ; //file descriptor associated with this buffer
int rio_size ; //number of unread bytes in the buffer
char * ptr ; //start of unread bytes
char buf [BSIZE] ; //the buffer itself
}

10.6 Reading File Metadata

An application can retrieve information about a file (sometimes called the file’s metadata) by calling the stat and fstat functions.

1
2
3
4
5
#include <unistd.h>
#include <sys/stat.h>
int stat(const char *filename, struct stat *buf);
int fstat(int fd, struct stat *buf);
//Returns: 0 if OK, −1 on error

stat structure

  • The stat function takes as input a filename and fills in the members of a stat structure shown in Figure 10.9.
  • The fstat function is similar, but it takes a file descriptor instead of a filename

10.7 Reading Directory Contents

Applications can read the contents of a directory with the readdir family of functions.

1
2
3
4
#include <sys/types.h>
#include <dirent.h>
DIR * opendir(const char *name);
//Returns: pointer to handle if OK, NULL on error

The opendir function takes a pathname and returns a pointer to a directory stream. A stream is an abstraction for an ordered list of items, in this case a list of directory entries

1
2
3
4
5
6
7
8
9
#include <dirent.h>
struct dirent *readdir(DIR *dirp);
//Returns: pointer to next directory entry if OK, NULL if no more entries or error

struct dirent {
ino_t d_ino; /* inode number */
char d_name[256]; /* Filename */
};
//The d_name member is the filename, and d_ino is the file location

On error, readdir returns NULL and sets errno. Unfortunately, the only way to distinguish an error from the end-of-stream condition is to check if errno has been modified since the call to readdir.

1
2
3
#include <dirent.h>
int closedir(DIR *dirp);
//Returns: 0 on success, −1 on error

10.8 Sharing Files

Linux files can be shared in a number of different ways. The kernel represents open files using three related data structures:

  • Descriptor table.
    • Each process has its own separate descriptor table whose entries are indexed by the process’s open file descriptors.
    • Each open descriptor entry points to an entry in the file table.
  • File table
    • The set of open files is represented by a file table that is shared by all processes
    • Each file table entry consists of (for our purposes) the current file position, a reference count of the number of descriptor entries that currently point to it, and a pointer to an entry in the v-node table
    • Closing a descriptor decrements the reference count in the associated file table entry.
    • The kernel will not delete the file table entry until its reference count is zero.
  • v-node table
    • Like the file table, the v-node table is shared by all processes
    • Each entry contains most of the information in the stat structure

file table

Multiple descriptors can also reference the same file through different file table entries, as shown in Figure 10.13

file sharing

This might happen, for example, if you were to call the open function twice with the same filename.

fork sharing

Practice Problem 10.2

Suppose the disk file foobar.txt consists of the six ASCII characters “foobar”. Then what is the output of the following program?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>

int main(){
int fd1 , fd2 ;
char c ;

fd1 = open("foobar.txt", O_RDONLY , 0);
fd2 = open("foobar.txt", O_RDONLY , 0);

read(fd1, &c, 1);
read(fd2, &c, 1);
printf("c = %c\n", c);

return 0;
}

Verification:

Since file position is stored in file table, both fd1 and fd2 will read from the beginning.

1
c = f

Practice Problem 10.3

As before, suppose the disk file foobar.txt consists of the six ASCII characters foobar. Then what is the output of the following program?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>

int main(){
int fd ;
char c ;
fd = open("foobar.txt", O_RDONLY , 0);

if ( fork() == 0){
read(fd, &c, 1);
exit(0);
}

wait(NULL);
read(fd, &c, 1);

printf("c = %c\n" , c);
exit(0);
}

My solution : :white_check_mark:

1
c = o

Because the child process shared the same file table with parent

10.9 I/O Redirection

Linux shells provide I/O redirection operators that allow users to associate standard input and output with disk files.

So how does I/O redirection work? One way is to use the dup2 function

1
2
3
4
5
#include <unistd.h>
int dup2(int oldfd, int newfd);
//Returns: nonnegative descriptor if OK, −1 on error

//duplicate old to new

The dup2 function copies descriptor table entry oldfd to descriptor table entry newfd, overwriting the previous contents of descriptor table entry newfd. If newfd was already open, then dup2 closes newfd before it copies oldfd.

For example , if dup2(4,1)

dup

File A doesn’t exist

Practice Problem 10.5

Assuming that the disk file foobar.txt consists of the six ASCII characters foobar, what is the output of the following program?

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
#include <stdio.h>
#include <unistd.h>
#include <fcntl.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/wait.h>

int main(){
int fd1 , fd2 ;
char c ;

fd1 = open("foobar.txt", O_RDONLY, 0);
fd2 = open("foobar.txt", O_RDONLY, 0);
read(fd2, &c, 1);
dup2(fd2, fd1);
read(fd1, &c, 1);
printf("c = %c\n" , c);

exit(0);
}

My solution : :white_check_mark:

1
c = o

10.10 Standard I/O

A stream of type FILE is an abstraction for a file descriptor and a stream buffer. The purpose of the stream buffer is the same as the Rio read buffer: to minimize the number of expensive Linux I/O system calls

10.11 Putting It Together: Which I/O Functions Should I Use?

IO functions

  • G1: Use the standard I/O functions whenever possible

  • G2: Don’t use scanf or rio_readlineb to read binary files

    • Functions like scanf and rio_readlineb are designed specifically for reading text files

    • binary files might be littered with many 0xa bytes that have nothing to do with terminating text lines.

  • G3: Use the Rio functions for I/O on network sockets.

10.12 Summary