Ch8

Exceptional Control Flow

Systems must also be able to react to changes in system state that are not captured by internal program variables and are not necessarily related to the execution of the program.

  • A hardware timer goes off at regular intervals and must be dealt with
  • Packets arrive at the network adapter and must be stored in memory.
  • Programs request data from a disk and then sleep until they are notified that the data are ready.
  • Parent processes that create child processes must be notified when their children terminate
  • ….

Modern systems react to these situations by making abrupt changes in the control flow. In general, we refer to these abrupt changes as exceptional control flow (ECF).

  • ECF occurs at all levels of a computer system.

    • At the hardware level, events detected by the hardware trigger abrupt control transfers to exception handlers(press the power off button)
    • At the operating systems level, the kernel transfers control from one user process to another via context switches.
    • At the application level, a process can send a signal to another process that abruptly transfers control to a signal handler in the recipient
  • Understanding ECF will help you understand important systems concepts.

    ECF is the basic mechanism that operating systems use to implement I/O, processes, and virtual memory

  • Understanding ECF will help you understand how applications interact with the operating system.

    Applications request services from the operating system by using a form of ECF known as a trap or system call

  • Understanding ECF will help you write interesting new application programs.

    The operating system provides application programs with powerful ECF mechanisms for creating new processes, waiting for processes to terminate, notifying other processes of exceptional events in the system, and detecting and responding to these events.

    If you understand these ECF mechanisms, then you can use them to write interesting programs such as Unix shells and Web servers.

  • Understanding ECF will help you understand concurrency

  • Understanding ECF will help you understand how software exceptions work

    Software exceptions allow the program to make nonlocal jumps (i.e., jumps that violate the usual call/return stack discipline) in response to error conditions(C++ try,catch,throw)

    (Nonlocal jumps are a form of application-level ECF and are provided in C via the setjmp and longjmp functions.)

8.1 Exceptions

Exceptions are a form of exceptional control flow that are implemented partly by the hardware and partly by the operating system.

An exception is an abrupt change in the control flow in response to some change in the processor’s state

exceptions

  • The processor is executing some current instruction Icurr when a significant change in the processor’s state occurs. The state is encoded in various bits and signals inside the processor

  • The event might be directly related to the execution of the current instruction

    • a virtual memory page fault occurs
    • an arithmetic overflow occurs
    • an instruction attempts a divide by zero
    • ….
  • The event might be unrelated to the execution of the current instruction

    • a system timer goes off
    • an I/O request completes
    • ….
  • In any case, when the processor detects that the event has occurred, it makes an indirect procedure call (the exception), through a jump table called an exception table, to an operating system subroutine (the exception handler) that is specifically designed to process this particular kind of event.

    1
    2
    3
    4
    5
    6
    7
    8
    9
    10
    11
    12
    13
    def IOhandler:
    ...
    def D0handler:
    ...
    ...

    exception_table = {
    IO : IOhandler,
    divide_zero : D0handler,
    ....
    }
    processor:
    call(exception_table[event])
  • When the exception handler finishes processing, one of three things happens, depending on the type of event that caused the exception:

    • The handler returns control to the current instruction Icurr, the instruction that was executing when the event occurred.
    • The handler returns control to Inext, the instruction that would have executed next had the exception not occurred.
    • The handler aborts the interrupted program.

8.1.1 Exception Handling

Since exceptions are implement by the cooperation of hardware and software(the OS), the key to understand it is understand which component performs which task.

Each type of possible exception in a system is assigned a unique nonnegative integer exception number.

(Some of these numbers are assigned by the designers of the processor and other are assigned by the designers of the operating system kernel)

  • At system boot time (when the computer is reset or powered on), the operating system allocates and initializes a jump table called an exception table

  • At run time (when the system is executing some program), the processor detects that an event has occurred and determines the corresponding exception number k.

    The exception number is an index into the exception table, whose starting address is contained in a special CPU register called the exception table base register.

exception handler address generating

An exception is akin to a procedure call, but with some important differences:

  • As with a procedure call, the processor pushes a return address on the stack before branching to the handler. However, depending on the class of exception, the return address is either the current instruction or the next instruction

  • The processor also pushes some additional processor state onto the stack that will be necessary to restart the interrupted program when the handler returns.

    For example, an x86-64 system pushes the EFLAGS register containing the current condition codes, among other things, onto the stack.

  • When control is being transferred from a user program to the kernel, all of these items are pushed onto the kernel’s stack rather than onto the user’s stack.

  • Exception handlers run in kernel mode, which means they have complete access to all system resources.

Once the hardware triggers the exception, the rest of the work is done in software by the exception handler.

8.1.2 Classes of Exceptions

Exceptions can be divided into four classes: interrupts, traps, faults, and aborts.

exception classes

Interrupts

Interrupts occur asynchronously as a result of signals from I/O devices that are external to the processor.

Asynchronous means that these exceptions are not caused by execution any instructions, they are completely hardware stuff. Hence exception handlers for hardware interrupts are often called interrupt handlers

Process

interrupt handling

  1. I/O devices such as network adapters, disk controllers, and timer chips trigger interrupts by signaling a pin on the processor chip and placing onto the system bus the exception number that identifies the device that caused the interrupt

  2. After the current instruction finishes executing, the processor notices that the interrupt pin has gone high, reads the exception number from the system bus, and then calls the appropriate interrupt handler

  3. When the handler returns, it returns control to the next instruction. The effect is that the program continues executing as though the interrupt had never happened.

Traps and System Calls

Traps are intentional exceptions that occur as a result of executing an instruction

The most important use of traps is to provide a procedure-like interface between user programs and the kernel, known as a system call.

  • To allow controlled access to such kernel services,(read,fork,execve,exit,..) processors provide a special syscall n instruction that user programs can execute when they want to request service n.
  • Executing the syscall instruction causes a trap to an exception handler that decodes the argument and calls the appropriate kernel routine.

From a programmer’s perspective, a system call is identical to a regular function call. However, their implementations are quite different, regular function call run in user mode while system call run in kernal mode

trap handling

Faults

Faults result from error conditions that a handler might be able to correct

  1. When a fault occurs, the processor transfers control to the fault handler.
  2. If the handler is able to correct the error condition, it returns control to the faulting instruction, thereby re-executing it
  3. Otherwise, the handler returns to an abort routine in the kernel that terminates the application program that caused the fault

fault handling

Abort

Aborts result from unrecoverable fatal errors , typically hardware errors such as parity errors that occur when DRAM or SRAM bits are corrupted.

the handler returns control to an abort routine that terminates the application program.

abort handling

8.1.3 Exceptions in Linux/x86-64 Systems

In x86-64, here are up to 256 different exception types . Numbers in the range from 0 to 31 correspond to exceptions that are defined by the Intel architects and thus are identical for any x86-64 system. Numbers in the range from 32 to 255 correspond to interrupts and traps that are defined by the operating system.

examples

Linux/x86-64 Faults and Aborts

  • Divide error

    A divide error (exception 0) occurs when an application attempts to divide by zero or when the result of a divide instruction is too big for the destination operand

    Linux shells typically report divide errors as Floating exceptions.

  • General protection fault

    The infamous general protection fault (exception 13) occurs for many reasons, usually because a program references an undefined area of virtual memory or because the program attempts to write to a read-only text segment

    Linux shells typically report general protection faults as Segmentation faults.

  • Machine check

    A machine check (exception 18) occurs as a result of a fatal hardware error that is detected during the execution of the faulting instruction.

Linux/x86-64 System Calls

Each system call has a unique integer number that corresponds to an offset in a jump table in the kernel. (Notice that this jump table is not the same as the exception table.)

system calls

  • C programs can invoke any system call directly by using the syscall function.
  • The C standard library provides a set of convenient wrapper functions for most system calls. The wrapper functions package up the arguments, trap to the kernel with the appropriate system call instruction, and then pass the return status of the system call back to the calling program.

Let’s do a syscall manually

  • All arguments to Linux system calls are passed through general-purpose registers rather than the stack.
  • By convention, register %rax contains the syscall number, with up to six arguments in %rdi, %rsi, %rdx, %r10, %r8, and %r9.
  • On return from the system call, registers %rcx and %r11 are destroyed, and %rax contains the return value. A negative return value between −4,095 and −1 indicates an error corresponding to negative errno.
1
2
3
4
5
6
#include <unistd.h>
int main()
{
write(1, "hello, world\n", 13);
_exit(0);
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
# hand crafted assembly code
.section .data
string:
.ascii "hello, world\n"
string_end:
.equ len, string_end - string
.section .text
.globl main
main:
# First, call write(1, "hello, world\n", 13)
movq $1, %rax # write is system call 1
movq $1, %rdi # Arg1:stdout has descriptor 1
movq $string, %rsi # Arg2: hello world string
movq $len, %rdx # Arg3: string length
syscall # Make the system call

# Next, call _exit(0)
movq $60, %rax # _exit is system call 60
movq $0, %rdi # Arg1: exit status is 0
syscall # Make the system call
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
$ as handhello.s -o handhello.o
$ ld handhello.o -o handhello
ld: warning: cannot find entry symbol _start; defaulting to 00000000004000b0
$ ./handhello
hello, world
$ objdump -d handhello

handhello: file format elf64-x86-64


Disassembly of section .text:

00000000004000b0 <main>:
4000b0: 48 c7 c0 01 00 00 00 mov $0x1,%rax
4000b7: 48 c7 c7 01 00 00 00 mov $0x1,%rdi
4000be: 48 c7 c6 de 00 60 00 mov $0x6000de,%rsi
4000c5: 48 c7 c2 0d 00 00 00 mov $0xd,%rdx
4000cc: 0f 05 syscall
4000ce: 48 c7 c0 3c 00 00 00 mov $0x3c,%rax
4000d5: 48 c7 c7 00 00 00 00 mov $0x0,%rdi
4000dc: 0f 05 syscall

8.2 Processes

Exceptions are the basic building blocks that allow the operating system kernel to provide the notion of a process

When we run a program on a modern system, we are presented with the illusion that

  • our program is the only one currently running in the system
  • Our program appears to have exclusive use of both the processor and the memory.
  • The processor appears to execute the instructions in our program, one after the other, without interruption.

Each program in the system runs in the context of some process.

  • The context consists of the state that the program needs to run correctly
  • This state includes the program’s code and data stored in memory, its stack, the contents of its generalpurpose registers, its program counter, environment variables, and the set of open file descriptors.

8.2.1 Logical Control Flow

logical control flow

  • The single physical control flow of the processor is partitioned into three logical flows
  • Process A runs for a while, followed by B, which runs to completion. Process C then runs for a while, followed by A, which runs to completion. Finally, C is able to run to completion.

processes take turns using the processor

  • Each process executes a portion of its flow and then is preempted (temporarily suspended) while other processes take their turns
  • Each time the processor stalls, it subsequently resumes execution of our program without any change to the contents of the program’s memory locations or registers.

8.2.2 Concurrent Flows

A logical flow whose execution overlaps in time with another flow is called a concurrent flow, and the two flows are said to run concurrently

More precisely, flows X and Y are concurrent with respect to each other if and only if X begins after Y begins and before Y finishes, or Y begins after X begins and before X finishes.

For example, in Figure 8.12, processes A and B run concurrently, as do A and C. On the other hand, B and C do not run concurrently, because the last instruction of B executes before the first instruction of C

The general phenomenon of multiple flows executing concurrently is known as concurrency

Notice that the idea of concurrent flows is independent of the number of processor cores or computers that the flows are running on.

There is a proper subset of concurrent flows known as parallel flows

If two flows are running concurrently on different processor cores or computers, then we say that they are parallel flows, that they are running in parallel, and have parallel execution.

Practice Problem 8.1

Consider three processes with the following starting and ending times:

Process Start time End time
A 1 3
B 2 5
C 4 6

For each pair of processes, indicate whether they run concurrently (Y) or not (N):

My solution : :white_check_mark:

1
2
3
4
time 1 2 3 4 5 6
A r r r
B r r r r
C r r r
Process pair Concurrent ?
AB Y
AC N
BC Y

8.2.3 Private Address Space

A process provides each program with its own private address space. This space is private in the sense that a byte of memory associated with a particular address in the space cannot in general be read or written by any other process.

illustion

  • The code segment always begins at address 0x400000
  • The kernal part of the address space contains the code, data, and stack that the kernel uses when it executes instructions on behalf of the process (e.g. system call)

8.2.4 User and Kernel Modes

In order for the operating system kernel to provide an airtight process abstraction, the processor must provide a mechanism that restricts the instructions that an application can execute, as well as the portions of the address space that it can access

  • Processors typically provide this capability with a mode bit in some control register that characterizes the privileges that the process currently enjoys

  • The only way for the process to change from user mode to kernel mode is via an exception such as an interrupt, a fault, or a trapping system call.

  • When the exception occurs, and control passes to the exception handler, the processor changes the mode from user mode to kernel mode

    The handler runs in kernel mode. When it returns to the application code, the processor changes the mode from kernel mode back to user mode

Linux provides a clever mechanism, called the /proc filesystem, that allows user mode processes to access the contents of kernel data structures(READ ONLY)

  • The /proc filesystem exports the contents of many kernel data structures as a hierarchy of text files that can be read by user programs
  • The 2.6 version of the Linux kernel introduced a /sys filesystem, which exports additional low-level information about system buses and devices.
1
2
$ ll /proc/cpuinfo 
-r--r--r-- 1 root root 0 Sep 25 09:27 /proc/cpuinfo

8.2.5 Context Switches

The operating system kernel implements multitasking using a higher-level form of exceptional control flow known as a context switch. The context switch mechanism is built on top of the lower-level exception mechanism

The kernel maintains a context for each process. The context is the state that the kernel needs to restart a preempted process.

It consists of

  • the values of objects such as the general-purpose registers, the floating-point registers, the program counter, user’s stack, status registers, kernel’s stack
  • various kernel data structures such as a page table that characterizes the address space, a process table that contains information about the current process, and a file table that contains information about the files that the process has opened.

After the kernel has scheduled(start) a new process to run, it preempts the current process and transfers control to the new process using a mechanism called a context switch

  1. saves the context of the current process
  2. restores the saved context of some previously preempted process
  3. passes control to this newly restored process.

There are many scenarios context switch can happen

  • while the kernel is executing a system call

    If the system call blocks because it is waiting for some event to occur, then the kernel can put the current process to sleep and switch to another process.

    (e.g. Reading data from disk)

  • A context switch can also occur as a result of an interrupt.

context switch example

  • Initially process A is running in user mode until it traps to the kernel by executing a read system call.
  • The trap handler in the kernel requests a DMA transfer from the disk controller, the transition will cost huge amount of time
  • Instead of waiting and doing nothing in the interim, the kernel performs a context switch from process A to B.
  • Process B then runs for a while in user mode until the disk sends an interrupt to signal that data have been transferred from disk to memory.
  • The processor return control in process A to the instruction immediately following the read system call.

During the first part of the switch, the kernel is executing instructions in kernel mode on behalf of process A. Then at some point it begins executing instructions (still in kernel mode) on behalf of process B. And after the switch, the kernel is executing instructions in user mode on behalf of process B.

Since context switch is a kernal job, it has nothing to do with the syscall of A

8.3 System Call Error Handling

When Unix system-level functions encounter an error, they typically return −1 and set the global integer variable errno to indicate what went wrong.

1
2
3
4
5
6
7
8
9
10
#include <stdlib.h> //for exit
#include <stdio.h> //for fprintf
#include <errno.h> //for errno
#include <string.h> //for strerror
#include <unistd.h> //for fork

if ((pid = fork()) < 0) {
fprintf(stderr, "fork error: %s\n", strerror(errno));
exit(0);
}
  • The strerror function returns a text string that describes the error associated with a particular value of errno
  • We can create error-handing wrapper to make our code consice along with error checking
1
2
3
4
5
6
void unix_error(char *msg) /* Unix-style error */
{
fprintf(stderr, "%s: %s\n", msg, strerror(errno));
exit(0);
}
if ((pid = fork()) < 0) unix_error("fork error");

8.4 Process Control

Unix provides a number of system calls for manipulating processes from C programs.

8.4.1 Obtaining Process IDs

  • Each process has a unique positive (nonzero) process ID (PID)
  • The getpid function returns the PID of the calling process
  • The getppid function returns the PID of its parent (i.e., the process that created the calling process)
  • Return an integer value of type pid_t, which on Linux systems is defined in types.h as an int.
1
2
3
4
5
6
7
int main(){
pid_t current = getpid();
pid_t parent = getppid();
printf("%d %d\n" , current , parent);

return 0;
}

8.4.2 Creating and Terminating Processes

From a programmer’s perspective, we can think of a process as being in one of three states: Running, Stopped and Terminated.

Terminal a process

1
2
#include <stdlib.h>
void exit(int status);

Create a child process

1
2
3
#include <sys/types.h>
#include <unistd.h>
pid_t fork(void);
  • A parent process creates a new running child process by calling the fork function.
  • The newly created child process is almost, but not quite, identical to the parent.
    • The child gets an identical (but separate) copy of the parent’s user-level virtual address space, including the code and data segments, heap, shared libraries, and user stack.
    • The child also gets identical copies of any of the parent’s open file descriptors, which means the child can read and write any files that were open in the parent when it called fork
    • The most significant difference between the parent and the newly created child is that they have different PIDs.
  • The fork function is interesting because it is called once but it returns twice:
    • In the parent, fork returns the PID of the child. In the child, fork returns a value of 0.
    • Since the PID of the child is always nonzero, the return value provides an unambiguous way to tell whether the program is executing in the parent or the child.

For example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
int main()
{
pid_t pid;
int x = 1;

pid = fork();
if (pid == 0) { /* Child */
printf("child : x=%d\n", ++x);
exit(0);
}

/* Parent */
printf("parent: x=%d\n", --x);
exit(0);
}
  • Call once, return twice: programs with multiple instances of fork can be confusing and need to be reasoned about carefully.

  • Concurrent execution:

    • The parent and the child are separate processes that run concurrently

    • The instructions in their logical control flows can be interleaved by the kernel in an arbitrary way

      (The parent process completes its printf statement first, followed by the child, but the reverse can also be true)

    • As programmers we can never make assumptions about the interleaving of the instructions in different processes.

  • Duplicate but separate address spaces

  • Shared files: When we run the example program, we notice that both parent and child print their output on the screen. The reason is that the child inherits all of the parent’s open files(stdout)

    It is often helpful to sketch the process graph, which is a simple kind of precedence graph that captures the partial ordering of program statements

fork example

Practice Problem 8.2

Consider the following program:

1
2
3
4
5
6
7
8
9
10
int main()
{
int a = 9;

if (fork() == 0)
printf("p1: a=%d\n", a--);

printf("p2: a=%d\n", a++);
exit(0);
}

A. What is the output of the child process?

B. What is the output of the parent process?

My solution : :white_check_mark:

1
2
3
child output : p1: a=9
p2: a=8
parent output : p2: a=9

Verification :

1
2
3
4
5
$ ./fork > out
$ cat out
p2: a=9
p1: a=9
p2: a=8

8.4.3 Reaping Child Processes

When a process terminates for any reason, the kernel does not remove it from the system immediately.

  • The process is kept around in a terminated state until it is reaped by its parent
  • When the parent reaps the terminated child, the kernel passes the child’s exit status to the parent and then discards the terminated process, at which point it ceases to exist
  • A terminated process that has not yet been reaped is called a zombie.

When a parent process terminates, the kernel arranges for the init process to become the adopted parent of any orphaned children

  • The init process, which has a PID of 1, is created by the kernel during system start-up, never terminates, and is the ancestor of every process.
  • If a parent process terminates without reaping its zombie children, then the kernel arranges for the init process to reap them.
  • However, long-running programs such as shells or servers should always reap their zombie children
1
2
3
4
5
#include <sys/types.h>
#include <sys/wait.h>
pid_t waitpid(pid_t pid, int *statusp, int options);
//Returns: PID of child if OK, 0 (if WNOHANG), or −1 on error
//if the first argument is −1, the call to waitpid blocks until an arbitrary child has terminated
  • By default (when options = 0), waitpid suspends execution of the calling process until a child process in its wait set terminates.
  • If a process in the wait set has already terminated at the time of the call, then waitpid returns immediately.
  • In either case, waitpid returns the PID of the terminated child that caused waitpid to return. At this point, the terminated child has been reaped and the kernel removes all traces of it from the system

Determining the Members of the Wait Set

The members of the wait set are determined by the pid argument:

1
2
3
4
5
def number_of_wait_set(pid):
if pid > 0:
return 1
elif pid == -1:
return len(all_children_process)

Modifying the Default Behavior

The default behavior can be modified by setting options to various combinations of the WNOHANG, WUNTRACED, and WCONTINUED constants

  • WNOHANG
    • Return immediately (with a return value of 0) if none of the child processes in the wait set has terminated yet
    • This option is useful in those cases where you want to continue doing useful work while waiting for a child to terminate.
  • WUNTRACED
    • Suspend execution of the calling process until a process in the wait set becomes either terminated or stopped.
    • This option is useful when you want to check for both terminated and stopped children.
  • WCONTINUED
    • Suspend execution of the calling process until a running process in the wait set is terminated or until a stopped process in the wait set has been resumed by the receipt of a SIGCONT signal.

Checking the Exit Status of a Reaped Child

If the statusp argument is non-NULL, then waitpid encodes status information about the child that caused the return in status, which is the value pointed to by statusp

The wait.h include file defines several macros for interpreting the status argument

  • WIFEXITED(status). Returns true if the child terminated normally, via a call to exit or a return.
  • WEXITSTATUS(status). Returns the exit status of a normally terminated child. This status is only defined if WIFEXITED() returned true.
  • ……

Error Conditions

If the calling process has no children, then waitpid returns −1 and sets errno to ECHILD. If the waitpid function was interrupted by a signal, then it returns −1 and sets errno to EINTR


Practice Problem 8.3

List all of the possible output sequences for the following program

1
2
3
4
5
6
7
8
9
10
11
12
13
int main()
{
if (fork() == 0) {
printf("9"); fflush(stdout);
}
else {
printf("0"); fflush(stdout);
waitpid(-1, NULL, 0);
}

printf("3"); fflush(stdout);
printf("6"); exit(0);
}

My solution: :white_check_mark:

1
2
3
4
5
6
7
8
main -> fork -+-> print(0) -> waitpid -> print(3) -> print(6)
|
+-> print(9) -> print(3) -> print(6)

093636
903636
930636
936036

The wait Function

The wait function is a simpler version of waitpid.

1
2
3
4
#include <sys/types.h>
#include <sys/wait.h>
pid_t wait(int *statusp);
//Returns: PID of child if OK or −1 on error

Calling wait(&status) is equivalent to calling waitpid(-1, &status, 0).

Examples of Using waitpid

A program that uses waitpid to wait, in no particular order, for all of its N children to terminate:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
#include "csapp.h"
#define N 2

int main()
{
int status, i;
pid_t pid;

/* Parent creates N children */
for (i = 0; i < N; i++)
if ((pid = Fork()) == 0) /* child */
exit(100+i);

/* Parent reaps N children in no particular order */
while ((pid = waitpid(-1, &status, 0)) > 0) {
if (WIFEXITED(status))
printf("child %d terminated normally with exit status=%d\n",
pid, WEXITSTATUS(status));
else
printf("child %d terminated abnormally\n", pid);
}

/* The only normal termination is if there are no more children */
if (errno != ECHILD)
unix_error("waitpid error");

exit(0);
}

The order that they were reaped is a property of this specific computer system. On another system, or even another execution on the same system, the two children might have been reaped in the opposite order.

This is an example of the nondeterministic behavior that can make reasoning about concurrency so difficult

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
//Using waitpid to reap zombie children in the order they were created.
/* $begin waitpid2 */
#include "csapp.h"
#define N 2

int main()
{
int status, i;
pid_t pid[N], retpid;

/* Parent creates N children */
for (i = 0; i < N; i++)
if ((pid[i] = Fork()) == 0) /* Child */
exit(100+i);

/* Parent reaps N children in order */
i = 0;
while ((retpid = waitpid(pid[i++], &status, 0)) > 0) {
if (WIFEXITED(status))
printf("child %d terminated normally with exit status=%d\n",
retpid, WEXITSTATUS(status));
else
printf("child %d terminated abnormally\n", retpid);
}

/* The only normal termination is if there are no more children */
if (errno != ECHILD)
unix_error("waitpid error");

exit(0);
}
/* $end waitpid2 */

Practice Problem 8.4

Consider the following program:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
#include <stdlib.h>
#include <stdio.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <unistd.h>
int main()
{
int status;
pid_t pid;

printf("Start\n");
pid = fork();
printf("%d\n", !pid);

if (pid == 0) {
printf("Child\n");
}
else if ((waitpid(-1, &status, 0) > 0) && (WIFEXITED(status) != 0)) {
printf("%d\n", WEXITSTATUS(status));
}
printf("Stop\n");
exit(2);
}

A. How many output lines does this program generate?

B. What is one possible ordering of these output lines?

My solution:

1
2
3
main -> print("start") -> fork() +-> print(pid)->print(WEXITSTATUS(status)) -> print("Stop")
|
+-> print(pid)->print("Child") -> print("Stop")

7 lines

Verification:

1
2
3
4
5
6
7
Start
0
1
Child
Stop
2
Stop

8.4.4 Putting Processes to Sleep

The sleep function suspends a process for a specified period of time

1
2
3
#include <unistd.h>
unsigned int sleep(unsigned int secs);
//Returns: seconds left to sleep

Sleep returns zero if the requested amount of time has elapsed, and the number of seconds still left to sleep otherwise(This is possible when the process is interrupted by a SIGNAL).

1
2
#include <unistd.h>
int pause(void);

pause function, which puts the calling function to sleep until a signal is received by the process.

Practice Problem 8.5

Write a wrapper function for sleep, called wakeup, with the following interface: unsigned int wakeup(unsigned int secs);

The wakeup function behaves exactly as the sleep function, except that it prints a message describing when the process actually woke up: Woke up at 4 secs.

My solution : :white_check_mark:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <string.h>
#include <errno.h>
void unixerror(char * msg){
fprintf(stderr , "%s :%s\n" , msg , strerror(errno));
exit(0);
}
unsigned int wakeup(unsigned int secs){
unsigned int elapsed = secs - sleep(secs);
printf("Woke up at %u secs\n" , elapsed );
}

int main(){
wakeup(100);
return 0;
}

8.4.5 Loading and Running Programs

The execve function loads and runs a new program in the context of the current process.(Override code,data,stack,…., completely corrupt the current program! It is the same process, but running a different program!)

1
2
3
4
#include <unistd.h>
int execve(const char *filename, const char *argv[],
const char *envp[]);
//Does not return if OK; returns −1 on error
  • argument list argv and the environment variable list envp
  • execve returns to the calling program only if there is an error, such as not being able to find filename.

arguments to execve

(Must terminate with NULL in the array)

After execve loads filename, it calls the start-up code. The start-up code sets up the stack and passes control to the main routine of the new program

new stack organization


Manipulate environment variables

1
2
3
4
5
6
7
#include <stdlib.h>
char *getenv(const char *name);
//Returns: pointer to name if it exists, NULL if no match
int setenv(const char *name, const char *newvalue, int overwrite);
//Returns: 0 on success, −1 on error
void unsetenv(const char *name);
//Returns: nothing

Practice Problem 8.6

Write a program called myecho that prints its command-line arguments and environment variables. For example:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
$ ./myecho arg1 arg2
Command-ine arguments:
argv[ 0]: myecho
argv[ 1]: arg1
argv[ 2]: arg2
Environment variables:
envp[ 0]: PWD=/usr0/droh/ics/code/ecf
envp[ 1]: TERM=emacs
.
.
.
envp[25]: USER=droh
envp[26]: SHELL=/usr/local/bin/tcsh
envp[27]: HOME=/usr0/droh

My solution : :white_check_mark:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
#include <stdio.h>
#include <stdlib.h>

void printarg(int argc , char* argv [] ){
for (int i = 0; i < argc ; i++){
printf("argv [%2d]: %s\n" , i , argv[i]);
}
}

void printenv(char * envp []){
char** walk = envp;
int cot = 0;
while(*walk != NULL){
printf("envp [%2d]: %s\n" , cot , *walk);
cot++;
walk++;
}
}

int main(int argc , char* argv [] , char* envp []){
printf("Command-ine arguments:\n");
printarg(argc , argv);
printf("Environment variables:\n");
printenv(envp);
}

8.4.6 Using fork and execve to Run Programs

Programs such as Unix shells and Web servers make heavy use of the fork and execve functions

  • A shell is an interactive application-level program that runs other programs on behalf of the user
  • A shell performs a sequence of read/evaluate steps and then terminates.
    • The read step reads a command line from the user.
    • The evaluate step parses the command line and runs programs on behalf of the user.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
/* function prototypes */
void eval(char *cmdline);
int parseline(char *buf, char **argv);
int builtin_command(char **argv);

int main()
{
char cmdline[MAXLINE]; /* Command line */

while (1) {
/* Read */
printf("> ");
Fgets(cmdline, MAXLINE, stdin);
if (feof(stdin))
exit(0);

/* Evaluate */
eval(cmdline);
}
}
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
void eval(char *cmdline) 
{
char *argv[MAXARGS]; /* Argument list execve() */
char buf[MAXLINE]; /* Holds modified command line */
int bg; /* Should the job run in bg or fg? */
pid_t pid; /* Process id */

strcpy(buf, cmdline);
bg = parseline(buf, argv);
if (argv[0] == NULL)
return; /* Ignore empty lines */

if (!builtin_command(argv)) {
if ((pid = Fork()) == 0) { /* Child runs user job */
if (execve(argv[0], argv, environ) < 0) {
printf("%s: Command not found.\n", argv[0]);
exit(0);
}
}

/* Parent waits for foreground job to terminate */
if (!bg) {
int status;
if (waitpid(pid, &status, 0) < 0)
unix_error("waitfg: waitpid error");
}
else
printf("%d %s", pid, cmdline);
}
return;
}
  • If the last argument is an ‘&’ character, then parseline returns 1, indicating that the program should be executed in the background (the shell does not wait for it to complete)( the shell returns to the top of the loop and waits for the next command line.).
  • Otherwise, it returns 0, indicating that the program should be run in the foreground(the shell waits for it to complete)(uses the waitpid function to wait for the job to terminate. )
  • After parsing the command line, the eval function calls the builtin_command function, which checks whether the first command-line argument is a built-in shell command. If so, it interprets the command immediately and returns 1. Otherwise, it returns 0
  • If builtin_command returns 0, then the shell creates a child process and executes the requested program inside the child
  • Notice that this simple shell is flawed because it does not reap any of its background children. Correcting this flaw requires the use of signals, which we describe in the next section.

Complete source code

8.5 Signals

In this section, we will study a higher-level software form of exceptional control flow, known as a Linux signal, that allows processes and the kernel to interrupt other processes.

Linux signals

A signal is a small message that notifies a process that an event of some type has occurred in the system

  • Low-level hardware exceptions are processed by the kernel’s exception handlers and would not normally be visible to user processes
  • Signals provide a mechanism for exposing the occurrence of such exceptions to user processes
  • For example, if a process attempts to divide by zero, then the kernel sends it a SIGFPE signal (number 8)

8.5.1 Signal Terminology

The transfer of a signal to a destination process occurs in two distinct steps:

  • Sending a signal
    • The kernelsends (delivers) a signal to a destination process by updating some state in the context of the destination process
  • Receiving a signal
    • A destination process receives a signal when it is forced by the kernel to react in some way to the delivery of the signal.
    • The process can either ignore the signal, terminate, or catch the signal by executing a user-level function called a signal handler

signal handler


A signal that has been sent but not yet received is called a pending signal.

  • At any point in time, there can be at most one pending signal of a particular type.
    • If a process has a pending signal of type k, then any subsequent signals of type k sent to that process are not queued; they are simply discarded
  • A process can selectively block the receipt of certain signals.
    • When a signal is blocked, it can be delivered, but the resulting pending signal will not be received until the process unblocks the signal.

8.5.2 Sending Signals

Unix systems provide a number of mechanisms for sending signals to processes. All of the mechanisms rely on the notion of a process group.

Process Groups

  • Every process belongs to exactly one process group, which is identified by a positive integer process group ID
1
2
3
#include <unistd.h>
pid_t getpgrp(void);
//Returns: process group ID of calling process
  • By default, a child process belongs to the same process group as its parent
  • A process can change the process group of itself or another process by using the setpgid function:
1
2
3
#include <unistd.h>
int setpgid(pid_t pid, pid_t pgid);
//Returns: 0 on success, −1 on error

Sending Signals with the /bin/kill Program

  • The /bin/kill program sends an arbitrary signal to another process.
  • some Unix shells have their own built-in kill command

Sending Signals from the Keyboard

Unix shells use the abstraction of a job to represent the processes that are created as a result of evaluating a single command line

  • At any point in time, there is at most one foreground job and zero or more background jobs.
  • For example, typing linux> ls | sort creates a foreground job consisting of two processes connected by a Unix pipe: one running the ls program, the other running the sort program.

process groups

Typing Ctrl+C at the keyboard causes the kernel to send a SIGINT signal to every process in the foreground process group

Sending Signals with the kill Function

Processes send signals to other processes (including themselves) by calling the kill function

1
2
3
4
#include <sys/types.h>
#include <signal.h>
int kill(pid_t pid, int sig);
//Returns: 0 if OK, −1 on error

Sending Signals with the alarm Function

1
2
3
#include <unistd.h>
unsigned int alarm(unsigned int secs);
//Returns: remaining seconds of previous alarm, or 0 if no previous alarm

8.5.3 Receiving Signals

When the kernel switches a process p from kernel mode to user mode (e.g., returning from a system call or completing a context switch):

1
2
3
4
5
6
7
def receive_signal(signal):
action()

if len(p.unblocked_pending_signals) != 0:
p.receive_signal(p.unblocked_pending_signals[one_of_the_signals])

p.continue()

Each signal type has a predefined default action, which is one of the following:

  • The process terminates.
  • The process terminates and dumps core.
  • The process stops (suspends) until restarted by a SIGCONT signal.
  • The process ignores the signal.

A process can modify the default action associated with a signal by using the signal function. The only exceptions are SIGSTOP and SIGKILL, whose default actions cannot be changed.

1
2
3
4
#include <signal.h>
typedef void (*sighandler_t)(int);
sighandler_t signal(int signum, sighandler_t handler);
//Returns: pointer to previous handler if OK, SIG_ERR on error (does not set errno)

The signal function can change the action associated with a signal signum in one of three ways:

  • If handler is SIG_IGN, then signals of type signum are ignored.
  • If handler is SIG_DFL, then the action for signals of type signum reverts to the default action
  • Otherwise, handler is the address of a user-defined function, called a signal handler, that will be called whenever the process receives a signal of type signum.
    • Changing the default action by passing the address of a handler to the signal function is known as installing the handler
    • The invocation of the handler is called catching the signal.
    • The execution of the handler is referred to as handling the signal.
  • When a process catches a signal of type k, the handler installed for signal k is invoked with a single integer argument set to k. This argument allows the same handler function to catch different types of signals.
  • When the handler executes its return statement, control (usually) passes back to the instruction in the control flow where the process was interrupted by the receipt of the signal.

For example

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
//catch ctrl-c signal
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>
void handler(int signum){
printf("Ctrl-c detected!\n");
return;
}
int main(){
//install the handler
if (signal(SIGINT, &handler) == SIG_ERR){
fprintf(stderr, "error\n");
exit(0);
}
else{
while(1){
;
}
}
}
1
2
3
4
5
$ ./noctrl 
^CCtrl-c detected!
^CCtrl-c detected!
^CCtrl-c detected!
^CCtrl-c detected!

Signal handlers can be interrupted by other handlers

multiple handlers

  • the main program catches signal s, which interrupts the main program and transfers control to handler S
  • While S is running, the program catches signal t = s, which interrupts S and transfers control to handler T
  • When T returns, S resumes where it was interrupted. Eventually, S returns, transferring control back to the main program, which resumes where it left off.

Practice Problem 8.7

Write a program called snooze that takes a single command-line argument, calls the snooze function from Problem 8.5 with this argument, and then terminates.

Write your program so that the user can interrupt the snooze function by typing Ctrl+C at the keyboard. For example:

1
2
3
4
$ ./snooze 5
^C # (User hits Crtl+C after 3 seconds)
Slept for 3 of 5 secs.
$

My solution : :white_check_mark:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <signal.h>

void handler(int signum){
return;
}

int main(int argc , char* argv []){

//install handler
if (signal(SIGINT, handler) == SIG_ERR){
fprintf(stderr, "error\n");
exit(0);
}

int sec = atoi(argv[1]);
unsigned int elapsed = sec - sleep(sec);
printf("Slept for %d of %d secs.\n" , elapsed , sec );
}

Verification:

1
2
$ ./snooze 10
^CSlept for 5 of 10 secs.