Ch2PartII

Representing and Manipulating Information (Part II)

2.3 Integer Arithmetic

2.3.1 Unsigned Addition

Consider two nonnegative integers x and y, such that $0 ≤ x, y < 2^w$. Each of these values can be represented by a w-bit unsigned number. If we compute their sum, however, we have a possible range $0 ≤ x + y ≤ 2^{w+1} − 2$. Representing this sum could require w + 1 bits.

If we were to maintain the sum as a $(w + 1)$-bit number and add it to another value, we may require $w + 2$ bits, and so on. This continued “word size inflation” means we cannot place any bound on the word size required to fully represent the results of arithmetic operations.

Some programming languages, such as Lisp, actually support arbitrary size arithmetic to allow integers of any size .More commonly, programming languages support fixed-size arithmetic.

**Let us define the operation $+^u_w$ for arguments x and y, where 0 ≤ x, y < $2^w$, as the result of truncating the integer sum x + y to be w bits long and then viewing the result as an unsigned number. **

This can be characterized as a form of modular arithmetic, computing the sum modulo $2^w$ by simply discarding any bits with weight greater than $2^{w−1}$ in the bit-level representation of x + y.

principle: Unsigned addition

For x and y such that $0 ≤ x,y < 2^w$:
$$
x + y = \begin{cases}
x+y , & \text{normal}\
x+y-2^w , &\text{overflow} , 2^w \le x+y < 2 ^{w+1}
\end{cases}
$$

principle: Detecting overflow of unsigned addition

For x and y in the range $0 ≤ x,y ≤ UMax_w$, let $s = x +^u_w y$. Then the computation of s overflowed if and only if :
$$
s < x
$$
or equivalently:
$$
s < y
$$
derivation:

If the computation is normal
$$
s = x+y \ge x
$$
Also, for w bits data
$$
UMax_w = 2^w-1 < 2^w
$$
If overflow happens, s must smaller than x(y).


Practice Problem 2.27

Write a function with the following prototype:

1
/* Determine whether arguments can be added without overflow */ int uadd_ok(unsigned x, unsigned y);

This function should return 1 if arguments x and y can be added without causing overflow.

My solution: :white_check_mark:

1
2
3
4
int uadd_ok(unsigned x, unsigned y){    
unsigned sum = x + y ;
return !(sum < x);
}

principle: Unsigned negation

For any number x such that $0 ≤ x < 2^w$, its w-bit unsigned negation $-^u_w$ x is given by the following:
$$
-^u_wx = \begin{cases}x,&x = 0\
2^w-x,& \text{otherwise}
\end{cases}
$$
derivation:

Since unsigned addition use a modulo strategy, we have an abelian group. Therefore, the inverse of x has the property that:
$$
(-x +x) \mod 2^w = 0
$$

Practice Problem 2.28

We can represent a bit pattern of length w = 4 with a single hex digit. For an unsigned interpretation of these digits, use Equation 2.12 to fill in the following table giving the values and the bit representations (in hex) of the unsigned additive inverses of the digits shown.

x:

Hex Decimal
1
4
7
A
E

-x:

Hex Decimal

My solution: :white_check_mark:

x:

Hex Decimal
1 1
4 4
7 7
A 10
E 14

-x:

Hex Decimal
F 15
C 12
9 9
6 6
2 2

2.3.2 Two’s-Complement Addition

Given integer values x and y in the range $−2^{w−1} ≤ x, y ≤ 2^{w−1} − 1$, their sum is in the range $−2^w ≤ x + y ≤ 2^{w} − 2$, potentially requiring w + 1 bits to represent exactly.

We truncate the result by dropping the leading bit, then view it as 2’s complement

principle: Two’s-complement addition

For integer values x and y in the range $−2^{w−1} ≤ x,y ≤ 2^{w−1} − 1$:
$$
x +^t_w y = \begin{cases}
x+y-2^w , & 2^{w-1} \le x+y , &\text{positive overflow}\
x+y , &−2^{w−1} ≤ x+y ≤ 2^{w−1} − 1 , &\text{normal}\
x+y+2^w , &x+y < -2^{w-1} , &\text{negative overflow}
\end{cases}
$$
The w-bit two’s-complement sum of two numbers has the exact SAME bit-level representation as the unsigned sum.

Derivation:

Since two’s-complement addition has the exact same bit-level representation as unsigned addition, we can characterize the operation $+^t_w$ as one of converting its arguments to unsigned, performing unsigned addition, and then converting back to two’s complement:
$$
x +^t_wy= U2T_w(T2U_w(x)+^u_wT2U_w(y))\
=U2T_w([(x_{w−1}2^w + x + y_{w−1}2^w + y) \mod 2^w])\
=U2T_w(x+y \mod 2^w)
$$
example

principle: Detecting overflow in two’s-complement addition

For x and y in the range $TMin_w ≤ x,y ≤ TMax_w$, let $s = x +^t_w y$. Then the computation of s

  • has had positive overflow if and only if x > 0 and y > 0 but s ≤ 0.
  • has had negative overflow if and only if x < 0 and y < 0 but s ≥ 0.

Practice Problem 2.29

Fill in the following table in the style of Figure 2.25. Give the integer values of the 5-bit arguments, the values of both their integer and two’s-complement sums, the bit-level representation of the two’s-complement sum, and the case from the derivation of Equation 2.13.

x y x+y $x+^t_w y $ Case
10100 10001
11000 11000
10111 01000
00010 00101
01100 00100

My solution: :white_check_mark:

x y x+y $x+^t_w y $ Case
10100 10001 100101 00101
-12 -15 -27 5 1
11000 11000 110000 10000
-8 -8 -16 -16 2
10111 01000 11111 11111
-9 8 -1 -1 2
00010 00101 00111 00111
2 5 7 7 3
01100 00100 10000 10000
12 4 16 -16 4

Practice Problem 2.30:

Write a function with the following prototype:

1
/* Determine whether arguments can be added without overflow */ int tadd_ok(int x, int y); 

This function should return 1 if arguments x and y can be added without causing overflow.

My solution: :warning:

1
2
3
4
5
6
int tadd_ok(int x , int y){
int sum = x + y ;
if ( (x > 0 && y > 0 && sum <= 0 ) ||
(x < 0 && y < 0 && sum >= 0) ) return 0;
return 1;
}

Solution on the book:

This function is a direct implementation of the rules given to determine whether or not a two’s-complement addition overflows.

1
2
3
4
5
6
7
/* Determine whether arguments can be added without overflow */
int tadd_ok(int x, int y) {
int sum = x+y;
int neg_over = x < 0 && y < 0 && sum >= 0;
int pos_over = x >= 0 && y >= 0 && sum < 0;
return !neg_over && !pos_over;
}

Practice Problem 2.31

Your coworker gets impatient with your analysis of the overflow conditions for two’s-complement addition and presents you with the following implementation of tadd_ok:

1
2
3
4
5
6
/* Determine whether arguments can be added without overflow */
/* WARNING: This code is buggy. */
int tadd_ok(int x, int y) {
int sum = x+y;
return (sum-x == y) && (sum-y == x);
}

You look at the code and laugh. Explain why.

My solution: :x:

assume both x and y are positive numbers, and the sum of them gets a positive overflow so the result is now negative. When doing sum-x, it is equivalent to sum + (-x), negative overflow might(must) happen and the result will be positive, so there possible some value can breach the function.

all value breach the function

Solution on the book:

Your coworker could have learned, by studying Section 2.3.2, that two’scomplement addition forms an abelian group, and so the expression (x+y)-x will evaluate to y regardless of whether or not the addition overflows, and that (x+y)-y will always evaluate to x.

More details

signed addition has no difference with unsigned addition in bit level!

Practice Problem 2.32

You are assigned the task of writing code for a function tsub_ok, with arguments x and y, that will return 1 if computing x-y does not cause overflow. Having just written the code for Problem 2.30, you write the following:

1
2
3
4
5
/* Determine whether arguments can be subtracted without overflow */
/* WARNING: This code is buggy. */
int tsub_ok(int x, int y) {
return tadd_ok(x, -y);
}

For what values of x and y will this function give incorrect results?

My solution : :warning:

For the reason that TMin’s inverse is itself, we have the following expressions.

if y=INT32_MIN, -y=INT32_MIN

the tadd_ok actually test x+INT32_MIN rather than x-INT32_MIN

Solution on the book:

This function will give correct values, except when y is TMin. In this case, we will have -y also equal to TMin, and so the call to function tadd_ok will indicate overflow when x is negative and no overflow when x is nonnegative.

In fact, the opposite is true: tsub_ok(x, TMin) should yield 0 when x is negative and 1 when it is nonnegative.

One lesson to be learned from this exercise is that TMin should be included as one of the cases in any test procedure for a function.


2.3.3 Two’s-Complement Negation

We can see that every number x in the range $TMin_w ≤ x ≤ TMax_w$ has an additive inverse under $+^t_w$

principle: Two’s-complement negation
$$
-^t_wx= \begin{cases}
TMin_w , & x= TMin_w\
-x , & \text{otherwise}
\end{cases}
$$

Practice Problem 2.33

We can represent a bit pattern of length w = 4 with a single hex digit. For a two’scomplement interpretation of these digits, fill in the following table to determine the additive inverses of the digits shown:

Hex of x Decimal of x Decimal of -x Hex of -x
2
3
9
b
c

What do you observe about the bit patterns generated by two’s-complement and unsigned negation?

My solution: :white_check_mark:

Hex of x Decimal of x Decimal of -x Hex of -x
2 2 -2 e
3 3 -3 d
9 -7 7 7
b -5 5 5
c -4 4 4

At bit representation perspective, unsigned negation and signed negation are the same

2.3.4 Unsigned Multiplication

Integers x and y in the range $0 ≤ x, y ≤ 2^{w} − 1$ can be represented as w-bit unsigned numbers, but their product $x \cdot y$ can range between 0 and $(2^w − 1)^2 = 2^{2w} − 2^{w+1} + 1$. This could require as many as 2w bits to represent.

Instead, unsigned multiplication in C is defined to yield the w-bit value given by the low-order w bits of the 2w-bit integer product.

We just truncate the high-order w bits.

principle: Unsigned multiplication

For x and y such that $0 ≤ x,y ≤ UMax_w$:
$$
x *^u_w y = (x\cdot y) \mod 2^w
$$

Example:
1
2
3
4
5
6
7
8
#include <stdio.h>
int main(){
unsigned int a = 300 , b = 400 , c = 500 , d = 600 ;
unsigned int product = a * b * c *d ;
printf(("%u\n") , product);
return 0;
}
//1640261632
1
2
3
4
5
6
7
8
9
10
from bitstring import BitArray as ba
product = 300 * 400 * 500 * 600
print(product)
p = ba(uint = product , length=64)
print(p.bin)
truncated = ba(bin=p.bin[32:])
print(truncated.uint)
#36000000000
#0000000000000000000000000000100001100001110001000110100000000000
#1640261632

2.3.5 Two’s-Complement Multiplication

Integers x and y in the range $−2^{w−1} ≤ x,y ≤ 2^{w−1} − 1$ can be represented as w-bit two’s-complement numbers, but their product x . y can range between $−2^{w−1} \cdot (2^{w−1} − 1) = −2^{2w−2} + 2^{w−1}$ and $−2^{w−1} \cdot −2^{w−1} = 2^{2w−2}$. This could require as many as 2w bits to represent in two’s-complement form.

signed multiplication in C generally is performed by truncating the 2w-bit product to w bits.

We just truncate the high-order w bits.

This is same with perform unsigned multiplication first and then convert it to 2’ complement.

principle: Two’s-complement multiplication

For x and y such that $TMin_w ≤ x,y ≤ TMax_w$:
$$
x *^t_w y = U2T_w((x \cdot y) \mod 2^w)
$$

We claim that the bit-level representation of the product operation is identical for both unsigned and two’s-complement multiplication.

principle: Bit-level equivalence of unsigned and two’s-complement multiplication
$$
T2B_w(x *^t_w y ) = U2B_w(x *^u_w y)
$$
The bitlevel representations of both truncated products are identical for both unsigned and two’s-complement multiplication, even though the full 6-bit representations differ.

Example

Derivation:

assume $x’ ,y’$ are unsigned number , $x,y$ are unsigned conterpart of them.
$$
(x’ \cdot y’) \mod 2^w = [(x + x_{w−1}2^w) . (y + y_{w−1}2^w)] \mod 2^w\
=(x . y) \mod 2^w
$$

Practice Problem 2.34

Fill in the following table showing the results of multiplying different 3-bit numbers, in the style of Figure 2.27:

Mode x x binary y y binary x*y x*y binray truncated x*y truncated x*y binary
Unsigned [100] [101]
2’s complement [100] [101]
Unsigned [010] [111]
2’s complement [010] [111]
Unsigned [110] [110]
2’s complement [110] [110]

My solution : :white_check_mark:

Mode x x binary y y binary x*y x*y binray truncated x*y truncated x*y binary
Unsigned 4 [100] 5 [101] 20 [010100] 4 [100]
2’s complement -4 [100] -3 [101] 12 [001100] -4 [100]
Unsigned 2 [010] 7 [111] 14 [001110] 6 [110]
2’s complement 2 [010] -1 [111] -2 [111110] -2 [110]
Unsigned 6 [110] 6 [110] 36 [100100] 4 [100]
2’s complement -2 [110] -2 [110] 4 [000100] -4 [100]

Practice Problem 2.35

You are given the assignment to develop code for a function tmult_ok that will determine whether two arguments can be multiplied without causing overflow. Here is your solution:

1
2
3
4
5
6
/* Determine whether arguments can be multiplied without overflow */
int tmult_ok(int x, int y) {
int p = x*y;
/* Either x is zero, or dividing p by x gives y */
return !x || p/x == y;
}

You test this code for a number of values of x and y, and it seems to work properly. Your coworker challenges you, saying, “If I can’t use subtraction to test whether addition has overflowed (see Problem 2.31), then how can you use division to test whether multiplication has overflowed?”

Devise a mathematical justification of your approach, along the following lines. First, argue that the case x = 0 is handled correctly.

Otherwise, consider w-bit numbers x (x $\ne$ 0), y, p, and q, where p is the result of performing two’scomplement multiplication on x and y, and q is the result of dividing p by x.

  1. Show that $x \cdot y$, the integer product of x and y, can be written in the form $x \cdot y = p + t2^w$, where t $\ne$ 0 if and only if the computation of p overflows.

  2. Show that p can be written in the form $p = x \cdot q + r$, where |r| < |x|.

  3. Show that q = y if and only if r = t = 0.

My solution: :warning:

if x is 0 , !x is true, the function works fine.

Otherwise:

if overflow happens
$$
\because p = x\cdot y \mod 2 ^w
$$

$$
\therefore x\cdot y = p + t\cdot 2^w , t\ne 0
$$

in C integer division gives integer , assume p/x =q
$$
\therefore p = x \cdot q + r
$$

$$
\therefore x \cdot y = x \cdot q + r + t\cdot 2^w
$$

$$
\therefore y = q + \frac{r}{x} + \frac{t\cdot 2^w}{x} \text{(here is why we need to check whether x = 0)}
$$

if and only if r=t=0 gives y = q.

Solution on the book:

solution

Practice Problem 2.36

For the case where data type int has 32 bits, devise a version of tmult_ok (Problem 2.35) that uses the 64-bit precision of data type int64_t, without using division.

My solution: :warning:

1
2
3
4
5
6
7
/* Determine whether arguments can be multiplied without overflow */    
int tmult_ok(int x, int y){
int64_t p = (int64_t)x * (int64_t)y;
int product = x * y ;
int64_t prod = (int64_t)product;
return !(prod ^ p);
}

Solution on the book :

With 64 bits, we can perform the multiplication without overflowing. We then test whether casting the product to 32 bits changes the value:

1
2
3
4
5
6
7
8
/* Determine whether the arguments can be multiplied
without overflow */
int tmult_ok(int x, int y) {
/* Compute product without overflow */
int64_t pll = (int64_t) x*y;
/* See if casting to int preserves value */
return pll == (int) pll;
}

Note that the casting on the right-hand side of line 5 is critical. If we instead wrote the line as int64_t pll = x*y;

the product would be computed as a 32-bit value (possibly overflowing) and then sign extended to 64 bits.

Practice Problem 2.37

You are given the task of patching the vulnerability in the XDR code shown in the aside on page 136 for the case where both data types int and size_t are 32 bits. You decide to eliminate the possibility of the multiplication overflowing by computing the number of bytes to allocate using data type uint64_t. You replace the original call to malloc (line 9) as follows:

1
2
uint64_t asize = ele_cnt * (uint64_t) ele_size;
void *result = malloc(asize);

Recall that the argument to malloc has type size_t.

A. Does your code provide any improvement over the original?

B. How would you change the code to eliminate the vulnerability?

My solution: :white_check_mark:

A : It doesn’t. Since the malloc has argument type size_t which is 32 bits, convert from uint64_t to size_t still truncate the high-order 32 bits.

B : If overflow happens, there is no way to handle the situation correctly.

1
2
3
uint64_t asize = ele_cnt * (uint64_t) ele_size;
if (asize != (size_t)asize) return NULL;
else void * result = malloc(asize);

2.3.6 Multiplying by Constants

Historically, the integer multiply instruction on many machines was fairly slow, requiring 10 or more clock cycles, whereas other integer operations—such as addition, subtraction, bit-level operations, and shifting—required only 1 clock cycle.

As a consequence, one important optimization used by compilers is to attempt to replace multiplications by constant factors with combinations of shift and addition operations.

principle: Multiplication by a power of 2

Let x be the unsigned integer represented by bit pattern $[x_{w−1}, x_{w−2},…,x_0]$. Then for any k ≥ 0, the w + k-bit unsigned representation of $x2^k$ is given by $[x_{w−1}, x_{w−2},…,x_0, 0,…, 0]$, where k zeros have been added to the right.

principle: Unsigned multiplication by a power of 2

For C variables x and k with unsigned values x and k, such that 0 ≤ k < w, the C expression x << k yields the value $x *^u_w 2^k$

Since the bit-level operation of fixed-size two’s-complement arithmetic is equivalent to that for unsigned arithmetic, we can make a similar statement about the relationship between left shifts and multiplication by a power of 2 for two’scomplement arithmetic:

principle: Two’s-complement multiplication by a power of 2

For C variables x and k with two’s-complement value x and unsigned value k, such that 0 ≤ k < w , the C expression x << k yields the value $x *^t_w 2 ^ k $.

For example, suppose a program contains the expression x*14. Recognizing that $14 = 2^3 + 2^2 + 2^1$, the compiler can rewrite the multiplication as (x<<3) + (x<<2) + (x<<1), replacing one multiplication with three shifts and two additions

Even better, the compiler can also use the property $14 = 2^4 − 2^1$ to rewrite the multiplication as (x<<4) - (x<<1), requiring only two shifts and a subtraction.

Practice Problem 2.38

As we will see in Chapter 3, the lea instruction can perform computations of the form (a << k) + b, where k s either 0, 1, 2, or 3, and b is either 0 or some program value. The compiler often uses this instruction to perform multiplications by constant factors. For example, we can compute 3*a as (a<<1) + a.

Considering cases where b is either 0 or equal to a, and all possible values of k, what multiples of a can be computed with a single lea instruction?

My solution: :white_check_mark:

when b is 0:

k expression result
0 a << 0 a
1 a << 1 2a
2 a << 2 4a
3 a << 3 8a

when b equals to a:

k expression result
0 a << 0 + a 2a
1 a << 1 + a 3a
2 a << 2 + a 5a
3 a << 3 + a 9a

Generalizing from our example, consider the task of generating code for the expression x * K, for some constant K

  1. The compiler can express the binary representation of K as an alternating sequence of zeros and ones : [(0 … 0) (1 … 1) (0 … 0) … (1 … 1)]

  2. Consider a run of ones from bit position n down to bit position m (n ≥ m).

    For example, 14 can be written as [(0 … 0)(111)(0)],we have (n = 3 and m = 1.)

  3. We can compute the effect of these bits on the product using either of two different forms:

    Form A: (x<<n) + (x << (n-1)) + ... + (x << m)

    Form B:(x<<(n+1)) - (x<<m)

Of course, the trade-off between using combinations of shifting, adding, and subtracting versus a single multiplication instruction depends on the relative speeds of these instructions, and these can be highly machine dependent.

Most compilers only perform this optimization when a small number of shifts, adds, and subtractions suffice.

Practice Problem 2.39

How could we modify the expression for form B for the case where bit position n is the most significant bit?

My solution : :x:

Solution on the book :

The expression simply becomes -(x<<m). To see this, let the word size be w so that n = w − 1. Form B states that we should compute (x<<w)-(x<<m), but shifting x to the left by w will yield the value 0.

Practice Problem 2.40

For each of the following values of K, find ways to express x * K using only the specified number of operations, where we consider both additions and subtractions to have comparable cost. You may need to use some tricks beyond the simple form A and B rules we have considered so far.

K Shifts Add/Subs Expression
7 1 1
30 4 3
28 2 1
55 2 2

My solution : :white_check_mark:

K Shifts Add/Subs Expression
7 1 1 (x<<3)-x
30 4 3 (x<<4)+(x<<3)+(x<<2)+(x<<1)
28 2 1 (x<<5)-(x<<2)
55 2 2 (x<<6)-(x<<3)-x

Practice Problem 2.41

For a run of ones starting at bit position n down to bit position m (n ≥ m), we saw that we can generate two forms of code, A and B. How should the compiler decide which form to use?

My solution: :x:

when K is odd number, generate form A, otherwise generate form B.

Solution on the book:

Assuming that addition and subtraction have the same performance, the rule is to choose form A when n = m, either form when n = m + 1, and form B when n>m + 1.

The justification for this rule is as follows.

Assume first that m > 0. When n = m, form A requires only a single shift, while form B requires two shifts and a subtraction.

When n = m + 1, both forms require two shifts and either an addition or a subtraction.

When n>m + 1, form B requires only two shifts and one subtraction, while form A requires n − m + 1 > 2 shifts and n − m > 1 additions.

For the case of m = 0, we get one fewer shift for both forms A and B, and so the same rules apply for choosing between the two.


2.3.7 Dividing by Powers of 2

Integer division on most machines is even slower than integer multiplication— requiring 30 or more clock cycles.

Integer division always rounds toward zero

  • For x ≥ 0 and y > 0, integer division should yield $\lfloor x/y \rfloor$
  • while for x < 0 and y > 0, it should yield $\lceil x/y \rceil $.
  • That is, it should round down a positive result but round up a negative one.

principle: Unsigned division by a power of 2

For C variables x and k with unsigned values x and k, such that 0 ≤ k<w, the C expression x >> k yields the value $\lfloor x/2^k \rfloor$.

principle: Two’s-complement division by a power of 2, rounding down

Let C variables x and k have two’s-complement value x and unsigned value k, respectively, such that 0 ≤ k < w, The C expression x >> k, when the shift is performed arithmetically, yields the value $\lfloor x/2^k \rfloor $ .

positive signed number works fine, but negative number should be rounding up ranther than rounding down. We will need to adjust our strategy to handle division for negative values of x.

We can correct for the improper rounding that occurs when a negative number is shifted right by “biasing” the value before shifting.

principle: Two’s-complement division by a power of 2, rounding up

Let C variables x and k have two’s-complement value x and unsigned value k, respectively, such that 0 ≤ k < w , The C expression (x + (1 << k) - 1) >> k , when the shift is performed arithmetically, yields the value x/2k.

The biasing technique exploits the property that $\lceil x/y \rceil =\lfloor (x + y − 1)/y \rfloor$ for integers x and y such that y > 0.

Returning to the case where $y = 2^k$, the C expression x + (1 << k) - 1 yields the value $x + 2^k − 1$. Shifting this right arithmetically by k therefore yields $\lceil x/2^k \rceil $We can use the following expression for signed division by 2’s power

1
(x<0 ? x+(1<<k)-1 : x) >> k

Practice Problem 2.42

Write a function div16 that returns the value x/16 for integer argument x.

Your function should not use division, modulus, multiplication, any conditionals (if or ?:), any comparison operators (e.g., <, >, or ==), or any loops.

You may assume that data type int is 32 bits long and uses a two’s-complement representation, and that right shifts are performed arithmetically

My solution : :x:

Solution on the book :

The only challenge here is to compute the bias without any testing or conditional operations.

We use the trick that the expression x >> 31 generates a word with all ones if x is negative, and all zeros otherwise. By masking off the appropriate bits, we get the desired bias value.

1
2
3
4
5
int div16(int x) {
/* Compute bias to be either 0 (x >= 0) or 15 (x < 0) */
int bias = (x >> 31) & 0xF;
return (x + bias) >> 4;
}

Unfortunately, this approach does not generalize to division by arbitrary constants. Unlike multiplication, we CANNOT express division by arbitrary constants K in terms of division by powers of 2.

Practice Problem 2.43

In the following code, we have omitted the definitions of constants M and N:

1
2
3
4
5
6
7
#define M /* Mystery number 1 */
#define N /* Mystery number 2 */
int arith(int x, int y) {
int result = 0;
result = x*M + y/N; /* M and N are mystery numbers. */
return result;
}

We compiled this code for particular values of M and N. The compiler optimized the multiplication and division using the methods we have discussed. The following is a translation of the generated machine code back into C:

1
2
3
4
5
6
7
8
9
/* Translation of assembly code for arith */
int optarith(int x, int y) {
int t = x;
x <<= 5;
x -= t;
if (y < 0) y += 7;
y >>= 3; /* Arithmetic shift */
return x+y;
}

What are the values of M and N?

My solution : :white_check_mark:

1
2
3
4
5
6
7
8
9
10
11
/* Translation of assembly code for arith */
int optarith(int x, int y) {
int t = x;
x <<= 5; // x = 32 * x
x -= t; // x = 31*x M = 31
if (y < 0) y += 7; // 7 = 2**3 - 1
y >>= 3; // y = y / 8 N = 8
return x+y;
}
// M = 31
// N = 8

2.3.8 Final Thoughts on Integer Arithmetic

As we have seen, the “integer” arithmetic performed by computers is really a form of modular arithmetic. The finite word size used to represent numbers limits the range of possible values, and the resulting operations can overflow.

Practice Problem 2.44

Assume data type int is 32 bits long and uses a two’s-complement representation for signed values. Right shifts are performed arithmetically for signed values and logically for unsigned values. The variables are declared and initialized as follows:

1
2
3
4
int x = foo(); /* Arbitrary value */
int y = bar(); /* Arbitrary value */
unsigned ux = x;
unsigned uy = y;

For each of the following C expressions, either (1) argue that it is true (evaluates to 1) for all values of x and y, or (2) give values of x and y for which it is false (evaluates to 0):

1
2
3
4
5
6
7
A. (x > 0) || (x-1 < 0)
B. (x & 7) != 7 || (x<<29 < 0)
C. (x * x) >= 0
D. x < 0 || -x <= 0
E. x > 0 || -x >= 0
F. x+y == uy+ux
G. x*~y + uy*ux == -x

My solution : :white_check_mark:

1
2
3
4
5
A
if x > 0 : the expression is True
if x == 0 : x-1 = 1111...1111 < 0 is True, the expression is True
if INT32_MIN < x < 0: the expression is True
only when x == INT32_MIN the expression is False
1
2
3
4
B
for any x != 7, expression (x & 7) != 7 will be True
if x == 7 , (x<<29) = 11100...00 < 0
the expression is always True
1
2
3
4
5
C
for some x*x leads to overflow the expression will give out False
for example, x = 46341
printf("%d\n" , 46341 * 46341 );
//-2147479015
1
2
3
4
5
D
if x < 0, the expression is true
if x == 0 , -x is 0 == 0, the expression is true
if x > 0, every signed postive number has its own additive inverse, so -x < 0 is true
the expression is always true
1
2
3
4
5
6
7
8
9
10
E
if x > 0, the expression is true
if x == 0 , -x is 0 == 0, the expression is true
if x < 0, every signed negative number has its own additive inverse except INT32_MIN
-INT32_MIN == INT32_MIN < 0
when x = INT32_MIN the expression is false, otherwise true
----------------------------------------------------------------
int x = INT32_MIN;
printf("%d\n" ,x > 0 || -x >= 0 )
//0
1
2
3
4
5
6
7
8
9
10
11
F
in bit level representation, signed and unsigned addition have no difference. When compare signed and unsigned, signed is converted to unsigned, so the expression is always true.
-----------------------------------------------------------------
for (int i = 0 ; i < 10 ; i ++){
int x = rand(); /* Arbitrary value */
int y = rand(); /* Arbitrary value */
unsigned ux = x;
unsigned uy = y;
printf("%d%c", x+y == uy+ux , i==9?'\n':' ');
}
//1 1 1 1 1 1 1 1 1 1
1
2
3
4
5
6
7
G
I am too lazy to write the comprehensive derivation, just an example to illustrate the expression is always true.
assume y = 0b11110000000000000000000000000111
then ~y = 0b00001111111111111111111111111000
x * ~y = (x<<28)-(x<<3)
ux * uy = (x<<32)-(x<<28) + (x<<3) - (x<<0)
x * ~y + ux * uy = (x<<32) - (x<<0) = 0 - x = -x

Solution on the book :

G:

G. x*~y + uy*ux == -x True. ~y equals -y-1. uy*ux equals x*y. Thus, the left-hand side is equivalent to x*-y-x+x*y.