Course Webpage

Exercise

Give the IEEE 754 single precision floating-point representation of $(120.875)_{10}$


Conversion decimal to binary

Integer part

We divide successively by $2$ until the quotient is zero and the remainders are digits in base $2$.

$$ \begin{array}{cccc} \hline \mathrm{Dividend} & \mathrm{Divisor} & \mathrm{Quotient} & \mathrm{Remainder} & \\ \hline 120 & 2 & 60 & 0 & \uparrow \\ 60 & 2 & 30 & 0 & \uparrow \\ 30 & 2 & 15 & 0 & \uparrow \\ 15 & 2 & 7 & 1 & \uparrow \\ 7 & 2 & 3 & 1 & \uparrow \\ 3 & 2 & 1 & 1 & \uparrow \\ 1 & 2 & 0 & 1 & \uparrow \\ \hline \end{array} $$

The number starts with the last remainder and the binary number is $\mathtt{1111000}.$

Fractional part

We multiply the fractional part by $2$, subtract the integer part, and repeat until the we obtain a zero.

$$\begin{array}{lccccccc} 0.875 &\times& 2 &= &1.750 &\rightarrow & 1 & \downarrow\\ 0.75&\times& 2& = &1.5 &\rightarrow & 1& \downarrow\\ 0.5&\times& 2& = &1.0&\rightarrow & 1& \downarrow \end{array}$$

We start with the first integer part and the number in binary is $\mathtt{0.111}$

And the full number is

$$(120.875)_{10} = (\mathtt{1111000.111})_2$$

Normalization

  1. We move the point so that a single non-zero digit appears to its left.
  2. Then, we have to multiply by $10^n$ where $n$ is the number of positions that we have moved the point to the left or $10^{- n}$ being $n$ the number of positions that we have moved the comma to the right.
  3. We add the sign.

This number, normalized, is

$$+\mathtt{1.1110\,0011\,1}\times2^6$$

with

  • Sign: $+$
  • Mantissa: $\mathtt{1.1110\,0011\,1}$
  • Exponent: $6$

Sign

As the sign is positive $\longrightarrow$ sign $\mathtt{0}$

Exponent

We have $m=8$ bits for the exponent. Therefore there are $2^m=2^8=256$ different combinations and, in principle, we can represent $256$ numbers. As we start at $0$ it will end at $255$. The first number, $\mathtt{0000\,0000}$, and the last one, $\mathtt{1111\,1111}$ are reserved (we will see later for what). And since the representation is biased, we subtract the bias

$$bias=2^{m-1}-1=2^{8-1}-1=2^7-1=128-1=127$$

to get the represented value.

The exponent value is $6$. To get its face value we must add the bias and we have $6+127=133$ that in binary is

$$ \begin{array}{cccc} \hline \mathrm{Dividend} & \mathrm{Divisor} & \mathrm{Quotient} & \mathrm{Remainder} & \\ \hline 133 & 2 & 66 & 1 & \uparrow \\ 66 & 2 & 33 & 0 & \uparrow \\ 33 & 2 & 16 & 1 & \uparrow \\ 16 & 2 & 8 & 0 & \uparrow \\ 8 & 2 & 4 & 0 & \uparrow \\ 4 & 2 & 2 & 0 & \uparrow \\ 2 & 2 & 1 & 0 & \uparrow \\ 1 & 2 & 0 & 1 & \uparrow \\ \hline \end{array} $$

So

$$(133)_{10} = (\mathtt{1000\,0101})_2$$
$$ \begin{array}{cccc} \hline \mathrm{Binary}\; \mathrm{number} & \mathrm{Face}\; \mathrm{value} & & \mathrm{Represented}\; \mathrm{value}\\ \hline \mathtt{0000\,0000}& 0& & R\\ \mathtt{0000\,0001}& 1& & -126\\ \mathtt{0000\,0010}& 2& & -125\\ \mathtt{0000\,0011}& 3& & -124\\ \cdots & \cdots & -127 & \cdots\\ \cdots & \cdots & \longrightarrow & \cdots\\ \mathtt{{\color{red}{1000\,0101}}}& {\color{red}{133}}& & {\color{red}{6}}\\ \cdots & \cdots & +127 & \cdots\\ \cdots & \cdots & \longleftarrow & \cdots\\ \mathtt{1111\,1100}& 252 & & 126 \\ \mathtt{1111\,1101}& 253 & & 126 \\ \mathtt{1111\,1110}& 254 & & 127 \\ \mathtt{1111\,1111}& 255 & & R\\ \hline \end{array} $$

Mantissa

$$(120.875)_{10}$$

The mantissa is $$\mathtt{1.{\color{ForestGreen}{1110\,0011\,1}}}.$$ We must take into account the hidden bit, which we do not store, and that we fill with zeros from the right until we have 23 bits.

Number

The number $120.875$, in single precision, is stored as

$$ \begin{array}{|c|c|c|} \hline \mathtt{sign}&\mathtt{exponent}&\mathtt{mantissa}\\ \hline \mathtt{0}&\mathtt{\color{red}{1000\,0101}}&\mathtt{\color{ForestGreen}{1110\,0011\,1}000\,0000\,0000\,000}\\ \hline \end{array} $$