In this course, we will present the basic numerical methods that solve a set of classical mathematical problems. Computers are a necessary tool in the efficient use of numerical methods. Therefore, we will see how the numbers, which can have an infinite number of digits, are stored in the computer, which is a finite device.

Numbers are stored on computers as:

Integer numbers
Floating-point numbers

Integers are stored exactly and there is no error, except overflow error. Thus, we will not study them. We will only study the biased binary representation of signed integers because that is how the exponents of floating-point numbers are stored.

We will study storing numbers in binary. Therefore, the first step we will take is to review how to convert a number from decimal to binary and from binary to decimal.

Introduction¶

Conversion from binary to decimal¶

In the decimal system, the number $107.625$ means:

$$ 107.625=1\cdot10^{2}+7\cdot10^{0}+6\cdot10^{-1}+2\cdot10^{-2}+5\cdot10^{-3}. $$

In general, computers use the binary system: only zeros and ones are stored. In the binary system, the numbers represent powers of $2$. The conversion from binary to decimal is direct.

$$(1101011.101)_2$$

We need to know the position of each digit with respect to the point

$$ \begin{array}{ccccccccccc} \tiny{(6)}&\tiny{(5)}&\tiny{(4)}&\tiny{(3)}&\tiny{(2)}&\tiny{(1)}&\tiny{(0)}& & \tiny{(-1)}&\tiny{(-2)}&\tiny{(-3)}\\ 1&1&0&1&0&1&1&.&1&0&1 \end{array} $$

which expresses the number

$$ 1\cdot2^{6}+1\cdot2^{5}+0\cdot2^{4}+1\cdot2^{3}+0\cdot2^{2}+1\cdot2^{1}+1\cdot2^{0}+1\cdot2^{-1}+0\cdot2^{-2}+1\cdot2^{-3}$$

that is

$$ (107.625)_{10} $$

Conversion from decimal to binary system¶

Integer part¶

We divide successively by $2$ until the quotient is zero and then the remainders are the digits in base $2$.

$$ \begin{array}{cccc} \hline \mathrm{Dividend} & \mathrm{Divisor} & \mathrm{Quotient} & \mathrm{Remainder} & \\ \hline 107 & 2 & 53 & 1 & \uparrow \\ 53 & 2 & 26 & 1 & \uparrow \\ 26 & 2 & 13 & 0 & \uparrow \\ 13 & 2 & 6 & 1 & \uparrow \\ 6 & 2 & 3 & 0 & \uparrow \\ 3 & 2 & 1 & 1 & \uparrow \\ 1 & 2 & 0 & 1 & \uparrow \\ \hline \end{array} $$

The number starts with the last remainder and the binary number is $1101011.$

Fractional part¶

We multiply the fractional part by $2$, subtract the integer part, and repeat until the we obtain a zero.

$$\begin{array}{lccccccc} 0.625 &\times& 2 &= &1.25 &\rightarrow & 1 & \downarrow\\ 0.25&\times& 2& = &0.5 &\rightarrow & 0& \downarrow\\ 0.5&\times& 2& = &1.0&\rightarrow & 1& \downarrow \end{array}$$

We start with the first integer part and the number in binary is $0.101$

Exercise¶

Change of basis. Compute:

$(101100.001)_2$ in decimal base.
$(51)_{10}$ in binary base.
$(51.65625)_{10}$ in binary base.

Compute $(101100.001)_2$ in decimal base.¶

We need to take into account the position of each digit with respect to the point

$$ \begin{array}{cccccccccc} \tiny{(5)}&\tiny{(4)}&\tiny{(3)}&\tiny{(2)}&\tiny{(1)}&\tiny{(0)}& & \tiny{(-1)}&\tiny{(-2)}&\tiny{(-3)}\\ 1&0&1&1&0&0&.&0&0&1 \end{array} $$

And then, the value of this number in base $10$ is

$$\begin{array}{l} 1\cdot 2^{5}+0\cdot 2^{4}+1\cdot 2^{3}+1\cdot 2^{2}+0\cdot 2^{1}+0\cdot 2^{0} +0\cdot 2^{-1}+0\cdot 2^{-2}+1\cdot 2^{-3}=\\=2^{5}+2^{3}+2^{2}+2^{-3}=44.125 \end{array} $$

Compute $(51)_{10}$ in binary base.¶

We divide successively by $2$ until the quotient is zero. Then, the remainders are the digits in base $2$.

$$ \begin{array}{cccc} \hline \mathrm{Dividend} & \mathrm{Divisor} & \mathrm{Quotient} & \mathrm{Remainder}&\\ \hline 51 & 2 & 25 & 1 & \uparrow \\ 25 & 2 & 12 & 1 & \uparrow \\ 12 & 2 & 6 & 0 & \uparrow \\ 6 & 2 & 3 & 0 & \uparrow \\ 3 & 2 & 1 & 1 & \uparrow \\ 1 & 2 & 0 & 1 & \uparrow \\ \hline \end{array} $$

The number starts with the last remainder and the binary number is $110011.$

Compute $(51.65625)_{10}$ in binary base.¶

The integer part is already converted in the previous section. Let's now convert the fractional part.

We multiply by $2$, subtract the integer part, and repeat until the fractional part is zero.

$$\begin{array}{lccclccc} 0.65625 &\times& 2 &= &1.3125 &\rightarrow & 1 & \downarrow\\ 0.3125&\times& 2& = &0.625 &\rightarrow & 0& \downarrow\\ 0.625&\times& 2& = &1.25&\rightarrow & 1& \downarrow\\ 0.25&\times& 2& = &0.5&\rightarrow & 0& \downarrow\\ 0.5&\times& 2& = &1.0&\rightarrow & 1& \downarrow \end{array}$$

We start with the first integer part. The number in binary is $0.10101$

And taking into account the integer part

$$(51.65625)_{10} = (110011.10101)_2$$

Contents