Floating Point Example - Storing a Non-Integer
What is the floating-point representation of -483.137_{10} ? First we must convert this number to base-2.
Remember that any integer, N, in base-10 can be represented as the sum
{... + a_{3}×2^{3} + a_{2}×2^{2} + a_{1}×2^{1} + a_{0}×2^{0}}, where the a_{i} values are either 0 or 1. If we divide N by 2, and there is a positive remainder, then a_{0}=1. Otherwise, a_{0}=0. Dividing the integer part of N/2 by 2, we can determine the value of a_{1 }and so on. The base-2 representation of N is thus {... a_{3}a_{2}a_{1}a_{0}}. As shown below, 483_{10} = 111100011_{2}.
483/2 |
= |
241 |
Remainder: |
1 |
a_{0} |
241/2 |
= |
120 |
1 |
a_{1} | |
120/2 |
= |
60 |
0 |
a_{2} | |
60/2 |
= |
30 |
0 |
a_{3} | |
30/2 |
= |
15 |
0 |
a_{4} | |
15/2 |
= |
7 |
1 |
a_{5} | |
7/2 |
= |
3 |
1 |
a_{6} | |
3/2 |
= |
1 |
1 |
a_{7} | |
½ |
= |
0 |
1 |
a_{8} |
A decimal number between 0 and 1, .F, in base-10 can be represented as the sum
{b_{1}×2^{-1} + b_{2}×2^{-2} + b_{3}×2^{-3} + ...}, where the b_{i} values are either 0 or 1. If we multiply .F by 2, and the result is greater or equal to one, then b_{1}=1. Otherwise, b_{1}=0. Multiplying the fractional part of .F×2 by 2, we can determine the value of b_{2}, and so on. The base-2 representation of .F is thus .{b_{1}b_{2}b_{3}...}. As shown below, 0.137_{10} = .00100011_{2}, truncated to 8-places.
0.137×2 |
= |
0.274 |
+ |
0 |
b_{1} |
0.274×2 |
= |
0.548 |
+ |
0 |
b_{2} |
0.548×2 |
= |
0.096 |
+ |
1 |
b_{3} |
0.096×2 |
= |
0.192 |
+ |
0 |
b_{4} |
0.192×2 |
= |
0.384 |
+ |
0 |
b_{5} |
0.384×2 |
= |
0.768 |
+ |
0 |
b_{6} |
0.768×2 |
= |
0.536 |
+ |
1 |
b_{7} |
0.536×2 |
= |
0.072 |
+ |
1 |
b_{8} |
Thus we have that 483.137_{10} _ 111100011.00100011_{2}. In base-2 scientific notation, the approximate value is 1.1110001100100011×10^{1000}. Adding the bias to the exponent yields 10000000111. Taking in account the negative sign and dropping the leading 1 from the mantissa, the 3-byte floating-point representation of _483.137_{10} is
1 1 0 0 0 0 0 0 |
0 1 1 1 1 1 1 0 |
0 0 1 1 0 0 1 0 |
byte 1 |
byte 2 |
byte 3 |
Note that not all of the mantissa can be stored. The remainder is truncated. If four or more bytes were available, more of the mantissa could be kept. In hexadecimal form, this number is C07E32_{16}. The actual base-10 number represented by this floating-point number is -483.125_{10}.