When float does not change when incrementing (C/C++ programming)

What would you expect if you increment 20 millions as float? Do you expect that it will not change at all?

Check the output of the simple program below. You may know the explanation, but did you expect it to happen to such relatively small values?

user@ubuntu$ gcc float_inc.c ; ./a.out
float has 32 bits
a=20000000.000000 a+1=20000000.000000
a=16777216.000000 a+1=16777216.000000
a=16777218.000000 a+1=16777220.000000
a=16777220.000000 a+1=16777220.000000
a=16777220.000000 a+1=16777220.000000
a=16777220.000000 a+1=16777220.000000
a=16777222.000000 a+1=16777224.000000

#include <stdio.h>

int main(){
    float a;

    printf("float has %d bits\n", (int)sizeof(float)*8 );

    printf("a=%f a+1=%f\n", a, a+1);

    printf("a=%f a+1=%f\n", a, a+1);

    printf("a=%f a+1=%f\n", a, a+1);

    printf("a=%f a+1=%f\n", a, a+1);

    printf("a=%f a+1=%f\n", a, a+1);

    printf("a=%f a+1=%f\n", a, a+1);

    printf("a=%f a+1=%f\n", a, a+1);


What we observe?

We see that we can represent 20 millions as float, but adding 1 to it will not change the value. Actually, it happens also for smaller numbers. I made a small loop and I determined that the first float (32bits) natural number that will not change when incrementing is 16777216.

We see that 16777217 and 16777219 cannot be represented at all as 32bits float, they are rounded to the next integer and does not change when adding 1.

However, numbers like 16777218 and 16777222 can be represented as float and changes when increment, but when adding 1 we actually obtain the number plus 2.

What is happening?

There is no trick under the table, it's all about float's internal representation.

We know that the float numbers cannot represent all the possible real numbers. We guess that for some really big natural numbers, the float will miss some values too. But this seems more like a theoretical problem, right?

Well, the number of distinct float numbers that can be represented on 32bits is the same as the number of integers on 32 bits : a bits over 4 billions. From this 4 billions of distinct bits representations, a lot of bits combinations are used on float to represent various decimal numbers like 0.5, 1.25, 0.15625, etc. This means less values remains to represent natural numbers.

As we go farther and farther from zero, the more sparse becomes the float numbers. We just don't have enough distinct bits combinations to represent all the numbers in between. We gain the possibility to approximate numbers bigger than 5 billions (impossible for 32 bits unsigned integer), however the approximation becomes bigger and bigger.

Bottom line:

The meaning of a float variable is to approximate a real number with an error with some orders of magnitude smaller than the value. This error can become greater than 1 for "big" values (like 17 millions). The problem is similar with "double" too, but a bit less obvious on... not very big numbers.

Take care where you use float variables. For example, don't use them to count something or adding small values to big values. You may also encounter big errors in implementing algorithms with float, even if they are mathematically correct.  Don't use it for money either!

No comments:

Post a Comment