Strange behaviour with floats and string conversion

I’ve typed this into python shell:

>>> 0.1*0.1
0.010000000000000002

I expected that 0.1*0.1 is not 0.01, because I know that 0.1 in base 10 is periodic in base 2.

>>> len(str(0.1*0.1))
4

I expected to get 20 as I’ve seen 20 characters above. Why do I get 4?

>>> str(0.1*0.1)
'0.01'

Ok, this explains why I len gives me 4, but why does str return '0.01'?

>>> repr(0.1*0.1)
'0.010000000000000002'

Why does str round but repr not? (I have read this answer, but I would like to know how they have decided when str rounds a float and when it doesn’t)

>>> str(0.01) == str(0.0100000000001)
False
>>> str(0.01) == str(0.01000000000001)
True

So it seems to be a problem with the accuracy of floats. I thought Python would use IEEE 754 single precicion floats. So I’ve checked it like this:

#include <stdint.h>
#include <stdio.h> // printf

union myUnion {
    uint32_t i; // unsigned integer 32-bit type (on every machine)
    float f;    // a type you want to play with
};

int main() {
    union myUnion testVar;
    testVar.f = 0.01000000000001f;
    printf("%fn", testVar.f);

    testVar.f = 0.01000000000000002f;
    printf("%fn", testVar.f);

    testVar.f = 0.01f*0.01f;
    printf("%fn", testVar.f);
}

I got:

0.010000
0.010000
0.000100

Python gives me:

>>> 0.01000000000001
0.010000000000009999
>>> 0.01000000000000002
0.010000000000000019
>>> 0.01*0.01
0.0001

Why does Python give me these results?

(I use Python 2.6.5. If you know of differences in the Python versions, I would also be interested in them.)

Contents hide

Answers:

Method 1

Method 2

Method 3

Answers:

Thank you for visiting the Q&A section on Magenaut. Please note that all the answers may not help you solve the issue immediately. So please treat them as advisements. If you found the post helpful (or not), leave a comment & I’ll get back to you as soon as possible.

Method 1

The crucial requirement on repr is that it should round-trip; that is, eval(repr(f)) == f should give True in all cases.

In Python 2.x (before 2.7) repr works by doing a printf with format %.17g and discarding trailing zeroes. This is guaranteed correct (for 64-bit floats) by IEEE-754. Since 2.7 and 3.1, Python uses a more intelligent algorithm that can find shorter representations in some cases where %.17g gives unnecessary non-zero terminal digits or terminal nines. See What’s new in 3.1? and issue 1580.

Even under Python 2.7, repr(0.1 * 0.1) gives "0.010000000000000002". This is because 0.1 * 0.1 == 0.01 is False under IEEE-754 parsing and arithmetic; that is, the nearest 64-bit floating-point value to 0.1, when multiplied by itself, yields a 64-bit floating-point value that is not the nearest 64-bit floating-point value to 0.01:

>>> 0.1.hex()
'0x1.999999999999ap-4'
>>> (0.1 * 0.1).hex()
'0x1.47ae147ae147cp-7'
>>> 0.01.hex()
'0x1.47ae147ae147bp-7'
                 ^ 1 ulp difference

The difference between repr and str (pre-2.7/3.1) is that str formats with 12 decimal places as opposed to 17, which is non-round-trippable but produces more readable results in many cases.

Method 2

I can confirm your behaviour

ActivePython 2.6.4.10 (ActiveState Software Inc.) based on
Python 2.6.4 (r264:75706, Jan 22 2010, 17:24:21) [MSC v.1500 64 bit (AMD64)] on win32
Type "help", "copyright", "credits" or "license" for more information.
>>> repr(0.1)
'0.10000000000000001'
>>> repr(0.01)
'0.01'

Now, the docs claim that in Python <2.7

the value of repr(1.1) was computed as format(1.1, '.17g')

This is a slight simplification.

Note that this is all to do with the string formatting code — in memory, all Python floats are just stored as C++ doubles, so there is never going to be a difference between them.

Also, it’s kind of unpleasant to work with the full-length string for a float even if you know that there’s a better one. Indeed, in modern Pythons a new algorithm is used for float formatting, that picks the shortest representation in a smart way.

I spent a while looking this up in the source code, so I’ll include the details here in case you’re interested. You can skip this section.

In floatobject.c, we see

static PyObject *
float_repr(PyFloatObject *v)
{
    char buf[100];
    format_float(buf, sizeof(buf), v, PREC_REPR);

    return PyString_FromString(buf);
}

which leads us to look at format_float. Omitting the NaN/inf special cases, it is:

format_float(char *buf, size_t buflen, PyFloatObject *v, int precision)
{
    register char *cp;
    char format[32];
    int i;

    /* Subroutine for float_repr and float_print.
       We want float numbers to be recognizable as such,
       i.e., they should contain a decimal point or an exponent.
       However, %g may print the number as an integer;
       in such cases, we append ".0" to the string. */

    assert(PyFloat_Check(v));
    PyOS_snprintf(format, 32, "%%.%ig", precision);
    PyOS_ascii_formatd(buf, buflen, format, v->ob_fval);
    cp = buf;
    if (*cp == '-')
        cp++;
    for (; *cp != ''; cp++) {
        /* Any non-digit means it's not an integer;
           this takes care of NAN and INF as well. */
        if (!isdigit(Py_CHARMASK(*cp)))
            break;
    }
    if (*cp == '') {
        *cp++ = '.';
        *cp++ = '0';
        *cp++ = '';
        return;
    }

    <some NaN/inf stuff>
}

We can see that

So this first initialises some variables and checks that v is a well-formed float. It then prepares a format string:

PyOS_snprintf(format, 32, "%%.%ig", precision);

Now PREC_REPR is defined elsewhere in floatobject.c as 17, so this computes to "%.17g". Now we call

PyOS_ascii_formatd(buf, buflen, format, v->ob_fval);

With the end of the tunnel in sight, we look up PyOS_ascii_formatd and discover that it uses snprintf internally.

Method 3

from python tutorial:

In versions prior to Python 2.7 and Python 3.1, Python rounded this value to 17 significant digits, giving ‘0.10000000000000001’. In current versions, Python displays a value based on the shortest decimal fraction that rounds correctly back to the true binary value, resulting simply in ‘0.1’.

All methods was sourced from stackoverflow.com or stackexchange.com, is licensed under cc by-sa 2.5, cc by-sa 3.0 and cc by-sa 4.0

0 0 votes

Article Rating