# Heads Up: Microchip Compiler Behavior with 'float', 'double' and 'long double' Data Types

To anyone who deals with numbers in their software where 15 digits of accuracy (as opposed to 7) is important:

Just a heads-up – something I discovered late yesterday.

Given:

Anything a programmer DOESN'T understand about the compiler he is using IS going to be a source of bugs.

I am about to arm you with something I didn't know (and is not clearly in my interpretation of Microchip compiler User's Guides) until late yesterday.

'float', 'double' and 'long double' are not strictly specified in the ANSI C specification. What they REALLY are is left up to the implementer of the compiler.

In all the compilers I've been exposed to intended to make C programs run in on a computer with an 80x86 family of microprocessors have a pattern, largely because Intel came out with an 8087 math co-processor chip as an option on their IBM PC, and it was so popular that starting with the 80286 (I believe, or possibly the 80186 – I don't remember if the IBM PC AT still had the external 80187 chip option or not) they started building it INTO the chip. This gave huge momentum to the IEEE 754 floating point standard which it implemented.

The IEEE 754 standard specified:

"single precision" floating point had 32 bits with a 23-bit mantissa, 8-bit biased exponent, 1-bit sign and assigned meanings to those fields.

and

"double precision" floating point had 64 bits with a 52-bit mantissa, … (etc.)

So, the long history of C compilers that I have worked with that generate output for the 80x86 chip family have adopted that same pattern with the naming of their type specifiers:

'float' and 'double' to mean SINGLE- and DOUBLE-precision IEEE 754 floating point respectively.

and later, 'long double' – also in NAME COORDINATION with the IEEE 754 standard with the "extra accurate" (80-bit) values that the 80x87 math co-processor chips dealt with internally to minimize rounding errors while computing results. This became implemented in a number of the C compilers as optional numerical values that programmers could work with in their software.

-------------- Where I made an erroneous assumption -------------

I didn't realize how much of this was actually weighted on the fact that the C compiler was generating output for an 80x86/87 family of chips.

Back to the ANSI C standard:

'float', 'double' and 'long double' are not strictly specified in the ANSI C specification. What they REALLY are is left up to the implementer of the compiler.

And I'm going to give them the benefit of the doubt and say this is CORRECT and fits possible alternate hardware implementations and reasons for doing other things with them.

Enter Microchip C compiler family.

You already know:

float 32 bits IEEE 754 single precision

double 32 bits IEEE 754 single precision (or 64 bits IEEE 754 double precision with -fno-short-double)

long double 64-bits IEEE 754 double precision

(Note about Microchip documents: specifically document DS51686B [MPLAB C Compiler for PIC32 MCUs User's Guide], page 8, section 1.5.4 – which is current as I write this on 26-Oct-2012 – lists a 'double' as 64 bits with no mention of the -fno-short-double switch) but my tests with this compiler show that the above is actually correct, and also shows that the document DS51686E [MPLAB XC32 C/C++ Compiler User's Guide], page 96, section 6.5 actually gets it right and lists 'double' as a 32-bit object. In both cases, they do document the -fno-short-double switch correctly, but neither of them mentions it near the table that shows how many bits 'double' has. Note: all Microchip PIC32 compilers as well as the PIC24 compilers treat 'float', 'double' and 'long double' in exactly the same way – per the above, verified by my own tests.)

Here is what I missed:

The long norm (and thus became an assumption of mine) is that a constant numerical value, example 3.14159265358979 that the CONSTANT ITSELF is interpreted by the compiler as a DOUBLE PRECISION (64-bit value) and thus retains 15 significant digits.

WRONG!

What REALLY happens is this:

WITH -fno-short-double

long double dd;

dd = 3. 14159265358979;

// Here 'dd' contains all 15 significant digits.

WITHOUT -fno-short-double

long double dd;

dd = 3. 14159265358979;

// Here 'dd' only contains an estimate close to 3. 141593! 8 significant digits lost!

// The constant itself is EVALUATED as a SINGLE-PRECISION floating point value FIRST before being assigned.

To illustrate what happened is more or less equivalent to this but without the 'f' variable:

long double dd;

float f;

f = 3. 14159265358979;

// Yields 3. 141593 truncated as expected.

dd = f;

// !!! Ouch! 8 significant digits lost.

What to do to correct it:

floating point CONSTANTS have an optional suffix 'L' which tells the compiler that you want the value to be interpreted as a DOUBLE-PRECISION value.

Thus:

WITHOUT -fno-short-double

long double dd;

dd = 3. 14159265358979L;

// Here 'dd' retains all 15 digits as expected!

// Now the constant itself is EVALUATED as a DOUBLE-PRECISION floating point value FIRST before being assigned.

I thought you should be armed with this since you work with these compilers and it might be a mystery as to why you're not getting results you expected.

Note: this is NOT WRONG – it was simply UNEXPECTED, and not covered clearly in my interpretation of any compiler User's Guides that I have seen.

I already sent this to Microchip documentation team directly for correction with explicit suggestions for correction in their compiler user's guides for C30, C32, XC16 and XC32 compilers.

Kind regards,

Victor Wheeler