Hot!XC8 2.00 floating point performance vs CCS and Mikrobasic

Author
alexconway
New Member
  • Total Posts : 25
  • Reward points : 0
  • Joined: 2009/07/14 08:40:24
  • Location: 0
  • Status: offline
2018/10/11 08:53:01 (permalink)
0

XC8 2.00 floating point performance vs CCS and Mikrobasic

I am concerned about the floting point performance of XC8 2.00 compared with some other compilers.
This comparison is done with the same code compiled for a PIC18F65J94 using 32bit floats.
XC8 PRO was tested at O3, 32bit float/double and results were taken for both C90 and C99 using the MPLABX simulator
CCS V5.080 was tested with optimisation level 9 using the MPLABX simulator
Mikrobasic was tested with optimisation level 5 using the Mikrobasic simulator
Timings are instruction counts.
 

volatile float float1, float2, float3;
volatile uint16_t word1, word2;

word1 = 65535;
word2 = 613;

                // instruction count    XC8 C90      XC8 C99        CCS        Mikrobasic
float1 = (float) word1;                   189          189          55           73
float1 *= (1.0 / 1024.0);                 296          296         117          140
float2 = (float) (word2);                 296          296          82          127
float2 /= 1024.0;                        1048         1048        1059          935
float3 = float1 * float2;                 302          302         118          132
float3 = float1 / float2;                1115         1115         995          867
float3 = float1 + float2;                 266          266         165          154
float3 = float2 - float1;                 293          293         176          186
float3 = float3 * 1024.0;                 296          296         117          140
word1 = (uint16_t) (float3);              370          370          27           57

volatile float gaindB = 9.5;
volatile float gain = log10(gaindB);     6477        10236        4317         3652
gain = log2(gaindB);                      n/a         9379         n/a          n/a
gaindB = pow(10, gain);                 14169        31444        7010         7655     
gaindB = pow(2.0, gain);                10053        21902        6718         6096


                          //rom used     5570        19922        6140         7578
 

XC8 C99 gives the worst performance, especially for log and pow. It has also eaten 20k!
XC8 C90 is identical to C99 for basic operations, but all are still much worse than non-microchip. It eats the least ROM.
Am I missing compiler switches that will turn off the extra "features" in return for speed and code size?
(without resorting to 24bit floats)
 
Thanks
Alex
#1

13 Replies Related Threads

    Aussie Susan
    Super Member
    • Total Posts : 3333
    • Reward points : 0
    • Joined: 2008/08/18 22:20:40
    • Location: Melbourne, Australia
    • Status: offline
    Re: XC8 2.00 floating point performance vs CCS and Mikrobasic 2018/10/11 18:18:54 (permalink)
    +2 (2)
    For those devices it all comes down to the way the library functions are written as there is no hardware support for any arithmetic operation.
    While it doesn't answer your question, I would try to avoid floating point operations on small MCUs.
    There are many techniques that might get around the issue such as fixed point values (typically the same as integer operations which are still not wonderful). Also stay well clear of transcendental functions as they just chew time and space (as you have found) - use lookup tables instead as you generally don't need to the range precision (e.g. angles to better than 1 degree???).
    Susan
    #2
    NKurzman
    A Guy on the Net
    • Total Posts : 16442
    • Reward points : 0
    • Joined: 2008/01/16 19:33:48
    • Location: 0
    • Status: offline
    Re: XC8 2.00 floating point performance vs CCS and Mikrobasic 2018/10/11 20:29:45 (permalink)
    0
    The other question is Performace. You posted size. What about speed? Sometimes bigger is faster.
    #3
    qhb
    Superb Member
    • Total Posts : 7151
    • Reward points : 0
    • Joined: 2016/06/05 14:55:32
    • Location: One step ahead...
    • Status: online
    Re: XC8 2.00 floating point performance vs CCS and Mikrobasic 2018/10/11 21:12:38 (permalink)
    0
    I think he
    NKurzman
    The other question is Performace. You posted size. What about speed? Sometimes bigger is faster.

    No, I think he did post performance, not code size in that table.
     

    Worst forum problems are now fixed, but the damn firewall is still there.
    #4
    crosland
    Super Member
    • Total Posts : 1256
    • Reward points : 0
    • Joined: 2005/05/10 10:55:05
    • Location: Bucks, UK
    • Status: offline
    Re: XC8 2.00 floating point performance vs CCS and Mikrobasic 2018/10/12 01:23:52 (permalink)
    +1 (1)
    The table is instruction counts (proxy for performance). The verbiage refers to code size.
     
     
    @OP Have you double and triple checked that all the compilers really are building for 32 bit floats? Do you really need that precision?
     
    Have you double and triple checked the optimisation settings? Have you compared against unoptimised code?
    #5
    alexconway
    New Member
    • Total Posts : 25
    • Reward points : 0
    • Joined: 2009/07/14 08:40:24
    • Location: 0
    • Status: offline
    Re: XC8 2.00 floating point performance vs CCS and Mikrobasic 2018/10/12 02:14:46 (permalink)
    0
    I put the code size at the bottom of the table, scroll down to see //rom used.
    The instruction counts were taken from the simulator stopwatch so directly proportional to time taken
    All are doing 32bit. AFAIK XC8 is the only one that offers 24bit.
    And, for my sins, I do need 32bit float
    Am I the only ont that is stunned by the extremely poor results from XC8?
    #6
    andersm
    Super Member
    • Total Posts : 2466
    • Reward points : 0
    • Joined: 2012/10/07 14:57:44
    • Location: 0
    • Status: offline
    Re: XC8 2.00 floating point performance vs CCS and Mikrobasic 2018/10/12 02:28:46 (permalink)
    +2 (2)
    It would be interesting to see how the different libraries fare in conformance and accuracy tests.
    #7
    NKurzman
    A Guy on the Net
    • Total Posts : 16442
    • Reward points : 0
    • Joined: 2008/01/16 19:33:48
    • Location: 0
    • Status: offline
    Re: XC8 2.00 floating point performance vs CCS and Mikrobasic 2018/10/12 05:38:38 (permalink)
    0
    Oh, never mind then, I was reading it wrong.
    It would be interesting to see the comparison to XC8 Version 1.xx compilers
    post edited by NKurzman - 2018/10/12 05:40:43
    #8
    alexconway
    New Member
    • Total Posts : 25
    • Reward points : 0
    • Joined: 2009/07/14 08:40:24
    • Location: 0
    • Status: offline
    Re: XC8 2.00 floating point performance vs CCS and Mikrobasic 2018/10/17 08:19:19 (permalink)
    0
    I've now added XC8 in 24bit mode. Still very disappointing.

    volatile float float1, float2, float3;
    volatile uint16_t word1, word2;
    word1 = 65535;
    word2 = 613;
                   // instruction count    XC8 C90   XC8 C99   CCS   Mikrobasic  XC8 24bit
    float1 = (float) word1;                   189      189      55      73         73
    float1 *= (1.0 / 1024.0);                 296      296     117     140        251
    float2 = (float) (word2);                 296      296      82     127        139
    float2 /= 1024.0;                        1048     1048    1059     935        734       
    float3 = float1 * float2;                 302      302     118     132        255
    float3 = float1 / float2;                1115     1115     995     867        782
    float3 = float1 + float2;                 266      266     165     154        352
    float3 = float2 - float1;                 293      293     176     186        375
    float3 = float3 * 1024.0;                 296      296     117     140        251
    word1 = (uint16_t) (float3);              370      370      27      57        236

    volatile float gaindB = 9.5;
    volatile float gain = log10(gaindB);     6477    10236    4317    3652       5842
    gain = log2(gaindB);                      n/a     9379     n/a     n/a        n/a
    gaindB = pow(10, gain);                 14169    31444    7010    7655      12971  
    gaindB = pow(2.0, gain);                10053    21902    6718    6096       7862


                              //rom used     5570    19922    6140    7578       3814

     
    post edited by alexconway - 2018/10/18 02:05:42
    #9
    Howard Long
    Super Member
    • Total Posts : 410
    • Reward points : 0
    • Joined: 2005/04/04 08:50:32
    • Status: offline
    Re: XC8 2.00 floating point performance vs CCS and Mikrobasic 2018/10/17 09:17:30 (permalink)
    +2 (2)
    andersm
    It would be interesting to see how the different libraries fare in conformance and accuracy tests.



    XC8 follows true IEEE754 whereas (ISTBC) CCS & Mikrobasic I believe follow Microchip's old "modified IEEE 754 32-bit format" also known as "Microchip 32-bit floating point format".
     
    AN575 describes the old "Microchip 32-bit floating point format", which has the same precision as IEEE754 float, but the exponent and mantissa sign bit field locations differ to IEEE754 as a means to benefit performance.
     
    #10
    alexconway
    New Member
    • Total Posts : 25
    • Reward points : 0
    • Joined: 2009/07/14 08:40:24
    • Location: 0
    • Status: offline
    Re: XC8 2.00 floating point performance vs CCS and Mikrobasic 2018/10/18 02:11:02 (permalink)
    0
    Howard Long
    XC8 follows true IEEE754 whereas (ISTBC) CCS & Mikrobasic I believe follow Microchip's old "modified IEEE 754 32-bit format" also known as "Microchip 32-bit floating point format".
     
    AN575 describes the old "Microchip 32-bit floating point format", which has the same precision as IEEE754 float, but the exponent and mantissa sign bit field locations differ to IEEE754 as a means to benefit performance.



    Well, I can confirm that Mikrobasic uses Microchip's modified format, but (according to AN575) that only seems to involve a 3 extra instructions, not hundreds.
    Alex
    #11
    Aussie Susan
    Super Member
    • Total Posts : 3333
    • Reward points : 0
    • Joined: 2008/08/18 22:20:40
    • Location: Melbourne, Australia
    • Status: offline
    Re: XC8 2.00 floating point performance vs CCS and Mikrobasic 2018/10/18 18:22:28 (permalink)
    0
    Without knowing anything about this, 3 extra instructions in a loop can be executed hundreds of times.
    Also 3 extra institution each time something happens only requires that thing to happen 33 times to appear 100 times in the instruction count.IT all comes down to the algorithm used, not just the data format.
    Susan
    #12
    alexconway
    New Member
    • Total Posts : 25
    • Reward points : 0
    • Joined: 2009/07/14 08:40:24
    • Location: 0
    • Status: offline
    Re: XC8 2.00 floating point performance vs CCS and Mikrobasic 2018/10/19 10:35:12 (permalink)
    0
    I'm pretty sure the 3 instructions (or whatever) are outside the loop - that is to say that the algorithms used for the calculations are done with the float "unpacked" into separate exponent and fraction (put that Knuth book down!).
    So the overhead of IEEE versus Microchip format are only incurred once per elementary operation (ie +, -, *, /)
    Of course Log10 or Pow may be using plenty of elementary operations.
    My measurements show excesssive slowness for everything
    I don't think the differing formats is the reason for slowness here.
    Alex
    #13
    andersm
    Super Member
    • Total Posts : 2466
    • Reward points : 0
    • Joined: 2012/10/07 14:57:44
    • Location: 0
    • Status: offline
    Re: XC8 2.00 floating point performance vs CCS and Mikrobasic 2018/10/19 11:33:57 (permalink)
    0
    How do the other formats handle things like denormals, or intermediate rounding, which have specified behaviour in IEEE754, and inhibit some optimizations? GCC eg. has a bunch of options that improve performance at the expense of IEEE754 conformance.
    #14
    Jump to:
    © 2018 APG vNext Commercial Version 4.5