• AVR Freaks

Hot!Bug in XC8 with -O2 or -O3

Author
hjf
Starting Member
  • Total Posts : 59
  • Reward points : 0
  • Joined: 2006/12/31 13:08:22
  • Location: 0
  • Status: offline
2019/03/08 15:32:44 (permalink)
0

Bug in XC8 with -O2 or -O3

This code:
 
#include <stdint.h>
#include <string.h>
//#include "speck.h"

#define ROR(x, r) ((x >> r) | (x << (32 - r)))
#define ROL(x, r) ((x << r) | (x >> (32 - r)))
#define R(x, y, k) (x = ROR(x, 8), x += y, x ^= k, y = ROL(y, 3), y ^= x)
#define ROUNDS 27

void encrypt_block(uint32_t ct[2], uint32_t const pt[2], uint32_t const K[4]) {
  uint32_t x = pt[0], y = pt[1];
  uint32_t a = K[0], b = K[1], c = K[2], d = K[3];

  R(y, x, a);
  // for (int i = 0; i < ROUNDS - 3; i += 3) {
  for (uint32_t i = 0; i < ROUNDS - 3; i += 3) {
    R(b, a, i);
    R(y, x, a);
    R(c, a, i + 1);
    R(y, x, a);
    R(d, a, i + 2);
    R(y, x, a);
  }
  R(b, a, ROUNDS - 3);
  R(y, x, a);
  R(c, a, ROUNDS - 2);
  R(y, x, a);

  ct[0] = x;
  ct[1] = y;
}

int main(void) {
  volatile uint32_t out[2];
  uint32_t in[] = {0x7475432d, 0x3b726574};
  uint32_t key[] = {0x03020100, 0x0b0a0908, 0x13121110, 0x1b1a1918};
  encrypt_block(out, in, key);
//you need to watch the variable "out" here, the expected result should be 8b 02 4e 45 48 a5 6f 8c


}
 
returns the wrong result with  -O3 or -Os. With -O1 it out[0] is 0x454E028B (correct), with -Os, out[0] is 0x8FA3FED7
If i compile it with GCC or CLANG (targeting windows), the result is correct at any optimization level: out[0] is 0x454E028B
 
This was discussed at https://stackoverflow.com...2-or-higher-is-enabled
post edited by hjf - 2019/03/08 17:31:45
#1

15 Replies Related Threads

    qhb
    Superb Member
    • Total Posts : 9999
    • Reward points : 0
    • Joined: 2016/06/05 14:55:32
    • Location: One step ahead...
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2019/03/08 17:24:43 (permalink)
    +1 (1)
    hjf
    This was discussed at https://www.microchip.com...=1089410&edit=true

    That points back to THIS very topic...
     

    Nearly there...
    #2
    hjf
    Starting Member
    • Total Posts : 59
    • Reward points : 0
    • Joined: 2006/12/31 13:08:22
    • Location: 0
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2019/03/08 17:32:22 (permalink)
    0
    qhb
    hjf
    This was discussed at https://www.microchip.com...=1089410&edit=true

    That points back to THIS very topic...
     


    oops!. fixed
    #3
    davekw7x
    Entropy++
    • Total Posts : 1879
    • Reward points : 0
    • Joined: 2012/01/16 12:01:07
    • Location: Second star on the right, straight on till morning
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2019/03/08 22:56:05 (permalink)
    0
    You forgot to tell us which chip you are using.

    I only have the free version of XC8 so I can't test -Os or -O3, but XC8 version 2.05 does allow optimization levels 0, 1, and 2, and you did report a problem with level 2.

    So I pasted your code into a PIC18F26K42 project that I happened to have lying around.

    I don't particularly like the superfluous numbers inside the [] brackets in the function parameter definitions, but they don't do any harm (other than to mildly offend my aesthetic sensibilities), but I made no changes whatsoever to your code, and I added a print statement.  (My test plans for evaluating chips or compilers simply never depend on simulators or debuggers with IDE output.)

    From my main.c:



    // Put config parameters here or in their own separate .c file

    #include <xc.h>
    #include <stdio.h>
    #include <stdint.h>
    #include <string.h>
    //#include "speck.h"

    // Your definitions of ROR, ROL, R, ROUNDS and the encrypt_block function
    // are here

    // My intialization function
    void init_system(void);

    // XC8 has historically liked void main
    void main(void)
    {
        // Init I/O, clocks, timers, uart, etc...
        init_system();
        __delay_ms(100); // Give things a little time to settle down
        printf("Compiled on %s at %s by XC8 version %u\r\n",
                __DATE__, __TIME__, __XC8_VERSION);

        volatile uint32_t out[2];
        uint32_t in[] = { 0x7475432d, 0x3b726574 };
        uint32_t key[] = { 0x03020100, 0x0b0a0908, 0x13121110, 0x1b1a1918 };
        encrypt_block(out, in, key);
        //you need to watch the variable "out" here, the expected result should be 8b 02 4e 45 48 a5 6f 8c
        // From davekw7x: Heck, I'll just print it
        int i;
        for (i = 0; i < 2; i++) {
            printf("out[%d] = 0x%08lX\r\n", i, out[i]);
        }
        while (1) {
            LED1_Toggle();
            __delay_ms(1000);
        } // End of main loop
    } // End of main()


    Output using XC8 version 2.05 in C90 and C99 mode with optimization level 0, 1, 2 all give the same values for out[0] and out[1]
    Here's a run with C99 mode with optimization level 2

    Compiled on Mar  8 2019 at 21:29:30 by XC8 version 2050
    out[0] = 0x454E028B
    out[1] = 0x8C6FA548

    Bottom line: Same output as I got from your code with gcc on my Linux workstation.  If you would use the package function of your MPLABX project and post the zip file, I would like to see it. (See Footnote.)
     
    As I mentioned, I can't test with Optimization Level 3 or s, but I would still like to see all of the code.  If you can't show us the entire project, at least tell us what PIC you are using for this test!
     
     
    Regards,
     
    Dave
     
    Footnote:
    I am very interested in compiler misbehavior.  An optimization level 1 bug that I discovered in XC16 was fixed by its most recent release, so I know these things can happen, and they may occur only for certain very specific code arrangement, so the entire project may be relevant, not just main().
    post edited by davekw7x - 2019/03/08 23:22:10

    Sometimes I just can't help myself...
    #4
    hjf
    Starting Member
    • Total Posts : 59
    • Reward points : 0
    • Joined: 2006/12/31 13:08:22
    • Location: 0
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2019/03/08 23:19:11 (permalink)
    0
    Interesting. I changed the PIC to yours and it works. When I put the simulator back to mine, the error appears.
    The PIC is a 16LF18313.
    #5
    NKurzman
    A Guy on the Net
    • Total Posts : 18779
    • Reward points : 0
    • Joined: 2008/01/16 19:33:48
    • Location: 0
    • Status: online
    Re: Bug in XC8 with -O2 or -O3 2019/03/09 00:21:39 (permalink)
    +1 (1)
    The Simulator is not a PIC. It is software with its own bugs. You are target locked on the compiler.
    You can cross post this The simulator forum.
    Or MPLabX, if it works on a real chip.
    #6
    davekw7x
    Entropy++
    • Total Posts : 1879
    • Reward points : 0
    • Joined: 2012/01/16 12:01:07
    • Location: Second star on the right, straight on till morning
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2019/03/09 00:48:46 (permalink)
    +2 (2)
    hjf
    The PIC is a 16LF18313.

    Well, that's different.  A lot different.
     
    I don't have that chip, but I do have a PIC16F18345 (same family), and I resurrected a project that uses that on a Curiosity board.
     
    I put your code into the PIC16F18345 project and got the same errored output with Optimization Level 2 that you observed.
     
    With XC8 version 2.05 Optimization Level 1 (Level 0 gives same out[] values)
    Compiled on Mar  9 2019 at 07:36:11 UTC by XC8 version 2050                     
    out[0] = 0x454E028B                                                             
    out[1] = 0x8C6FA548                                                             
                                                                                    
    With XC8 version 2.05 Optimization Level 2
    Compiled on Mar  9 2019 at 07:36:59 UTC by XC8 version 2050                     
    out[0] = 0x8FA3FED7                                                             
    out[1] = 0x53D8CEA8


    Oops!
     
    Looks like time for a bug report to Microchip.  They will ask for a complete project.
     
    Real problem with real chip, so that eliminates the simulator as the culprit.  Maybe they can sort it out.
     
    [Edit]
    I hve submitted a support ticket.  If any Microchip employees are monitoring this and want check progress of the report, it is case number 00391433.
    Project is attached.
    [/Edit]
     
     
    Regards,
     
    Dave
    Footnote: One of the reasons I hate function-like macros is that they are very hard to debug.  I know they are necessary for performance reasons in applications like this, and I don't see anything fundamentally wrong with the way they are written here.  The fact that the program works with workstation gcc -Os sort of makes me believe they are OK.
    After observing the error with XC8 I cleaned up a few things (not related to the macros)  that made me feel better, but that didn't change the errored output.
    post edited by davekw7x - 2019/03/09 07:14:56

    Sometimes I just can't help myself...
    #7
    davekw7x
    Entropy++
    • Total Posts : 1879
    • Reward points : 0
    • Joined: 2012/01/16 12:01:07
    • Location: Second star on the right, straight on till morning
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2019/03/10 08:57:04 (permalink)
    +5 (5)
    I did a little detective work and found some bad code generated by the compiler.  Note that this has nothing to do with way the macros are written or used or the way that the compiler inserts the defined macro stuff into the code.

    It's a bug in the way the optimizer handles the expression in the definition of ROL.

    There may or may not be other compiler "issues," but I'll stick to the definite bug that I found for the specific chip.  (PIC16F18345)

    I created a separate test function for this expression and inserted it in my project.  It shows proper output for Optimization levels 0 and 1, but wrong output for Optimization level 2;

    Here's the test code:

    void test_ROL()
    {
        printf("In test_ROL()\n");

        //uint32_t xinp = 0x743B7265; // This is the input value for all tests here

        // Calculated off-line for this xinp rotated 3 bits left
        //uint32_t expected = 0xA1DB932B;
        
        // Easier to calculate correct answer in my poor little peanut head:
        uint32_t xinp = 0x7000000E;
        uint32_t expected = 0x80000073;

        uint32_t x = xinp;
        uint8_t r = 3;              // All tests here rotate 3 bits

        printf("  x = 0x%08lX\n", x);
        printf("  r = %u\n", r);
        printf("  0x%08lX rotated left %u is equal to 0x%08lX\n\n", x, r, expected);
        printf("    Before rol assignment : x = 0x%08lX\n", x);
        // Same action as the macro, with same results:
        // Optimization level 0 or 1 works OK, 2 screws the pooch
        x = (x << r) | (x >> (32-r));
        printf("    After  rol assignment : x = 0x%08lX <=== ", x);
        if (x == expected) {
            printf("OK!");
        }
        else {
            printf("Ding-Ding-Ding: Wrong answer!");
        }
        printf("\n\n");
    }


    Here are results of a couple of test runs:
    Output: XC8 version 2.05 C99 mode, Optimization Level 1
    Test version compiled on Mar 10 2019 at 07:53:36 by XC8 version 2050
    In test_ROL()
      x = 0x7000000E
      r = 3
      0x7000000E rotated left 3 is equal to 0x80000073

        Before rol assignment : x = 0x7000000E
        After  rol assignment : x = 0x80000073 <=== OK!

    and

    Output: XC8 version 2.05 C99 mode, Optimization Level 2
    Test version compiled on Mar 10 2019 at 07:54:55 by XC8 version 2050
    In test_ROL()
      x = 0x7000000E
      r = 3
      0x7000000E rotated left 3 is equal to 0x80000073

        Before rol assignment : x = 0x7000000E
        After  rol assignment : x = 0x80000071 <=== Ding-Ding-Ding: Wrong answer!


    I have attached a file that shows an excerpt from the Level 2 listing output and shows where the optimizer goes wrong for this example.
     
    I have given all of this information to the support team in my bug report.  Regardless of whether (or not) my analysis is correct, I think it is a serious bug, but who knows when they will get a round tuit?
     
    Note that this post concentrates on the ROL code, but I discovered the expression in ROR can create the same kind of incorrect functionality.  Depending on what else is going on in the particular function that uses one of these rotate sequences, optimization may or may not occur, so testing a particular example can't prove that it will never create a problem.

    Bottom line:  Even if you can figure out how to change the definitions for ROL and ROR to suppress the optimization for these particular expressions, I would probably just stick to Optimization Level 1 until this bug gets fixed.  Optimization Level 2 generates significantly smaller code for other things as will, and it would be really nice to be able to use it.

    Regards,

    Dave


    post edited by davekw7x - 2019/03/10 10:02:27

    Sometimes I just can't help myself...
    #8
    hjf
    Starting Member
    • Total Posts : 59
    • Reward points : 0
    • Joined: 2006/12/31 13:08:22
    • Location: 0
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2019/03/10 11:32:32 (permalink)
    0
    As a workaround I created an actual function and split the rotation in two operations:
     
    uint32_t ROL(uint32_t x, uint8_t r) {
        uint32_t intermedio;
        intermedio = x << r;
        intermedio |= x >> (32 - r);
        return intermedio;
    }
    This gives the correct result.

    #9
    davekw7x
    Entropy++
    • Total Posts : 1879
    • Reward points : 0
    • Joined: 2012/01/16 12:01:07
    • Location: Second star on the right, straight on till morning
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2019/03/10 12:08:24 (permalink)
    +1 (1)
    hjf
    This gives the correct result.

    davekw7x
    Note that this post concentrates on the ROL code, but I discovered the expression in ROR can create the same kind of incorrect functionality.

    Even though your specific example gives the correct answer with the ROL workaround, I have verified that, under certain conditions, the ROR code can also screw the pooch.
     
    Bottom line: For sanity sake, I strongly suggest that you implement an analogous workaround for ROR
     
    Regards,

    Dave
    post edited by davekw7x - 2019/03/10 12:11:15

    Sometimes I just can't help myself...
    #10
    hjf
    Starting Member
    • Total Posts : 59
    • Reward points : 0
    • Joined: 2006/12/31 13:08:22
    • Location: 0
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2019/03/10 12:09:59 (permalink)
    +1 (1)
    Sorry, I posted that as an example. I did the same for both ROR and ROL.
    #11
    TedNukem
    New Member
    • Total Posts : 1
    • Reward points : 0
    • Joined: 2019/03/08 02:25:05
    • Location: 0
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2019/03/11 04:46:42 (permalink)
    0
    Hi !
     
    I have maybe the same problem !!!
     
    Here's what I do: I get the address of the character from the table into FLASH ROM:
    const uint8_t* Tst1( uint8_t ch ) {
        return &font_8x8_32_128[ (ch - 32) << 3 ];
    }

     
    PIC12F1840, XC8 v2.05, C99
     
    If the optimization is -O3 or -s, the function returns the wrong result, if optimization is -O2 returns the correct result:
    optimization -O2, offset 0x23        |                optimization -O3 / -s, offset 0x1d1
    Tst1( '0' ) -> 0x80A3 ( Ok )               |                Tst1( '0' ) -> 0x8251 ( Ok )
    Tst1( '1' ) -> 0x80AB ( Ok )               |                Tst1( '1' ) -> 0x8A59 ( WRONG )
    Tst1( '2' ) -> 0x80B3 ( Ok )               |                Tst1( '2' ) -> 0x9261 ( WRONG )
    Tst1( '3' ) -> 0x80BB ( Ok )               |                Tst1( '3' ) -> 0x9A69 ( WRONG )
    Tst1( '4' ) -> 0x80C3 ( Ok )               |                Tst1( '4' ) -> 0x‭A271‬ ( WRONG )
     
    This is the assembler's listing:
     

     
    If the function changes this way, the addresses are returned correct regardless of the optimization:
    const uint8_t* Tst2( uint8_t ch ) {
        uint16_t chs;
        ch -= 32;
        chs = ch << 3;
        return &font_8x8_32_128[ chs ];
    }

     
    I tried instruction after instruction to reproduce the wrong result, but no success.
    Yet the wrong result is fact!
    #12
    pcbbc
    Super Member
    • Total Posts : 1698
    • Reward points : 0
    • Joined: 2014/03/27 07:04:41
    • Location: 0
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2019/03/12 05:21:11 (permalink)
    +1 (1)
    It's attempting to eliminate the ??_Tst1+2/??_Tst1+3 temporary registers for the adjusted base address of the array.  Which it could do, but for some reason it tries to double duty the ??_Tst1/??_Tst1+1 registers, which it then overwrites with the shifted character index.
     
    Presumably what it was aiming for was something like...
    movf    Tst@ch,w
    movwf    ??_Tst1
    clrf    ??_Tst1+1
    lslf    ??_Tst1,f
    rlf    ??__Tst1+1,f
    lslf    ??_Tst1,f
    rlf    ??__Tst1+1,f
    lslf    ??_Tst1,f
    rlf    ??__Tst1+1,f
    movlw    low ((_font_8x8_32_128 | (0+32768)+65280))
    addwf    ??_Tst1,f
    movlw    high ((_font_8x8_32_128 | (0+32768)+65280))
    addwfc    ??_Tst1+1,f

     
    I wonder if it has anything to do with the adjusted offset of 32 x 8 = 256, and so the low byte is zero?  There's probably an optimisation that says in that case you only need to add to the high byte.  But you still have the low byte of &_font_8x8_32_128 to consider, so that's probably not it.  I know next to nothing of the dark art of compiler code generation, so I'm totally guessing.
    #13
    davekw7x
    Entropy++
    • Total Posts : 1879
    • Reward points : 0
    • Joined: 2012/01/16 12:01:07
    • Location: Second star on the right, straight on till morning
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2019/03/13 07:52:26 (permalink)
    +2 (2)
    davekw7x
    I have given all of this information to the support team in my bug report.  Regardless of whether (or not) my analysis is correct, I think it is a serious bug, but who knows when they will get a round tuit?

     
    My wisecrack was based on previous experience with Microchip support when it took over two months and, finally, intervention by a member of the XC16 compiler team, whom I had contacted by external means, to get it looked at seriously.
     
    However, only a couple days after submitting this bug report. I received the following:
    Microchip support team
    **** Microchip Comments Begin ****
    Hi Dave,

    I am able to replicate the issue in the code that involves rotate operations.
    The issue only affects the expressions where the compiler has detected that a rotate operation is being performed and a level 2 or a higher optimizations is being used.

    I have created an internal report for this issue.
    I'll provide you with the issue report number, so that you can have a reference to this issue.
    It is XC8-186.
     **** Microchip Comments End ******

    Bottom line: If you submit not only a project that shows the problem, but an analysis that includes the exact spot in compiled code that creates the problem (and a precise explanation of where the compiled code went wrong, and how it might be fixed) you have a chance of getting their immediate attention.  (However, see Footnote.)
     
    In other words the ticket is to be able to zero in on the problem and grab it by the throat and wrestle it to the ground.  Not always possible for the end user, but this one alarmed me enough to make me want to expend the effort.
     
    Regards,
     
    Dave
    Footnote:
    It also depends on the luck of the draw---who in the Support team gets assigned your case.  I had similar detailed analysis of the previous bug and received several inane (ridiculously irrelevant and incorrect) responses from that Support person, which I rebutted to no avail.
     
    post edited by davekw7x - 2019/03/13 22:16:15

    Sometimes I just can't help myself...
    #14
    davekw7x
    Entropy++
    • Total Posts : 1879
    • Reward points : 0
    • Joined: 2012/01/16 12:01:07
    • Location: Second star on the right, straight on till morning
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2020/05/14 09:58:02 (permalink)
    +4 (4)
    A followup in case someone stumbles across this post now (a year later) and is still interested:
     
    Received a notification that the bug has been fixed in XC8 release 2.20.  I verified that my test gives satisfactory results with optimization levels 0, 1, 2  (XC8 version 2.20, C99 mode)
     
    Timeline:
    1. March 11, 2019:  I submitted bug report with a complete project including an assembly language routine that gave satisfactory results and a C function that exhibited the bug and an explanation of exactly what the problem was with assembly code generated from the C source with Optimization level 2.
    2. March 13, 2019: Reply from Microchip Support that they verified the bug, and appreciated my analysis.
    3. May 13, 2020: Report from Microchip Support that the bug has been corrected in XC8 release 2.20
     
     
    Regards,

    Dave
    post edited by davekw7x - 2020/05/14 14:19:56

    Sometimes I just can't help myself...
    #15
    du00000001
    Just Some Member
    • Total Posts : 3778
    • Reward points : 0
    • Joined: 2016/05/03 13:52:42
    • Location: Germany
    • Status: offline
    Re: Bug in XC8 with -O2 or -O3 2020/05/14 10:36:47 (permalink)
    +1 (1)
    Hardly more than 14 months.

    PEBKAC / EBKAC / POBCAK / PICNIC (eventually see en.wikipedia.org)
    #16
    Jump to:
    © 2020 APG vNext Commercial Version 4.5