• AVR Freaks

Hot!XC8 the good, the bad, and the Ugly

Page: 12 > Showing page 1 of 2
Author
davea
Super Member
  • Total Posts : 585
  • Reward points : 0
  • Joined: 2016/01/28 13:12:13
  • Location: Tampa Bay FL USA
  • Status: offline
2020/11/28 13:42:25 (permalink)
5 (2)

XC8 the good, the bad, and the Ugly

I first started a project as a learning experience to see if I could "self" optimize C code
for a faster execution using (PRO opt = 3) and found some trick's that make code faster
the first one using direct addressing by 1and0 makes a big difference when expanded out to 4 bytes
// ADC = ((ADRESH << 8) | ADRESL); // BAD used in microchip's caned fuctions 
#define ADCL (*((uint8_t *)&ADC + 0)) // Low byte
#define ADCH (*((uint8_t *)&ADC + 1)) // Hi byte
    ADCH = ADRESH;
    ADCL = ADRESL;
    NEARbits.DO_1ms = 1;
    PIR1bits.ADIF = 0;
 
the BAD 
96: MUL_tmp >>= 2; // 7 extra cycles vers  doing 2 separate shifts !!!!
0098 3002 MOVLW 0x2
009A 00F7 MOVWF 0x77
009B 37E8 ASRF 0x68, F
009C 0CE7 RRF 0x67, F
009D 0CE6 RRF 0x66, F
009E 0CE5 RRF MUL_tmp, F
009F 0BF7 DECFSZ 0x77, F
00A0 289B GOTO 0x9B
97: MUL_tmp >>= 1; // ^^^^^^^^^^^^
00A1 37E8 ASRF 0x68, F
00A2 0CE7 RRF 0x67, F
00A3 0CE6 RRF 0x66, F
00A4 0CE5 RRF MUL_tmp, F

ACC += MUL_tmp >> 2; <<<< adds 20us this is BAD
don't over stuff your lines just because you can it may cost you
the Ugly (XC8 opt = 2 free)
1: every byte you move will be saved to temp location and then moved to final location
2: any shift will use a loop even if its >>1 and applies #1 to it
3: clrf is never used it all ways movlw 0 and movwf for every byte
and out of blue this showed up in one of the tabs
this is Umul8.c from MPLAB X 5.35 XC8 2.31 and Pack 1.3.177 is broken (using 1.2.99)
// 8 x 8 bit multiplication with 8 bit result
unsigned char
__bmul(unsigned char multiplier, unsigned char multiplicand)
{
 unsigned char product = 0;
#if defined(__OPTIMIZE_SPEED__)  <<<<<<<<<<<<
 if(multiplier & 0x01)
  product = (product + multiplicand) & 0xff;
 multiplicand <<= 1;
 if(multiplier & 0x02)
  product = (product + multiplicand) & 0xff;
 multiplicand <<= 1;
 if(multiplier & 0x04)
  product = (product + multiplicand) & 0xff;
 multiplicand <<= 1;
 if(multiplier & 0x08)
  product = (product + multiplicand) & 0xff;
 multiplicand <<= 1;
 if(multiplier & 0x10)
  product = (product + multiplicand) & 0xff;
 multiplicand <<= 1;
 if(multiplier & 0x20)
  product = (product + multiplicand) & 0xff;
 multiplicand <<= 1;
 if(multiplier & 0x40)
  product = (product + multiplicand) & 0xff;
 multiplicand <<= 1;
 if(multiplier & 0x80)
  product = (product + multiplicand) & 0xff;
#else
 do {
  if(multiplier & 1)
   product += multiplicand;
  multiplicand <<= 1;
  multiplier >>= 1;
 } while(multiplier != 0);
#endif
 return product;
}
I call BS on this....
I have included disassembly listing and HEX file in zip file
PRO = 3 and compare with opt = 2 disassembly you will be shocked...
see what think and any other ways to speed up C code
thanks Ric for all the help over the years 
davea
 
 
#1

23 Replies Related Threads

    trossin
    Super Member
    • Total Posts : 64
    • Reward points : 0
    • Joined: 2006/06/02 11:31:50
    • Location: 0
    • Status: offline
    Re: XC8 the good, the bad, and the Ugly 2020/11/29 18:40:22 (permalink)
    0
    I complained about the multiply issue here which has some work-arounds.

    https://www.microchip.com..ums/m/tm.aspx?m=1130061
    #2
    ric
    Super Member
    • Total Posts : 29435
    • Reward points : 0
    • Joined: 2003/11/07 12:41:26
    • Location: Australia, Melbourne
    • Status: online
    Re: XC8 the good, the bad, and the Ugly 2020/11/29 18:57:01 (permalink)
    +1 (3)
    trossin
    I complained about the multiply issue here which has some work-arounds.

    https://www.microchip.com..ums/m/tm.aspx?m=1130061

    It does appear the easiest workaround is to pay for the PRO licence. ;)
     

    I also post at: PicForum
    Links to useful PIC information: http://picforum.ric323.co...opic.php?f=59&t=15
    NEW USERS: Posting images, links and code - workaround for restrictions.
    To get a useful answer, always state which PIC you are using!
    #3
    davea
    Super Member
    • Total Posts : 585
    • Reward points : 0
    • Joined: 2016/01/28 13:12:13
    • Location: Tampa Bay FL USA
    • Status: offline
    Re: XC8 the good, the bad, and the Ugly 2020/11/29 22:40:16 (permalink)
    0
    thanks for link it got me thinking

    /*
    a 32-bit multiply can be decomposed into the sum of ten 8-bit multiplies
    a b c d
    * e f g h
    -----------------------
    | dh
    | ch 0
    | bh 0 0
    |ah 0 0 0
    | dg 0
    | cg 0 0
    |bg 0 0 0
    ag| 0 0 0 0 (we ignore this intermediate product
    because it does not affect the low 32 bits of the result)
    | df 0 0
    |cf 0 0 0
    bf| 0 0 0 0 (ignore)
    af 0| 0 0 0 0 (ignore)
    |de 0 0 0
    ce| 0 0 0 0 (ignore)
    be 0| 0 0 0 0 (ignore)
    + ae 0 0| 0 0 0 0 (ignore)
    =======================
    */
    product = (unsigned int)LOWBYTE(multiplier) * LOWBYTE(multiplicand);
    #if defined(USE_MASKS)
    product += ((unsigned long)
    ((unsigned int)LOWBYTE(multiplier) * LMIDBYTE(multiplicand))
    +
    ((unsigned int)LMIDBYTE(multiplier) * LOWBYTE(multiplicand)))
    << 8;
    product += ((unsigned long)
    ((unsigned int)LOWBYTE(multiplier) * HMIDBYTE(multiplicand))
    +
    ((unsigned int)LMIDBYTE(multiplier) * LMIDBYTE(multiplicand))
    +
    ((unsigned int)HMIDBYTE(multiplier) * LOWBYTE(multiplicand)))
    << 16;
    /* cast to smaller type to avoid adding high bits just to discard */
    product += ((unsigned long)
    (unsigned char)
    ((unsigned int)LOWBYTE(multiplier) * HIGHBYTE(multiplicand))
    +
    (unsigned char)
    ((unsigned int)LMIDBYTE(multiplier) * HMIDBYTE(multiplicand))
    +
    (unsigned char)
    ((unsigned int)HMIDBYTE(multiplier) * LMIDBYTE(multiplicand))
    +
    (unsigned char)
    ((unsigned int)HIGHBYTE(multiplier) * LOWBYTE(multiplicand)))
    << 24;
    seeing that the mul that is needed 16bits X 24bits
    I could do this and use 8bit X 8bit into 16bit

    c d
    * f g h
    -----------------------
    | dh
    | ch 0
    | bh 0 0 <<< not needed
    | ah 0 0 0 <<< not needed
    | dg 0
    | cg 0 0
    | bg 0 0 0 <<< not needed
    ag| 0 0 0 0 (we ignore this intermediate product
    because it does not affect the low 32 bits of the result)
    | df 0 0
    | cf 0 0 0
    bf| 0 0 0 0 (ignore)
    af 0| 0 0 0 0 (ignore)
    | de 0 0 0 <<< not needed
    ce| 0 0 0 0 (ignore)
    be 0| 0 0 0 0 (ignore)
    + ae 0 0| 0 0 0 0 (ignore)
    =======================

    only 6 multiplies do you think it would be any faster then
     product = 0;
    do {
    if(multiplier & 1)
    product += multiplicand;
    multiplicand <<= 1;
    multiplier >>= 1;
    } while(multiplier != 0);

    thanks
    davea
    #4
    NKurzman
    A Guy on the Net
    • Total Posts : 19106
    • Reward points : 0
    • Joined: 2008/01/16 19:33:48
    • Location: 0
    • Status: online
    Re: XC8 the good, the bad, and the Ugly 2020/11/29 23:34:16 (permalink)
    +2 (2)
    The problem with trying to trick the compiler into making faster code is that there’s no guarantee that it will work on future versions of the compiler. If you have a single project that you wish to track into making faster without paying for the pro license. ( which can be rented for about $30 a month) you will have to check that all of your tricks work if you move to a newer version.
    #5
    crosland
    Super Member
    • Total Posts : 2142
    • Reward points : 0
    • Joined: 2005/05/10 10:55:05
    • Location: Warks, UK
    • Status: offline
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 08:20:49 (permalink)
    -1 (1)
    NKurzman
    pro license. ( which can be rented for about $30 a month) 



    I would call $40/mo considerably more than "about" :)
     
    [You missed the recent price increase]
     
    #6
    crosland
    Super Member
    • Total Posts : 2142
    • Reward points : 0
    • Joined: 2005/05/10 10:55:05
    • Location: Warks, UK
    • Status: offline
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 08:22:40 (permalink)
    +1 (3)
    trossin
    I complained about the multiply issue here which has some work-arounds.

    https://www.microchip.com..ums/m/tm.aspx?m=1130061



    Does anyone else see a completely different forum interface when following that link? What am I missing?
    #7
    Jan Audio
    Super Member
    • Total Posts : 187
    • Reward points : 0
    • Joined: 2018/09/24 08:12:24
    • Location: 0
    • Status: offline
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 08:29:14 (permalink)
    +2 (2)
    NKurzman
    The problem with trying to trick the compiler into making faster code is that there’s no guarantee that it will work on future versions of the compiler.



    Indeed, write down the used XC version in the code comments.
    #8
    domble
    Super Member
    • Total Posts : 183
    • Reward points : 0
    • Joined: 2007/01/25 04:11:53
    • Location: UK
    • Status: offline
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 08:38:09 (permalink)
    +1 (1)
    crosland
    trossin
    I complained about the multiply issue here which has some work-arounds.

    https://www.microchip.com..ums/m/tm.aspx?m=1130061



    Does anyone else see a completely different forum interface when following that link? What am I missing?


    The /m/ (mobile) part of the link.
     
    dom.
     
    #9
    oliverb
    Super Member
    • Total Posts : 369
    • Reward points : 0
    • Joined: 2009/02/16 13:12:38
    • Location: 0
    • Status: offline
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 08:40:06 (permalink)
    0
    Is there an efficient way to code a "promoting" multiply, e.g. one that takes 8 bit arguments and returns a 16 bit result, or 16x16 to 32?
     
    I suppose ideally there'd be a syntax that allowed you to cast the multiply operator directly but as I understand it you can only cast the arguments (too early) or the result (too late).
    #10
    NKurzman
    A Guy on the Net
    • Total Posts : 19106
    • Reward points : 0
    • Joined: 2008/01/16 19:33:48
    • Location: 0
    • Status: online
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 09:50:32 (permalink)
    +1 (1)
    crosland
    I would call $40/mo considerably more than "about" :) 
    [You missed the recent price increase]



    Apparently I did.
    #11
    NKurzman
    A Guy on the Net
    • Total Posts : 19106
    • Reward points : 0
    • Joined: 2008/01/16 19:33:48
    • Location: 0
    • Status: online
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 09:53:30 (permalink)
    0
    Jan Audio
    NKurzman
    The problem with trying to trick the compiler into making faster code is that there’s no guarantee that it will work on future versions of the compiler.



    Indeed, write down the used XC version in the code comments.


    That is Great Advice for all code.  Compilers change over time.  The Person that inherits the project may need the compiler that made the working code. And sometimes it is the original programmer. that needs to make a minor change.
    #12
    NKurzman
    A Guy on the Net
    • Total Posts : 19106
    • Reward points : 0
    • Joined: 2008/01/16 19:33:48
    • Location: 0
    • Status: online
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 09:57:09 (permalink)
    0 (2)
    oliverb
    Is there an efficient way to code a "promoting" multiply, e.g. one that takes 8 bit arguments and returns a 16 bit result, or 16x16 to 32?
     
    I suppose ideally there'd be a syntax that allowed you to cast the multiply operator directly but as I understand it you can only cast the arguments (too early) or the result (too late).


    C does not support 16X16=32 only 16X16=16.  You can use any multiplication algorithm in C or ASM.  for PIC18 you may be able to use the built in Multiplication.
    #13
    davea
    Super Member
    • Total Posts : 585
    • Reward points : 0
    • Joined: 2016/01/28 13:12:13
    • Location: Tampa Bay FL USA
    • Status: offline
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 14:44:33 (permalink)
    +1 (1)
    thanks for the replies
    currently the company pays for PRO
    but I don't Know for how long now 
    that COV19 has killed our sales
    so my objective is to create functions
    in Pro mode and save the disassembly file as a library
    to reference as i get better at PIC RISC asm
    then move to mixed C and asm in the free mode
    can anyone answer #4 with a yes or no ?
    davea
     
     
    #14
    NKurzman
    A Guy on the Net
    • Total Posts : 19106
    • Reward points : 0
    • Joined: 2008/01/16 19:33:48
    • Location: 0
    • Status: online
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 15:11:30 (permalink)
    0
    Pays for Pro by the Month, or Bought it.  In that case you keep it.  But can not updated it to new versions after support runs out.  And far as monthly, you can rent it for 1 month if that is what you need.
    #15
    Murton Pike Systems
    Super Member
    • Total Posts : 139
    • Reward points : 0
    • Joined: 2020/09/10 02:13:01
    • Location: 0
    • Status: offline
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 19:03:27 (permalink)
    0
    Just got caught out with XC8.
    It kept telling me I had stack over run.
    I could only see my software using 2 levels of the stack.
    Then going through code slowly I spotted a multiply sign in a calculation.
    I then twigged what is happening is XC is calling a multiply routine and so using one of my levels of stack.
     
    #16
    ric
    Super Member
    • Total Posts : 29435
    • Reward points : 0
    • Joined: 2003/11/07 12:41:26
    • Location: Australia, Melbourne
    • Status: online
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 19:08:29 (permalink)
    +1 (3)
    That should be expected. Of course the underlying machine language requires some stack levels to be able to call functions.
    If you have interrupts enabled, that's an extra layer of stack usage that has to be allowed for too.
     

    I also post at: PicForum
    Links to useful PIC information: http://picforum.ric323.co...opic.php?f=59&t=15
    NEW USERS: Posting images, links and code - workaround for restrictions.
    To get a useful answer, always state which PIC you are using!
    #17
    NorthGuy
    Super Member
    • Total Posts : 6464
    • Reward points : 0
    • Joined: 2014/02/23 14:23:23
    • Location: Northern Canada
    • Status: offline
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 20:23:01 (permalink)
    0
    ric
    If you have interrupts enabled, that's an extra layer of stack usage that has to be allowed for too.



    Also all the functions called from within an ISR add levels. And this is on top of the deepest level of the "main" code.
    #18
    NKurzman
    A Guy on the Net
    • Total Posts : 19106
    • Reward points : 0
    • Joined: 2008/01/16 19:33:48
    • Location: 0
    • Status: online
    Re: XC8 the good, the bad, and the Ugly 2020/11/30 20:24:58 (permalink)
    0
    On Older PICs the Debugger may also use a Stack Level.
    #19
    BroadwellConsultingInc
    Super Member
    • Total Posts : 97
    • Reward points : 0
    • Joined: 2020/06/09 06:07:55
    • Location: 0
    • Status: offline
    Re: XC8 the good, the bad, and the Ugly 2020/12/01 06:52:11 (permalink)
    +1 (1)
    And if you're compiling with procedural abstraction to save space that can add invisible calls as well.
    #20
    Page: 12 > Showing page 1 of 2
    Jump to:
    © 2021 APG vNext Commercial Version 4.5