• AVR Freaks

LockedHot!PIC32MZ running very slow for no good reason

Author
vexorg
New Member
  • Total Posts : 16
  • Reward points : 0
  • Joined: 2019/09/27 10:59:40
  • Location: 0
  • Status: offline
2019/12/10 10:22:01 (permalink)
0

PIC32MZ running very slow for no good reason

I'm using a PIC32MZ core running at 45MHz, PB at 22.5MHz on a board to manage a number of bits of hardware and be a central firmware hub.
 
One of the devices connected to it is an SD card through the SPI port. Reading and writing data to the card is extrememly slow. I know it's a serial interface and not the fasted. The SPI clock speed is running at 11MHz. Fat32 card using the MLA code compressed into one file tailored to our board.
 
Functionally it works fine, will read and write files as expected, no compiler warnings, all good, other than it's extrememly slow. Slow is in one 256 byte file and one 2k file take a total of 0.7second to write.
 
I put a scope on to see what was happening as the base transfer speed should be over 1Mb/sec. And it is for one byte. Then it hangs around for a signifcant amount of time then sends the next byte. It's a very basic loop in the code that has been condensed down to this for reading blocks of data:
 
    while (count--)
    {
        SPI2BUF = 0xFF;
        while (!(SPI2STAT&0x01));   // 0x01= buffer full (1 byte)
        *data++ = SPI2BUF;
    }


I've optimised the original code to get the fewest lines for the multiple reads. On the disassembly listing, there are about 20 assembly commands, and this takes around 12us. I've played around with the code and find that each assembly instruction takes between 0.5 and 1us to execute. Even a simple i/o line toggle using LATxbits.LATx takes 4 assembly commands per line, and the fastest it will toggle is 3us!!
 
    LATEbits.LATE2=1;
    LATEbits.LATE2=0;
 
I am new to the PIC32 and mplab C compilers, but I have had many years of C programming on the PC, and using quite a number of PICs, such as 16C, 17C and 18F families all on the older mplab assembler versions. I could get these older PICs to perform faster and they were on much (much!) slower clocks. The PIC32 system clock must be right as the SPI is derived from that and it measures correct on the scope.
 
What am I missing here? it seems to be about a factor of 25 to 50 times slower than it should.
Or is it a case of buy mplab pro for the full speed?
I noticed there are quite a few random NOPs in the assembly listings.
 
If I bump the clock up to 180MHz then all these times are divided by 4, so it is purely the execution speed, and not affected by any hardware or peripherals.
#1

16 Replies Related Threads

    NorthGuy
    Super Member
    • Total Posts : 5876
    • Reward points : 0
    • Joined: 2014/02/23 14:23:23
    • Location: Northern Canada
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/10 11:16:37 (permalink)
    0
    The code you posted waits for SPI2STAT. At 45 MHz you should have about 40 instruction cycles per API transfer (assuming it's all cached), which should be plenty.
     
    Perhaps you have an interrupt handle which frequently executes slowing down your code.
     
    The toggle rate for pins will depend on the PB clock for the pins. If you run the corresponding PB clock faster, it'll toggle faster. Although 300kHz is too slow by any count.
     
    Why don't you use DMA?
     
     
    #2
    vexorg
    New Member
    • Total Posts : 16
    • Reward points : 0
    • Joined: 2019/09/27 10:59:40
    • Location: 0
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/10 11:52:26 (permalink)
    0
    The wait for SPISTAT is not an issue, the times I posted are outside of the SPI transmission, in that I'm measuring the times inbetween the SPI clock bursts. I was expecting the SPI transmission to tbe the slow part, not the loop outside it.
     
    Even removing the while (!(SPI2STAT&0x01)) and the loop , the delay between transmissions was 7us. If I take away the "++" then you have 5us, and if I replace the pointer with a local variable then it was 3us (back to the old near and far days!). This corresponded to 7, 5 and 3 instructions per option there:
     
    7us between clock bursts, code would have been:
     
    SPI2BUF = 0xFF;
    *data++ = SPI2BUF;
    SPI2BUF = 0xFF;
    *data++ = SPI2BUF;
    SPI2BUF = 0xFF;
    *data++ = SPI2BUF;
     
    and 5us between clock bursts:
     
    SPI2BUF = 0xFF;
    *data = SPI2BUF;
    SPI2BUF = 0xFF;
    *data = SPI2BUF;
    SPI2BUF = 0xFF;
    *data = SPI2BUF;

    and 3us between clock bursts:
     
    SPI2BUF = 0xFF;
    local_data = SPI2BUF;
    SPI2BUF = 0xFF;
    local_data = SPI2BUF;
    SPI2BUF = 0xFF;
    local_data = SPI2BUF; 
     
    I done it 3 times to see the bursts on the scope without the loop. This was just for test, I know it should check the SPI to make sure it's clear before loading the next value.
    #3
    NorthGuy
    Super Member
    • Total Posts : 5876
    • Reward points : 0
    • Joined: 2014/02/23 14:23:23
    • Location: Northern Canada
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/10 12:07:39 (permalink)
    0
    So, you're saying that an instruction takes roughly 500 ns to 1 us to execute?
     
    You probably have your PRECON setting wrong. For 45 MHz you should have 0 wait states, and pre-fetcher must be enabled. But even then, the speed you're observing is way too slow. How do you know that the instruction clock is 45 MHz?
    #4
    vexorg
    New Member
    • Total Posts : 16
    • Reward points : 0
    • Joined: 2019/09/27 10:59:40
    • Location: 0
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/10 12:10:13 (permalink)
    0
    Here's the SCK2 line for this short loop:
     
    while (count--)
        {
            SPI2BUF = 0xFF;
            while (!(SPI2STAT&0x01));   // 0x01= buffer full (1 byte)
            *data++ = SPI2BUF;
        }
     
    About 11us to loop around again, still better than the original code that had another function in there to get a byte with checks on the SPI (DRV_SPI_Get was the original function call within DRV_SPIGetbuffer), that took over 50us to loop!
     
     

    Attachment(s)

    Attachments are not available: Download requirements not met
    #5
    vexorg
    New Member
    • Total Posts : 16
    • Reward points : 0
    • Joined: 2019/09/27 10:59:40
    • Location: 0
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/10 12:14:05 (permalink)
    0
    PB2 is set to divide sys clockby 2, SPI baud rate is half of that (maximum), verified by the scope showing about 700ns for the burst of 8 clocks on the SPI. Approx = 11 MHz SPI clock
    #6
    vexorg
    New Member
    • Total Posts : 16
    • Reward points : 0
    • Joined: 2019/09/27 10:59:40
    • Location: 0
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/10 12:20:23 (permalink)
    0
    In the example I had taken this from, the precon settings are:
     
    PRECONbits.PREFEN =0x3;
    PRECONbits.PFMWS =0x7;
     
    I'll have a dig around and see what they should be, sound like they are wrong.
    #7
    NorthGuy
    Super Member
    • Total Posts : 5876
    • Reward points : 0
    • Joined: 2014/02/23 14:23:23
    • Location: Northern Canada
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/10 12:32:20 (permalink)
    +1 (1)
    Set PFMWS to 0.
     
    I would set PREFEN to 1, but 3 is just as good.
    #8
    vexorg
    New Member
    • Total Posts : 16
    • Reward points : 0
    • Joined: 2019/09/27 10:59:40
    • Location: 0
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/10 14:07:38 (permalink)
    0
    Ahh, thanks for the pointer, that old value wont have helped. I'm adding this SD card stuff into someone elses program, so had assumed it was already all set up. I'll give it a test tomorrow.
     
    This PIC32 stuff is somewhat more complex than the 8 and 16 bit PICs of old. I had assumed the PIC could read the program or data at sys clock speed at all times.
     
    What impact does the prefetch settings of 1, 2 and 3 have on performance and power usage?
    The datasheets are a bit light on info for real world effect, only say what they do functionally.
    #9
    Mysil
    Super Member
    • Total Posts : 3559
    • Reward points : 0
    • Joined: 2012/07/01 04:19:50
    • Location: Norway
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/10 14:53:52 (permalink)
    +1 (1)
    Hi,
    When CPU clock frequency is so low,
    there is no reason to slow down the PB dividers.
    You may set the PB2 divider to divide 1/1, at least for the peripheral bus that connect to the SPI.
     
    PIC32 is quite different from PIC16 and PIC18F families that you know.
    PIC16 and PIC18 microcontrollers are able to Read a SFR register, or a memory location, do something with it, and store the result back in the same register, all in a single instruction. But that instruction take 4 clock cycles to complete.
    PIC32 microcontrollers are the opposite way around:
    The CPU is able to perform 1 instruction every clock cycle, but the CPU is pipelined, so the instruction is not completed until 5 clock cycles have passed. Also, the result from one instruction is Not available as input for the next instruction. At least one more clock cycle time must pass before the result is useable.
    This will not cause a problem, but it will slow down the processor.
     
    Also, the MIPS CPU in a PIC32 is a Load/Store  architecture,
    where all operations accessing Memory or SFR registers are done by separate instructions,
    So to read something, do something with it, and store it in SFR register, there have to be at least 3 separate instructions, and these cannot follow immediately after each other.
    It will not cause a problem, and compiler will try to stagger instructions and interleave with other instructions if it can. But it may contribute to make I/O operations seem slower.
     
    About the SPI peripheral in PIC32:
    In SPI hardware, there is FIFO hardware buffers, 16 bytes each for Input and Output,
    so you can safely load up to  16 bytes as fast as you can, before checking progress of the actual transfer.
    There are counters telling how many bytes are ready to read, and number of bytes still waiting to transmit.
    You may then start Reading bytes, and fill up the empties with additional bytes to Transmit.
    You will have to use ENHBUF = 1; and use the FIFO buffers, to be able to keep the SPI hardware busy.
     
    PIC32MZ have a complicated Bus interconnect with bridges that buffer and transfer data between main Bus and
    Peripheral Buses. 
    There are some experiences that indicate that transfer to and from SFR registers in PIC32MZ devices,
    are slower than similar transfer in PIC32MX devices. Even when the MZ CPU is running twice as fast. 
     
        Mysil
    #10
    vexorg
    New Member
    • Total Posts : 16
    • Reward points : 0
    • Joined: 2019/09/27 10:59:40
    • Location: 0
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/11 02:23:09 (permalink)
    0
    Thanks, I'm finding out very quickly that the PIC32 is nothing like the classic RISC processors, more like a pentium. A bit of a shame really as if it has the "PIC" name then I would have thought it would be the same. You know exactly where you were with older RISC type from clock cycle to clock cycle, now the PIC32 has a random factor in when you get the result (maybe not random but difficult to predict variable factor).
     
    I had disabled the enhanced buffer, partly assuming the PIC would be so fast that they were almost redundent. Also, had some issues with the SD card not behaving, so wanted total control over the bytes in and out while debugging it. I should re-enable now that I've got over the issues, and know how the PIC32 runs at a lower level.
    #11
    vexorg
    New Member
    • Total Posts : 16
    • Reward points : 0
    • Joined: 2019/09/27 10:59:40
    • Location: 0
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/11 04:22:48 (permalink)
    0
    Found it, someone had set PB7 to divide by 10!!
     
    That's sneaky one, it's not very clear how critical PB7 to the running of the PIC32. PB7 makes it sound unimportant, not even shown on the overview diagram.
    #12
    NKurzman
    A Guy on the Net
    • Total Posts : 18162
    • Reward points : 0
    • Joined: 2008/01/16 19:33:48
    • Location: 0
    • Status: online
    Re: PIC32MZ running very slow for no good reason 2019/12/11 08:34:34 (permalink)
    0
    Speed comes at a Price. If you want cycle count accuracy then a PIC32 (or even an ARM) may not be what you need,
    #13
    vexorg
    New Member
    • Total Posts : 16
    • Reward points : 0
    • Joined: 2019/09/27 10:59:40
    • Location: 0
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/11 09:18:22 (permalink)
    0
    NKurzman
    Speed comes at a Price.



    Bad pun, they are pretty cheap, though we do have an FPGA on a daughterboard for the really critical stuff :D
    #14
    Mysil
    Super Member
    • Total Posts : 3559
    • Reward points : 0
    • Joined: 2012/07/01 04:19:50
    • Location: Norway
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/11 11:07:43 (permalink)
    0
    Well,
    RISC, Reduced Instruction Set Computer (CPU)
    is a term that was invented and used when MIPS32 and similar CPU architectures was designed.
    and the MIPS32 that is used in PIC32, is one of the most typical RISC processors that exist.
     
    Forerunners of the PIC16 were designed even before the RISC term was invented,
    and although there may be some similarities, there are much more differences from RISC principles.
     
    A CPU that need 4 clock cycles to perform one instruction at a time,
    and is designed to read a word from memory, perform a logic or arithmetic operation,
    and store the result back in memory or SFR register, in a single instruction,
    is Not a RISC processor in my understanding.
     
    Microchip is trying to twist the RISC term into meaning a computer with a small number of instructions,
    that is reasonably easy to understand and use in assembly programming,
    and because it is a popular term to use in marketing.
     
    The original priorities of RISC Instruction set and Processor design were quite different,
    with emphasis on creating hardware that could run with high clock frequency,
    even if it would require a optimizing compiler for the intended use.
     
    In some ways, a PIC10, PIC12, PIC16 or PIC18 may be regarded as just one large CPU,
    with all RAM memory and SFR control registers beeing registers in the CPU.
    This is different from a CPU with interface to external memory.
     
    The ability to increment a SFR register or a memory location in a single instruction that cannot be interrupted,
    is a very convinient feature in a PIC16 or PIC18 microcontroller,
    and single instruction access to SFR registers is valuable in dealing with peripherals. 
    But this doesn't scale to well into larger memory and higher clock frequency.
    In the latest PIC18FxxK42  there is extra trickery used to reach beyond 64 kByte program memory
    and beyond 4 kByte RAM and SFR memory. But this is more a limitation of short instruction word size.
     
    PIC24 and dsPIC devices use very much the same design principles as 8 bit PIC microcontrollers.
    There is wider data bus at 16 bits, and 24 bit instruction word size, 16 accumulator registers,
    and more computing hardware in the CPU,
    with hardware to perform multiplication and division, plus signal processing hardware in the ds models.
    Also there are more Interrupt priority levels than in PIC18, and a lot of fancy peripherals in some models.
    There is still 4 steps in execution of each instruction, just that there is a DDR clock system in use,
    using both Low and High half period of each clock cycle for doing something different.
    It is in my opinion impressive what PIC24 and dsPIC33 devices can do at 70 Million instructions/second to
    90 and 100 Million Instructions/second, with clock frequencies 140 MHz up to 200 MHz.
     
    There is then extra trickery needed when reaching beyond 32 kByte or 64 kByte memory.
     
    AVR microcontrollers, that are now Microchip was designed as RISC processors.
     
    Regards,
        Mysil
     
     
    post edited by Mysil - 2019/12/11 11:30:13
    #15
    jdeguire
    Super Member
    • Total Posts : 488
    • Reward points : 0
    • Joined: 2012/01/13 07:48:44
    • Location: United States
    • Status: offline
    Re: PIC32MZ running very slow for no good reason 2019/12/11 11:15:13 (permalink)
    0
    vexorg
    Found it, someone had set PB7 to divide by 10!!
     
    That's sneaky one, it's not very clear how critical PB7 to the running of the PIC32. PB7 makes it sound unimportant, not even shown on the overview diagram.


    Peripheral clock 7 runs the CPU (and deadman timer) according to Table 8-1 in the PIC32MZ EF datasheet, so that probably was a big issue for you.
    #16
    NKurzman
    A Guy on the Net
    • Total Posts : 18162
    • Reward points : 0
    • Joined: 2008/01/16 19:33:48
    • Location: 0
    • Status: online
    Re: PIC32MZ running very slow for no good reason 2019/12/11 11:26:18 (permalink)
    0
    vexorg
    NKurzmanSpeed comes at a Price.

    Bad pun, they are pretty cheap, though we do have an FPGA on a daughterboard for the really critical stuff :D

    I was talking about money. It is predictability. Note the DMA steals bus cycles, so it also affects timing.
    #17
    Jump to:
    © 2020 APG vNext Commercial Version 4.5