• AVR Freaks

Approaching the end of my rope with DMA+SPI

Page: 123 > Showing page 1 of 3
Author
Aiden.Morrison
Super Member
  • Total Posts : 729
  • Reward points : 0
  • Joined: 2005/02/25 11:18:31
  • Location: Canada
  • Status: offline
2011/04/15 15:46:58 (permalink)
5 (1)

Approaching the end of my rope with DMA+SPI

Hi there,

I've tried to use the microchip provided SPI DMA example in the following situation:

I am streaming data to disk using fatFS.  Performance is great when it is all the chip is doing, but since it's just throwing one byte transfers any time there is an interrupt this degrades write throughput.  To minimize this I'm switching the block write operations to use DMA.

Here's the pertinent code:

volatile int DmaTxIntFlag; // flag used in interrupts, signal that DMA transfer ended
volatile int DmaRxIntFlag; // flag used in interrupts, signal that DMA transfer ended

/*-----------------------------------------------------------------------*/
/* Start a DMA Sector Transfer between host to card     */
/*-----------------------------------------------------------------------*/
BYTE sectorsendDMA(const BYTE *buff)
{
unsigned char resp;
int evFlags;
int done=0; 
//clear SPI TX empty int
INTClearFlag(INT_SOURCE_SPI(_SPI3A_TX_IRQ)); 
DmaChnOpen(DMA_CHANNEL1, DMA_CHN_PRI3, DMA_OPEN_DEFAULT);    // Tx 
DmaChnSetEventControl(DMA_CHANNEL1, DMA_EV_START_IRQ_EN|DMA_EV_START_IRQ(_SPI3A_TX_IRQ));
DmaChnSetTxfer(DMA_CHANNEL1,(unsigned char*)buff, (void*)&SPI3ABUF, 512, 1, 1); 
DmaChnSetEvEnableFlags(DMA_CHANNEL1, DMA_EV_BLOCK_DONE);

INTSetVectorPriority(INT_VECTOR_DMA(DMA_CHANNEL1), INT_PRIORITY_LEVEL_5); // set INT controller priority
INTSetVectorSubPriority(INT_VECTOR_DMA(DMA_CHANNEL1), INT_SUB_PRIORITY_LEVEL_3); // set INT controller sub-priority

INTEnable(INT_SOURCE_DMA(DMA_CHANNEL1), INT_ENABLED); // enable the chn interrupt in the INT controller
DmaTxIntFlag=0; // clear the interrupt flag we're  waiting on
//clear SPI TX empty int
DmaChnEnable(DMA_CHANNEL1);
DmaChnStartTxfer(DMA_CHANNEL1, DMA_WAIT_NOT, 0); 

while(!DmaTxIntFlag) {}
//DmaChnDisable(DMA_CHANNEL1);
//disable spi interrupt
INTEnable(INT_SOURCE_DMA(DMA_CHANNEL1), INT_DISABLED);
volatile unsigned char discard=SPI3ABUF;



}




// handler for the DMA channel 1 interrupt
void __ISR(_DMA1_VECTOR, ipl5) DmaHandler1(void)
{
int evFlags; // event flags when getting the interrupt
SPI3ASTATbits.SPIROV=0;
INTClearFlag(INT_SOURCE_DMA(DMA_CHANNEL1)); // acknowledge the INT controller, we're servicing int
evFlags=DmaChnGetEvFlags(DMA_CHANNEL1); // get the event flags
    if(evFlags&DMA_EV_BLOCK_DONE)
    { // just a sanity check. we enabled just the DMA_EV_BLOCK_DONE transfer done interrupt
     DmaTxIntFlag=1;
  DmaChnClrEvFlags(DMA_CHANNEL1, DMA_EV_BLOCK_DONE);
    }
}



If I don't clear the SPIROV bit in the DMA ISR the SPI module simply locks up!  This is interesting since the microchip example code doesn't seem to concern itself with handling this.

Before and after the code there are other SPI transactions happening, but nothing interrupt driven.  Just simple put/take byte operations that either send a byte and discard the received, or send 0xFF to receive a byte.

it seems to me that the DMA module isn't actually sending 512 bytes.

#1

47 Replies Related Threads

    frostmeister
    Super Member
    • Total Posts : 769
    • Reward points : 0
    • Joined: 2006/12/03 10:20:52
    • Location: UK
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/15 15:56:49 (permalink)
    0

    This line:
     DmaChnSetTxfer(DMA_CHANNEL1,(unsigned char*)buff, (void*)&SPI3ABUF, 512, 1, 1); 



    Did you try 512, 1, 512? I posted my findings in your other post on this, but the last <1> is the cell size - that is, the number of bytes per DMA event, unless the source size is bigger, in which case it xfers <source size> no. of bytes. 


    'Scuse me if you've already tried this, but it's worth a shot... :)


    Edit: your other post /Edit
    post edited by frostmeister - 2011/04/15 15:59:17
    #2
    Aiden.Morrison
    Super Member
    • Total Posts : 729
    • Reward points : 0
    • Joined: 2005/02/25 11:18:31
    • Location: Canada
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/15 16:14:39 (permalink)
    0
    Hey Frost,

    I tried that for the heck of it, but sure enough it made things far worse ;)

    cell size can be thought of as 'how many of these bytes do I send before I wait for the trigger event to poke me again....?'.

    Since the trigger is 'the spi3a transmitter is empty' and the buffer size of the spi module is 1 byte (not in enhanced buffer mode) I have to stop every 1 byte to wait for the spi transmitter to catch up.
    #3
    frostmeister
    Super Member
    • Total Posts : 769
    • Reward points : 0
    • Joined: 2006/12/03 10:20:52
    • Location: UK
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/15 16:30:53 (permalink)
    0
    ahhhh.... My bad. I didn't think of it like that - I've done CRC, but not SPI transfer yet. 

    Hmm - you're using the TX interrupt to trigger the DMA - What about using the RX interrupt instead? As far as I recall, the RX buffer has to be read every byte received, or it overflows, errors and stops the module. Worth a try maybe...

    Or if you're enabling and setting the SPI TX interrupt, is it serviced elsewhere if only to clear the flag? Does it need to be? Just another couple of thoughts - hope they're a bit more useful than my last one!
    #4
    Aiden.Morrison
    Super Member
    • Total Posts : 729
    • Reward points : 0
    • Joined: 2005/02/25 11:18:31
    • Location: Canada
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/15 17:14:49 (permalink)
    0
    Edit - I've figured out how to get this to work:

    volatile int DmaTxIntFlag; // flag used in interrupts, signal that DMA transfer ended
    volatile int DmaRxIntFlag; // flag used in interrupts, signal that DMA transfer ended
    volatile int DMAINTCOUNT=0;
    /*-----------------------------------------------------------------------*/
    /* Start a DMA Sector Transfer between host to card     */
    /*-----------------------------------------------------------------------*/
    BYTE sectorsendDMA(const BYTE *buff)
    {
    unsigned char resp;
    int evFlags;
    int done=0; 
    //clear SPI TX empty int
    INTClearFlag(INT_SOURCE_SPI(_SPI3A_TX_IRQ)); 
    DmaChnOpen(DMA_CHANNEL1, DMA_CHN_PRI3, DMA_OPEN_DEFAULT);    // Tx 
    DmaChnSetEventControl(DMA_CHANNEL1, DMA_EV_START_IRQ_EN|DMA_EV_START_IRQ(_SPI3A_TX_IRQ));
    DmaChnSetTxfer(DMA_CHANNEL1,(unsigned char*)buff, (void*)&SPI3ABUF, 512, 1, 1); 
    DmaChnSetEvEnableFlags(DMA_CHANNEL1, DMA_EV_BLOCK_DONE);

    INTSetVectorPriority(INT_VECTOR_DMA(DMA_CHANNEL1), INT_PRIORITY_LEVEL_5); // set INT controller priority
    INTSetVectorSubPriority(INT_VECTOR_DMA(DMA_CHANNEL1), INT_SUB_PRIORITY_LEVEL_3); // set INT controller sub-priority

    INTEnable(INT_SOURCE_DMA(DMA_CHANNEL1), INT_ENABLED); // enable the chn interrupt in the INT controller
    DmaTxIntFlag=0; // clear the interrupt flag we're  waiting on
    //clear SPI TX empty int
    //DmaChnEnable(DMA_CHANNEL1);
    DmaChnStartTxfer(DMA_CHANNEL1, DMA_WAIT_NOT, 0); 

    while(!DmaTxIntFlag) {}
    //DmaChnDisable(DMA_CHANNEL1);
    //disable spi interrupt
    INTEnable(INT_SOURCE_DMA(DMA_CHANNEL1), INT_DISABLED);
    while(SPI3ASTATbits.SPIBUSY==1){}
    volatile unsigned char discard=SPI3ABUF;



    }




    // handler for the DMA channel 1 interrupt
    void __ISR(_DMA1_VECTOR, ipl5) DmaHandler1(void)
    {
    int evFlags; // event flags when getting the interrupt
    SPI3ASTATbits.SPIROV=0;
    volatile unsigned char discard=SPI3ABUF;
    INTClearFlag(INT_SOURCE_DMA(DMA_CHANNEL1)); // acknowledge the INT controller, we're servicing int
    evFlags=DmaChnGetEvFlags(DMA_CHANNEL1); // get the event flags
        if(evFlags&DMA_EV_BLOCK_DONE)
        { // just a sanity check. we enabled just the DMA_EV_BLOCK_DONE transfer done interrupt
         DmaTxIntFlag=1;
      DmaChnClrEvFlags(DMA_CHANNEL1, DMA_EV_BLOCK_DONE);
        }
    DMAINTCOUNT++;
    }


    The DMAINTCOUNT variable was just me making sure that the interrupt was only called once per 512 byte block transfer.  The reason it wasn't working was that I had forgotten to account for the fact that the interrupt is called when the DMA finishes sending its last byte to the SPI module - we have to wait for it to finish sending before reading the last byte out of the module.  Interestingly, the overflow of the spi receive buffer doesn't appear to bother it when it's working in DMA mode like this - reception is unecessary, though I do clear the overflow bit once at the end of the 511th transaction/start of the 512th.
    post edited by Aiden.Morrison - 2011/04/15 23:40:11
    #5
    bsder
    Senior Member
    • Total Posts : 137
    • Reward points : 0
    • Joined: 2008/10/23 21:17:18
    • Location: Southern California
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/17 22:37:18 (permalink)
    0
    You apparently figured it out, but one other thing to watch out for is that the 2/3/4 series has a 256 byte limit on DMA (I think) while the 5/6/7 series can go higher.
    #6
    Aiden.Morrison
    Super Member
    • Total Posts : 729
    • Reward points : 0
    • Joined: 2005/02/25 11:18:31
    • Location: Canada
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/18 09:15:06 (permalink)
    0
    I noticed the 256 byte limit on the 3/4 series early on - similar to the UART and SPI peripherals having less RX/TX buffer on the 3/4 series than on the 6/7.
     
    When I noticed this I found it amusing how much effort has gone into making helper functions that are compatible over the whole family, but still having these small but critical differences in peripherals that could cause some issues - though I suppose the problems would only been seen if one was trying to downsize the chip in use.
     
    I've attached an image of the file system benchmarks I've obtained so far accross the 8 SD/SDHC cards I have available. Each of these cards has been benchmarked using a FAT32 format with default sector sizes (4kB-32kB) and with a block transfer size in my code of 4kB.  Note that higher performance can be obtained by writing larger blocks, but this gets expensive in terms of RAM usage pretty fast.
     
    1) Microchip Disk Drive library running on a pic32
    2) Fat-FS running on the same pic32 with an SPI clock of 6.67 MHz
    3) Fat-FS running on the same pic32 with an SPI clock of 20.0 MHz
    4) Fat-FS running on a '695 with an SPI of 22.5 MHz and DMA transfers of the 512 byte blocks 
     
    Vertical axis is in kB/sec of write throughput.
    post edited by Aiden.Morrison - 2011/04/18 09:16:18

    Attached Image(s)

    #7
    jnadelman
    Starting Member
    • Total Posts : 54
    • Reward points : 0
    • Joined: 2010/11/04 08:46:25
    • Location: 0
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/27 04:44:41 (permalink)
    0
    Isn't it necessary to send a dummy CRC even if CRC checking is OFF?
    #8
    Aiden.Morrison
    Super Member
    • Total Posts : 729
    • Reward points : 0
    • Joined: 2005/02/25 11:18:31
    • Location: Canada
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/27 06:21:55 (permalink)
    0
    Yes
    #9
    jnadelman
    Starting Member
    • Total Posts : 54
    • Reward points : 0
    • Joined: 2010/11/04 08:46:25
    • Location: 0
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/27 07:08:49 (permalink)
    0
    I'm having trouble seeing how to integrate FatFs with the working DMA solution you posted below. Where is the microchip provided SPI DMA example you started with?
    Aiden.Morrison... I've tried to use the microchip provided SPI DMA example in the following situation ...


    #10
    Aiden.Morrison
    Super Member
    • Total Posts : 729
    • Reward points : 0
    • Joined: 2005/02/25 11:18:31
    • Location: Canada
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/27 08:46:55 (permalink)
    0
    It should be hiding somewhere under your C32 folder, i think in the plib examples area.
    #11
    jnadelman
    Starting Member
    • Total Posts : 54
    • Reward points : 0
    • Joined: 2010/11/04 08:46:25
    • Location: 0
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/27 11:11:28 (permalink)
    0
    Works like a champ. I'm getting 1.961 MB/s using using Patriot 8 GB Class 10 MicroSDHC. Thanks a million!
    post edited by jnadelman - 2011/04/27 11:22:10
    #12
    rce
    Junior Member
    • Total Posts : 111
    • Reward points : 0
    • Joined: 2006/08/08 13:54:28
    • Location: 0
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/27 15:11:52 (permalink)
    0
    Hi jnadelman,
    Can you post your test project for the FATFS modification with DMA.


    Regards,
    Gerrit
    #13
    yuantuh
    Super Member
    • Total Posts : 204
    • Reward points : 0
    • Joined: 2009/01/30 04:39:57
    • Location: 0
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/27 17:08:37 (permalink)
    0
    I doubt the speed >15Mb/s for any Class 10 MicroSDHC under SPI 22.5MHz by using DMA transfers of the 512 byte blocks. Maybe it is possiable to ignore sending write command timimg or the stopwatch may run too slow.
    #14
    Aiden.Morrison
    Super Member
    • Total Posts : 729
    • Reward points : 0
    • Joined: 2005/02/25 11:18:31
    • Location: Canada
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/28 01:46:20 (permalink)
    0
    Hi Yuantuh, I should clarify what I mean by block size.

    In the code I posted it's using a 4096 byte write size, but the individual write commands (multi-block write) are issued on 512 byte blocks that are convenient to use with DMA.

    I think the biggest limitation on speed now is the SPI limit of ~20 MHz.  Since SD cards support up to 50 MHz SPI now, I hope the next iteration of the pic32 will support a ~40 MHz SPI bus within spec.

    To get much faster we would need the 4 bit SD interface peripheral which seems too specialized for anything but multimedia processors.
    #15
    yuantuh
    Super Member
    • Total Posts : 204
    • Reward points : 0
    • Joined: 2009/01/30 04:39:57
    • Location: 0
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/28 03:39:58 (permalink)
    0
    Hi Aiden,

    Besides each 512 byte block DMA write, extrack at least 3 bytes read and 3 bytes write in polling need considerable time. As you said, each write command for 4096 byte block transaction, that also needs considerable time. You may double check your stopwatch implementation and confirm time statistics for writing 4MB data, same 4KB data loop 1000 times. So you need record time of 1000 write commands plus 8000 DMA 4KB block transaction plus executing program instructions.

    By the way, the following code may be called only once in initilisation somewhere else.
    INTEnable(INT_SOURCE_DMA(DMA_CHANNEL1), INT_DISABLED);     DmaChnOpen(DMA_CHANNEL1, DMA_CHN_PRI3, DMA_OPEN_DEFAULT);    // Tx  
    DmaChnSetEventControl(DMA_CHANNEL1, DMA_EV_START_IRQ_EN|DMA_EV_START_IRQ(_SPI3A_TX_IRQ));
    DmaChnSetEvEnableFlags(DMA_CHANNEL1, DMA_EV_BLOCK_DONE);
    INTSetVectorPriority(INT_VECTOR_DMA(DMA_CHANNEL1), INT_PRIORITY_LEVEL_5); // set INT controller priority
    INTSetVectorSubPriority(INT_VECTOR_DMA(DMA_CHANNEL1), INT_SUB_PRIORITY_LEVEL_3); // set INT controller sub-priority

    #16
    yuantuh
    Super Member
    • Total Posts : 204
    • Reward points : 0
    • Joined: 2009/01/30 04:39:57
    • Location: 0
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/28 03:45:18 (permalink)
    0
    Sorry, I made mistakes,  8000 DMA 4KB block transaction should be 8000 DMA 512 byte block transaction.
    #17
    jnadelman
    Starting Member
    • Total Posts : 54
    • Reward points : 0
    • Joined: 2010/11/04 08:46:25
    • Location: 0
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/28 05:01:57 (permalink)
    0
    rce Can you post your test project for the FATFS modification with DMA.


    Aiden.Morrison already posted a FatFS implementation for PIC32 project (without DMA) here and a working sectorsendDMA here. Basically all I did was modify xmit_datablock to work with or without DMA like this:
          static
    int xmit_datablock (    /* 1:OK, 0:Failed */
        const BYTE *buff,   /* _MAX_SS byte data block to be transmitted */
        BYTE token          /* Data token */
    )
    {
        BYTE resp;
        UINT bc = _MAX_SS;
        if (wait_ready() != 0xFF) return 0;
        xmit_spi(token);        /* Xmit a token */
        if (token != 0xFD) {    /* Not StopTran token */
            extern BOOL useDma;
            if (useDma) sectorsendDMA(buff);
            else
            {
                do { /* Xmit the _MAX_SS byte data block to the MMC */
                    xmit_spi(*buff++);
                    xmit_spi(*buff++);
                } while (bc -= 2);
            }
            xmit_spi(0xFF);             /* CRC (Dummy) */
            xmit_spi(0xFF);
            resp = rcvr_spi();          /* Receive a data response */
            if ((resp & 0x1F) != 0x05)  /* If not accepted, return with error */
                return 0;
        }
        return 1;
    }

    #18
    Aiden.Morrison
    Super Member
    • Total Posts : 729
    • Reward points : 0
    • Joined: 2005/02/25 11:18:31
    • Location: Canada
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/04/28 07:59:25 (permalink)
    0
     You may double check your stopwatch implementation and confirm time statistics for writing 4MB data, same 4KB data loop 1000 times. 



    Oh I use a very simple stopwatch implementation - I write down the time when I start the program running, and then write down the time it turns on the 'done' LED :)


    Not the most refined or precise method, but pretty close.
    #19
    jonm
    Senior Member
    • Total Posts : 119
    • Reward points : 0
    • Joined: 2010/06/23 15:51:53
    • Location: 0
    • Status: offline
    Re:Approaching the end of my rope with DMA+SPI 2011/06/21 15:29:16 (permalink)
    0
    This is an awesome thread, I do not know how I missed it.

    jnadelman - I have also thought about doing this the way you have posted in your code, but how would you know that the DMA transfer is complete before transmitting the CRC?  The same question goes for receiving the response as well.

    *edit* I see that Aiden.Morrison's DMA Interrupts block while waiting for transfer completion.  This would explain my above question, but I wonder if this could be achieved without blocking.

    I may try ping-ponging the tx/rx with DMA ISR's to see how that works.
    post edited by jonm - 2011/06/21 15:36:00
    #20
    Page: 123 > Showing page 1 of 3
    Jump to:
    © 2019 APG vNext Commercial Version 4.5