Can DMA to PMP be jitter free and fast?

Author
bodyjarrocks
New Member
  • Total Posts : 12
  • Reward points : 0
  • Joined: 2014/12/07 08:22:39
  • Location: 0
  • Status: offline
2014/12/07 08:38:39 (permalink)
0

Can DMA to PMP be jitter free and fast?

Hi

I'm using a ubw32 and I have built a small board with AD724 PAL encoder and R2R DAC. I'm trying to get 416x234 video out of the pic32 and through this DAC. I have it working by bit banging the video out through PortE in assembler. It is not perfect due to cache misses etc. Essentially, trying to do cycle accurate work on pic32 is very difficult due to cache misses that occur. Pinning code in the cache is possibly a solution but the cache is tiny. My video is 8 bit RRGGBBII in a framebuffer in RAM.

OK, Can I use DMA instead? So far with a DMA cell size of one I get nice regular (low jitter) stream at up to 3.7Mhz. Not fast enough for a line of 416 pixels in 52us (PAL). Someone else proved this limit too:

(I can't insert a HTTP link so just google "hackaday pic32 performance")

Alternatively using one DMA cell of 416 bytes is much faster and I can use PMP wait states to hit the 10 cycles per pixel I need but the jitter is all over the place. There is regular pauses in DMA transfer which translate into loss of video sync. I assume the CPU is grabbing the bus.

Can the DMA operate without interruption? I'm happy for the CPU to block.

The Maximite (google it) is using SPI but I can't as my video is 8 bit. They acheive jitter free DMA out through SPI.

Any ideas?

Mike
post edited by bodyjarrocks - 2014/12/08 03:03:18
#1

17 Replies Related Threads

    Mysil
    Super Member
    • Total Posts : 2906
    • Reward points : 0
    • Joined: 2012/07/01 04:19:50
    • Location: Norway
    • Status: online
    Re: Can DMA to PMP be jitter free and fast? 2014/12/08 05:39:39 (permalink)
    0
    Hi,
    There is a setting somewhere, maybe in the bus matrix, about who will have the highest priority in access to the bus for data transfer.
    You may try to look up BMXCON, BMXARB maybe in Family Reference Manual section 3. Memory Organisation.
     
    The PMP use at least 3 clock cycles for each write transfer, and since you use wait states to make a transfer each 10 clock cycles, there should be higher priority to DMA, above CPU data and CPU instruction access,
    but I cannot se a setting to achieve that ?
     
    You could try BMXCONbits.BMXARB = 2;
    It may help some, but I think it will not be enough to eleminate jitter.
     
    This thread is in General PIC32 forum, but you do not say wether this is MX or MZ?
    Have assumed  PIC32MX so far, but you write about cache, and on MX cache is only for Flash memory?
     
    About PIC32MX and instruction  memory access, it could help to pin the first instruction in a loop to a location in cache,
    maybe aligned to a 16 byte boundary. After that, the instruction prefetcher should be able to keep the pipeline filled, if you can write code that avoid tests and branches, until it go back to the same starting point.
     
    Mysil
    post edited by Mysil - 2014/12/08 10:01:03
    #2
    bodyjarrocks
    New Member
    • Total Posts : 12
    • Reward points : 0
    • Joined: 2014/12/07 08:22:39
    • Location: 0
    • Status: offline
    Re: Can DMA to PMP be jitter free and fast? 2014/12/09 07:25:13 (permalink)
    0
    Hi Mysil,

    Thanks very much for your response. That Bus mode you mentioned is actually the default according to the datasheet.  So that's what I'm using (assuming the bootloader didn't modify it).
     
    I wasn't too clear on a couple of things so let me clear those up.

    I'm using a UBW32 board (again just google it) and it has a PIC32MX795F512L on it.  I have essentially written two video tests for this board:
    • A manually bit-banged version
    • A DMA-powered version.
    Both versions use an 4 bits/pixel framebuffer at 416x234 resolution which consumes 48,672 bytes of RAM.

    The bit-banged version works at 416x234.  I have attached a photo of this working on my 42-inch plasma via composite input.  This version is very fragile (can easily lose sync) due to cache effects and bus contention (I guess).  Basically, the scan line rendering loop runs much faster on the software simulator versus hardware.  So I can't use the simulator and therefore I have to test on hardware which is a real pain when dealing with video.


    Here is a screen shot from my TV of the bit-banged version:

     
    The second version uses DMA and I'm still fighting with it so I wrote a simpler test of DMA and that is the version that either runs smoothly at 3.7MB/s max, or at the required speed (8MB/s) with large gaps of inactivity.  This test transfers 64 bytes and repeats itself:
    #define PMP_CONTROL (PMP_ON | PMP_MUX_OFF)
    #define PMP_MODE (PMP_DATA_BUS_8 | PMP_MODE_MASTER2 |\
                         PMP_WAIT_BEG_4 | PMP_WAIT_MID_15 | PMP_WAIT_END_4 )

    const unsigned char portOut[] = {
        0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff,
        0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff,
        0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff,
        0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff
    };
    static volatile int dmaTxferSz;
    static int dmaChn;

    int main(void) {
        AD1PCFG = 0xFFFF;
        mJTAGPortEnable(0);

        SYSTEMConfigPerformance(80000000);
        mPORTEClearBits(BIT_7 | BIT_6 | BIT_5 | BIT_4 | BIT_3 | BIT_2 | BIT_1 | BIT_0);
        mPORTESetPinsDigitalOut(BIT_7 | BIT_6 | BIT_5 | BIT_4 | BIT_3 | BIT_2 | BIT_1 | BIT_0);

        mPORTCClearBits(BIT_2 | BIT_1);
        mPORTCSetPinsDigitalOut(BIT_2 | BIT_1);
        mPORTCSetBits(BIT_2);

        mPMPOpen(PMP_CONTROL, PMP_MODE, PMP_PEN_OFF, PMP_INT_OFF);

        dmaChn = 0;
        DmaChnOpen(dmaChn, 0, DMA_OPEN_DEFAULT);
        DmaChnSetEventControl(dmaChn, DMA_EV_START_IRQ(_PMP_IRQ));
        dmaTxferSz = 64;

        INTConfigureSystem(INT_SYSTEM_CONFIG_MULT_VECTOR);
        INTEnableInterrupts();

        while (1) {
            DmaChnSetTxfer(dmaChn, portOut, (void *) &PMDIN, dmaTxferSz, 1, dmaTxferSz);
            DmaChnEnable(dmaChn);
            DmaChnStartTxfer(dmaChn, DMA_WAIT_BLOCK, 0);
        }
        mPMPClose();
    }

     
    This code exhibits the issue.  There are are bursts of 8 or so bytes followed by long pauses.

    Once I get the DMA to achieve sync on the same code-base as the bit-banged version, I'll post an image of that.

    What I can't understand is that the PIC32MX795 is capable of DMA bursts up to 20MB/s through the PMP peripheral but can't maintain more than 3.7MB/s continuously.
     
    Does anyone think it is possible to stream 8-bit data out of the PMP at more than 3.7MB/s?

    Thanks,
    Mike
     
     

    Attached Image(s)

    #3
    Mysil
    Super Member
    • Total Posts : 2906
    • Reward points : 0
    • Joined: 2012/07/01 04:19:50
    • Location: Norway
    • Status: online
    Re: Can DMA to PMP be jitter free and fast? 2014/12/09 18:44:36 (permalink)
    0
    Hi again,
    As far as I could see from Datasheet and FRM for PIC32MX795,
    it is BMXARB mode 1 that  is default, I did suggest mode 2,
    that is the rotating priority method,
    but there still is no documented setting to give DMA a consistent priority above CPU data.
     
    There have been the LCC display driver projects that do similar things,
    clocking out either display data, or external RAM addresses, using DMA to PMP.
     
       Mysil
    #4
    bodyjarrocks
    New Member
    • Total Posts : 12
    • Reward points : 0
    • Joined: 2014/12/07 08:22:39
    • Location: 0
    • Status: offline
    Re: Can DMA to PMP be jitter free and fast? 2014/12/10 01:54:05 (permalink)
    0
    You are certainly correct. My apologies. I had a datasheet read fail. I'll try that mode. I can't see how fast it rotates but it might be for every bus access which could work.

    I have been through the LCC example and I have seen it on my oscilloscope. As far as I can tell the stream from the PMP port is not regular. It relies on a pixel clock from the PMP port. Since data is clocked to the display it can be irregular.

    Thanks again
    #5
    bodyjarrocks
    New Member
    • Total Posts : 12
    • Reward points : 0
    • Joined: 2014/12/07 08:22:39
    • Location: 0
    • Status: offline
    Re: Can DMA to PMP be jitter free and fast? 2014/12/10 07:05:56 (permalink)
    0
    I have tried the BMXARB = 0x2 but that has not helped.  I captured a trace from one of the PMP port pins and is attached to this post below.  It looks like I get about 13 bytes out with reasonably regularly and then there is a strange pause.  I'm not sure exactly what my oscilloscope is telling me between the bursts as the sampled voltage is around 1.5Volts???.  This is strange for the PIC32 which is 3.3V or 0V in digital I/O mode.
     
    I may have to give up on DMA. Sigh.  Code is attached for interest.
     

     
    #include <stdlib.h>
    #include <plib.h>

    #define PMP_CONTROL (PMP_ON | PMP_MUX_OFF)
    #define PMP_MODE (PMP_DATA_BUS_8 | PMP_MODE_MASTER2 |\
                             PMP_WAIT_BEG_4 | PMP_WAIT_MID_15 | PMP_WAIT_END_4 )

    const unsigned char portOut[] = {
        0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff,
        0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff,
        0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff,
        0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff, 0x00, 0xff
    };

    static volatile int dmaTxferSz;
    static int dmaChn;

    int main(void) {
        AD1PCFG = 0xFFFF;
        mJTAGPortEnable(0);

        SYSTEMConfigPerformance(80000000);
        BMXCONbits.BMXARB = 0x02;

        mPORTEClearBits(BIT_7 | BIT_6 | BIT_5 | BIT_4 | BIT_3 | BIT_2 | BIT_1 | BIT_0);
        mPORTESetPinsDigitalOut(BIT_7 | BIT_6 | BIT_5 | BIT_4 | BIT_3 | BIT_2 | BIT_1 | BIT_0);

        mPORTCClearBits(BIT_2 | BIT_1);
        mPORTCSetPinsDigitalOut(BIT_2 | BIT_1);
        mPORTCSetBits(BIT_2);

        mPMPOpen(PMP_CONTROL, PMP_MODE, PMP_PEN_OFF, PMP_INT_OFF);

        dmaChn = 0;
        DmaChnOpen(dmaChn, 0, DMA_OPEN_DEFAULT);
        DmaChnSetEventControl(dmaChn, DMA_EV_START_IRQ(_PMP_IRQ));

        dmaTxferSz = 64;
        INTConfigureSystem(INT_SYSTEM_CONFIG_MULT_VECTOR);
        INTEnableInterrupts();

        while (1) {
            DmaChnSetTxfer(dmaChn, portOut, (void *) &PMDIN, dmaTxferSz, 1, dmaTxferSz);
            DmaChnEnable(dmaChn);
            DmaChnStartTxfer(dmaChn, DMA_WAIT_BLOCK, 0);
        }
        return 0;
    }

     
     

    Attached Image(s)

    #6
    bodyjarrocks
    New Member
    • Total Posts : 12
    • Reward points : 0
    • Joined: 2014/12/07 08:22:39
    • Location: 0
    • Status: offline
    Re: Can DMA to PMP be jitter free and fast? 2014/12/10 08:05:31 (permalink)
    0
    Upon further investigation, the delay is simply the time between DMA transfers due to the while(1) loop.  To prove this I simply added a delay in the while loop and the delay in the screenshot above increased.
     
    So perhaps it is working?  However, I can't explain why the transfer is only 13 bytes long instead of the 64 bytes I requested.
     
    More investigation is required.
    #7
    vini_i
    Super Member
    • Total Posts : 398
    • Reward points : 0
    • Joined: 2014/01/16 17:51:55
    • Location: Ohio, United States
    • Status: offline
    Re: Can DMA to PMP be jitter free and fast? 2014/12/10 08:23:16 (permalink)
    0
    i'm not super familiar with using macros to set up dma, i normally set it up directly. that said...
    if i understand your code correctly i don't see any where that you wait for the dma to complete it's transfer. 
    you set it, enable it, start transfer and repeat. that is probably overwriting dma over and over again. 
    try using and interrupt to service the dam. that way the dma interrupt will trigger only after the transfer has completed. 
    #8
    bodyjarrocks
    New Member
    • Total Posts : 12
    • Reward points : 0
    • Joined: 2014/12/07 08:22:39
    • Location: 0
    • Status: offline
    Re: Can DMA to PMP be jitter free and fast? 2014/12/10 15:34:36 (permalink)
    0
    The call to start the DMA transfer in the while loop takes a parameter that is set to spin in that function while DMA in progress then return. I suspect that is working because if I insert my own delay after starting DMA is still only get 13 bytes.

    You'll probably end up being right I'll need to rename this thread to: Can Mike read the datasheet? :-)

    Thanks mike
    #9
    bodyjarrocks
    New Member
    • Total Posts : 12
    • Reward points : 0
    • Joined: 2014/12/07 08:22:39
    • Location: 0
    • Status: offline
    Re: Can DMA to PMP be jitter free and fast? 2014/12/10 15:46:50 (permalink)
    0
    I had a thought about the strange level I saw on my scope between transfers. There is an LED and resistor connected to that pin. Perhaps the PMP port is switched to high impedance at the end of the write operation?

    If so is it possible to keep the port as an output always? If not I'll have to find another way to set the black level at the end of the visible part of the scanline.
    #10
    Mysil
    Super Member
    • Total Posts : 2906
    • Reward points : 0
    • Joined: 2012/07/01 04:19:50
    • Location: Norway
    • Status: online
    Re: Can DMA to PMP be jitter free and fast? 2014/12/10 17:59:47 (permalink)
    3 (1)
    Hi,
    Yes, PMP is a bidirectional bus interface, so when there is no Write data and WAITE states have been performed, IO lines will be Tristate highZ.
    If you want a specific state when lines are not driven by PMP, you could pull PMD lines Up or down using resistors.
     
       Mysil
    #11
    bodyjarrocks
    New Member
    • Total Posts : 12
    • Reward points : 0
    • Joined: 2014/12/07 08:22:39
    • Location: 0
    • Status: offline
    Re: Can DMA to PMP be jitter free and fast? 2014/12/11 05:20:14 (permalink)
    3 (1)
    Thanks. That's a pity. PIC24 had a feature called BUSKEEP that acheived this. I guess I'll need pull down resistors.
    #12
    Nigle
    Super Member
    • Total Posts : 297
    • Reward points : 0
    • Joined: 2008/10/14 04:09:08
    • Location: London
    • Status: offline
    Re: Can DMA to PMP be jitter free and fast? 2014/12/11 07:26:22 (permalink)
    3 (1)
    Why not just use a normal port instead of PMP? That way the outputs will be active all the time.
     
    #13
    bodyjarrocks
    New Member
    • Total Posts : 12
    • Reward points : 0
    • Joined: 2014/12/07 08:22:39
    • Location: 0
    • Status: offline
    Re: Can DMA to PMP be jitter free and fast? 2014/12/11 08:20:36 (permalink)
    3 (1)
    Yeah I have tried that.  Unfortunately using a normal port (and DMA which is the goal) i'm limited to 3.7MB/s.  That is too low for 416x234.  I believe the PMP is capable of higher speeds compared to normal ports because it has a 4 byte buffer which helps it absorb blockages due to bus contention.
     
     
    post edited by bodyjarrocks - 2014/12/12 02:28:16
    #14
    Mysil
    Super Member
    • Total Posts : 2906
    • Reward points : 0
    • Joined: 2012/07/01 04:19:50
    • Location: Norway
    • Status: online
    Re: Can DMA to PMP be jitter free and fast? 2014/12/12 06:02:32 (permalink)
    0
    Hi again,
     
    Have experimented with your code and made some modifications.
    Have not reached a frequency much above 5 MHz with PMP and cell size 1.
    It seem DMA have a latency about 12 cycles from the Interrupt signal is raised by PMP.
    Red trace in the second image is the delay function waiting while DMA transfer is active.
    The scope used, do not sample fast enough to give a good trace of the data signal
    when sampling 2 channels, the data signal is the same in both images.
     
    I think it should be possible to write  C code to transfer data faster than 5 MHz,
    will make a try on that later.
     
    The delay function in the modified code use Core timer in MIPS system coprocessor CP0,
    and do not access the Data bus. Delay function is set for 14 microsecond, plus there is some overhead.
     
    I think the problem in the original sample code may be that there is no synchronization,
    so PMP get overrun by too much data.
    Maybe, if you reduce number of wait states in PMP, it may be able to process data as transferred, but without cell syncronization, wait states cannot be used to adjust video frequency.
     
    Regards,
       Mysil
     
    Edit: One more issue with the test program:
    In the thread, it has been explained that you want to transfer display data from a Frame buffer in RAM,
    but in the test program, test data "portOut" is declared const unsigned char.
    In PIC, "const" declaration has the effect that data are stored in Flash program memory.
    This will cause DMA to suffer (2) wait state cycles while reading data from Flash,
    maybe more, since DMA may also have to compete with CPU instruction access to Flash.
     
    Enabling Data Caching may help some, but will also increase jitter.
    To make test program more realistic, const declaration should be removed from PortOut.
     
       Mysil
    post edited by Mysil - 2014/12/13 08:55:07

    Attached Image(s)

    #15
    bodyjarrocks
    New Member
    • Total Posts : 12
    • Reward points : 0
    • Joined: 2014/12/07 08:22:39
    • Location: 0
    • Status: offline
    Re: Can DMA to PMP be jitter free and fast? 2014/12/13 17:12:19 (permalink)
    0
    Well done Mysil. I'll give this a try myself when I get a moment. I see what you mean about the framing of transfers. I thought the PMP would drive the framing but perhaps that can only occur inter-cell. 5Mhz is better that 3.7MHz. I wonder how using RAM for portOut would change that...
    #16
    bodyjarrocks
    New Member
    • Total Posts : 12
    • Reward points : 0
    • Joined: 2014/12/07 08:22:39
    • Location: 0
    • Status: offline
    Re: Can DMA to PMP be jitter free and fast? 2014/12/14 00:17:25 (permalink)
    0
    Hi Mysil,
     
    Very interesting.  I tried your code and it works well.  I removed the "const" to move the buffer to RAM and this boosts the speed to 6.25 MHz:
     
     
     
    On top of that I bumped the cell size from 1 -> 2 and that boosts speed to 19.2MHz:
     

     
    This however makes the waveform asymmetric.  I have no idea why.  Any ideas? I assume using a cell size of 2 is ok given that the PMP has a 4 byte buffer.
     
    The problem i guess now is that this kind of DMA method is never going to work for video because the DMA is pushing as fast as it can and is not throttled by anything.  The PMP is just eating the bytes as fast as it can but is not in control of the feed-in rate from the DMA engine.
     
    I'm starting to wonder if there is anyway I can achieve control with DMA.  The Timer triggered method seemed to max out at 3.7MHz.
     
    Thanks a million Mysil.
     
    More thinking required I think...
     
     
     

    Attached Image(s)

    #17
    bodyjarrocks
    New Member
    • Total Posts : 12
    • Reward points : 0
    • Joined: 2014/12/07 08:22:39
    • Location: 0
    • Status: offline
    Re: Can DMA to PMP be jitter free and fast? 2014/12/15 07:45:49 (permalink)
    3 (1)
    I read the PMP data sheet again and the 4 byte buffer in the PMP peripheral is only for slave modes.  So the second screenshot in my previous post can be explained as I'm simply pushing the PMP peripheral too hard.
     
    My Requirement:  I want to clock pixels (8bpp) at 8MHz out of my 80MHz PIC32 micro.  So I need to push a pixel out every 10 cycles.
     
    Using DMA appears to be out of the question.  Driven by a timer the DMA maxes out at 3.7MHz.  Driven by PMP interrupts the DMA maxes out at 6.25MHz.
     
    I thought of another idea.  What if I setup the PMP in Buffered Parallel Slave Port Mode?  This mode actually has a 4-byte buffer.  I could trigger each single byte Slave Read (PIC32 puts a pixel on the bus) by driving the PMRD signal via a loopback wire to an Output Compare peripheral which is generating an 8MHz square wave.  That should have perfect timing.  I could use the PMP to interrupt the CPU after transferring 4 bytes and it could load the next 4 bytes into the PMDOUT buffer.
     
    The problem will be interrupt latency.  Sending 4 bytes will take 40 cycles.  I'd get an interrupt for every byte sent out I think.  The very first interrupt will signal 1 byte sent and 3 bytes in the buffer.  By the time I have control in the interrupt it is likely that 4 bytes have been sent assuming I can get interrupt latency under 40 cycles.  I'd then check the buffer empty flag and spin while not empty.  I'd then transfer in 4 fresh bytes from the active scan line.
     
    It's far fetched but if I understand the data sheet it might work and should meet my jitter requirements.
     
    Any thoughts?  I'll try it out soon.
     
    Mike.
    #18
    Jump to:
    © 2018 APG vNext Commercial Version 4.5