• AVR Freaks

Hot![LIBERO] Interfacing Arm to custom logic

Author
blipton
New Member
  • Total Posts : 22
  • Reward points : 0
  • Joined: 2009/11/13 12:52:25
  • Location: 0
  • Status: offline
2020/03/21 15:04:20 (permalink)
0

[LIBERO] Interfacing Arm to custom logic

I'm following App Note AC335 to instantiate an AP3 wrapper/slave around a reg16x8 ram block.  While there's no synthesis or routing warning clues, once programmed, any cpu access to that memory results in a hard-fault.   I'm using Digikey's Microsemi SF2 Starter Kit.
 
I had to slightly modify 'reg_apb_wrp.v' from the 2012 app note in order to get it to connect in Libero 12.3, but not sure if this is the problem or it's something else.
 
 
#1

9 Replies Related Threads

    I52897
    New Member
    • Total Posts : 1
    • Reward points : 0
    • Joined: 2020/03/20 03:44:21
    • Location: 0
    • Status: offline
    Re: [LIBERO] Interfacing Arm to custom logic 2020/03/24 04:53:19 (permalink)
    0
    Hello,
     
    Smartfusion2 FPGA comes with a 32-bit Arm® Cortex™-M3 based MSS.
    You might find it useful, to check with 32 bit data width while calling reg16X8 module to avoid hard fault, instead of byte addressing.
    Also address will be 16 bit.
    #2
    Phoenix136
    New Member
    • Total Posts : 5
    • Reward points : 0
    • Joined: 2020/03/19 11:26:13
    • Location: 0
    • Status: offline
    Re: [LIBERO] Interfacing Arm to custom logic 2020/03/25 21:25:36 (permalink)
    0
    I just figured this out and I assume you're making the same naive mistake as me.
     
    You may notice that all official peripherals (MSS and fabric) have register addresses that increment by 4. e.g 0x00, 0x04, 0x08, etc. Intermediate addresses are valid for things like CoreABC but they cause a hard fault in the Arm core.
     
    Why? My understanding at the moment is that the register addresses DO represent byte locations. However, the Arm core operates on words which are 32 bits or 4 bytes. These 4 byte groupings are accessed as a single block using the address of the first byte and attempting to access a word at any intermediate byte address, e.g. 0x01, would result in accessing the data in byte locations 0x01, 0x02, 0x03, and 0x04. If you then access 0x02, you will get the data in byte locations 0x02, 0x03, 0x04, 0x05. To avoid all the potential problems from this the Arm core hard faults.
     
    This means that custom cores in the FPGA fabric must abide by these rules when connecting to the MSS through the FIC even if they don't need to for their own function.
    post edited by Phoenix136 - 2020/03/25 21:47:47
    #3
    blipton
    New Member
    • Total Posts : 22
    • Reward points : 0
    • Joined: 2009/11/13 12:52:25
    • Location: 0
    • Status: offline
    Re: [LIBERO] Interfacing Arm to custom logic 2020/03/26 16:38:22 (permalink)
    0
    If I understand correctly, you're saying the 8-bit slave should be instantiated like:
    reg16x8 reg16x8_0 (.clk(PCLK), .nreset(PRESETN), .wr_en(wr_enable),  .rd_en(rd_enable), .addr(PADDR[5:2]), .data_in(PWDATA[7:0]), .data_out(PRDATA[7:0]));
     
    and 32-bit data by the CPU would be accessed via 4 byte offsets:
    uint8_t *pUserLogic8 = (uint8_t*)(0x50000000L);
    *(pUserLogic8+(0*4)) = 0xDE;
    *(pUserLogic8+(1*4)) = 0xAD;
    *(pUserLogic8+(2*4)) = 0xBE;
    *(pUserLogic8+(3*4)) = 0xEF;
    test8 = *(pUserLogic8+(0*4));
    test8 = *(pUserLogic8+(1*4));
    test8 = *(pUserLogic8+(2*4));
    test8 = *(pUserLogic8+(3*4));
     
    and for a 16-bit slave :
    reg16x16 reg16x16_0 (.clk(PCLK), .nreset(PRESETN), .wr_en(wr_enable),  .rd_en(rd_enable), .addr(PADDR[4:1]), .data_in(PWDATA[7:0]), .data_out(PRDATA[7:0]));
     
    The cpu would still address via 4 byte boundaries: 
    uint16_t *pUserLogic16 = (uint16_t*)(0x50000000L);
    pUserLogic16 [0] = 0xDEAD;
    pUserLogic16 [1] = 0xBEEF;
    test16 = pUserLogic16 [0];
    test16 = pUserLogic16 [1];
     
    post edited by blipton - 2020/03/26 16:54:51
    #4
    Phoenix136
    New Member
    • Total Posts : 5
    • Reward points : 0
    • Joined: 2020/03/19 11:26:13
    • Location: 0
    • Status: offline
    Re: [LIBERO] Interfacing Arm to custom logic 2020/03/26 17:57:58 (permalink)
    0
    I think your C code will work. All the values look valid. (edit because of your edit, I have no idea how that array version will resolve).
    I'm super rusty on pointers though so as long as your addresses are 32 bit and your casting things to the right places I guess it would work. If you were to instantiate a CoreGPIO or some other official core in the fabric via the system builder and exported the firmware you would see that the included drivers don't use pointers and instead have a chain of functions that work their way down to ARM assembly.
     
    Your 8 bit peripheral looks to have addressing that will work.
    Your 16 bit peripheral doesn't. It should use the same PADDR[5:2] as the 8 bit. If you were to have 32x8 and 32x16 memories it would be PADDR[6:2] for both.
    post edited by Phoenix136 - 2020/03/26 18:08:03
    #5
    blipton
    New Member
    • Total Posts : 22
    • Reward points : 0
    • Joined: 2009/11/13 12:52:25
    • Location: 0
    • Status: offline
    Re: [LIBERO] Interfacing Arm to custom logic 2020/03/26 21:02:17 (permalink)
    0
    Interestingly, the logic I'm trying to test has a 16-bit data bus, with registers 0 - 0x20, and the only way I got the cpu to read known values from certain registers, was to feed it PADDR [ :1] and reference it in software from a 16-bit pointer (pUserLogic16 [4])!  
     
    Just curious, do you have any thoughts on running custom logic on a clock that's different than the CPU core?   For example, the above register interface works when using FIC_0_CLK (100Mhz).  However, ultimately the block is required to run at 20Mhz.. yet if I use FAB_CCC_GL1 CLK (20Mhz), the data read-back is bad.   It doesn't seem like the CoreAHBTOAPB3/CoreAPB3 blocks are handling any clock domain switching, have you run into anything like this?  
    #6
    Phoenix136
    New Member
    • Total Posts : 5
    • Reward points : 0
    • Joined: 2020/03/19 11:26:13
    • Location: 0
    • Status: offline
    Re: [LIBERO] Interfacing Arm to custom logic 2020/03/27 01:15:02 (permalink)
    0
    Accessing memory with PADDR [#:1] is certainly valid, its just that you skip every other RAM location.
     
    Is pUserLogic16 [4] the same as 0x50000000L + 4?
     
    Given that PADDR [1:0] is 2'b00.
    PADDR [5:1] gives possible addresses 0x00, 0x02, 0x04, 0x06, ..., 0x1E
    0x50000000L should return data in memory location 0x00
    0x50000004L should return data in memory location 0x02
    0x50000008L should return data in memory location 0x04
    etc.
     
    If this isn't happening then there's something about your fabric memory implementation I'm missing. The ARM core is complex so it has the 4 byte boundary alignment thing going on. Fabric memory should be straight forward: addresses incremented by 1 with data widths of whatever length you want.
     
     
    The ARM core runs off the M3_CLK.
    MSS peripherals run off APB_0_CLK and APB_1_CLK depending on which MSS APB bus they're on.
    Fabric peripherals run off FIC_0_CLK.
     
    From what I can tell these all have CDC implemented (honestly they better since its all within the MSS and we're given the option to set the clock dividers). Part of my debugging was actually to set the FIC_0_CLK divider so I could run my fabric stuff at less than 20Mhz and then connect some signals to IO which I could then view on my logic analyzer. Everything up to my bad address accesses seemed to work perfectly.
     
    If you're leaving FIC_0_CLK at 100Mhz and running your peripheral at 20Mhz then you've created a situation where the bus master and bus slave are operating on 2 different clocks. With PREADY set to constant '1', the APB master will start the transaction, and then read PREADY on the next clock cycle (100Mhz) and use whatever is on PRDATA before the peripheral has even seen a rising edge.
    post edited by Phoenix136 - 2020/03/27 01:19:31
    #7
    blipton
    New Member
    • Total Posts : 22
    • Reward points : 0
    • Joined: 2009/11/13 12:52:25
    • Location: 0
    • Status: offline
    Re: [LIBERO] Interfacing Arm to custom logic 2020/03/28 14:05:45 (permalink)
    0
    Normally, or at least with Altera's Nios (avalon bus) the compiler calculates the address based on the size of the element that it's point to.  And the logic has Byte Enable signals that can be used for decoding.  In the software, an 8-bit pointer with an offset of 4, pUserLogic8 [4] or *(pUserLogic8 + 4), would translate to an address of 0x50000000L + (4*1)..
    a 16 bit pointer with an offset of 4, pUserLogic16 [4] or *(pUserLogic16+4), is 0x50000000L + (4*2).   
    and offset 4 of a 32-bit pointer, pUserLogic32 [4] or *(pUserLogic32+4), is 0x50000000L + (4*4).  
     
    Of course with Libero/Arm, I am flying blind on how all this works, so I'm not certain how memory/bytes are are getting translated.   Not having SignalTap or some other internal logic analyzer makes it even more of a challenge.
     
    Regarding the clocking, the tool takes care of CDC between the M3 and FIC0.. so you tested your logic works at 20Mhz by also running the FIC_0_CLK at 20mhz.   I guess I don't have any reason to run FIC_0_CLK faster at the moment, unless I add more external/custom logic that doesn't require the slow-speed... in which case, the slow logic wrapper would need to use PREADY to stall the bus, do I understand that correctly?
    #8
    Phoenix136
    New Member
    • Total Posts : 5
    • Reward points : 0
    • Joined: 2020/03/19 11:26:13
    • Location: 0
    • Status: offline
    Re: [LIBERO] Interfacing Arm to custom logic 2020/03/28 14:53:27 (permalink)
    0
    Yes on the clocks. Once you leave the FIC you're in fabric land where you have to do your own CDC. CoreAPB3 does not have CDC built in, I suppose the thinking is "why would u have differently clocked peripherals on the same bus?"
     
    PREADY to stall the bus should sort of work maybe? I'm imagining the situation where the slave responds by setting PREADY high to finish a transaction and the master on the fast clock begins and checks PREADY in the time it takes the slave to lower PREADY again. Not to mention all the other CDC problems on every other wire in the bus if the clocks are synchronized.
     
    I like to run peripherals at the bus speed and if they need to do things slower (e.g. I2C, SPI busses) use a counter/clock divider that triggers the logic to update at the higher speed. I think the only time I can foresee using PREADY like that is when responding takes multiple cycles at the bus speed. e.g. read from SRAM, perform calculation, put result on the bus.
     
    something like:
     
    if(slow_clk /= cnt_limit)
         slow_clk <= slow_clk  + 1;
         updt_logic <= '0';
    else
         slow_clk <= 0;
         updt_logic <= '1';
    end if;
    #9
    blipton
    New Member
    • Total Posts : 22
    • Reward points : 0
    • Joined: 2009/11/13 12:52:25
    • Location: 0
    • Status: offline
    Re: [LIBERO] Interfacing Arm to custom logic 2020/04/01 20:51:06 (permalink)
    0
    Thanks, I'd also like to run the fabric logic using a 100Mhz FIC clock, and dividing it down inside the custom logic.   Unfortunately, the main block I'm trying to connect is an obfuscated RTL 1553 IP.     It's synthesized for 20Mhz, so while I could feed it a fast FIC clock and the registers will work, the 1553 timing will be off.    However, since I don't have any other custom logic that requires higher speed, I think setting the FIC for 20Mhz and using that for the IP and any other logic should work!    Thanks for the help!
    #10
    Jump to:
    © 2020 APG vNext Commercial Version 4.5