• AVR Freaks

Hot!Troubleshooting malloc

Author
jrmymllr
Super Member
  • Total Posts : 196
  • Reward points : 0
  • Joined: 2011/06/11 06:05:25
  • Location: MPLS
  • Status: offline
2020/05/25 14:18:59 (permalink)
5 (1)

Troubleshooting malloc

I've spent the last two days trying to track down a heap corruption problem and I still haven't figured it out. Calling malloc in some circumstances causes a TLB exception. It's a somewhat complex system and I haven't been able to repeat the issue in the MPLAB simulator as it uses Ethernet, but it is repeatable. I'm using XC32 v2.40.
 
Any tips on tracking down malloc crashes? I've already checked the obvious stuff. Malloc is a black box and that's what's frustrating. I can set a breakpoint on the precise malloc call that throws the exception, but then what? Malloc is a black box. (I should add the malloc call that causes the exception is in a simple function that's called multiple times for different sizes, so it's not the syntax.)
 
It's nearly impossible to not use dynamic memory allocation. I have two audio decoder libraries that run one at at time, and in particular contains many malloc calls.
 
 
post edited by jrmymllr - 2020/05/25 14:28:33
#1

17 Replies Related Threads

    andersm
    Super Member
    • Total Posts : 2834
    • Reward points : 0
    • Joined: 2012/10/07 14:57:44
    • Location: 0
    • Status: offline
    Re: Troubleshooting malloc 2020/05/25 16:38:52 (permalink)
    5 (1)
    Malloc isn't re-entrant, so make sure you're not calling memory allocation functions from interrupts, or multiple threads without locking. Buffer over/underflows can corrupt malloc's internal data structures, as can double frees. The stack can grow and collide with the heap.
    #2
    jrmymllr
    Super Member
    • Total Posts : 196
    • Reward points : 0
    • Joined: 2011/06/11 06:05:25
    • Location: MPLS
    • Status: offline
    Re: Troubleshooting malloc 2020/05/25 17:31:45 (permalink)
    0
    andersm
    Malloc isn't re-entrant, so make sure you're not calling memory allocation functions from interrupts, or multiple threads without locking. Buffer over/underflows can corrupt malloc's internal data structures, as can double frees. The stack can grow and collide with the heap.


    No RTOS used here. You got me thinking and I found one instance of malloc and free that could occur in an interrupt so I got rid of that. No other instances of malloc or free that could occur in an interrupt and still the exact same problem :(
     
    This is going to be a tough one. Some type of debugging malloc for small embedded systems would be ideal but I don't know of any. Found some for operating systems.
    #3
    ric
    Super Member
    • Total Posts : 28024
    • Reward points : 0
    • Joined: 2003/11/07 12:41:26
    • Location: Australia, Melbourne
    • Status: online
    Re: Troubleshooting malloc 2020/05/25 17:35:14 (permalink)
    0
    Are you always checking if malloc returned NULL?
     

    I also post at: PicForum
    Links to useful PIC information: http://picforum.ric323.co...opic.php?f=59&t=15
    NEW USERS: Posting images, links and code - workaround for restrictions.
    To get a useful answer, always state which PIC you are using!
    #4
    jrmymllr
    Super Member
    • Total Posts : 196
    • Reward points : 0
    • Joined: 2011/06/11 06:05:25
    • Location: MPLS
    • Status: offline
    Re: Troubleshooting malloc 2020/05/25 18:33:26 (permalink)
    5 (1)
    ric
    Are you always checking if malloc returned NULL?
     

    Yes. When this occurs it displays a message on my LCD and goes into while(1). Or I just set a breakpoint. 
     
    I recently (30 minutes ago) came up with a low effort idea to track this down. It involves a custom memory allocator that might at least let me rule out the audio decoder.
     
    In the meantime, I'm still open to ideas!
    #5
    NorthGuy
    Super Member
    • Total Posts : 6230
    • Reward points : 0
    • Joined: 2014/02/23 14:23:23
    • Location: Northern Canada
    • Status: offline
    Re: Troubleshooting malloc 2020/05/25 20:10:06 (permalink)
    5 (1)
    There are lots of ways to corrupt malloc, most common are freeing something which is not allocated, freeing something twice, or writing outside of the memory being allocated.
    #6
    al_bin
    Super Member
    • Total Posts : 214
    • Reward points : 0
    • Joined: 2011/02/11 06:28:47
    • Location: 0
    • Status: offline
    Re: Troubleshooting malloc 2020/05/25 22:16:24 (permalink)
    5 (1)
    Maybe you could wrap malloc() and free() in some logging function?
    #7
    jrmymllr
    Super Member
    • Total Posts : 196
    • Reward points : 0
    • Joined: 2011/06/11 06:05:25
    • Location: MPLS
    • Status: offline
    Re: Troubleshooting malloc 2020/05/26 05:27:12 (permalink)
    4 (1)
    al_bin
    Maybe you could wrap malloc() and free() in some logging function?



    Hmm, that's something I've thought about but maybe not seriously enough. This library makes up to 77 malloc calls (yeah it's crazy) for some audio streams, and it crashes before calling free() even once so don't have to worry about free(). What I might try is asking malloc for more than what the library is asking for, putting a string at the start and end of the space (like 0xDEADBEEF) and returning a pointer just past that.
    #8
    NorthGuy
    Super Member
    • Total Posts : 6230
    • Reward points : 0
    • Joined: 2014/02/23 14:23:23
    • Location: Northern Canada
    • Status: offline
    Re: Troubleshooting malloc 2020/05/26 05:37:58 (permalink)
    0
    jrmymllr
    What I might try is asking malloc for more than what the library is asking for, putting a string at the start and end of the space (like 0xDEADBEEF) and returning a pointer just past that.



    The corrupted memory will not necessarily be adjacent to the allocated piece.
     
    If you can reproduce the problem, you do not want the memory layout to change, as this may (or may not) make the problem unreproducable. This will feel like the bug magically disappeared, but in fact the things got worse.
     
    Instead try to comment out various memory writes to single out the write which causes the problem. This is probably somewhere in the code you have added lately.
    #9
    Jim Nickerson
    User 452
    • Total Posts : 6725
    • Reward points : 0
    • Joined: 2003/11/07 12:35:10
    • Location: San Diego, CA
    • Status: offline
    Re: Troubleshooting malloc 2020/05/26 06:00:49 (permalink)
    5 (4)
    Maybe you could log the calls to malloc with wrap

     

    Attached Image(s)

    #10
    jrmymllr
    Super Member
    • Total Posts : 196
    • Reward points : 0
    • Joined: 2011/06/11 06:05:25
    • Location: MPLS
    • Status: offline
    Re: Troubleshooting malloc 2020/05/26 06:33:47 (permalink)
    0
    NorthGuy
     
    The corrupted memory will not necessarily be adjacent to the allocated piece.
     
    If you can reproduce the problem, you do not want the memory layout to change, as this may (or may not) make the problem unreproducable. This will feel like the bug magically disappeared, but in fact the things got worse.
     
    Instead try to comment out various memory writes to single out the write which causes the problem. This is probably somewhere in the code you have added lately.




    I have seen differing behavior depending on optimization level and added code, but it always crashes somewhere, sometimes malloc, sometimes elsewhere. 
     
    Commenting out memory writes to track this down is nearly impossible. The codec library is full of complex math routines and loops I don't understand, and taking something out might cause an unrelated problem.
    #11
    jrmymllr
    Super Member
    • Total Posts : 196
    • Reward points : 0
    • Joined: 2011/06/11 06:05:25
    • Location: MPLS
    • Status: offline
    Re: Troubleshooting malloc 2020/05/26 06:35:53 (permalink)
    5 (2)
    JANickerson
    Maybe you could log the calls to malloc with wrap

     
    Oh that's intriguing. I didn't know XC32 had this.  
    #12
    friesen
    Super Member
    • Total Posts : 2149
    • Reward points : 0
    • Joined: 2008/05/08 05:23:35
    • Location: Indiana, USA
    • Status: offline
    Re: Troubleshooting malloc 2020/05/27 05:37:10 (permalink)
    5 (3)
    Keep in mind that some string manipulators will malloc.  Also make sure you wrap calloc.
     
    If you are operating in uncached malloc'ed space, beware, this won't work unless you align to a cache line and do proper cache manipulations.  Malloc typically uses the area before or after to mark the location as a linked list, so any overrun or cache writebacks on spanned areas will trash memory and you'll get undefined behavior like tlb faults.
     
    umm_malloc has a much better allocation viewer and info dump.  I wrapped it and use it instead of xc32's malloc. It does have to be reworked for greater than 512K allocation though. It does have defines where you can disable interrupts or rtos core for thread safety.

    Erik Friesen
    #13
    jrmymllr
    Super Member
    • Total Posts : 196
    • Reward points : 0
    • Joined: 2011/06/11 06:05:25
    • Location: MPLS
    • Status: offline
    Re: Troubleshooting malloc 2020/05/27 10:04:18 (permalink)
    0
    friesen
    Keep in mind that some string manipulators will malloc.  Also make sure you wrap calloc.

    Is it listed somewhere which use malloc? I do use a decent number of string functions that could be running in an interrupt so that's something to look at. What's the reason for wrapping calloc? I don't directly use it, unless a lib function is.
    friesen 
    If you are operating in uncached malloc'ed space, beware, this won't work unless you align to a cache line and do proper cache manipulations.  Malloc typically uses the area before or after to mark the location as a linked list, so any overrun or cache writebacks on spanned areas will trash memory and you'll get undefined behavior like tlb faults.

    I believe I'm ok there but I'm investigating to make sure.
    friesen  
    umm_malloc has a much better allocation viewer and info dump.  I wrapped it and use it instead of xc32's malloc. It does have to be reworked for greater than 512K allocation though. It does have defines where you can disable interrupts or rtos core for thread safety.



    Looks straightforward, thank you for bringing this up. What was the reason for wrapping it, do you do a heap sanity check first, check disregard null pointers before free()ing etc. or something entirely different?
    #14
    friesen
    Super Member
    • Total Posts : 2149
    • Reward points : 0
    • Joined: 2008/05/08 05:23:35
    • Location: Indiana, USA
    • Status: offline
    Re: Troubleshooting malloc 2020/05/27 11:47:15 (permalink)
    5 (2)
    calloc calls malloc, and isn't guaranteed to call wrapped malloc.  Inter-library calls especially.
     
    I wrapped malloc for some different reasons, 
    1. XC32 malloc/calloc/free is rather an opaque blob, who knows how heap fragmentation gets handled.
    2. Viewing memory info is nearly an undocumented feature of xc32, not very usable, it basically dumps into a serial port.
    3. I wanted more than one heap, especially one for the mz ddr.
    4. I wrote custom allocators that can auto add/shift to cache lines.
    5. Heap sanity checks are helpful in development.

    Erik Friesen
    #15
    jrmymllr
    Super Member
    • Total Posts : 196
    • Reward points : 0
    • Joined: 2011/06/11 06:05:25
    • Location: MPLS
    • Status: offline
    Re: Troubleshooting malloc 2020/05/29 14:57:33 (permalink)
    0
    Just an update after staying away from this for a couple days. It was the stack.....Who would have knew an audio decoder library needed over 90K of stack (and 150K heap)!!?? This is why walking away from a problem for a day or two is helpful. I should have checked the stack days ago!
    #16
    friesen
    Super Member
    • Total Posts : 2149
    • Reward points : 0
    • Joined: 2008/05/08 05:23:35
    • Location: Indiana, USA
    • Status: offline
    Re: Troubleshooting malloc 2020/05/29 15:03:31 (permalink)
    4 (1)
    Which audio decoder?
     
    That is somewhat astounding.  The Franhofer c++ aac decoder nor the Microchip MP3 decoder do not use anything close to that amount of stack.

    Erik Friesen
    #17
    jrmymllr
    Super Member
    • Total Posts : 196
    • Reward points : 0
    • Joined: 2011/06/11 06:05:25
    • Location: MPLS
    • Status: offline
    Re: Troubleshooting malloc 2020/05/29 17:11:09 (permalink)
    0
    friesen
    Which audio decoder?
     
    That is somewhat astounding.  The Franhofer c++ aac decoder nor the Microchip MP3 decoder do not use anything close to that amount of stack.


    It's FAAD2 2.8.8, when it's decoding HE-AAC v2 (SBR and PS). When it's doing a stream without SBR or PS it's far less stack, heap and CPU cycles.
     
    I'm also using the Helix MP3 decoder and that's around 25K heap and hardly any stack. Previously I was using the Helix AAC decoder and that was 20-something KB heap, and again hardly any stack. But, when SBR was enabled heap usage went up by another 50-something KB.
     
    FAAD2 has large arrays as local variables in multiple functions. If it wasn't my desire for PS (parametric stereo) I would have stuck with Helix.
     
    #18
    Jump to:
    © 2020 APG vNext Commercial Version 4.5