Hot!Harmony v2.05.01 Notes: File System

Author
jdeguire
Super Member
  • Total Posts : 437
  • Reward points : 0
  • Joined: 2012/01/13 07:48:44
  • Location: United States
  • Status: online
2018/03/09 13:58:54 (permalink)
5 (2)

Harmony v2.05.01 Notes: File System

I plan on making a few threads about some of the issues I continue to see in Harmony.  Many of these I have been able to correct on my end, but I decided to share these here because they may be useful to others (and because I'm just plain tired of having to fix certain things on every new version of Harmony we use).  I am making a few threads simply because one thread would be absolutely massive.
 
This thread pertains to Harmony's file system support.
 
  • The stock flash device drivers appear to be very naive in how they handle writes, performing a read-erase-modify-write cycle on every single sector that is written.  It also appears that the FAT FS code writes one 512-byte sector at a time (at least, this was true in older Harmony.  Version 2.05 got new FAT FS code.).  This means that writing sequentially to a single 4kB block of flash causes it to get erased eight times.  This will unnecessarily accelerate the rate a which the flash degrades have a significant impact on performance.

    We have custom flash drivers that use a 4kB cache to prevent this.  We actually perform the read-erase-modify-write cycle only when the cache contents are no longer valid (we're writing to a different block now, for example) or after one second has passed since the last write.

    Having native support for a scheme like this in the file system would be a great addition.
  • The FAT FS code requires the user of it to provide the proper low-level disk access routines.  These are found in "framework/system/fs/fat_fs/src/hardware_access/diskio.c".  The disk_ioctl() function provided by Harmony, however, is very minimal and supports only the GET_SECTOR_COUNT command.  It would appear that the FAT FS code would also need the GET_BLOCK_SIZE command.  I implemented my own function long, long ago back when Harmony's disk_ioctl() function was completely empty, which we've been using instead of what comes with Harmony.

    #include "system/fs/src/sys_fs_media_manager_local.h"
    extern SYS_FS_MEDIA_MANAGER_OBJ gSYSFSMediaManagerObj;

    DRESULT disk_ioctl (
            uint8_t pdrv, /* Physical drive nmuber (0..) */
            uint8_t cmd, /* Control code */
            void *buff /* Buffer to send/receive control data */
            )
    {
        SYS_FS_MEDIA *media = NULL;
        DRESULT dresult = RES_OK;

        media = &gSYSFSMediaManagerObj.mediaObj[pdrv];

        if(media->driverFunctions->mediaStatusGet(media->driverHandle))
        {
         switch(cmd)
         {
          case CTRL_SYNC:
          {
           /* Do nothing because disk_write() blocks until the write is complete. */

           break;
          }
          case GET_SECTOR_COUNT:
          {
           /* Get the number of FAT sectors the device can hold. For now, we'll assume
              a fixed sector size. */

           uint32_t *count = (uint32_t *)buff;

           // Use erasable region.
           *count = (media->mediaGeometry->geometryTable[SYS_FS_MEDIA_GEOMETRY_ERASE].blockSize *
                     media->mediaGeometry->geometryTable[SYS_FS_MEDIA_GEOMETRY_ERASE].numBlocks /
                     FAT_FS_MAX_SS);

           break;
          }
          case GET_SECTOR_SIZE:
          {
           /* Get the sector size for variable sector size configs. This is not used in
              Harmony since it uses a fixed sector size. */

           uint16_t *size = (uint16_t *)buff;
           *size = FAT_FS_MAX_SS;

           break;
          }
          case GET_BLOCK_SIZE:
          {
           /* Get the number of FAT sectors that must be erased at once. For now, we'll assume
              a fixed sector size. */

        uint32_t *size = (uint32_t *)buff;

           *size = (media->mediaGeometry->geometryTable[SYS_FS_MEDIA_GEOMETRY_ERASE].blockSize /
                    FAT_FS_MAX_SS);

           break;
          }
          case CTRL_TRIM:
          {
           /* This would normally be used to erase sectors, but Harmony does not make use of
              this (_USE_ERASE is not defined). Not only that, but the USB and SD Card drivers
              don't even implement erase functions. */

           dresult = RES_PARERR;
           break;
          }
          default:
          {
           dresult = RES_PARERR; // unknown command code
           break;
          }
         }
        }
        else
        {
         // Media not attached, so return error.
         dresult = RES_ERROR;
        }


        return dresult;
    }
  • The context pointer that is given to SYS_FS_EventHandlerSet() is completely ignored and NULL is passed to the event handler instead.  We implement the context ourselves, but this really should be a part of Harmony proper.  Also, if I remember correctly, you would need to then remove the const qualifier from context in the function signature because you'll get a compiler warning if you assign it to the (presumably non-const) global context variable.

    I should mention that the documentation states that that the context IS used and so is misleading.  This will lead to users accidentally dereferencing NULL pointers until they realize what is happening.
  • The SYS_FS_MEDIA_MANAGER_Tasks() function checks if the current media object was disconnected.  However, if its mediaState was SYS_FS_MEDIA_ANALYZE_FS, then it needs to clear the bufferInUse flag or else no other media processing can be done and essentially the whole service will stop working.

    // In SYS_FS_MEDIA_MANAGER_Tasks near line 1476

        if (mediaObj->isMediaDisconnected == true)
        {
            /* If the media driver was de-registered in this state, then the media
             * had use of the media buffer and no longer needs it. */
            if(SYS_FS_MEDIA_ANALYZE_FS == mediaObj->mediaState)
                gSYSFSMediaManagerObj.bufferInUse = false;

            /* If media driver has de-registered then switch to the DEREGISTERED
             * state and handle the media detach. */
            mediaObj->mediaState = SYS_FS_MEDIA_STATE_DEREGISTERED;
        }
  • The SYS_FS_FileOpen() function incorrectly treats the SYS_FS_FILE_OPEN_WRITE and the SYS_FS_FILE_OPEN_APPEND attributes the same.  The SYS_FS_FILE_OPEN_WRITE attribute also needs the FA_CREATE_ALWAYS flag.

    // Approx. line 881

        /* Convert the SYS_FS file open attributes to FAT FS attributes */
        switch(attributes)
        {
            case SYS_FS_FILE_OPEN_READ:
                mode = FA_READ;
                break;
            case SYS_FS_FILE_OPEN_WRITE:
                 mode = FA_WRITE | FA_OPEN_ALWAYS | FA_CREATE_ALWAYS;
                break;
            case SYS_FS_FILE_OPEN_APPEND:
                mode = FA_WRITE | FA_OPEN_ALWAYS;
                break;
            case SYS_FS_FILE_OPEN_READ_PLUS:
                mode = FA_READ | FA_WRITE;
                break;
            case SYS_FS_FILE_OPEN_WRITE_PLUS:
                mode = FA_READ | FA_WRITE | FA_OPEN_ALWAYS;
                break;
            case SYS_FS_FILE_OPEN_APPEND_PLUS:
                mode = FA_READ | FA_WRITE | FA_OPEN_ALWAYS;
                break;
            default:
                /** TODO */
                //mode = FA__ERROR;
                break;
        }
  • I don't really like that the file system layer blocks on transactions, though I guess it'd be pretty difficult to change that.  However, we use the Watchdog and long transactions can cause it to trip in the middle of a transaction.  This bit us with really slow USB flash drives (our watchdog is set to about a half-second), so I ended up clearing the watchdog in the USB stack.  However, if the file system service is going to be blocking, then it would be helpful to allow us to either just clear the watchdog periodically or to give us a callback to do some processing while we wait for the file transaction to complete.
#1

9 Replies Related Threads

    Tez
    Moderator
    • Total Posts : 477
    • Reward points : 0
    • Joined: 2006/10/04 11:09:05
    • Location: 0
    • Status: offline
    Re: Harmony v2.05.01 Notes: File System 2018/03/11 07:03:36 (permalink)
    5 (1)
    Thank you for providing your inputs. These will be reviewed and if accepted, these will be incorporated in a future release of Harmony.
     
    The blocking behavior of the file system and its side effects can be mitigated by using an RTOS. Providing a hook to call other tasks while the file system blocks can cause issues. Users typically forget the context from which these callback functions are called. In that, the file system as it now periodically calls this callback,  will behave like a system within a system. The alternative is to make the file read and write function non-blocking. This may increase the application and the file system complexity but appears to be a viable and safe approach.
    #2
    muellernick
    Super Member
    • Total Posts : 435
    • Reward points : 0
    • Joined: 2015/01/06 23:58:23
    • Location: Germany
    • Status: offline
    Re: Harmony v2.05.01 Notes: File System 2018/03/12 05:23:04 (permalink)
    0
    The blocking behavior of the file system and its side effects can be mitigated by using an RTOS.

     
    Or Microchip finally fixes that issue that I and a lot of others are complaining about since years.
    I can't find nice words to comment the answer "Use a RTOS", so I leave that part out.
    MCP designed Harmony to work as a big non-blocking state machine, so they should practice what they preach.
     
    I already had to give up a certain concept, after I found out that the file system can block up to 20 seconds writing to a micro-SD card.
     
    Nick
    #3
    jdeguire
    Super Member
    • Total Posts : 437
    • Reward points : 0
    • Joined: 2012/01/13 07:48:44
    • Location: United States
    • Status: online
    Re: Harmony v2.05.01 Notes: File System 2018/03/12 06:56:23 (permalink)
    0
    Tez
    The blocking behavior of the file system and its side effects can be mitigated by using an RTOS. Providing a hook to call other tasks while the file system blocks can cause issues. Users typically forget the context from which these callback functions are called. In that, the file system as it now periodically calls this callback,  will behave like a system within a system. The alternative is to make the file read and write function non-blocking. This may increase the application and the file system complexity but appears to be a viable and safe approach.

    That's a good point about the "system within a system", but it does reinforce the need for a non-blocking API.  I don't think the current functions have to change (ie. they can still block), but having a non-blocking set we can choose to use would be helpful.  A non-blocking API would presumably have a function we'd poll to see if a file IO operation is in progress, but another function to abort a transfer (perhaps because of a timeout) might be handy.  It'd be up to the user to deal with the ramifications of aborting the transfer, of course.
     
    EDIT:  Thinking about it a bit more, the polling function could probably double as the "tasks" function by just calling SYS_FS_MEDIA_MANAGER_TransferTasks(), which is what the non-blocking versions do.
    post edited by jdeguire - 2018/03/12 07:22:24
    #4
    moser
    Super Member
    • Total Posts : 392
    • Reward points : 0
    • Joined: 2015/06/16 02:53:47
    • Location: Germany
    • Status: offline
    Re: Harmony v2.05.01 Notes: File System 2018/03/12 08:24:37 (permalink)
    5 (1)
    You have a lot of very good points here, jdeguire!
    As far as I can tell, I agree with all of them.
    And I think I need to check some of these things for my project in detail, because this could explain a few things.
     
    jdeguire
    • [...] Also, if I remember correctly, you would need to then remove the const qualifier from context in the function signature because you'll get a compiler warning if you assign it to the (presumably non-const) global context variable.
    The const does not mean that the content at the pointed location is const. Only the pointer itself is const, so it cannot be changed to point to some other location. Furthermore, this is just a promise of SYS_FS_EventHandlerSet() that it wont change the pointer and this promise has only meaning to the SYS_FS_EventHandlerSet() function itself, because the context pointer is given by value to SYS_FS_EventHandlerSet(). So there is no problem, neither when you pass the context to the SYS_FS_EventHandlerSet() function, nor when a correct implementation of _SYS_FS_MEDIA_MANAGER_HandleMediaDetach() or _SYS_FS_MountVolume() would give it back to your handler.
     
    The only reason which I can imagine for the compiler to yell at you, is if you assign the returned context directly without any cast to your pointer variable. uintptr_t is defined as unsigned int, and the compiler might warn you, because you use an int value as a pointer. If you do an explicit cast to your pointer type, it should be silent.
     
    jdeguire
    • The SYS_FS_MEDIA_MANAGER_Tasks() function checks if the current media object was disconnected.  However, if its mediaState was SYS_FS_MEDIA_ANALYZE_FS, then it needs to clear the bufferInUse flag or else no other media processing can be done and essentially the whole service will stop working.
    I had some very rare cases where my microSD card slot stopped working completely, after a detaching and attaching the card. More detaching and attaching did not fix it, only device reset helped. However, I never was able to reproduce these rare cases. I wonder, if this was due to the problem described by you.
     
     
    jdeguire
    • I don't really like that the file system layer blocks on transactions, though I guess it'd be pretty difficult to change that.  However, we use the Watchdog and long transactions can cause it to trip in the middle of a transaction.  This bit us with really slow USB flash drives (our watchdog is set to about a half-second), so I ended up clearing the watchdog in the USB stack.  However, if the file system service is going to be blocking, then it would be helpful to allow us to either just clear the watchdog periodically or to give us a callback to do some processing while we wait for the file transaction to complete.

    This was also a huge problem for us. We were writing to a microSD card. And although writing less than 512 bytes per main loop, we had the file system blocking the main loop for over 2 seconds. Some other forum users have reported even longer times. I think somebody said, it is mostly happening when the file increases and a new block needs to get added, at an already fragmented storage device.
     
    After this we decided to create one huge regular interrupt for all the time critical stuff, monitored by a very sharp windowed deadman timer, and of course sitting at the right interrupt priority level. The main loop was reduced to include only stuff which is not time-critical, including slow blocking peripheral accesses and is monitored by a very lazy watchdog which gives several seconds time. And communication between the main loop and the interrupt was changed to be completely asynchronous, which requires of course lots of attention for interrupt safety.
     
     
     
     
     
     
     
    #5
    jdeguire
    Super Member
    • Total Posts : 437
    • Reward points : 0
    • Joined: 2012/01/13 07:48:44
    • Location: United States
    • Status: online
    Re: Harmony v2.05.01 Notes: File System 2018/03/12 08:46:46 (permalink)
    0
    moser
    The const does not mean that the content at the pointed location is const. Only the pointer itself is const, so it cannot be changed to point to some other location. Furthermore, this is just a promise of SYS_FS_EventHandlerSet() that it wont change the pointer and this promise has only meaning to the SYS_FS_EventHandlerSet() function itself, because the context pointer is given by value to SYS_FS_EventHandlerSet(). So there is no problem, neither when you pass the context to the SYS_FS_EventHandlerSet() function, nor when a correct implementation of _SYS_FS_MEDIA_MANAGER_HandleMediaDetach() or _SYS_FS_MountVolume() would give it back to your handler.
     
    The only reason which I can imagine for the compiler to yell at you, is if you assign the returned context directly without any cast to your pointer variable. uintptr_t is defined as unsigned int, and the compiler might warn you, because you use an int value as a pointer. If you do an explicit cast to your pointer type, it should be silent.

     
    Aahhh, I think that makes sense.  This is one of those things that I had to think about for a while, though I really should just try it to see what the compiler outputs for my own educational purposes.
     
    moser
    This was also a huge problem for us. We were writing to a microSD card. And although writing less than 512 bytes per main loop, we had the file system blocking the main loop for over 2 seconds. Some other forum users have reported even longer times. I think somebody said, it is mostly happening when the file increases and a new block needs to get added, at an already fragmented storage device.
     
    After this we decided to create one huge regular interrupt for all the time critical stuff, monitored by a very sharp windowed deadman timer, and of course sitting at the right interrupt priority level. The main loop was reduced to include only stuff which is not time-critical, including slow blocking peripheral accesses and is monitored by a very lazy watchdog which gives several seconds time. And communication between the main loop and the interrupt was changed to be completely asynchronous, which requires of course lots of attention for interrupt safety.

     
    Unfortunately, I suspect that making a non-blocking API will be difficult because it appears that the problem is in the FAT FS code that Microchip is using for Harmony.  That code seems to expect that transactions will block and will even issue a "CTRL_SYNC" command to the underlying layer to make sure that happens.  Microchip would either have to work around the issue somehow with the current code or modify it enough (or make their own FAT FS module) to allow for a non-blocking mode.
    #6
    NKurzman
    A Guy on the Net
    • Total Posts : 16430
    • Reward points : 0
    • Joined: 2008/01/16 19:33:48
    • Location: 0
    • Status: offline
    Re: Harmony v2.05.01 Notes: File System 2018/03/12 14:43:15 (permalink)
    0
    They can not fix the Current Code, it would break every current application.  ( They tried this on the I2C API once).
    The would need a New set of really non blocking functions and are you done calls.
    #7
    moser
    Super Member
    • Total Posts : 392
    • Reward points : 0
    • Joined: 2015/06/16 02:53:47
    • Location: Germany
    • Status: offline
    Re: Harmony v2.05.01 Notes: File System 2018/03/13 02:21:10 (permalink)
    0
    You can do it in the same way as with the HTTP module: 
     
    There was the HTTP module, and they added a new HTTP_NET module, which replaced it. Interface as similar as possible, but also with some significant differences, where needed. So moving from one interface to the other needs some work but it is not too difficult.
     
    Same would work with the file system. In addition to the blocking file system interface SYS_FS_ you could develop a new non-blocking SYS_FSNB_ , and each developer has to choose which of the two interfaces he uses. And if the interface functions have huge similarities (given the available functions, and their parameters), then this would help to switch to the non-blocking system. Of course you will definitely have some work, as all calls need now need more code and need to store states for waiting for the result.
     
    However, keep in mind, that it is not only about the non-blocking interface, but also about the underlying driver. If Microchip removes the 2 seconds delay from the interface, but then the SYS_FS_Tasks() / SYS_FS_MEDIA_MANAGER_Tasks() is blocking for 2 seconds in one single main loop, we would have no improvement at all. In other words: The two second task is acceptable, if it is splitted into 200 mini-tasks times 10ms, for example.
    #8
    muellernick
    Super Member
    • Total Posts : 435
    • Reward points : 0
    • Joined: 2015/01/06 23:58:23
    • Location: Germany
    • Status: offline
    Re: Harmony v2.05.01 Notes: File System 2018/03/13 04:24:55 (permalink)
    5 (1)
    The would need a New set of really non blocking functions and are you done calls.

     
    I'd be more than happy to use a new interface as long as it works. For me, there's only a single unit I'd have to change.
     
    The file size doesn't matter that much. You just have to pay attention that you don't seek from start to finish of the file.
    I do have a log file (20 MB) that is size limited (circular buffer in a file). Initially, I had additional information (head and tail position) in the same file. It was a disaster when writing (appending new records) and at the same time reading data to be sent and deleted). After moving that small portion into a separate file, things went much quicker.
     
    You can get horror blocking times when having a lot of files in one directory (it get bad at about 100 files). Open will take forever.
     
    Nick
    #9
    moser
    Super Member
    • Total Posts : 392
    • Reward points : 0
    • Joined: 2015/06/16 02:53:47
    • Location: Germany
    • Status: offline
    Re: Harmony v2.05.01 Notes: File System 2018/03/13 06:28:17 (permalink)
    0
    muellernickI'd be more than happy to use a new interface as long as it works. For me, there's only a single unit I'd have to change.

    I totally agree here. 
     
    muellernickThe file size doesn't matter that much. You just have to pay attention that you don't seek from start to finish of the file.
    I do have a log file (20 MB) that is size limited (circular buffer in a file). Initially, I had additional information (head and tail position) in the same file. It was a disaster when writing (appending new records) and at the same time reading data to be sent and deleted). After moving that small portion into a separate file, things went much quicker.
     
    It's funny how so many of us stumble over similar problems. I have an event log file. Append always at end, but this happens seldom. And deliver whole file by HTTP on request. But since the device literally runs non-stop for years, the file might get quite big (several MB).
     
    I implemented the HTTP access using the same file access (because you can not open the same file twice), and it worked with quite acceptable speed. But then I started the download of the file two times in parallel, and the performance completely broke down, with factor more than 100 or something like that. The read process, which is a bit behind, is always causing the file pointer to start completely new from the beginning. You can see where it happens, if you follow the SYS_FS_FileSeek() call to f_lseek() in ff.c. You can read it there in the comments:

                     (ofs - 1) / bcs >= (ifptr - 1) / bcs) { /* When seek to same or following cluster, */
                     fp->fptr = (ifptr - 1) & ~(bcs - 1);    /* start from the current cluster */
                     ofs -= fp->fptr;
                     clst = fp->clust;
                } else {                                   /* When seek to back cluster, */
                     clst = fp->sclust;                     /* start from the first cluster */

     
    Actually, has anybody checked what this _USE_FASTSEEK is doing? It looks a bit like it is exactly what we need here. And if it works the next question is, if it is reliable and which disadvantages does it have?
     
    PS: I found this link: http://irtos.sourceforge....oc/en/lseek.html 
     
    muellernickYou can get horror blocking times when having a lot of files in one directory (it get bad at about 100 files). Open will take forever.
    Thanks for the warning. I think I will have to change something ... 
    post edited by moser - 2018/03/13 06:44:12
    #10
    Jump to:
    © 2018 APG vNext Commercial Version 4.5