We postponed implementing the CRCs for the microSD cards, as we have currently other more important issues.
We fixed the problems by selecting the right microSD cards. But I hope I can come back to it at a later point in time.
I experimented a bit with my stress test which uses the SPI mode.
Reducing SPI speed from 20MHz to 2MHz did not have an impact on the microSD cards which show errors. The errors only appear slower.
Also we really verified that at some point a random sector gets erased, and the SD card driver does not see any error, not even at the lowest levels. As far as we can tell, we verified that the implementation follows the SD specification and does all necessary checks. Only the optional CRCs are avoided. Not one of checks fails, but we get empty sectors.
If this happens within the data sectors, then only your data is damaged, but if it happens elsewhere, for example in the directory entries of the FAT, then this causes really bad things. The FatFS driver has only one sector in the cache. However, after opening files, it remembers where file entries and other things are located. And when FatFS needs to change this stuff, it just reads in the right sector without checking what was read, then overwrites only the bytes which need to get changed and writes back the whole sector. We did not verify, but we believe, this combination of silently deleted sectors and blindly overwriting stuff, could theoretically explain almost all the different phenomena, which we have seen.
Also I tested several microSD card brands. Results were varying. We found a few other brands which also had errors within minutes. We even had even one brand which almost never made it over the SD initialization phase, and even if it did then it immediately failed after. But we also found some brands which worked perfectly and were from manufactures which we believe are probably also more trustworthy. Since microSD cards are not replaced in our products, this solves our problem completely.
When I started with the SD I was surprised by the power draw spike on start up.
I know I have a power draw spike, but it is not coming from the microSD. Power is really not an issue on my board. My external power supply has 10 A output current at 48 V. And I don't see any spike or dent in my 3.3 V level with the oscilloscope on startup or any different behavior which is depending on whether a microSD card is present or not.