Using Asserts in Embedded Systems | Interrupt

I can definitely explain a little more

Don’t assert on operations that depend on the hardware behaving appropriately. If a sensor says it will return a value between 0-100, it’s probably best not assert when it’s above 100, because you can never trust today’s cheap hardware.

Asserts are used primarily to validate that the code you or others on your team write is behaving correctly. Asserts should generally not be used when you are trying to validate code from other developers. You should check return values, validate them, and if they are invalid, raise errors or error codes.

This applies also to hardware. I’ve had experience when working with vendor chips where the documentation says “THIS WILL NEVER HAPPEN”, and of course it happens every now and then. Just because a chip is misbehaving or return invalid values doesn’t mean one should bring down the whole system, especially if the system isn’t reliant on it behaving 100% correctly (and the bug is likely in software anyways, not hardware).

If something comes back from hardware that is invalid, a soft reset of the chip or the vendor stack is generally enough, and shouldn’t require asserting.

Don’t assert on the contents of data read from persistent storage, unless it’s guaranteed to be valid. The data read from flash or a filesystem could be corrupted.

This is a fun one. It’s again mostly related to vendor code and hardware. When you read data from a flash chip, there is always the chance that the read subtly failed, whether it’s due to previous corruption of the flash chip, a bad filesystem, incorrect timings used, a previously failed erase, etc. You should always because able to stomach and recover from an invalid flash read or invalid flash contents.

Most systems handle this by validating the contents (not asserting!), and if invalid, erasing the flash chip or bad sectors and starting anew.

A common bootloop issue that occurs on systems is when a developer chooses to assert the contents of the flash chip on boot (maybe provisioning dat). When the device boots, let’s say it reads the device serial number to print to a screen. It reads that value from flash, asserts the length, the assert fails due to corruption, and reboots. When the device wakes up again, it will do the same exact thing.

A common way to mitigate this is to add a boot counter to detect boot loops, as mentioned above by @cyril, but the point still stands.

Don’t assert things that aren’t 100% reliable and consistent, and if that doesn’t hold true, make sure you really want to reboot the system if the assert fails.