What kind of Hardfault is tracable and how to do it?

Hello Memfault community!

I’m trying to implement tracing on my customized nrf91 based board. When running sample coding on the nrf9160dk, I’ve figured out, that some of the hardfaults are not traceable (return with 0xaaaaaaaa).

When I’m trying to implement on my own board, I can’t trace any hardfaults at all (for example, when I copy the NULL-Pointer-dereferencing error).

Please tell me if this behavior is predictable and how I can trace the hardfaults correctly.

Thank you all in advance!

Hi @duyanh-y4n, thanks for the report and welcome to the community!

The memfault-firmware-sdk collects the exact registers at the time a fault takes place so we should be able to recover full stack unwinds.

The pattern you are seeing, 0xaaaaaaaa is interesting as that’s the stack sentinel used by Zephyr when initializing stacks. If the stack pointer gets corrupted in a task context, this could suggest the registers are getting loaded back from an incorrect region when a context switch is taking place.

A few quick questions to help us investigate further:

  • Is the fault you are looking at manually triggered? If so, what’s the code you are executing to trigger it?
  • What version of the nRF Connect SDK are you using?

Thanks,
-Chris

Hi @chrisc

Thank you for your helpful source code reference. This problem happened at the system initializing stage before the main function, so the context was possibly corrupted. It was a bus hard fault.

It’s ok for now, as this problem should never happen in production.

Thank you for your help.