Glad to hear you found it useful! Great questions, I’ve added some thoughts inline below!
About the second patch: I tried to replace the ROOT_DIR := $(abspath .) assignment with ROOT_DIR := . and not to apply patch 03. I would have expected it to work instead it doesn’t but I don’t understand why.
By default, the GNU DWARF writer will emit the
DW_AT_comp_dir attribute (the directory the file was compiled in) as an absolute path. You need to provide the
-fdebug-prefix-map compiler option (patch 03) to change this.
You can examine particular attributes by using the
--debug-dump option of
arm-none-eabi-readelf --debug-dump build/nrf52.elf | grep DW_AT_COMP -i
<15> DW_AT_comp_dir : (indirect string, offset: 0x456f): /private/tmp/dev/interrupt/example/reproducible-build
If from two different directories you compile different elf, but the binaries are the same, then what’s the problem?
So even changing just one gcc parameter to drive debugging (like -g3 or -g2) would change the SHA1 without changing the binary functionally.
Nothing would be the matter per-se. You would need to set up your own tooling to compute a md5 over the final binary and use that to verify that the same build has been generated but that wouldn’t be hard to do.
While it’s true just changing a compiler flag related to debugging won’t change the binary, I do think it’s nice to know that the exact same debug information is being generated for a few reasons:
- If the debug (DWARF) information emitted changes, different developers may get different results when debugging locally. For example, depending on what settings were changed, one build may be able to correctly display backtraces and one may not, etc
- If you are collecting cores / automating analysis of crashes (e.g with gdb-python scripts), you’ll want a way to get the same debug info so you get the same analysis results. For example, changing from -g3 to -g2 removes macro definitions from the ELF … if an analysis script was looking up the value of a
#define, it will no longer work.
Finally, I ask you how risky it is to embed the GNU build ID into the binary, since it also depends on the debug sections.
I think it’s quite safe to store this information. A lot of effort has been put into the GNU project to make sure the debug information emitted is reproducible. The build ID itself was originally added to the GNU project to aid in being able to reliably look up the symbols for a linux core dump.
Also, just in case you didn’t see it we talk about using the build id in embedded applications in a bit more detail here.