Tracking Firmware Code Size | Interrupt

discobot · March 18, 2020, 4:04am

This is the terminal output of my nightmares. It frequently means a halt in productivity and sends engineers scrambling to save 100 bytes wherever they can, all while trying to meet the deadlines and requirements set forth by the Product team.

This is a companion discussion topic for the original entry at https://interrupt.memfault.com/blog/code-size-deltas

geky · March 19, 2020, 1:24am

Hi Tyler, great article!

One thing we did for littlefs was piggy-back the code-size measurements onto GitHub’s CI commit statuses. GitHub can store multiple statuses for each commit, which makes it easy to store various CI measurements in one place that can be diffed at CI time. As an added plus the status is also an easy way to consume the information as a human.

The only downside is that you may become rate-limited by GitHub if you tried to download all of the statuses for a large repository at once.

Here’s the script we use to update statuses:

github.com

ARMmbed/littlefs/blob/ce2c01f098f4d2b9479de5a796c3bb531f1fe14c/.travis.yml#L305-L312


curl -u "$GEKY_BOT_STATUSES" -X POST \
    https://api.github.com/repos/$TRAVIS_REPO_SLUG/statuses/${TRAVIS_PULL_REQUEST_SHA:-$TRAVIS_COMMIT} \
    -d "{
        \"context\": \"$STAGE/$NAME\",
        \"state\": \"success\",
        \"description\": \"${STATUS:-Passed}\",
        \"target_url\": \"https://travis-ci.org/$TRAVIS_REPO_SLUG/jobs/$TRAVIS_JOB_ID\"
    }"

francois · March 19, 2020, 2:09am

Wow @geky this is very cool!

mastupristi · March 28, 2020, 3:57pm

Thank you for that very interesting article.

We have been adopting the NXP i.MX RT105x controllers for a long time now for many of our products.
The memory layout of these processors is much more complex than a classic controller (which usually has only flash+SRAM).
They only have built-in SRAM, the flash is external (we adopt 2MB QSPI chips).
SRAM in turn is divided into ITCM, DTCM and OnChipRAM. Although the dimensions can be configured, this is done statically with respect to FW.
For us, using the output of the size command can’t work. In fact both .text and .data and .bss are splitted between flash and the three RAM memories. For example .text is partly in flash and partly in ITCM (and hopefully no other memory). .data is in both DTCM and OCRAM.
What we’re interested in is knowing how full the memory regions are:

Memory region         Used Size  Region Size  %age Used
     BOARD_FLASH:      280964 B         2 MB     13.40%
        SRAM_DTC:      122580 B       128 KB     93.52%
        SRAM_ITC:      106112 B       128 KB     80.96%
         SRAM_OC:       61444 B       256 KB     23.44%

but also know for each memory region how the allocated part is divided between .text, .data and .bss.
What strategy would you follow?

Our long-term goal would be to have a map of occupation of the memory regions that informs us how much the different compile units fill.

best regards
Max

chrisc · March 28, 2020, 6:02pm

What strategy would you follow?

Our long-term goal would be to have a map of occupation of the memory regions that informs us how much the different compile units fill.

Under the hood, arm-none-eabi-size is just walking through the ELF sections (output you would see from arm-none-eabi-readelf -S <your_file.elf> and inspecting the sh_flags in the section header. The rules it uses are:

If the section is not allocated (SHF_ALLOC == 0), don’t count it
if the section is executable (SHF_EXECINSTR == 1) or the section is not writable (SHF_WRITE == 0), add to text count
If the section has no data (SHT_NOBITS == 1), add it to the bss count else add it to the data count

You could use pyelftools to load the elf and compute text/data/bss sums for each memory region you want using those rules.

mastupristi · March 29, 2020, 9:24pm

You could use pyelftools to load the elf and compute text/data/bss sums for each memory region you want using those rules.

I wrote two python script to compute sum: elf_test_rt105x.py and elf_test_stm32.py.
One is for STM32 and the other is for RT105x. The difference between the two is only the map of the memory regions and their addresses.
I tried it on an elf for STM32:
linker output is:

Memory region         Used Size  Region Size  %age Used
           FLASH:       90124 B       128 KB     68.76%
             RAM:       31948 B        32 KB     97.50%

Python script output is:

Memory region             .text        .data         .bss        Total
           FLASH:       88916 B        448 B          0 B      89364 B
             RAM:           0 B        760 B      31188 B      31948 B

as you can see, the SRAM total is identical. The flash should be added to the size of .data: 89364+760=90124 that match too.

I also have another example for RT105x:
linker output:

Memory region         Used Size  Region Size  %age Used
     BOARD_FLASH:      280964 B         2 MB     13.40%
        SRAM_DTC:      122580 B       128 KB     93.52%
        SRAM_ITC:      106112 B       128 KB     80.96%
         SRAM_OC:       61444 B       256 KB     23.44%

python script output:

Memory region             .text        .data         .bss        Total
           FLASH:      171876 B        480 B          0 B     172356 B
            ITCM:      106112 B          0 B          0 B     106112 B
            DTCM:           0 B       1796 B     120784 B     122580 B
           OCRAM:           0 B          0 B      61444 B      61444 B

even all these numbers match

I have more questions, though:
Does the elf file embed (or could it embed) information about memory regions? This way I could recover them from the file itself without having to code them in python scripts from time to time. Is this information stored in the program headers ( arm-none-eabi-readelf -l <file.elf>)?

Is there a reliable way to calculate the amount of flash actually occupied? I mean considering also initialized data and the portion of .text that is in RAM.

Could you review my Python code? I’m not that skilled, so I’d love the opinion of someone who knows more about it than I do.
I try to make the adjustments due to section alignment, but the prerequisite for this to work is that the sections are sorted by address within the elf. Is that always the case?

Finally, for our needs it can often be useful to know how big is .rodata, so I added the --rodata parameter to the script. Did I identify well the sections containing .rodata (both the SHF_EXECINSTR and SHF_WRITE flags cleared)?

best regards
Max

mixandmatch · May 10, 2020, 3:32pm

Hi, thank you very much about your post, it was a great suggestion.
I’m trying to implement a similar approach in my project.
I was wondering what was the SQL query you used to get the diff between a rev’s sizes and its’ parent’s sizes (in redash)

tyler · May 11, 2020, 1:19am

Great question! I’ve pasted it here:


  SELECT
    codesizes.committed_at,
    - (row_number() OVER (ORDER BY codesizes.created_at DESC)) AS i,
    codesizes.revision,
    codesizes.message,
    codesizes.text,
    codesizes.data,
    codesizes.bss,
    codesizes.parent_revision,
    parents.text AS parent_text,
    parents.data AS parent_data,
    parents.data AS parent_bss,
    codesizes.text - parents.text AS text_delta,
    codesizes.data - parents.data AS data_delta,
    codesizes.bss - parents.bss AS bss_delta
  FROM
    codesizes
    INNER JOIN codesizes parents ON codesizes.parent_revision = parents.revision
ORDER BY i DESC

agausmann · November 4, 2024, 6:20pm

Readers should be aware, your SQL code examples are vulnerable to SQL injection! See the documentation for psycopg about properly passing parameters to SQL queries. Putting aside the security implications, even during legitimate use cases, this service will break the moment you use apostrophes in a commit message.

If you use Python’s builtin string formatter, it will not properly escape strings before substituting them in the query. Instead, pass the parameters directly to the psycopg execute method, and use their string formatting syntax. It will automatically convert the string parameters to valid, escaped SQL.

These rules apply for any other place where you’re generating code from code. When you’re generating string literals, make sure they’re escaped correctly, or you will have a headache later.

Topic		Replies	Views
How to Dig into Firmware Code Size Blog	19	3866	December 15, 2023
Device Firmware Update Cookbook \| Interrupt Blog	30	3376	August 26, 2023
GNU Binutils: the ELF Swiss Army Knife \| Interrupt Blog	7	1496	March 21, 2021
Building Better Firmware with Continuous Integration \| Interrupt Blog	0	739	September 18, 2019
Saving bandwidth with delta firmware updates \| Interrupt Blog	3	562	February 6, 2025

Tracking Firmware Code Size | Interrupt

Related topics