Zero to main(): How to Write a Bootloader from Scratch | Interrupt

This is the third post in our zero to main() series, where we boostrap a working firmware from zero code on a cortex-M series microcontroller.


This is a companion discussion topic for the original entry at https://interrupt.memfault.com/blog/how-to-write-a-bootloader-from-scratch
1 Like

Great post FranƧois. I used to dread writing bootloaders until I made the effort to really understand the ARM early boot process, and now theyā€™re pretty straight-forward.

If youā€™re using CMSIS you (technically) donā€™t need assembler to branch to application code. Hereā€™s a snippet of some code showing how my bootloaders launch application code, modified to use your linker variables.

// Reset stack and vector table
uint32_t *stack_ptr = (uint32_t *)__approm_start__;
__disable_irq();  // Good idea when messing with MSP/VTOR
__set_CONTROL(0);
__set_MSP(*stack_ptr);
SCB->VTOR = __approm_start__;
// Call the reset handler (the construct below is 'pointer to a pointer to a function that takes no arguments and returns void')
void (**reset_handler)(void) = (void(**)(void))(__approm_start__ + 4);
(*reset_handler)();  // Call it as if it were a C function

Although your ASM version may be smaller, this is an alternative that boils down to pretty much the same instructions.

Some advice for others from working with bootloaders in the field: always have a last resort way to ā€˜break intoā€™ the bootloader by, for example, holding a button down, or checking for a voltage on a pin. My bootloaders typically checksum the application and if it validates, immediately execute the code. If you donā€™t have a way to bypass this, and force the bootloader to remain active for new new code to be loaded, you may need to attach a debugger to recover from buggy firmware.

As well as a manual override to remain in the bootloader for loading new code, you also want the bootloader to automatically detect that it should not launch any existing code but instead wait for new code to be loaded. For a while I used a technique similar to your ā€˜Message passing to catch reboot loopsā€™ code, putting magic numbers into RAM that is not cleared during a restart. These days most ARM vendors provide a ā€˜reset controlā€™ register that reports the reason for the last reset. One of the MCUs I use, for example, reports whether the system was reset because of low voltage, watchdog, lockup, reset pin (debugger attached), or software (request via the NVIC). If the bootloader detects a software reset, it will not launch any existing app and instead wait for new code. This means no more messing around with application linker scripts to reserve memory for communicating with the bootloader. It could also perform some specific actions based on continued watchdog resets, as suggested in your article.

Lastly, my favourite user interface to a bootloader has to be an emulated USB mass-storage device. When the bootloader is waiting for new code, it brings up a USB stack and emulates a hard drive that you simply drag and drop the new application image onto, and the bootloader validates the image and programs it into flash. This is a serious undertaking! USB stacks are usually pretty large and MSD emulation is complex, so only recommended for fearless developers targeting chips with lots of flash. But when it works, it is pure magic!

Thanks for the comment, @simonhaines. It adds a ton to the conversation.

You make a great point that the bootloader may want to do something different based on the reset type. Some implementations also provide some RTC scratch registers (e.g. STM32) which can be used to communicate between the app & the bootloader without the linker complexity.

On the topic of setting the MSP, Keil makes this even nicer using named register variables. Iā€™ll reproduce their example below:

register unsigned int _control __asm("control");
register unsigned int _msp     __asm("msp");
register unsigned int _psp     __asm("psp");

void init(void)
{
  _msp = 0x30000000;        // set up Main Stack Pointer
  _control = _control | 3;  // switch to User Mode with Process Stack
  _psp = 0x40000000;        // set up Process Stack Pointer
}

source: http://www.keil.com/support/man/docs/armcc/armcc_chr1359125006491.htm

Unfortunately this is not supported by GCC.

On the USB mass storage device: the nRF52 development kit does just this, and you are right to say it is a magical experience :slight_smile:

Thank you francois for your very good job ! I am starting in programation and stm32 and i thank it will very help me to build my own bootloader. Yet, may your help me to build a sucure bootloader for stm32f103x8 from this repository linked (i have many error) ? Thank you : https://github.com/dmitrystu/sboot_stm32

Hi @sorokolo,

Thanks for getting in touch on Interrupt.

I am happy to answer any specific questions you have. Reading through your code is a bit more than Iā€™d like to take on.

Hello ! Have you done something about stm32 secure bootloader or stm X-CUBE-SBSFU ? thank you

Iā€™m really enjoying the posts, please do keep them coming.
There was one thing I was curious about in the assembly code snippet:

static void start_app(uint32_t pc, uint32_t sp) attribute((naked)) {
__asm(" \n
msr msp, r1 /* load r1 into MSP /\n
bx r0 /
branch to the address at r0 */\n
");
}

Is this deterministic? Will the pc variable always end up in r0, and sp in r1?
Is there a way of doing this by using the symbolic names?

1 Like

Hi @Duske,

Good question! Yes, it is deterministic. The way arguments are mapped to registers is part of the ARM Procedure Call Standard that a compiler needs to follow for ARM. You can find all the details in Section 5.5 ā€œParameter Passingā€ but the gist is

The base standard provides for passing arguments in core registers (r0-r3) and on the stack. For subroutines that take a small number of parameters, only registers are used, greatly reducing the overhead of a call

GCC does support an assembly variant, referred to as extended-asm which lets you mix C variables with assembly. While itā€™s not necessary to use in this case, it can be helpful when you want to reference something that isnā€™t in a known register from some assembly code (such as a local variable)

One thing worth pointing out is that the Cortex-M0 does not have the VTOR register (but the M0+ does), which makes routing IRQs through multiple programs more difficult. Hereā€™s good blog post on the topic: http://kevincuzner.com/2018/11/13/bootloader-for-arm-cortex-m0-no-vtor/.

Hey, Iā€™ve really enjoyed the Zero to main() series thus far. Unfortunately I think they are a bit too unspecific at times, so it can kind of be hard to follow along. This is especially true when trying to implement the examples locally. So my question is, is there any external reading material or lectures that kind of explains everything in this series more in depth?

Hi

Not a boot-strapper but a boot-loader to update code via application code via UART, USB, SD card, Memory Stick, Ethernet web server, Modbus, I2C (and so on). Supports optional encryption (including AES256) and two step boot loader to allow the serial loader and also the application to be updated. Available for various processors (including NXP Kinetis, LPC, STM32, and others) - also in open source form on GIT hub (Search uTasker there)
Docs:

The project has custom tools to do typical things like combining loader and application, encrypting the application for loading, etc.- *********************

Uses about all the techniques discussed in the blog and works with many IDEs so can be freely used by anyone with interest in loader capability. So enjoy.

Also supported for professional users, with a history of 15 years use in many real products (millions of units with no reported issues).

Regards

Mark (developer of uTasker project since 2004)

P.S. **************** are links that I removed since I canā€™t use more that two in my first post (I can add then in other posts on request)

hallo i have read your post on bootloader it was really a great post. I am new to firmware development and bootloader .As a part of my project i have to developed a bootloader for firmware updation .Atmel samd21 19a i have to do firmware updation using usb so first i have to check whether ther is data in usb buffer if so i should store it in a buffer which is then given to xml parser and then encrypted to base 64
but i stucked with first part i am not getting data from my usb buffer

if(usb_rx_buf != 0) //checking if data is available
			memcpy(xmlBuf,usb_rx_buf,usb_rx_buf+1);   //copying Usb data to xml Buffer      
			if(usb_rv_index >= sizeof(usb_rx_buf) )
			usb_rv_index ++;
			xmlReq = XMLparser_parse_str((uint8_t*)xmlBuf, &XMLParameter);
			clearXmlBuffer = 1;

am not sure whether problem is with code please help me to find the fault and proceed

Hi Francois,

great post : )

every time I read again, I learn something. and the topic is quite interesting for typical firmware engineer.

thanks.

Sorry @ryan1, thereā€™s not quite enough info in your post to help. Also, note that base64 is not encryption but encoding, please donā€™t use it to hide sensitive information!

First thanks for great post. I have below question:

  • In session ā€œRelocating our app from flash to RAMā€, I believe this is not mandatory step. We can execute code from approm something like below. Correct ?

/* app.ld /
SECTIONS {
.text :
{
KEEP(
(.vectors .vectors.*))
(.text)
(.rodata)
} > approm
ā€¦
}

  1. Regarding ā€œshared.hā€ text code, I believe we will have 2 copies of same code( .text) in bootrom and approm. Was thinking that Is there any way to make only 1 copy. Is this possible?

  2. During ā€œstart_appā€ definition you have added keyword "attribute((naked)) ".Can you please tell reason of adding? If we donā€™t add, can it create issue?

I have one suggestion, Please make new blog on updating firmware with bootloader (may be dual bank application). As a hobby project I built my bootloader to update application so looking to know best practice. Thanks

Also facing one strange issue, I am getting linker script variables value completely wrong. For example for variable _shared_data_start, I am getting its value as ā€œ0x4b04d002ā€ but this is not correct address. Any suggestion. I am using IDE STM32 workbench

Hi Chandan,

Welcome to interrupt!

You are correct. I wanted to cover this technique, but it is not required or even appropriate for every project. Reasons you might want to relocate to RAM are:

  1. It might be faster than ROM on some chips
  2. Not every MCU supports executing from flash

You are correct, you end up with two copies of that code. It is technically possible to use a single copy of the code shared between the app and the bootloader, but it is non trivial. In the past I have done this to avoid having two copies of libc (in bootloader & app). The trick is to link your app against your bootloader. Perhaps we will cover this in a future post.

You can find documentation for the naked attribute here. This attribute is used to tell the compiler it should not add a function prologue / epilogue. It is recommended when a function contains only inline assembly. In this case, omitting it likely would not cause problems.

I am not familiar with the STM32 workbench, but remember that you should be taking a reference to the variable. I would expect &_shared_data_start to be 0x20000300, but if you remove the & you wonā€™t get the expected value.

Hope that helps!

Thanks for reply. Was looking for such firmware blog from quite a while, Its helping to upgrade firmware knowledge.

Thank you for posting really good content.

I have a small question about the data sharing between the bootloader and the application.
In general, in this implementation, I know that the linker represents the memory offset and accesses the variable using the attribute((section(.NAME)) keyword in the code. (eg. https://www.reddit.com/r/embedded/comments/f2yqr0/sharing_mcu_memory_region_between_bootloader_and/)

I think your way is to express these memory offsets in your code, is that correct?

@Hyunyong-Park Thank you for the encouragement :-).

Thatā€™s correct, you could do this just was with with a section attribute (in fact, this is what I do in my latest post). Here, I chose to use a linker script variable in instead:

// memory_map.ld
_shared_data_start = ORIGIN(shared);
...
// shared.c
struct shared_data *sd = (struct shared_data *)_shared_data_start;