Zero to main(): How to Write a Bootloader from Scratch | Interrupt

This is the third post in our zero to main() series, where we boostrap a working firmware from zero code on a cortex-M series microcontroller.

This is a companion discussion topic for the original entry at
1 Like

Great post François. I used to dread writing bootloaders until I made the effort to really understand the ARM early boot process, and now they’re pretty straight-forward.

If you’re using CMSIS you (technically) don’t need assembler to branch to application code. Here’s a snippet of some code showing how my bootloaders launch application code, modified to use your linker variables.

// Reset stack and vector table
uint32_t *stack_ptr = (uint32_t *)__approm_start__;
__disable_irq();  // Good idea when messing with MSP/VTOR
SCB->VTOR = __approm_start__;
// Call the reset handler (the construct below is 'pointer to a pointer to a function that takes no arguments and returns void')
void (**reset_handler)(void) = (void(**)(void))(__approm_start__ + 4);
(*reset_handler)();  // Call it as if it were a C function

Although your ASM version may be smaller, this is an alternative that boils down to pretty much the same instructions.

Some advice for others from working with bootloaders in the field: always have a last resort way to ‘break into’ the bootloader by, for example, holding a button down, or checking for a voltage on a pin. My bootloaders typically checksum the application and if it validates, immediately execute the code. If you don’t have a way to bypass this, and force the bootloader to remain active for new new code to be loaded, you may need to attach a debugger to recover from buggy firmware.

As well as a manual override to remain in the bootloader for loading new code, you also want the bootloader to automatically detect that it should not launch any existing code but instead wait for new code to be loaded. For a while I used a technique similar to your ‘Message passing to catch reboot loops’ code, putting magic numbers into RAM that is not cleared during a restart. These days most ARM vendors provide a ‘reset control’ register that reports the reason for the last reset. One of the MCUs I use, for example, reports whether the system was reset because of low voltage, watchdog, lockup, reset pin (debugger attached), or software (request via the NVIC). If the bootloader detects a software reset, it will not launch any existing app and instead wait for new code. This means no more messing around with application linker scripts to reserve memory for communicating with the bootloader. It could also perform some specific actions based on continued watchdog resets, as suggested in your article.

Lastly, my favourite user interface to a bootloader has to be an emulated USB mass-storage device. When the bootloader is waiting for new code, it brings up a USB stack and emulates a hard drive that you simply drag and drop the new application image onto, and the bootloader validates the image and programs it into flash. This is a serious undertaking! USB stacks are usually pretty large and MSD emulation is complex, so only recommended for fearless developers targeting chips with lots of flash. But when it works, it is pure magic!

Thanks for the comment, @simonhaines. It adds a ton to the conversation.

You make a great point that the bootloader may want to do something different based on the reset type. Some implementations also provide some RTC scratch registers (e.g. STM32) which can be used to communicate between the app & the bootloader without the linker complexity.

On the topic of setting the MSP, Keil makes this even nicer using named register variables. I’ll reproduce their example below:

register unsigned int _control __asm("control");
register unsigned int _msp     __asm("msp");
register unsigned int _psp     __asm("psp");

void init(void)
  _msp = 0x30000000;        // set up Main Stack Pointer
  _control = _control | 3;  // switch to User Mode with Process Stack
  _psp = 0x40000000;        // set up Process Stack Pointer


Unfortunately this is not supported by GCC.

On the USB mass storage device: the nRF52 development kit does just this, and you are right to say it is a magical experience :slight_smile:

Thank you francois for your very good job ! I am starting in programation and stm32 and i thank it will very help me to build my own bootloader. Yet, may your help me to build a sucure bootloader for stm32f103x8 from this repository linked (i have many error) ? Thank you :

Hi @sorokolo,

Thanks for getting in touch on Interrupt.

I am happy to answer any specific questions you have. Reading through your code is a bit more than I’d like to take on.

Hello ! Have you done something about stm32 secure bootloader or stm X-CUBE-SBSFU ? thank you

I’m really enjoying the posts, please do keep them coming.
There was one thing I was curious about in the assembly code snippet:

static void start_app(uint32_t pc, uint32_t sp) attribute((naked)) {
__asm(" \n
msr msp, r1 /* load r1 into MSP /\n
bx r0 /
branch to the address at r0 */\n

Is this deterministic? Will the pc variable always end up in r0, and sp in r1?
Is there a way of doing this by using the symbolic names?

1 Like

Hi @Duske,

Good question! Yes, it is deterministic. The way arguments are mapped to registers is part of the ARM Procedure Call Standard that a compiler needs to follow for ARM. You can find all the details in Section 5.5 “Parameter Passing” but the gist is

The base standard provides for passing arguments in core registers (r0-r3) and on the stack. For subroutines that take a small number of parameters, only registers are used, greatly reducing the overhead of a call

GCC does support an assembly variant, referred to as extended-asm which lets you mix C variables with assembly. While it’s not necessary to use in this case, it can be helpful when you want to reference something that isn’t in a known register from some assembly code (such as a local variable)

One thing worth pointing out is that the Cortex-M0 does not have the VTOR register (but the M0+ does), which makes routing IRQs through multiple programs more difficult. Here’s good blog post on the topic:

Hey, I’ve really enjoyed the Zero to main() series thus far. Unfortunately I think they are a bit too unspecific at times, so it can kind of be hard to follow along. This is especially true when trying to implement the examples locally. So my question is, is there any external reading material or lectures that kind of explains everything in this series more in depth?


Not a boot-strapper but a boot-loader to update code via application code via UART, USB, SD card, Memory Stick, Ethernet web server, Modbus, I2C (and so on). Supports optional encryption (including AES256) and two step boot loader to allow the serial loader and also the application to be updated. Available for various processors (including NXP Kinetis, LPC, STM32, and others) - also in open source form on GIT hub (Search uTasker there)

The project has custom tools to do typical things like combining loader and application, encrypting the application for loading, etc.- *********************

Uses about all the techniques discussed in the blog and works with many IDEs so can be freely used by anyone with interest in loader capability. So enjoy.

Also supported for professional users, with a history of 15 years use in many real products (millions of units with no reported issues).


Mark (developer of uTasker project since 2004)

P.S. **************** are links that I removed since I can’t use more that two in my first post (I can add then in other posts on request)

hallo i have read your post on bootloader it was really a great post. I am new to firmware development and bootloader .As a part of my project i have to developed a bootloader for firmware updation .Atmel samd21 19a i have to do firmware updation using usb so first i have to check whether ther is data in usb buffer if so i should store it in a buffer which is then given to xml parser and then encrypted to base 64
but i stucked with first part i am not getting data from my usb buffer
if(usb_rx_buf != 0) //checking if data is available
memcpy(xmlBuf,usb_rx_buf,usb_rx_buf+1); //copying Usb data to xml Buffer
if(usb_rv_index >= sizeof(usb_rx_buf) )
usb_rv_index ++;
xmlReq = XMLparser_parse_str((uint8_t*)xmlBuf, &XMLParameter);
clearXmlBuffer = 1;
am not sure whether problem is with code please help me to find the fault and proceed