This is the second post in our “Zero to main()” series.
Last time, we talked about
bootstrapping a C environment on an MCU before invoking our main function. One
thing we took for granted was the fact that functions and data end up in the
right place in our binary. Today, we’re going to dig into how that happens by
learning about memory regions and linker scripts.
Very well written, but my question relates to the demystification. The “COMMON” section still just magically appears.
You’ll note that the .bss section also includes *(COMMON) . This is a special input section where the compiler puts global unitialized variables that go beyond file scope.
Where is this documented? I am still curious by some linker script content. I now understand 99% (reading a script) of a script but coming up with one is a still a huge challenge. I guess it is like makefiles, you never create one from scratch.
Hello, in this example will .sbss point to the start of RAM location ie 0x20000000, since that is the first item defined in the RAM section? Is it correct that _end will be used as the start of stack, as it grows “downward” in ARM?
I was reading about stack overflow protection, where it was suggested that the stack be placed at the “top” of RAM, to prevent bss and data section corruption and also generate a hardfault. I am actually not sure what the implication of top or bottom is(in terms of the physical address), but does that mean that the stack section should be defined “first”, ie before bss? Can you may be help me understand?
You could indeed put the stack right at the start of RAM to generate a fault when you overflow it. In that case you would add the stack prior to the bss section in your linker script. An alternative is to use your MPU to memory protect the end of your stack, or to write a pattern at the end of your stack and verify it has not been overridden when you context switch.
Thank you for the quick response! Yes I will also look at enabling the MPU (your MPU blogpost is on my reading list!). In my readings in blogs and posts, it seems that the method of writing a pattern (like gcc stack canary?) would add additional overhead, and an alternative (“zero overhead”) was to change the position of stack. I will explore all options! Thanks again.
Right, adding a stack canary has additional overhead. Depending on when you check the canary, you trade overhead for how quickly the overflow is detected. E.g.
Check at before every function returns (i.e. the GCC approach): quickly detects overflows, but high overhead (you can read more about it at https://lwn.net/Articles/584225/).
Check at every scheduler tick (i.e. the FreeRTOS approach): lower overhead, but you won’t know which function led to the overflow exactly.
Changing the position of the stack is a good approach!
I am trying to use custom linker script on my beaglebone black rev C, Ubuntu 18.04.2 LTS, armv7l. Following are the 3 files, namely, the C code, the linker script and the script to compile and execute:
1 of 3) main.c:
const char msg[] = "Hello World !\n";
unsigned int msg_size = sizeof(msg);
const char func_msg[] = "func(): I was called!\n";
unsigned int func_msg_size = sizeof(func_msg);
void func(void)
{
// write syscall arg1 is the file descriptor. 1 is STDOUT.
asm volatile (
"mov %r0, $1"
);
// write syscall arg2 is the buf.
asm volatile (
"ldr %r1, =func_msg"
);
// write syscall arg3 is the size_t count.
asm volatile (
"ldr %r2, =func_msg_size"
);
// Specify the type of system call (4 is the write syscall) in r7.
asm volatile (
"mov %r7, $4"
);
// Invoke the syscall.
asm volatile (
"swi $0"
);
}
void nri_main(void)
{
// write syscall arg1 is the file descriptor. 1 is STDOUT.
asm volatile (
"mov %r0, $1"
);
// write syscall arg2 is the buf.
asm volatile (
"ldr %r1, =msg"
);
// write syscall arg3 is the size_t count.
asm volatile (
"ldr %r2, =msg_size"
);
// Specify the type of system call (4 is the write syscall) in r7.
asm volatile (
"mov %r7, $4"
);
// Invoke the syscall.
asm volatile (
"swi $0"
);
// Specify the return value. Let's say 32.
asm volatile (
"mov %r0, $32"
);
// Specify the type of system call (1 is the exit syscall) in r7.
// Source: By randomly searching online.
asm volatile (
"mov %r7, $1"
);
// Invoke the syscall.
asm volatile (
"swi $0"
);
}
#!/bin/bash
# Exit on failure of any command.
set -e
rm -f main.elf main.map
# Disable ASLR
if [[ $(cat /proc/sys/kernel/randomize_va_space) != 0 ]]; then
echo "pcal.sh: ASLR is enabled. Disabling it."
sudo -u root bash -c "echo 0 > /proc/sys/kernel/randomize_va_space"
else
echo "pcal.sh: ASLR is disabled. No action required."
fi
cpp -nostdlib -nostartfiles -nodefaultlibs -static -fno-exceptions -fstack-usage -Wstack-usage=8192 main.c -o main.i
gcc -nostdlib -nostartfiles -nodefaultlibs -static -fno-exceptions -fstack-usage -Wstack-usage=8192 -S main.i -o main.s
as main.s -o main.o
ld -static --gc-section -T nri.ld -Map=main.map main.o -o main.elf
echo "Now you can ./main.elf and then echo \$? to see if 32 return value gets printed."
rm -f main.i main.s main.o
I end up with segmentation fault, as shown below:
$ ./pcal.sh
pcal.sh: ASLR is disabled. No action required.
Now you can ./main.elf and then echo $? to see if 32 return value gets printed.
$ ./main.elf
func(): I was called!
Segmentation fault
I believe it is something to do with incorrect placement of stack since I wanted to learn how to correctly setup stack on a Linux environment where, if I’m not mistaken, it is not as flexible as bare metal. Please help!
Terrific explanation, I hope you could help me to solve another issue.
I need to create a new section in the RAM, to save all the global variables of type uint16 (allready did it), how can I tell the compiler and/or linker to save the uint16 global variables in that especific memory section?
Your post is very good, simple, easy to understand.
Can you also add a section on how to map the reset vector to the appropriate handler function? I have seen, for example, in some projects, it is defined as:
reset_handler = _start
ENTRY(_start)
@francois You demonstrated with a linker script that only defines MEMORY, calling arm-none-eabi-objdump -t <elf-file> will produce ‘SYMBOL TABLE: no symbols’
When I tried with my own ‘MEMORY only’ linker script I still got .text, .bss, .data and .comment sections.
I compile with: arm-none-eabi-gcc -nostdlib -mcpu=cortex-m0plus test.c -c -o test.o
I link with: arm-none-eabi-ld -T test.ld test.o -o test.elf
If you want to reserve a section of memory, how could you carve it out so that a new section data section is created from the original block?
For example, the fpga/board designers suddenly say, ok here is another 4k memory block at this address. How could you 1) add it to the existing memory pool so that the linker uses it as it pleases… or add it as a separate section, that is not available for linker’s general use but rather is used by the programmer via:
[the NOLOAD property] is the only section property used in modern linker scripts.
This is not true. As of GCC 11 (used by the current STM32 toolchain) will emit a warning
“warning: file.elf has a LOAD segment with RWX permissions”, if the READONLY property isn’t specified for certain sections. This caused a lot of churn when users upgraded to the a new IDE and toolchain release, since perfectly working projects suddenly showed these new warnings.
Thanks for calling that out @foolong. IMO this is a bug in the ST toolchain, they should disable this warning. The READONLY property does nothing on these chips. You could imagine a future where a header is added to the firmware (or ELF files are supported directly by bootloaders) and the bootloader will setup write protection on those sections of flash, but AFAIK no MCU toolchain does this today.