Tuesday, April 26, 2011

Linker scripts

Linkers are pretty smart, but they need some help to tell them what to do. That is where the linker script comes in. Mine originally descended from the Sparkfun Logomatic firmware (like my whole program) where it is called main_memory_block.ld , but it turned out to have a flaw.

I remember back in the 90s when I was first learning a real language, Turbo Pascal. Don't laugh. One of the features of its compiler/linker was what it called "smart linking" where it would only include code and/or data in the final executable if it had a chance to be used. It eliminated dead code.

GNU ld by default doesn't do this. The linker is a lot simpler and dumber than we think. For instance, the objects that it links are composed of sections and labels. It uses the labels to do the linking, but the sections are trackless expanses of bytes. They might be code, they might be data, they might be unlabeled static functions or read-only data. The linker doesn't know, it can't know.

In order to do a semblance of smart linking, GCC has a feature called -ffunction-sections and -fdata-sections. This tells GCC that when it is building an ELF object, it should put each function into its own section. The code goes into .text.function_name, while zeroed data goes into .bss.variable_name and initialized data goes into .data.variable_name.

The complementary feature in ld is called --gc-sections. The linker script tells the linker where to start, where the true program entry point is. All labels used in the section where the entry point is, are live. All labels used in the sections where those live labels are are live also, and so on and so on. Any section which has no live labels is dead, and the linker doesn't put the dead sections into the final executable.

With the much smaller sections provided by -ffunction-sections, the .text section is no longer a trackless waste of bytes. It's probably empty, in fact. All the functions live in their own sections, so the linker can know what is what, and can remove dead code. These complementary features are the GCC implementation of smart linking.

However, when the linker is done garbage collecting the dead code, the linker script might tell it to bundle together all the sections whose names match a pattern, together into one section in the output. No one is going to try to link the output into some other program, so this is ok. The Sparkfun linker script bundled together all the .text.whatever sections into a single .text section in the output, and all the .data.whatever, but not the .bss.whatever. This is important, because the linker creates a label at the beginning and end of the .bss section, and a block of code in Startup.S, the true entry point, fills memory between the labels with zeros. With all these unbundled .bss sections, the final .bss was very small and did not include all the variables, so some of the variables I expected to be zeroed, were not. This is a Bad Thing. Among other things, it meant that my circular buffer of 1024 bytes had its head pointer at byte 1734787890. As the wise man Strong Bad once said, "That is not a small number. That is a big number!"

It turns out this does not have anything to do with C++. I turned on -ffunction-sections to try and reduce the bloat from the system libraries needed in C++, but if I had turned it on in C, the same thing would have happened.

The fix:

Open up main_memory_block.ld . Inside is a section like this:


/* .bss section which is used for uninitialized (zeroed) data */
.bss (NOLOAD) :
{
__bss_start = . ;
__bss_start__ = . ;
*(.bss)
*(.gnu.linkonce.b*)
*(COMMON)
. = ALIGN(4);
}
> RAM

. = ALIGN(4);
__bss_end__ = . ;
PROVIDE (__bss_end = .);


This says among other things, that in the output file, the .bss section (which has a new label __bss_start at its beginning) includes all .bss sections from all the input object files. It also creates a new label at the end of the section called naturally enough, __bss_end . The startup code zeroes out all the RAM between these two labels.

The problem is that *(.bss) only includes the .bss sections from each object, not the .bss.variable stuff. So, change the


*(.bss)


line to:


*(.bss .bss.*)


This says to bundle in not only all the .bss sections from all the inputs, but also all the sections which start with .bss. in their names. Now with them all bundled, the bss zero-er will do the right thing, and the program will work.

No comments:

Post a Comment