How To Strip An ELF Object Without Fully Understanding It

Ali Bahrami — Friday April 22, 2016

Surfing with the Linker-Aliens

The generic ELF ABI (gABI) provides a namespace in which ELF features unique to a given platform can be defined, without affecting other platforms. Solaris has such a namespace (ELFOSABI_SOLARIS), and we've used it over the years to define a number of Solaris specific features. The GNU community has their own namespace (ELFOSABI_GNU), and their tools support it, but those tools also run on Solaris and other non-GNU systems. Sometimes that leads to cross-ELFOSABI confusion.

We recently reached out to the GNU community to ask for help in solving a problem we were having with their strip and objcopy utilities, which are part of binutils. GNU strip and objcopy, which share common code, were zeroing the sh_link and sh_info fields of ELFOSABI_SOLARIS specific sections. That issue is now well on its way to being fixed, thanks to Nick Clifton's rapid response, and the resulting discussion. The discussion revolved around the ELF rules that allow utilities to add or remove sections from an object, and update their sh_link and sh_info fields, without necessarily understanding the contents of those sections. I don't believe that the rules for this are well known, so this might be a good time to describe them.

When you add or remove sections from an object, the indexes of the other sections can move up or down. The sh_link and sh_info fields of a section header can contain section indexes for other sections, and therefore need to be updated to track such movement. On the surface, this would seem to be a thorny problem, because a tool of this sort would have to know the details of all the various platform ELFOSABIs, and in particular, the meanings of all the different section types, and the meanings of their sh_link and sh_info fields. That could be quite a burden. However, ELF is defined in a way that allows tools like this to be written without knowing any of those things.

When ELF was originally conceived at Bell Labs in the 1980's, they were only thinking of themselves. There was no concept of ELFOSABI, or vendor extensions. When section headers were designed, it was evident that there are 2 types of "extra" information that might be generally useful to have as part of each header:

sh_link
It is very common for one section to reference ("link to") another. For instance, a symbol table has an associated string table. sh_link was added to reference other sections by index.

sh_info
It is also common for a section to need to carry some other section specific information. sh_info was added to hold that sort of information. The meaning of sh_info was expected to be defined independently for each section type. Initially, relocations sections used it to hold a second section index, while symbol tables used it to hold the count of local symbols found in the table. It was anticipated that as new section types were added, they would each define their own meaning for sh_info.

For many section types, one or both of these fields are always set to 0.

During the 1990's, System V Unix, and therefore ELF, became the basis for a growing collection of operating systems. Rather than a single ELF implementation, there was now a growing family, each developed by a different organization. In order to preserve a common core, while providing a way for different platforms to pursue their own requirements, the notion of a generic ELF ABI was formulated, and the various numeric ranges in the ELF standard were partitioned into generic, operating system specific, and processor specific ranges.

Central to the idea of a generic shared core is the idea that ELF utilities from one OSABI should be able to perform basic operations on objects from another. As part of that process, thought was given to how a tool like strip can do the right thing for an object from a different ELFOSABI without having to encode knowledge of that ELFOSABI. At that time, existing sh_link and sh_info usage in existing objects was limited to the small set shown in the following table.


sh_type sh_link sh_info

SHT_DYNAMICsection-index0
SHT_HASHsection-index0
SHT_RELsection-indexsection-index or 0
SHT_RELAsection-indexsection-index or 0
SHT_SYMTABsection-indexsymbol-index
SHT_DYNSYMsection-indexsymbol-index

Starting from that point, and considering a future in which new section types will be added, a few things were evident:

And so, the SHF_INFO_LINK section header flag was added, and all sections other than relocation sections are required to set it when sh_info contains a section index. With this, it is possible to establish rules that are backward compatible to the very oldest ELF objects, while allowing for future expansion:

sh_link

  • A value of 0 is preserved as is.

  • A non-zero sh_link is a section index, and should be translated.

sh_info

  • For relocation sections (SHT_REL or SHT_RELA), or if the SHF_INFO_LINK flag is set, sh_info is a section index, and follows the translation rules given above for sh_link.

  • If the previous rule does not apply, the value of sh_info is preserved as is.

These rules are sufficient to safely move sections around without any understanding of their meaning or contents. That is generally all that a program like strip needs to do for ELFOSABI sections.

On the other hand, code that alters the contents of a section does need to understand the meaning of sh_info for that section type. Presumably any code with knowledge of the section format sufficient to alter it would also know what to do with sh_info. That goes beyond what a program like strip needs.

Surfing with the Linker-Aliens

Published Elsewhere

https://blogs.oracle.com/ali/entry/how_to_strip_an_elf/
https://blogs.oracle.com/ali/how-to-strip-an-elf-object-without-fully-understanding-it/

Surfing with the Linker-Aliens

[34] Multiple Of Same Object In Process
Blog Index (ali)
[36] ELF Program Header Names