elffile: ELF Specific File Identification Utility
Ali Bahrami Friday November 11, 2011
Solaris 11 has a new standard user level command, /usr/bin/elffile. elffile is a variant of the file utility that is focused exclusively on linker related files: ELF objects, archives, and runtime linker configuration files. All other files are simply identified as "non-ELF". The primary advantage of elffile over the existing file utility is in the area of archives elffile examines the archive members and can produce a summary of the contents, or per-member details.
The impetus to add elffile to Solaris came from the effort to extend the format of Solaris archives so that they could grow beyond their previous 32-bit file limits. That work introduced a new archive symbol table format. Now that there was more than one possible format, I thought it would be useful if the file utility could identify which format a given archive is using, leading me to extend the file utility:
In turn, this caused me to think about all the things that I would like the file utility to be able to tell me about an archive. In particular, I'd like to be able to know what's inside without having to unpack it. The end result of that train of thought was elffile.% cc -c ~/hello.c % ar r foo.a hello.o % file foo.a foo.a: current ar archive, 32-bit symbol table % ar r -S foo.a hello.o % file foo.a foo.a: current ar archive, 64-bit symbol table
Much of the discussion in this article is adapted from the PSARC case I filed for elffile in December 2010:
PSARC 2010/432 elffile
The file utility provides a quick answer to question (1), as it identifies all archives as "current ar archive". It does nothing to answer the more interesting question (2). To answer that question, requires a multi-step process:
It should be easier and more efficient to answer such an obvious question.
It would be reasonable to extend the file utility to examine archive contents in place and produce a description. However, there are several reasons why I decided not to do so:
I decided that extending file in this way is overkill, and that an investment in the file utility for better archive support would not be worth the cost. A solution that is more narrowly focused on ELF and other linker related files is really all that we need. The necessary code for doing this already exists within libelf. All that is missing is a small user-level wrapper to make that functionality available at the command line.
In that vein, I considered adding an option for this to the elfdump utility. I examined elfdump carefully, and even wrote a prototype implementation. The added code is small and simple, but the conceptual fit with the rest of elfdump is poor. The result complicates elfdump syntax and documentation, definite signs that this functionality does not belong there.
And so, I added this functionality as a new user level command.
elffile [-s basic | detail | summary] filename...
Please see the elffile(1) manpage for additional details.
To demonstrate how output from elffile looks, I will use the following files:
The file utility identifies these files as follows:
File Description config A runtime linker configuration file produced with crle dwarf.o An ELF object /etc/passwd A text file mixed.a Archive containing a mixture of ELF and non-ELF members mixed_elf.a Archive containing ELF objects for different machines not_elf.a Archive containing no ELF objects same_elf.a Archive containing a collection of ELF objects for the same machine. This is the most common type of archive.
By default, elffile uses its "summary" output style. This output differs from the output from the file utility in 2 significant ways:% file config dwarf.o /etc/passwd mixed.a mixed_elf.a not_elf.a same_elf.a config: Runtime Linking Configuration 64-bit MSB SPARCV9 dwarf.o: ELF 64-bit LSB relocatable AMD64 Version 1 /etc/passwd: ascii text mixed.a: current ar archive, 32-bit symbol table mixed_elf.a: current ar archive, 32-bit symbol table not_elf.a: current ar archive same_elf.a: current ar archive, 32-bit symbol table
The output for same_elf.a is of particular interest: The vast majority of archives contain only ELF objects for a single platform, and in this case, the default output from elffile answers both of the questions about archives posed at the beginning of this discussion, in a single efficient step. This makes elffile considerably more useful than file, within the realm of linker-related files.% elffile config dwarf.o /etc/passwd mixed.a mixed_elf.a not_elf.a same_elf.a config: Runtime Linking Configuration 64-bit MSB SPARCV9 dwarf.o: ELF 64-bit LSB relocatable AMD64 Version 1 /etc/passwd: non-ELF mixed.a: current ar archive, 32-bit symbol table, mixed ELF and non-ELF content mixed_elf.a: current ar archive, 32-bit symbol table, mixed ELF content not_elf.a: current ar archive, non-ELF content same_elf.a: current ar archive, 32-bit symbol table, ELF 64-bit LSB relocatable AMD64 Version 1
elffile can produce output in two other styles, "basic", and "detail". The basic style produces output that is the same as that from 'file', for linker-related files. The detail style produces per-member identification of archive contents. This can be useful when the archive contents are not homogeneous ELF object, and more information is desired than the summary output provides:
% elffile -s detail mixed.a mixed.a: current ar archive, 32-bit symbol table mixed.a(dwarf.o): ELF 32-bit LSB relocatable 80386 Version 1 mixed.a(main.c): non-ELF content mixed.a(main.o): ELF 64-bit LSB relocatable AMD64 Version 1 [SSE]
| Stub Proto: Not Just For Stub Objects|| Ancillary Objects|