New CRT Objects. (Or: What Are CRT objects?)

Ali Bahrami — Tuesday December 22, 2015

Surfing with the Linker-Aliens

Solaris 11 Update 4 introduces a new set of CRT objects for use by compilers, and others creating specialized low level system objects. This area has always been poorly documented and it took far longer to unravel the history and understand how things are supposed to work, than it look to actually write and test. Unlike the CRT objects they replace, this set is publicly documented and committed, and the same set of objects are delivered uniformly for 32 and 64-bits, on sparc and x86.

That's big news for a small number of people who deal with compilers and linking, and if you're one of them, then it's likely we've met. Everyone else is probably asking, "What are CRT objects?".

What Are CRT Objects?

CRT objects are objects quietly added to the link line by the compiler driver as glue between the user supplied code, and the system. The user is generally not aware of their presence, but they are essential for correct program operation. This is why we always recommend linking via the compiler rather than calling ld directly. As a user, you have no idea what crt objects to supply, plus that list can change as compilers and OS evolve over time.

I believe that CRT originally stood for "C RunTime", but that's a vestigial historical artifact now. The base CRT objects apply to all executables and shared objects, regardless of the language they were written in.

Consider the simple hello world program, which is typically built as follows:

% cc hello.c
% ./a.out
hello
We know that the C compiler creates a relocatable object, and then invokes the link-editor (ld) to create the final executable object a.out, without showing the ld command. You are probably imagining that it looks something like this:
% ld /tmp/hello.o -lc
We can ask the compiler to show the link line. The details depend on the specific compiler. Using the Studio compilers, the real command looks more like the following, noting that I've omitted a large number of arguments that have nothing to do with the CRT objects we're discussing, and that /compilers is not the real path to my local compilers:
ld /compilers/crti.o \
    /compilers/crt1x.o \
    /compilers/values-xa.o \
    hello.o -o a.out -lc \
    /compilers/crtn.o
crti.o, crt1x.o, values-xa.o, and crtn.o are all CRT objects, here provided by the compiler, largely as implementation details that the programmer should not need to be aware of. If you've ever been told that you should not link executables and shared objects directly, and should instead use the C compiler, you may have found that advice odd. After all, isn't the linker the right tool for linking? It is of course, but without the CRT objects, you can't get it right:
% cc -c hello.c
% ld hello.o -lc
% ./a.out
hello
Segmentation Fault (core dumped)
You could of course do as I've shown above and have the compiler show you the link line, and then make copies of the CRT objects for yourself, and then call ld directly. I've seen it done, but that is a really, terrible idea, because the compilers are free to change their CRTs without notice. The copies you've made are not guaranteed to continue working, and there is no support for using them in any way other than having the compiler supply them. It is useful to understand what CRT objects do, but please take this advice, and just use the compiler driver as intended to link your objects.

CRT Details And History

In general, there are two categories of CRT:

crt1.o / crti.o / crtn.o
The base low level CRT objects required to let basic C programs start and run.

You've probably been told that execution of a process starts with your main() function. That's an effective abstraction, but in fact, execution starts within the runtime linker, not at main(). The runtime linker maps and relocates the objects that make up the process, and then jumps to a symbol named _start to pass control to the program executable. crt1.o provides the _start symbol that the runtime linker jumps to in order to pass control to the executable, and is responsible for providing some ABI mandated symbols, language specific runtime setup, for calling main(), and ultimately, exit(). crti.o and crtn.o provide prologue and epilogue .init/.fini sections to encapsulate any user provided init/fini code.

values-*.o
These higher level objects are provided to control things like compilation mode (Ansi C vs K&R C) or standards compliance (xpg4, xpg6, etc).
As noted previously, there can be other CRTs, depending on the compiler. This discussion is mainly about the three core CRTs: crt1.o, crti.o, and crtn.o.

Note that you may encounter some variants of these:

gcrt1.o
gprof-enabled version of crt1.o

crt0.o
I believe this was the original name for crt1.o, and served the same purpose. It got renamed at some transitional point, to allow both objects to exist. You may hear compiler folks use either name interchangeably.

crt*x.o
The Studio compilers have 'x' variants of these CRT objects, to provide different implementations.

...
It is common for compilers to add additional CRT objects of their own devising to support features, or to provide specific language dependent support.
The three crt?.o objects have been historically delivered by the compilers, on a schedule unrelated to that of the OS. They contain some things required by ABI for basic operation, and some other things required specifically by the compilers. This has largely worked, but is not ideal: To summarize the situation as it existed from Solaris 10 through 11.3: There is a lot of history here, much of which hasn't changed in decades, and a fair amount of cruft.

New Supported CRT Objects For Solaris 11 Update 4

Given the above discussion above, the need for a uniform and documented set of CRT objects should be evident. Solaris 11 Update 4 provides: We hope these changes will lower the bar for compilers targeting Solaris, but doing it was a benefit to us internally as well.

Typically in a blog like this, I might continue on to describe the details of these CRTs, but as these are public interfaces, you can read the manpage on a Solaris 11 Update 4 system:

% man crt1.o
If you don't have Solaris 11 Update 4 running locally, that manpage is available online at docs.oracle.com. I'll tack on an ASCII copy at the end of this as well.

crt1.o(5) Manpage

Standards, Environments, and Macros                                  crt1.o(5)



NAME
       crt1.o,  crti.o,  crtn.o,  __start_crt_compiler  -  Core  OS  C Runtime
       Objects (CRT)

SYNOPSIS
   Executable
       ld /usr/lib/crt1.o [compiler-crt-objects]...
          /usr/lib/crti.o ld-arguments... /usr/lib/crtn.o


       ld /usr/lib/64/crt1.o [compiler-crt-objects]...
          /usr/lib/64/crti.o ld-arguments... /usr/lib/64/crtn.o


   Shared Object
       ld -G [compiler-crt-objects]...
          /usr/lib/crti.o ld-arguments... /usr/lib/crtn.o


       ld -G [compiler-crt-objects]...
          /usr/lib/64/crti.o ld-arguments... /usr/lib/64/crtn.o


   Optional crt1.o Extension Function
       int __start_crt_compiler(int argc, char *argv[])


DESCRIPTION
       The crt1.o, crti.o, and crtn.o objects comprise the core  CRT  (C  Run-
       Time) objects required to enable basic C programs to start and run. CRT
       objects are typically added by compiler drivers when building  executa-
       bles and shared objects in a manner described by the above SYNOPSIS.


       crt1.o  provides  the  _start  symbol that the runtime linker, ld.so.1,
       jumps to in order to pass control to the executable, and is responsible
       for  providing  ABI  mandated symbols and other process initialization,
       for calling main(), and ultimately, exit(). crti.o and  crtn.o  provide
       prologue  and epilogue .init and .fini sections to encapsulate ELF init
       and fini code.


       crt1.o is only used when building executables.  crti.o and  crtn.o  are
       used by executables and shared objects.


       These  CRT  objects are compatible with position independent (PIC), and
       position dependent (non-PIC) code,  including both normal and  position
       independent executables (PIE).

   Compiler-Specific CRT Objects
       Compilers may supply additional CRT objects to provide compiler or lan-
       guage   specific   initialization.   If   the   compiler   provides   a
       __start_crt_compiler()  function,  then  crt1.o  calls __start_crt_com-
       piler() immediately before calling main(), with the same arguments that
       main()  receives. If  __start_crt_compiler() returns a value of 0, then
       execution  continues  on  to  call  main().  If  __start_crt_compiler()
       returns  a  non-zero  value,  then  that value is passed to exit(), and
       main() is not called. The __start_crt_compiler() is optional,  and  may
       be omitted if not needed.

       Note -

         The __start_crt_compiler() function is reserved for the exclusive use
         of the compiler. Any other use is unsupported. Such use can result in
         undefined  and  non-portable behavior. Applications requiring code to
         execute at startup have a variety of supported options.  The  startup
         code can be called early in the main() function.  Many compilers sup-
         port the #pragma init directive to create init functions that run  as
         part  of  program  startup.  Alternatively, some languages expose the
         concept of init functions in terms  of  portable  language  features,
         such as C++ static constructors.

EXAMPLES
       Example 1 Simple Executable.


       The  following  example  builds  a simple executable that contains both
       init and fini functions. The program prints the number of arguments  on
       the command line, and the number of environment variables.


         #include 
         extern char **environ;

         #pragma init(main_init)
         static void
         main_init(void)
         {
              (void) printf("main_init\n");
         }

         #pragma fini(main_fini)
         static void
         main_fini(void)
         {
              (void) printf("main_fini\n");
         }

         int
         main(int argc, char **argv)
         {
              char **envp = environ;
              int  envcnt = 0;

              for (; *envp; envp++)
                   envcnt++;

              (void) printf("main: argc=%d, envcnt=%d\n", argc, envcnt);
         }



       Normally,  a compiler is used to compile and link a program in a single
       step. To illustrate CRT use, this example uses the link-editor directly
       to build the program from compiled objects.


         example% cc -c main.c
         example% ld /usr/lib/crt1.o /usr/lib/crti.o main.o -lc /usr/lib/crtn.o
         example% ./a.out
         main_init
         main: argc=1, envcnt=49
         main_fini



ATTRIBUTES
       See attributes(5) for descriptions of the following attributes:




       +-----------------------------+-----------------------------+
       |      ATTRIBUTE TYPE         |      ATTRIBUTE VALUE        |
       +-----------------------------+-----------------------------+
       |Availability                 |system/linker                |
       +-----------------------------+-----------------------------+
       |Interface Stability          |Committed                    |
       +-----------------------------+-----------------------------+
       |MT-Level                     |Safe                         |
       +-----------------------------+-----------------------------+

SEE ALSO
       ld(1), ld.so.1(1), exec(2), exit(3C)


       Oracle Solaris 11 Update 4 Linkers and Libraries Guide

NOTES
       The reference to the C programming language in the term CRT is histori-
       cal. The CRT  objects  described  here  are  required  by  all  dynamic
       objects.



SunOS 5.11 Update 4          28 July 2015                        crt1.o(5)
Surfing with the Linker-Aliens

Published Elsewhere

https://blogs.oracle.com/ali/new_crt_objects/

Surfing with the Linker-Aliens

[29] Goodbye to -mt and -D_REENTRANT
Blog Index (ali)
[31] Regex and Glob for Mapfiles