Direct Binding - the -zdirect/-Bdirect options, and probing

Rod Evans — Monday July 21, 2008

Surfing with the Linker-Aliens

In a previous posting I introduced the use of direct bindings within the OSNet consolidation. A comment to this posting questioned the difference between the two options -z direct and -B direct, and pointed out that runtime errors can occur during process execution if a lazy dependency (typically enabled with -B direct) can not be found. In this entry, I'll discuss the difference between the -z direct and -B direct options, and offer a useful technique for handling the case where the lazy dependency is not present at runtime.

First, the difference between -z direct and -B direct. A full discussion of these options can be found in the Direct Binding chapter of the Linker and Libraries Guide. Aside from lazy loading being enabled by -B direct, the essential difference between these options is a trade off between ease of use, and of control. -B direct can be specified anywhere on the command-line, and results in any external and internal symbol bindings being established as direct. This means that if libX.so defines xy() and references xy(), then a direct binding will be established within the same object.

    % cc -G -o libxy.so xy.c -Bdirect -Kpic
    % elfdump -y libxy.so | fgrep xy
          [7]  DB          <self>             xy

Hence, -B direct is a blunt club that hits everything. In contrast, -z direct is sensitive to its position in the command line, and can therefore be used in a more precise manner. Only external references that are resolved to dependencies that follow -z direct are established as direct. In the following example, only the references to libX.so and libY.so will have direct bindings established.

    % cc -o libxy.so xy.c -lA -lB -z direct -lX -lY

But the real question is why would you use one option over the other? -B direct is recommended where possible, due to its simplicity and ease of use. However, there are cases where finer grained control is needed, and -z direct is more appropriate. One example is libproc. This library contains many routines that users (typically debugging tools) wish to interpose upon. We want libproc to have direct bindings to any of the dependencies it requires (libc, libelf, etc.), but we do not wish libproc to directly bind to itself. Therefore, but using -z direct we can build libproc to bind directly to its own dependencies while freely binding to any interposers, for any of the interfaces libproc defines. This interposition is provided regardless of the interposers being explicitly defined (a requirement as we do not have control over all the consumers of libproc). Note, we even went a little bit further, and defined all the libproc interfaces as NODIRECT, which prevents any direct binding to libproc. This was to prevent any dependencies binding to libproc instead of to an interposer.

The comment to my previous blog entry also raised the issue of how lazy loading can be compromised if a lazy dependency can not be found. Typically, lazy loading is used to locate dependencies that are expected to exist. Historically, interfaces like dlopen(3c) have been used to test for the occurrence of dependencies that might not exist. However, a useful technique is to use lazy loading and test for the existence of a dependency with dlsym(3c). By testing for the existence of a known interface with a lazy dependency you can verify the dependency exists and then feel free to call any other interface within that dependency.

When a dependency is bound to, the SONAME of that dependency is recorded in the caller.

    % cc -G -o libxy.so -hlibxy.so xy.c -Kpic
    % elfdump -d libxy.so | fgrep SONAME
	 [2]  SONAME		0x1		    libxy.so

    % cc -o main main.c -z lazyload -L. -lxy
    % elfdump -d main | egrep "NEEDED|POSFLAG"
	 [0]  POSFLAG_1		0x1		    [ LAZY ]
	 [1]  NEEDED		0x163		    libxy.so

With this dependency established, you can protect yourself from calling the interfaces within the dependency unless the interface family you are interested in are known to exist.

    if (dlsym(RTLD_PROBE, "symbol-in-libxy-1") {
	/*
         * feel free to call any-and-all interfaces in libxy
	 */
	symbol-in-libxy-1();
	symbol-in-libxy-2();
	....

With this model you don't need to know the name of the object that provides the interfaces, as the name was recorded at link-time. And, the dlsym() will trigger an attempt to load the dependency associated with the symbol. All other references can be made directly through function calls rather than through dlsym(). This allows the compiler, or verification tools like lint, to ensure that you are calling the function with the proper argument and return types, and will therefore lead to safer and more robust code.

The use of dlopen() is still appropriate for selecting between differing objects, or when the caller is not knowledgeable of the dependency, such as the case with plugins. In other cases, the use of lazy loading together with dlsym(), as outlined above, is recommended, as the implementation is usually easier to write, debug and deploy.

Surfing with the Linker-Aliens

Published Elsewhere

https://blogs.sun.com/rie/entry/direct_binding_the_zdirect_bdirect/
https://blogs.oracle.com/rie/entry/direct_binding_the_zdirect_bdirect/
https://blogs.oracle.com/rie/direct-binding-the-zdirect-bdirect-options%2c-and-probing/

Surfing with the Linker-Aliens

[28] OSNet Direct Binding
Blog Index (rie)
[30] Symbol Capabilities