By Yumu
In the process of software development, problems often occur with the dynamic library linking, which may lead to symbol conflicts, causing program abnormalities or crashes. To understand the dynamic linking mechanism and its working principle, the author reviewed Self-cultivation of Programmers and learned about the dynamic linking process through practical demonstration and disassembly analysis.
This article will probe into the dynamic-link library mechanism in Linux, including but not limited to global symbol interposition, lazy binding, and position-independent code (PIC). Through the discussion of the above concepts and technical details, it is hoped to provide a clearer cognitive framework to reveal the hidden essential reasons behind symbol conflicts. In this way, when encountering similar problems in the actual software development process, developers can take measures to prevent or solve them more easily, ensuring the stable operation of the program while improving the overall quality and user experience.
For the convenience of readers, basic concepts mentioned in this article, such as ELF, PIC, GOT, PLT, and commonly used sections, are summarized in the appendix.
Through a simple C language program, we will explore the operation mechanism of dynamic-link libraries within and between modules, which involves the interaction between variables and functions. Moreover, we will use the -fPIC option to ensure that position-independent code is generated.
#include <stdio.h>
// The static variable a is only visible in this module.
static int a;
// Declare the external global variable b with extern.
extern int b;
// The global variable c accessed in this module.
int c = 3;
// Declare the external function ext().
extern void ext();
// The scope of the static function inner() is limited to this module.
static void inner() {}
// The bar() function modifies the static variable a and the external global variable b.
void bar() {
a = 1; // Modify the value of the static variable a.
b = 2; // Modify the value of the external global variable b.
c = 4; // Modify the value of the global variable c in the module.
}
// The inner, bar, and ext are called in the foo() function, and the variable values are printed.
void foo() {
inner(); // Call the static function inner().
bar(); // Call the function bar().
ext(); // Call the external function ext().
printf("a = %d, b = %d, c = %d\n", a, b, c); // Output the value of the variable.
}
// Define the external global variable b.
int b = 1;
// The external function ext() modifies the value of the external global variable b.
void ext() {
b = 3; // Modify the value of the external global variable b.
}
// main.c
int main() {
foo(); // Call the foo() function to demonstrate the interaction between modules.
return 0; // The program ends normally.
}
gcc -shared -fPIC -o libpic.so pic.c -g
gcc -o main main.c -L. -lpic
In this code example, the -fPIC compilation option can generate position-independent code that is suitable for creating shared libraries. The code contains multiple scenarios:
• Intra-module function calls: The inner and bar functions are called in the foo function. Since inner is a static function, its scope is limited to this module. The bar function operates on the static variable a and the global variable c in the module.
• Inter-module function calls: The foo function calls the external function ext, which is defined in another module. The ext function is responsible for modifying the external global variable b.
Different types of variables:
• The static variable a is only visible in this module. Its value is not changed in other modules of the program, nor is it lost due to function calls.
• The external global variable b can be shared among multiple modules. Its value is unique and changeable throughout the program.
• The global variable c within the module can only be accessed and modified in the current module.
We all know that dynamic-link libraries need to share the same piece of code between multiple processes. To achieve this goal, the code must be position-independent so that it can be linked to different addresses as needed when loaded, and the -fPIC compilation option can be added to generate position-independent code. How do we implement this when these functions and variables are running? Next, the process of dynamic linking will be analyzed step by step.
In the example, there are two function calls in the foo function implementation: the static function inner() and the non-static function bar(). The result after disassembly is as follows.
Disassembly of section .plt:
0000000000000670 <bar@plt-0x10>:
670: ff 35 92 09 20 00 push QWORD PTR [rip+0x200992] # 201008 <_GLOBAL_OFFSET_TABLE_+0x8>
676: ff 25 94 09 20 00 jmp QWORD PTR [rip+0x200994] # 201010 <_GLOBAL_OFFSET_TABLE_+0x10>
67c: 0f 1f 40 00 nop DWORD PTR [rax+0x0]
0000000000000680 <bar@plt>:
680: ff 25 92 09 20 00 jmp QWORD PTR [rip+0x200992] # 201018 <_GLOBAL_OFFSET_TABLE_+0x18>
686: 68 00 00 00 00 push 0x0
68b: e9 e0 ff ff ff jmp 670 <_init+0x20>
...
00000000000007e8 <foo>:
foo():
00000000000007e2 <inner>:
inner():
/mnt/share/demo1/pic.c:12
static void inner() {}
7e2: 55 push rbp
7e3: 48 89 e5 mov rbp,rsp
7e6: 5d pop rbp
7e7: c3 ret
...
/mnt/share/demo1/pic.c:15
inner();
7ec: b8 00 00 00 00 mov eax,0x0
7f1: e8 ec ff ff ff call 7e2 <inner>
/mnt/share/demo1/pic.c:16
bar();
7f6: b8 00 00 00 00 mov eax,0x0
7fb: e8 80 fe ff ff call 680 <bar@plt>
It is similar to static compilation relocation but is simpler here, as follows:
7f1: e8 ec ff ff ff call 7e2 <inner>
• e8: relative offset call instruction.
• ec ff ff ff: little endian 0XFFFFFFEC is the complement of -20, which is the offset of the destination address relative to the next instruction of the current instruction. That is, the inner address is 0x7f6 (next instruction offset) - 0x 14 = 0x7e2.
Conclusion: Static function calls are simple. You can jump by relative address offset.
7fb: e8 80 fe ff ff call 680 <bar@plt>
• The parsing rule is the same as above, but the jump address is 0x 680 .
• The first instruction is jmp QWORD PTR [rip+0x200992], which is an indirect jump (jmp) instruction, running the jump address 0x201018. What is this address?
objdump -s libpic.so
Contents of section .got:
200fc8 00000000 00000000 00000000 00000000 ................
200fd8 00000000 00000000 00000000 00000000 ................
200fe8 00000000 00000000 00000000 00000000 ................
200ff8 00000000 00000000 ........
Contents of section .got.plt:
201000 080e2000 00000000 00000000 00000000 .. .............
201010 00000000 00000000 86060000 00000000 ................
201020 96060000 00000000 a6060000 00000000 ................
201030 b6060000 00000000 c6060000 00000000 ................
• It is found that this address is in the got.plt section, 0x 00000686, and the address stored in this address is
0000000000000680 <bar@plt>:
680: ff 25 92 09 20 00 jmp QWORD PTR [rip+0x200992] # 201018 <_GLOBAL_OFFSET_TABLE_+0x18>
686: 68 00 00 00 00 push 0x0
68b: e9 e0 ff ff ff jmp 670 <_init+0x20>
What is the above series of address jumps doing? We use a schematic diagram to show the first address relocation process of bar (orange is the call entry, blue indicates the running instruction, and purple represents the corrected address).
The _dl_runtime_resolve() function is not elaborated. The input parameters of this function are the symbol index and library ID of the stack. The parsing process depends on section information such as .dynamic and .rela.plt. After parsing, the redirected address is filled in as 0x201018. You can check the contents of the .rela.plt section.
[root@docker-desktop demo1]# readelf -r libpic.so
Relocation section '.rela.dyn' at offset 0x4e8 contains 10 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000200de8 000000000008 R_X86_64_RELATIVE 780
000000200df0 000000000008 R_X86_64_RELATIVE 740
000000200e00 000000000008 R_X86_64_RELATIVE 200e00
000000200fc8 000200000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_deregisterTMClone + 0
000000200fd0 000300000006 R_X86_64_GLOB_DAT 0000000000000000 b + 0
000000200fd8 000500000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
000000200fe0 000e00000006 R_X86_64_GLOB_DAT 0000000000201040 c + 0
000000200fe8 000700000006 R_X86_64_GLOB_DAT 0000000000000000 _Jv_RegisterClasses + 0
000000200ff0 000800000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_registerTMCloneTa + 0
000000200ff8 000900000006 R_X86_64_GLOB_DAT 0000000000000000 __cxa_finalize + 0
Relocation section '.rela.plt' at offset 0x5d8 contains 5 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000201018 000b00000007 R_X86_64_JUMP_SLO 00000000000007b8 bar + 0
000000201020 000400000007 R_X86_64_JUMP_SLO 0000000000000000 printf + 0
000000201028 000500000007 R_X86_64_JUMP_SLO 0000000000000000 __gmon_start__ + 0
000000201030 000600000007 R_X86_64_JUMP_SLO 0000000000000000 ext + 0
000000201038 000900000007 R_X86_64_JUMP_SLO 0000000000000000 __cxa_finalize + 0
The .rela.plt section in the ELF file contains the function slot relocation information. Specific meanings:
Type - Describes the type of relocation. In this case, the type is R_X86_64_JUMP_SLOT, which is used to parse the PLT entry for symbols through lazy loading. There are many other common types:
At runtime, the dynamic linker performs address resolution based on these relocation items. For example, when the program calls printf for the first time, the control flow first jumps to the corresponding item of printf in PLT. There will be a stub code in PLT to trigger the dynamic linker, which will resolve the real address of printf and update the corresponding address in GOT.
After the address is relocated after running, the second call will be much simpler, as shown in the following figure:
After GDB is used for debugging, the single-step debugging address redirects the content of the .got.plt section (base address: 0x7F7A97F75000).
201000 080e2000 00000000 00000000 00000000 .. .............
(gdb) x/16a 0x7f7a98176000
0x7f7a98176000: 0x200e08 0x7f7a983976a8
0x7f7a98176010: 0x7f7a9818d890 <_dl_runtime_resolve_xsave> 0x7f7a97f75686 <bar@plt+6>
0x7f7a98176020: 0x7f7a97f75696 <printf@plt+6> 0x7f7a97f756a6 <__gmon_start__@plt+6>
0x7f7a98176030: 0x7f7a97f756b6 <ext@plt+6> 0x7f7a97f756c6 <__cxa_finalize@plt+6>
0x7f7a98176040 <c>: 0x3 0x0
0x7f7a98176050: 0x31303220352e382e 0x5228203332363035
0x7f7a98176060: 0x3420746148206465 0x2936332d352e382e
0x7f7a98176070: 0x20000002c00 0x8000000
The bar address in the .got.plt section is 0x201018 + 0x7F7A97F75000 (base address) = 0x7F7A98176018, and the content of 0x7F7A98176018 is 0x7f7A97f75686 <bar@plt+6>
, which is the same as the relative address offset in the preceding figure. The result after redirection is as follows:
(gdb) x/16a 0x7f7a98176000
0x7f7a98176000: 0x200e08 0x7f7a983976a8
0x7f7a98176010: 0x7f7a9818d890 <_dl_runtime_resolve_xsave> 0x7f7a97f757b8 <bar>
0x7f7a98176020: 0x7f7a97f75696 <printf@plt+6> 0x7f7a97f756a6 <__gmon_start__@plt+6>
0x7f7a98176030: 0x7f7a97f756b6 <ext@plt+6> 0x7f7a97f756c6 <__cxa_finalize@plt+6>
0x7f7a98176040 <c>: 0x3 0x0
0x7f7a98176050: 0x31303220352e382e 0x5228203332363035
0x7f7a98176060: 0x3420746148206465 0x2936332d352e382e
0x7f7a98176070: 0x20000002c00 0x8000000
0x7f7a97f757b8 is the code segment, and 0x7f7a97f757b8 - 0x7F7A97F75000 (base address) = 0x7B8. This offset is also corresponding to the bar entry address of .text.
Let's abstract it as follows:
Based on the figure, the command call bar@plt leads to .plt, which uses the writable .got.plt section. During program execution, the function pointers in .got.plt are corrected to point to the actual addresses in the .text section (which is not writable). This process enables the creation of position-independent code.
This process also includes an important concept: lazy binding. The dynamic linker is completed at runtime. If it has been executed at the beginning, it will definitely slow down the startup speed of the program and affect performance if all symbols are loaded. Therefore, the function is not bound until it is used for the first time. This can greatly speed up the startup of the program. In this example, the bar is redirected only when it is called, and address redirection binding is not performed if it is not called, achieving the lazy binding effect.
Does the external function redirection have to be in .rela.plt?
No, if it is compiled with PIC, it will be performed in .rela.plt; if not, it will be performed in .rela.dyn.
Reason: Enabling the PIC call instruction will point to an entry in the PLT, which requires the .rela.plt section to implement lazy binding. The .rela.dyn section is used by the dynamic linker to bind the symbol to the relocation entry of its runtime address when loading. It contains other dynamic relocation information that is not specific to PLT entries. .rela.plt is mainly used for PLT relocation to resolve function addresses during dynamic linking and implement lazy binding, while .rela.dyn is used for broader dynamic relocation requirements.
Doubt?
• Question 1: What are the differences between global function calls within a module and global function calls between modules?
• Question 2: Why is there such a significant difference in the jump behavior between static function calls and global function calls, even though both involve function calls?
Put these two questions aside for a moment. Let's move on to inter-module function calls.
In the example, foo() calls ext(). Looking at the assembly, it is found that the method of inter-module function calls is exactly the same as that of intra-module function calls. The assembly instructions are as follows:
/mnt/share/demo1/pic.c:17
ext();
800: b8 00 00 00 00 mov eax,0x0
805: e8 a6 fe ff ff call 6b0 <ext@plt>
Now let's answer the first question in the previous section. There is no difference between global function calls within a module and between modules. Why?
Let's first recall the loading process. After the dynamic linker completes bootstrapping, it merges both the executable file and the linker's own symbol table into a symbol table called the global symbol table. When a symbol needs to be added to the global symbol table, if the same symbol already exists, the symbol added later is ignored. This rule is called global symbol intervention.
Due to the global symbol intervention rule, if, in the previous section, the intra-module function calls bar() by directly using a relative address, it may be overwritten by the function symbol of the same name in other modules. As a result, the relative address cannot accurately find the correct function address. Therefore, both intra-module and inter-module function calls need to be indirectly called through the .got.plt relocation method.
The answer to the second question in the previous section is also obvious. Static functions do not involve global symbol intervention issues. They can redirect through the relative address within the module. The addressing speed of such calls is also faster than that of global functions.
To have a deeper understanding of global symbol intervention, let's look at another example.
/* a1.c*/
#include <stdio.h>
void a() {
printf("a1.c\n");
}
/* a2.c */
#include <stdio.h>
void a() {
printf("a2.c\n");
}
/* b1.c */
void a();
void b1() {
a();
}
/* b2.c */
void a();
void b2() {
a();
}
/* main.c */
#include <stdio.h>
void b1();
void b2();
int main() {
b1();
b2();
return 0;
}
[root@docker-desktop priority]# g++ -fPIC -shared a1.c -o a1.so
[root@docker-desktop priority]# g++ -fPIC -shared a2.c -o a2.so
[root@docker-desktop priority]# g++ -fPIC -shared b1.c a1.so -o b1.so
[root@docker-desktop priority]# g++ -fPIC -shared b2.c a2.so -o b2.so
[root@docker-desktop priority]# ldd b1.so
a1.so (0x0000004001c2a000)
libstdc++.so.6 => /usr/local/gcc-5.4.0/lib64/libstdc++.so.6 (0x0000004001e2c000)
libm.so.6 => /lib64/libm.so.6 (0x00000040021ad000)
libgcc_s.so.1 => /usr/local/gcc-5.4.0/lib64/libgcc_s.so.1 (0x00000040024b0000)
libc.so.6 => /lib64/libc.so.6 (0x00000040026c7000)
/lib64/ld-linux-x86-64.so.2 (0x0000004000000000)
[root@docker-desktop priority]# ldd b2.so
a2.so (0x0000004001c2a000)
libstdc++.so.6 => /usr/local/gcc-5.4.0/lib64/libstdc++.so.6 (0x0000004001e2c000)
libm.so.6 => /lib64/libm.so.6 (0x00000040021ad000)
libgcc_s.so.1 => /usr/local/gcc-5.4.0/lib64/libgcc_s.so.1 (0x00000040024b0000)
libc.so.6 => /lib64/libc.so.6 (0x00000040026c7000)
/lib64/ld-linux-x86-64.so.2 (0x0000004000000000)
[root@docker-desktop priority]# g++ main.c b1.so b2.so -o main
[root@docker-desktop priority]# ./main
a1.c
a1.c
In the above example, although both b1.so and b2.so call the a() function, the main program first links b1.so, resulting in the implementation of a() using the definition in a1.so. Therefore, no matter how b2.so changes, the implementation of a1.so is always called in the main program. This phenomenon emphasizes the parsing order of symbols in dynamic-link libraries and how it affects the final execution result. Developers need to carefully consider the naming of symbols and the loading order of libraries when designing interfaces to avoid potential symbol conflicts and uncertainties.
The example shows the static variable a, the external global variable b, and the internal global variable c. The results after disassembly are as follows:
void bar() {
7b8: 55 push rbp
7b9: 48 89 e5 mov rbp,rsp
/mnt/share/demo1/pic.c:7
a = 1;
7bc: c7 05 82 08 20 00 01 mov DWORD PTR [rip+0x200882],0x1 # 201048 <__TMC_END__>
7c3: 00 00 00
/mnt/share/demo1/pic.c:8
b = 2;
7c6: 48 8b 05 03 08 20 00 mov rax,QWORD PTR [rip+0x200803] # 200fd0 <_DYNAMIC+0x1c8>
7cd: c7 00 02 00 00 00 mov DWORD PTR [rax],0x2
/mnt/share/demo1/pic.c:9
c = 4;
7d3: 48 8b 05 06 08 20 00 mov rax,QWORD PTR [rip+0x200806] # 200fe0 <_DYNAMIC+0x1d8>
7da: c7 00 04 00 00 00 mov DWORD PTR [rax],0x4
/mnt/share/demo1/pic.c:10
}
Idx Name Size VMA LMA File off Algn
CONTENTS, ALLOC, LOAD, DATA
20 .got 00000038 0000000000200fc8 0000000000200fc8 00000fc8 2**3
CONTENTS, ALLOC, LOAD, DATA
21 .got.plt 00000040 0000000000201000 0000000000201000 00001000 2**3
CONTENTS, ALLOC, LOAD, DATA
22 .data 00000004 0000000000201040 0000000000201040 00001040 2**2
CONTENTS, ALLOC, LOAD, DATA
23 .bss 0000000c 0000000000201044 0000000000201044 00001044 2**2
ALLOC
static int a; # 201048 <__TMC_END__> ==> .bss
extern int b; # 200fd0 <_DYNAMIC+0x1c8> ==> .got
int c; # 200fe0 <_DYNAMIC+0x1d8> ==> .got
In conjunction with the function calls we learned above, variable call redirection is similar, and access to static variables is done directly through offsets. This is more efficient because the scope of static variables is limited to the same compilation unit. As a result, their addresses can be determined at compile time (compared with rip). Non-static variables (including global variables and extern variables defined in the current module) may be referenced or modified by other modules, and their addresses need to be resolved by the dynamic linker at runtime. For global and extern variables, shared libraries use rip-based addressing plus runtime relocation of addresses in the .got section to ensure position independence.
There is no lazy binding for the address of global variables because they are usually resolved at load time and accessed through the global offset table, rather than deferred until the first use. As a result, delaying their address resolution will not provide significant advantages, but will place an additional performance burden at runtime.
If bar and variable c use symbols hidden by the __attribute__((visibility("hidden"))), what will happen to the function call redirection?
#include <stdio.h>
static int a;
extern int b;
__attribute__((visibility("hidden"))) int c = 3;
extern void ext();
void bar() __attribute__((visibility("hidden")));
void bar() {
a = 1;
b = 2;
c = 4;
}
static void inner() {}
void foo() {
inner();
bar();
ext();
printf("a = %d, b = %d, c = %d\n", a, b, c);
Results after disassembly
[root@docker-desktop demo1]# objdump -d -M intel -S -l libpic_hidden.so
Disassembly of section .text:
...
0000000000000738 <bar>:
bar():
/mnt/share/demo1/pic_hidden.c:7
static int a;
extern int b;
__attribute__((visibility("hidden"))) int c = 3;
extern void ext();
void bar() __attribute__((visibility("hidden")));
void bar() {
738: 55 push rbp
739: 48 89 e5 mov rbp,rsp
/mnt/share/demo1/pic_hidden.c:8
a = 1;
73c: c7 05 fa 08 20 00 01 mov DWORD PTR [rip+0x2008fa],0x1 # 201040 <__TMC_END__>
743: 00 00 00
/mnt/share/demo1/pic_hidden.c:9
b = 2;
746: 48 8b 05 8b 08 20 00 mov rax,QWORD PTR [rip+0x20088b] # 200fd8 <_DYNAMIC+0x1c8>
74d: c7 00 02 00 00 00 mov DWORD PTR [rax],0x2
/mnt/share/demo1/pic_hidden.c:10
c = 4;
753: c7 05 db 08 20 00 04 mov DWORD PTR [rip+0x2008db],0x4 # 201038 <c>
75a: 00 00 00
...
/mnt/share/demo1/pic_hidden.c:17
bar();
773: b8 00 00 00 00 mov eax,0x0
778: e8 bb ff ff ff call 738 <bar>
[root@docker-desktop demo1]# readelf -S libpic_hidden.so
There are 34 section headers, starting at offset 0x1470:
Section Headers:
[Nr] Name Type Address Offset
Size EntSize Flags Link Info Align
......
[23] .data PROGBITS 0000000000201038 00001038
0000000000000004 0000000000000000 WA 0 0 4
• bar: After disassembly, we can see that calling bar can directly jump through the relative address without running relocation.
• int c; # 201038 <c> ==> .data section
View the .rela.plt section.
[root@docker-desktop demo1]# readelf -r libpic_hidden.so
Relocation section '.rela.dyn' at offset 0x4a8 contains 9 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000200df0 000000000008 R_X86_64_RELATIVE 700
000000200df8 000000000008 R_X86_64_RELATIVE 6c0
000000200e08 000000000008 R_X86_64_RELATIVE 200e08
000000200fd0 000200000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_deregisterTMClone + 0
000000200fd8 000300000006 R_X86_64_GLOB_DAT 0000000000000000 b + 0
000000200fe0 000500000006 R_X86_64_GLOB_DAT 0000000000000000 __gmon_start__ + 0
000000200fe8 000700000006 R_X86_64_GLOB_DAT 0000000000000000 _Jv_RegisterClasses + 0
000000200ff0 000800000006 R_X86_64_GLOB_DAT 0000000000000000 _ITM_registerTMCloneTa + 0
000000200ff8 000900000006 R_X86_64_GLOB_DAT 0000000000000000 __cxa_finalize + 0
Relocation section '.rela.plt' at offset 0x580 contains 4 entries:
Offset Info Type Sym. Value Sym. Name + Addend
000000201018 000400000007 R_X86_64_JUMP_SLO 0000000000000000 printf + 0
000000201020 000500000007 R_X86_64_JUMP_SLO 0000000000000000 __gmon_start__ + 0
000000201028 000600000007 R_X86_64_JUMP_SLO 0000000000000000 ext + 0
000000201030 000900000007 R_X86_64_JUMP_SLO 0000000000000000 __cxa_finalize + 0
There is no bar() in .rela.plt, and no variable c in .rela.dyn, so after hiding, bar() does not need to be relocated, and variable c does not need to be indirectly redirected. The hidden symbols bar() and c also do not appear in the dynamic symbol table (.dynsym) of the dynamic-link library, so they are not visible to other shared objects or executable files during linking. As a result, there is no global symbol intervention for hidden symbols.
1. How to distinguish whether a DSO is a PIC?
readelf -d xxx.so | grep TEXTREL
If there is no output, the dynamic library is generated using PIC. Text relocation (TEXTREL) means that the code section (.text section) needs to be modified to reference the correct address. In non-PIC code, there will be references based on absolute addresses, which need to be modified when loading so that the code can run correctly. This process is text relocation.
2. How to distinguish whether a static library is PIC?
ar -t xxx.a
readelf -r xxx.o
You need to check whether there are absolute address-based relocation types such as R_X86_64_GOTPCREL or other similar relocation types that are not specifically designed for PIC code in the output.
3. Assuming that the static library is compiled without -fPIC and the dynamic library is compiled with -fPIC, is it ok?
No. In practice, the static library a.a does not use -fPIC, and the dynamic library b.so uses -fPIC. The compilation will fail due to the executable program linking the two libraries through the main. The error log is shown as follows:
g++ -c nopic_common.c -o nopic_common.o
ar rcs libnopic_common.a nopic_common.o
g++ -shared -o libnopic.so pic.c -L. -lnopic_common -fPIC
/usr/bin/ld: ./libnopic_common.a(nopic_common.o): relocation R_X86_64_PC32 against symbol `b' can not be used when making a shared object; recompile with -fPIC
/usr/bin/ld: final link failed: Bad value
collect2: error: ld returned 1 exit status
The nopic_common.o object file is not compiled with -fPIC, so it contains a reference to the global variable b in a PC-relative manner (R_X86_64_PC32 relocation type). This type of relocation is incompatible with the creation of dynamic libraries as it requires that the code must be executed at a specific address. However, the address where the dynamic library is loaded remains unknown at runtime and may even be different for each run. Specifically, the code of the static library assumes that some data or function exists at a fixed address, which could be occupied by other code or libraries, potentially leading to link errors or runtime errors.
To fix this error, you need to recompile the code in nopic_common.o to position-independent code (PIC).
4. Why is PIC not used by default when compiling a dynamic library?
• Historical reason: Due to historical inertia, earlier compiler versions did not include PIC generation as a default option.
• Option delivery issue: -fPIC
is a compiler option, which is determined at the source code compilation stage, while -shared
is a linker option, which is determined at different stages, so -fPIC
cannot be automatically enabled through -shared.
• Performance: While PIC is important for efficient operations of shared libraries, in some cases, PIC code may be slightly slower than non-PIC code because it requires using indirect addresses to reference global variables and functions. This performance impact is generally small, but it can be a factor in applications with extremely high performance requirements.
• Compiler and build system design: Compilers and build systems often allow developers to choose whether to generate PIC based on project requirements. Support for flexible configuration enables developers to determine the most appropriate compilation option based on specific usage scenarios and requirements.
Static Linking | Dynamic Linking | |
---|---|---|
Phase | Compilation and linking | Loading and running |
Execution Control | Control is handed over to the executable file. | Control is handed over to the dynamic linker, and then to the executable file after mapping. |
Addressing Speed | Fast | Due to indirect jumps, it is about 1% to 5% slower than static linking, and it is improved by using lazy binding. |
Table Name Relocation | The .rela.text code segment relocates tables. The .rela.data data segment relocates tables. |
The .rela.plt code segment relocates tables. The .rela.dyn data segment relocates tables. |
The above section mainly introduces the dynamic loading process. In the initialization and de-initialization phases, special attention needs to be paid to the construction and destruction order of global variables and functions. These processes directly affect the dependencies between modules and the interactions between objects. Therefore, we need to understand how to control these sequences by using specific attributes to ensure the stability and expected behavior of the program. Especially in the multi-module dynamic library environment, reasonable arrangement of initialization and de-initialization order is an important measure to avoid runtime errors and crashes.
For global variables across shared libraries, their initialization order is affected by the dependencies between these shared libraries. If shared library A depends on shared library B, then the initialization code of B will be executed before the initialization code of A, so the global variables in B will be initialized before the global variables in A.
Let's take a look at the example in Chapter 1 Function Calls Between Two Modules and view the link order and initialization order through the LD_DEBUG=files ./main command.
[root@docker-desktop]# LD_DEBUG=files ./main
112: find library=b1.so [0]; searching
112: search path=/usr/local/gcc-5.4.0/lib64/tls/i686:/usr/local/gcc-5.4.0/lib64/tls:/usr/local/gcc-5.4.0/lib64/i686:/usr/local/gcc-5.4.0/lib64:tls/i686:tls:i686: (LD_LIBRARY_PATH)
112: trying file=/usr/local/gcc-5.4.0/lib64/tls/i686/b1.so
112: trying file=/usr/local/gcc-5.4.0/lib64/tls/b1.so
112: trying file=/usr/local/gcc-5.4.0/lib64/i686/b1.so
112: trying file=/usr/local/gcc-5.4.0/lib64/b1.so
112: trying file=tls/i686/b1.so
112: trying file=tls/b1.so
112: trying file=i686/b1.so
112: trying file=b1.so
112:
112: find library=b2.so [0]; searching
112: search path=/usr/local/gcc-5.4.0/lib64:tls/i686:tls:i686: (LD_LIBRARY_PATH)
112: trying file=/usr/local/gcc-5.4.0/lib64/b2.so
112: trying file=tls/i686/b2.so
112: trying file=tls/b2.so
112: trying file=i686/b2.so
112: trying file=b2.so
112:
112: find library=libstdc++.so.6 [0]; searching
112: search path=/usr/local/gcc-5.4.0/lib64:tls/i686:tls:i686: (LD_LIBRARY_PATH)
112: trying file=/usr/local/gcc-5.4.0/lib64/libstdc++.so.6
112:
112: find library=libm.so.6 [0]; searching
112: search path=/usr/local/gcc-5.4.0/lib64:tls/i686:tls:i686: (LD_LIBRARY_PATH)
112: trying file=/usr/local/gcc-5.4.0/lib64/libm.so.6
112: trying file=tls/i686/libm.so.6
112: trying file=tls/libm.so.6
112: trying file=i686/libm.so.6
112: trying file=libm.so.6
112: search cache=/etc/ld.so.cache
112: trying file=/lib64/libm.so.6
112:
112: find library=libgcc_s.so.1 [0]; searching
112: search path=/usr/local/gcc-5.4.0/lib64:tls/i686:tls:i686: (LD_LIBRARY_PATH)
112: trying file=/usr/local/gcc-5.4.0/lib64/libgcc_s.so.1
112:
112: find library=libc.so.6 [0]; searching
112: search path=/usr/local/gcc-5.4.0/lib64:tls/i686:tls:i686: (LD_LIBRARY_PATH)
112: trying file=/usr/local/gcc-5.4.0/lib64/libc.so.6
112: trying file=tls/i686/libc.so.6
112: trying file=tls/libc.so.6
112: trying file=i686/libc.so.6
112: trying file=libc.so.6
112: search cache=/etc/ld.so.cache
112: trying file=/lib64/libc.so.6
112:
112: find library=a1.so [0]; searching
112: search path=/usr/local/gcc-5.4.0/lib64:tls/i686:tls:i686: (LD_LIBRARY_PATH)
112: trying file=/usr/local/gcc-5.4.0/lib64/a1.so
112: trying file=tls/i686/a1.so
112: trying file=tls/a1.so
112: trying file=i686/a1.so
112: trying file=a1.so
112:
112: find library=a2.so [0]; searching
112: search path=/usr/local/gcc-5.4.0/lib64:tls/i686:tls:i686: (LD_LIBRARY_PATH)
112: trying file=/usr/local/gcc-5.4.0/lib64/a2.so
112: trying file=tls/i686/a2.so
112: trying file=tls/a2.so
112: trying file=i686/a2.so
112: trying file=a2.so
112:
112:
112: calling init: /lib64/libc.so.6
112:
112:
112: calling init: /lib64/libm.so.6
112:
112:
112: calling init: /usr/local/gcc-5.4.0/lib64/libgcc_s.so.1
112:
112:
112: calling init: /usr/local/gcc-5.4.0/lib64/libstdc++.so.6
112:
112:
112: calling init: a2.so
112:
112:
112: calling init: a1.so
112:
112:
112: calling init: b2.so
112:
112:
112: calling init: b1.so
112:
112:
112: initialize program: ./main
112:
112:
112: transferring control: ./main
112:
a1.c
a1.c
......
As can be seen from the log, the loading sequence of dynamic libraries is as follows: b1.so, b2.so, a1.so, a2.so. These libraries are loaded according to dependencies. Using the find library statement, we can see that they are searched and the successful path is found.
The order of initialization is: a2.so, a1.so, b2.so, b1.so.
This sequence shows how the constructor of each library is called before the main function is executed. It can be seen that the initialization of dynamic libraries is carried out in the order of dependencies, that is, the initialization of a library will be performed after all the libraries it depends on are initialized.
__attribute__((__init_priority__(PRIORITY)) is a feature provided by GCC that controls the initialization priority of a global variable or function. It can only be used for global or static object declarations. It changes the order in which object constructors are called, ensuring that the constructors of different objects are called in the specified priority order when the program starts (that is, before the main() function is executed). PRIORITY must be an integer between 101 and 65535, where 101 is the highest priority (initialized first) and 65535 is the lowest priority (initialized last).
• If no priority is defined, the initialization order depends on the order of the '.o' where the global variable is defined in the command line parameters when linking.
• If some global variables use init_priority and some do not, all global variables that use init_priority are initialized before global variables that do not use init_priority.
Sample code:
TestClass obj __attribute__((init_priority(102)))
Functions can use __attribute__(constructor(PRIORITY)) and __attribute__(destructor(PRIORITY)).
The __attribute__(constructor(PRIORITY)) attribute is used to mark a function, which tells the compiler that this function should be executed automatically before the main() function is executed. If you specify PRIORITY, it can affect the order in which multiple such functions are executed: a smaller PRIORITY value means that the initialization function will be executed earlier.
Functions modified with __attribute__(destructor(PRIORITY)) allow the system to call it after the main() function exits or exit() is called. The priority is the same as above.
Sample code:
void __attribute__((constructor(102))) test()
• Portability: attribute is GCC-specific. Although many other compilers provide similar extensions, they are not compatible across compilers, so you should consider using other mechanisms or adding compatibility conditions.
• Initialization dependencies: Great care must be taken to manage dependencies between objects when using these attributes to modify the initialization order. Incorrectly planned initialization sequences can cause programs to crash when using uninitialized or semi-initialized objects.
• Default priority: The compiler also assigns a default initialization priority to global objects that do not have a specified priority. However, this default priority may vary from compiler to compiler, so it is best to specify the priority explicitly to avoid ambiguity.
• Compatibility with other features: When using constructor attributes, consider their possible compatibility with other language features such as smart pointers, lazy initialization of static local variables, and so on.
The above sections describe the process of dynamic linking. From the perspective of the overall operation process of the program, it can be divided into several key stages: compilation, linking, loading, and running. The following table briefly summarizes these stages.
Main work |
Sample command |
|
Compile |
The source file is converted by gcc/g++ into an ELF format object file that contains compiled code but is not bound to the address of the dependency. The .o file is generated on the disk. |
gcc -fPIC -c test.c -o test.o gcc -c main.c -o main.o • -fPIC: indicates that position-independent code is generated. • -c: indicates that only the compilation step is executed without linking. • -o test.o: specifies the name of the output destination file. |
Linking |
Set up the necessary information for the linker (ld.so) to prepare various table structures and reference placeholders for runtime dynamic linking. The .so file is generated on the disk. Detailed process: 1. Create a table of symbol references for subsequent resolution by the loader and dynamic linker. 2. Create data structures for runtime symbol resolution, such as placeholders for the global offset table (GOT) and program link table (PLT). 3. Provide the necessary redirection entries to tell the loader the place to find all references to dynamic libraries. |
gcc-shared-o libtest.so test.o gcc -o main main.o -L. -ltest • -shared: tells the linker that we want to create a shared object, namely, a dynamic library. • -o libtest.so: specifies the name of the generated dynamic library file. |
Loading (The focus of this article) |
The dynamic linker is responsible for loading dynamic libraries into memory and redirecting and repositioning in conjunction with resolution symbols, to ensure that programs can run correctly in memory. Detailed process: 1. Start the dynamic linker and perform its own relocation work through GOT and .dynamic information to complete bootstrapping. 2. Load the shared target file: merge the executable file and the linker's own symbols into the global symbol table, and traverse the shared target files in the breadth-first order. Their symbol tables will be continuously merged into the global symbol table. If multiple shared objects have the same symbols, the shared target file loaded first will block the subsequent symbols. 3. Relocation (memory): Relocate the function calls, and variable addresses that need to be corrected so that they point to the correct memory address. 4. Initialization: Run the initialization code for dynamic libraries, such as .init and constructors. |
./main |
Running |
Control is handed over to the main function, which parses and updates more symbol references when needed (such as in the case of lazy binding). |
An executable and linkable format standard used as the standard binary file format in Unix systems, including executable files, object code, shared libraries, and core dumps. The ELF file contains all the information needed to run a program, such as program instructions, program entry points, data, and symbol tables.
• Concept: Position-independent code refers to code that can be executed without depending on the specific loading address. Compiling to PIC means that the generated code can run anywhere in the address space of the process. This is especially crucial in dynamic libraries, because multiple programs may share a single copy of the same dynamic library, but the library may be loaded to different locations in the address spaces of the programs.
• Use phase: Compilation. Compiling with the '-fPIC' option generates position-independent code.
• Concept: The global offset table provides a fixed location for storing absolute addresses of external symbols and is populated by the linker. It is used to support position-independent code (PIC) in shared libraries.
• Use phase: Linking/loading. The linker creates the GOT and it is populated by the dynamic linker (part of the loader) when the program starts.
• Concept: The procedure linkage table works with the GOT for function calls in dynamic linking. It contains code to find the address of an external function from .got.plt. If the function is called for the first time, it will trigger the linker to resolve the function address and fill it in the corresponding position of .got.plt. If the function address has been stored in .got.plt, it will jump directly to the corresponding address to continue execution.
• Use phase: Linking/loading. Similar to the GOT, the creation of the PLT occurs in the linking phase, and its filling and updating occur when the program starts and the dynamic symbol is accessed for the first time.
A dynamic linker program in the Linux system that is responsible for loading shared libraries and performing dynamic linking and binding. It reads the dynamic library dependencies specified by the executable file and loads these libraries into memory, while also handling symbol resolution and relocation. When you run a dynamically linked executable file, it actually runs ld.so first, and then your program itself. ld.so will check the libraries needed by the program and load them into memory.
Section name |
Commands to view the information |
Instance results |
|
.interp |
Save the path to the dynamic linker. |
|
|
.dynsym RA |
Include only symbols that need to be dynamically linked during program execution. Symbols hidden by __attribute__((visibility("hidden"))) in GCC will not appear here. |
|
'Ndx' (index) is displayed as UND (short for "undefined"), indicating that the symbol is not defined in the shared object and needs to be parsed (imported) from other shared objects.
The 'Value' column has a non-zero address value, indicating the symbol's location in the shared object file (.so file). |
.rela.dyn and rela.plt RA |
The relocation table segment that stores the relocation information.
.rela.dyn fixes data references in the locations: .got and data segments.
.rela.plt fixes function references (enable PIC compilation) in the location: .got.plt. Where there is a procedure linked list there usually exists this table, because plt causes absolute jumps, then all absolute addresses that need dynamic linking/relocation in all plt tables (possibly in .got.plt or .got, they depend on whether lazy binding is enabled) need to be recorded through .rela.plt. |
|
|
.plt RA |
A set of springboard functions that implement lazy binding of shared library functions. |
|
|
.text RA |
Code section |
|
|
.dynamic RWA |
.dynamic stores the basic information used by the dynamic linker, such as the dynamic link symbol table (.dynsym), string table (.dynstr), runtime libraries on which relocation tables (.rela.dyn/rela.plt) depend, and library search paths. |
|
|
.got and .got.plt RWA |
The places where the relocation pointer is stored. |
|
|
.data RWA |
Store initialized global and static variables. |
|
|
.bss RWA |
Store uninitialized global and static variables. .bss does not occupy actual disk space, because it is just a placeholder. |
|
|
.symtab |
This includes not only exported and imported symbols but also local symbols (such as static functions and static global variables) and dynsym debug symbols. |
|
'Ndx' (index) is displayed as UND (short for "undefined"), indicating that the symbol is not defined in the shared object and needs to be parsed (imported) from other shared objects. The 'Value' column has a non-zero address value, indicating the symbol's location in the shared object file (.so file). |
Show runtime links
Environment variables
Tools and commands
《Self-cultivation of Programmers》
Disclaimer: The views expressed herein are for reference only and don't necessarily represent the official views of Alibaba Cloud.
Alibaba Unveils its Latest Open-Source Video Generation Model
1,140 posts | 355 followers
FollowAlibaba Cloud Security - November 6, 2019
Alibaba Cloud Security - February 17, 2020
Alibaba Cloud Community - February 28, 2024
Jincheng Liu - July 6, 2018
Alibaba Cloud Data Intelligence - November 28, 2024
Alibaba Container Service - July 28, 2021
1,140 posts | 355 followers
FollowAlibaba Cloud Linux is a free-to-use, native operating system that provides a stable, reliable, and high-performance environment for your applications.
Learn MoreOffline SDKs for visual production, such as image segmentation, video segmentation, and character recognition, based on deep learning technologies developed by Alibaba Cloud.
Learn MoreExplore Web Hosting solutions that can power your personal website or empower your online business.
Learn MoreExplore how our Web Hosting solutions help small and medium sized companies power their websites and online businesses.
Learn MoreMore Posts by Alibaba Cloud Community