该文章主要借鉴了yichen师傅的博客，加上自己的一些实操与理解。主要是为了更好理解一些原理。

Linux动态链接

PLT与GOT

在Linux中，动态链接主要通过PLT与GOT来实现的。实验以下源代码来理解一下

#include <stdio.h>
void print_banner()
{
    printf("Welcome to World of PLT and GOT\n");
}
int main(void)
{
    print_banner();
    return 0;
}

使用gcc -o test test.o -m32与gcc -Wall -g -o test.o -c test.c -m32进行编译，可以同时获得一个.o文件与可执行文件，接着使用objdump看一下反汇编

test.o:     file format elf32-i386


Disassembly of section .text:

00000000 <print_banner>:
   0:	55                   	push   %ebp
   1:	89 e5                	mov    %esp,%ebp
   3:	53                   	push   %ebx
   4:	83 ec 04             	sub    $0x4,%esp
   7:	e8 fc ff ff ff       	call   8 <print_banner+0x8>
   c:	05 01 00 00 00       	add    $0x1,%eax
  11:	83 ec 0c             	sub    $0xc,%esp
  14:	8d 90 00 00 00 00    	lea    0x0(%eax),%edx
  1a:	52                   	push   %edx
  1b:	89 c3                	mov    %eax,%ebx
  1d:	e8 fc ff ff ff       	call   1e <print_banner+0x1e>
  22:	83 c4 10             	add    $0x10,%esp
  25:	90                   	nop
  26:	8b 5d fc             	mov    -0x4(%ebp),%ebx
  29:	c9                   	leave
  2a:	c3                   	ret

0000002b <main>:
  2b:	55                   	push   %ebp
  2c:	89 e5                	mov    %esp,%ebp
  2e:	83 e4 f0             	and    $0xfffffff0,%esp
  31:	e8 fc ff ff ff       	call   32 <main+0x7>
  36:	05 01 00 00 00       	add    $0x1,%eax
  3b:	e8 fc ff ff ff       	call   3c <main+0x11>
  40:	b8 00 00 00 00       	mov    $0x0,%eax
  45:	c9                   	leave
  46:	c3                   	ret

Disassembly of section .text.__x86.get_pc_thunk.ax:

00000000 <__x86.get_pc_thunk.ax>:
   0:	8b 04 24             	mov    (%esp),%eax
   3:	c3                   	ret

函数都在glibc库中，当程序运行时才会确定地址，因此此时看到的函数是用fc ff ff ff也就是-4代替。又由于运行时，重定位是无法修改代码段的，只能将函数地址重定位到数据段，因此当程序运行时，链接器会额外生成一段代码，通过这段代码来获取相应函数的地址，如：

.text
...

// 调用printf的call指令
call printf_stub
...
printf_stub:
    mov rax, [printf函数的储存地址] // 获取printf重定位之后的地址
    jmp rax // 跳过去执行printf函数

.data
...
printf函数的储存地址,这里储存printf函数重定位后的地址

因此可以得出，每一个函数在动态链接时需要两个东西：存放函数地址的数据段与获取该地址的代码，这也就是我们所熟知的GOT表与PLT表。可执行文件里面存放的是PLT表的地址，对应PLT表指向的是GOT表的地址，GOT表指向glibc中的地址。于是我们可以通过PLT表来获取函数的地址，但是这就需要GOT表以及获取了正确的地址，也就是已经重定位到了数据段上，但是若一开始就将所有函数都重定位的话，过于麻烦，于是就有了延迟绑定。

延迟绑定

延迟绑定，只有函数在被调用时才会地址解析与重定位，如下代码：

//一开始没有重定位的时候将 printf@got 填成 lookup_printf 的地址
void printf@plt()
{
address_good:
    jmp *printf@got   
lookup_printf:
    调用重定位函数查找 printf 地址，并写到 printf@got
	goto address_good;//再返回去执行address_good
}

一开始，第一次调用printf()时，printf@got是lookup_printf函数的地址，该函数用来寻找printf()地址并写入到printf@got，该函数执行完后再回到address_good，最后再跳转到printf执行。当第二次调用时，已经直到了printf的地址了，就直接执行printf()了。接下来具体看一下是怎么找的。

没关pie保护
00001040 <puts@plt>:
    1040:	ff a3 10 00 00 00    	jmp    *0x10(%ebx)
    1046:	68 08 00 00 00       	push   $0x8
    104b:	e9 d0 ff ff ff       	jmp    1020 <_init+0x20>

关了pie保护
Disassembly of section .plt:

08049020 <__libc_start_main@plt-0x10>:
 8049020:	ff 35 f8 bf 04 08    	push   0x804bff8
 8049026:	ff 25 fc bf 04 08    	jmp    *0x804bffc
 804902c:	00 00                	add    %al,(%eax)
	...

08049030 <__libc_start_main@plt>:
 8049030:	ff 25 00 c0 04 08    	jmp    *0x804c000
 8049036:	68 00 00 00 00       	push   $0x0
 804903b:	e9 e0 ff ff ff       	jmp    8049020 <_init+0x20>

08049040 <puts@plt>:
 8049040:	ff 25 04 c0 04 08    	jmp    *0x804c004
 8049046:	68 08 00 00 00       	push   $0x8
 804904b:	e9 d0 ff ff ff       	jmp    8049020 <_init+0x20>

为什么这里是puts@plt，因为当printf的参数只是一个字符串并且没有额外的格式参数时，编译器会将printf优化成puts。

我们发现除了第一个之外，其余plt表的第一条都是跳转到某一地址（即got表），来看一下该表的内容：

pwndbg> x/x 0x804c004
0x804c004 <puts@got[plt]>:	0x08049046

我们可以发现该地址存放的是plt表第二条指令的地址，为什么？因为printf函数还没有被调用，got表中是没有该函数的实际地址，因此要先寻找地址。

接下来要做的就是

8049046:	68 08 00 00 00       	push   $0x8
804904b:	e9 d0 ff ff ff       	jmp    8049020 <_init+0x20>
8049020:	ff 35 f8 bf 04 08    	push   0x804bff8
8049026:	ff 25 fc bf 04 08    	jmp    *0x804bffc

那来看一下程序还没运行时，0x804bffc里面是什么？运行后又是什么？

运行前
pwndbg> x/x 0x804bffc
0x804bffc:	0x00000000
运行后
pwndbg> x/x 0x804bffc
0x804bffc:	0xf7fdaba0

发现多了一个地址，而这个值对应的函数为**_dl_runtime_resolve**，通过该函数寻找到所需函数的地址并保存到对应的got表中。

OK，先来小总结一下当调用一个没有调用过的函数时的流程：

xxx@plt -> xxx@got -> xxx@plt -> 公共@plt -> _dl_runtime_resolve

接下来就是对_dl_runtime_resolve这个函数进行研究了。

_dl_runtime_resolve

dl_runtime_resolve函数负责程序运行时解析和绑定共享库中函数地址。

接下来说一下_dl_runtime_resolve(link_map_obj,reloc_index)函数如何使程序第一次调用一个函数

首先用link_map访问.dynamic，分别取出.dynstr、 .dynsym、 .rel.plt的地址；
.rel.plt + 参数reloc_index,求出当前函数的重定位表项Elf32_Rel的指针，记作rel;
rel->r_info >> 8作为.dynsym的下标，求出当前函数的符号表项Elf32_Sym的指针，记作sym;
.dynstr + sym->st_name得出符号名字符串指针;
在动态链接库查找这个函数的地址，并且把地址赋值给*rel->r_offset，即GOT表;
最后调用这个函数；

接下来用一道例题看一下：

#include <unistd.h>
#include <stdio.h>
#include <string.h>

void vuln()
{
    char buf[100];
    setbuf(stdin, buf);
    read(0, buf, 256);
}
int main()
{
    char buf[100] = "Welcome to XDCTF2015~!\n";

    setbuf(stdout, buf);
    write(1, buf, strlen(buf));
    vuln();
    return 0;
}

源码如上，使用gdb调一下

0x8049254 <main+120>    call   strlen@plt                  <strlen@plt>
将断点下在此处并进入看一下

 0x8049066  <strlen@plt+6>              push   0x18
 0x804906b  <strlen@plt+11>             jmp    0x8049020                   <0x8049020>
这就是之前说的，第一次调用函数，先将参数压入栈用于寻找函数，再跳转到公共plt表项
 0x8049020                              push   0x804bff8
 0x8049026                              jmp    *0x804bffc
再将参数压入栈中并跳转到_dl_runtime_resolve函数，刚好两个参数，0x18为reloc_index(第二个参数)，0x804bff8为link_map指针(第一个参数)。

接着来看是如何通过这两个参数找到所需函数地址的：

pwndbg> x/x 0x804bff8
0x804bff8:	0xf7ffda20
找到link_map地址0xf7ffda20，再通过该地址找到.dynamic地址

pwndbg> x/10x 0xf7ffda20
0xf7ffda20:	0x00000000	0xf7ffdd28	0x0804bf00	0xf7fc1000
0xf7ffda30:	0x00000000	0xf7ffda20	0x00000000	0xf7ffdd1c
0xf7ffda40:	0x00000000	0x0804bf00                            //0x0804bf00为.dynamic地址

pwndbg> x/40x 0x0804bf00
0x804bf00:	0x00000001	0x00000048	0x0000000c	0x08049000
0x804bf10:	0x0000000d	0x08049284	0x00000019	0x0804bef8
0x804bf20:	0x0000001b	0x00000004	0x0000001a	0x0804befc
0x804bf30:	0x0000001c	0x00000004	0x6ffffef5	0x080481ec
0x804bf40:	0x00000005	0x080482ac	0x00000006	0x0804820c
0x804bf50:	0x0000000a	0x00000076	0x0000000b	0x00000010
0x804bf60:	0x00000015	0xf7ffd93c	0x00000003	0x0804bff4
0x804bf70:	0x00000002	0x00000028	0x00000014	0x00000011
0x804bf80:	0x00000017	0x08048380	0x00000011	0x08048368
0x804bf90:	0x00000012	0x00000018	0x00000013	0x00000008    //0x080482ac为.dynstr地址，0x0804820c为.dynsym地址，0x08048380为.rel.plt地址

pwndbg> x/10x 0x08048380
0x8048380:	0x0804c000	0x00000107	0x0804c004	0x00000207
0x8048390:	0x0804c008	0x00000307	0x0804c00c	0x00000507
0x80483a0:	0x0804c010	0x00000607                            //0x0804c00c为r_offest，0x00000507为r_info，5为.dynsym的下标

pwndbg> x/30x 0x0804820c
0x804820c:	0x00000000	0x00000000	0x00000000	0x00000000 [0]
0x804821c:	0x00000016	0x00000000	0x00000000	0x00000012 [1]
0x804822c:	0x00000030	0x00000000	0x00000000	0x00000012 [2]
0x804823c:	0x00000024	0x00000000	0x00000000	0x00000012 [3]
0x804824c:	0x00000067	0x00000000	0x00000000	0x00000020 [4]
0x804825c:	0x0000001d	0x00000000	0x00000000	0x00000012 [5]
0x804826c:	0x00000042	0x00000000	0x00000000	0x00000012 [6]
0x804827c:	0x00000010	0x00000000                            //name_offest=0x0000001d

pwndbg> x/s 0x080482c9
0x80482c9:	"strlen"                                          //.dynstr + name_offset

第三个地址为.dynamic的地址，即0x0804bf00，再通过该地址获取.dynstr、 .dynsym、 .rel.plt的地址。
即.dynamic的地址加0x44的位置是.dynstr，即0x080482ac；.dynamic的地址加0x4c的位置是.dynsym，即0x0804820c；.dynamic的地址加0x84的位置是.rel.plt，即0x08048380。
.rel.plt 的地址加上参数 reloc_index，找到的就是函数的重定位表项 Elf32_Rel 的指针，通过这个可以得到r_offest与r_info。将r_info>>8作为.dynsym中的下标，再来到.dynsym的地址，在该下标的地方为函数名称的偏移：name_offest，.dynstr + name_offset 就是这个函数的符号名字符串 st_name最后在动态链接库查找这个函数的地址，并且把地址赋值给 *rel -> r_offset，即 GOT 表就可以了。

大致可以看出_dl_runtime_resolve函数最后是通过st_name来确定执行某一个函数，因此我们可以通过修改该内容从而执行任意函数。而reloc_index是我们控制的，那么可以通过一系列操作来修改就行了（也就是ret2dlresolve攻击手法了）。

pwn_test

Linux动态链接

PLT与GOT

延迟绑定

_dl_runtime_resolve

SROP