ASM 高级 C 开发 C++Linux

注入代码到正在运行的 Linux 应用程序

Gregory Shpitalnik

4.92/5 (22投票s)

2009年2月12日

CPOL

8分钟阅读

140418

如何将一些代码注入到正在运行的 Linux 应用程序中

引言

假设您的程序正在 Linux 上运行，并且不会在很长一段时间内终止，类似于 UNIX 守护进程。但是您想以某种简单的方式升级程序，但又不希望终止程序的执行。您可能会想到某种方式升级程序中的某个已知函数，使其执行一些额外的任务，同时又不影响函数的常规行为，也不终止程序。您考虑将一些新代码注入到您的程序代码中，以便在调用程序中另一个已存在的函数时触发。这可能是一个有些虚构的例子，但它展示了为什么有时需要将代码注入到正在运行的程序中。另外，值得一提的是病毒注入技术到正在运行的代码中。

在本文中，我将解释如何在不终止程序的情况下，将 C 函数注入到 Linux 上正在运行的程序中。我们将讨论 Linux 的可执行与可链接格式 (ELF) 文件、目标文件节、符号和重定位。

工作示例概述

我将使用以下简单示例，逐步解释代码注入技术。该示例包含 3 个组件：

动态（共享）库 libdynlib.so，它是从 dynlib.hpp 和 dynlib.cpp C++ 源文件构建的。
应用程序 app，它是从 app.cpp 源文件构建的，并与 libdynlib.so 库链接。
位于 injection.cpp 文件中的注入函数。

让我们来回顾一下这些组件的代码。

// dynlib.hpp

extern "C" void print();

dynlib.hpp 头文件定义了 print() 函数。

// dynlib.cpp

#include <stdlib.h>
#include <iostream>
#include "dynlib.hpp"

using namespace std;


extern "C" void print()
{
    static unsigned int counter = 0;
    ++counter;

    cout << counter << ": PID " << getpid() << ": In print() " << endl;
}

dynlib.cpp 实现的 print() 函数，它只是打印一个计数器（每次调用函数时都会递增）、程序进程 ID 和一条消息。

// app.cpp

#include 
#include 
#include "dynlib.hpp"

using namespace std;


int main()
{
    while (1)
    {
        print();
        cout << "Going to sleep ..." << endl;
        sleep(3);
        cout << "Waked up ..." << endl;
    }

    return 0;
}

应用程序 app.cpp 调用 print() 函数（从 libdynlib.so 动态库），然后休眠几秒钟，并继续在无限循环中执行相同的操作。

// injection.cpp

#include 

extern "C" void print();

extern "C" void injection()
{
    print(); // do the original job, call the function print()
	system("date"); // do some additional job
}

injection() 函数调用将替换应用程序 main() 函数中的 print() 函数调用。injection() 函数将首先调用原始的 print() 函数，然后执行一些额外的任务。例如，它可以调用 system() 函数运行一个外部可执行文件，或者像我在这里做的那样，打印当前日期。

编译并运行应用程序

首先，我们使用 g++ C++ 编译器和 gcc C 编译器来编译这些组件。

g++ -ggdb -Wall dynlib.cpp -fPIC -shared -o libdynlib.so
g++ -ggdb app.cpp -ldynlib -ldynlib -L./ -o app
gcc  -Wall injection.cpp -c -o injection.o

-rwxr-xr-x  1 gregory ftp  52248 Feb 12 02:05 app
-rw-r--r--  1 gregory ftp   1088 Feb 12 02:05 injection.o
-rwxr-xr-x  1 gregory ftp  52505 Feb 12 02:05 libdynlib.so

请注意，动态库 libdynlib.so 是使用 -fPIC 标志编译和链接的，该标志生成位置无关代码，并且注入对象是使用 C 编译器编译的。现在我们可以运行应用程序 app 可执行文件了。

[lnx63:code_injection] ==> ./app
1: PID 4184: In print()
Going to sleep ...
Waked up ...
2: PID 4184: In print()
Going to sleep ...
Waked up ...
3: PID 4184: In print()
Going to sleep ...

进入调试器

应用程序 app 已经执行了几轮循环，但我们假设它已经运行了几周，现在是时候在不终止应用程序的情况下注入新代码了。在注入过程中，我们将使用 Linux 的 gdb 调试器。首先，我们需要将 gdb 附加到应用程序进程 4184，如上面打印的 PID（应用程序进程 ID）所示。

[lnx63:code_injection] ==> gdb app 4184
GNU gdb 6.3
Copyright 2004 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i686-pc-linux-gnu"...
	Using host libthread_db library "/lib/tls/libthread_db.so.1".

Attaching to program: /store/fileril104/project/gregory/code_injection/app, process 4184
Reading symbols from 
	/store/fileril104/project/gregory/code_injection/libdynlib.so...done.
Loaded symbols for /store/fileril104/project/gregory/code_injection/libdynlib.so
Reading symbols from /usr/lib/libstdc++.so.6...done.
Loaded symbols for /usr/lib/libstdc++.so.6
Reading symbols from /lib/tls/libm.so.6...done.
Loaded symbols for /lib/tls/libm.so.6
Reading symbols from /lib/libgcc_s.so.1...done.
Loaded symbols for /lib/libgcc_s.so.1
Reading symbols from /lib/tls/libc.so.6...done.
Loaded symbols for /lib/tls/libc.so.6
Reading symbols from /lib/ld-linux.so.2...done.
Loaded symbols for /lib/ld-linux.so.2
0x006e17a2 in _dl_sysinfo_int80 () from /lib/ld-linux.so.2
(gdb)

将注入代码加载到可执行进程内存中

如上所述，injection.o 对象文件最初并未包含在 app 可执行进程映像中。我们首先需要将 injection.o 加载到进程内存地址空间。这可以通过 mmap() 系统调用来完成，它将 injection.o 文件映射到 app 进程地址空间。让我们在调试器中执行此操作。

(gdb) call open("injection.o", 2)
$1 = 3
(gdb) call mmap(0, 1088, 1 | 2 | 4, 1, 3, 0)
$2 = 1073754112
(gdb)

我们首先以 O_RDWR（值为 2）读/写权限打开 injection.o 文件。我们需要写入权限，因为稍后我们将对加载的注入代码进行修改。返回的已打开文件的分配的文件描述符是 3。然后，我们使用 mmap() 调用将文件引入进程地址空间。mmap() 调用接受文件大小（1088 字节）、文件映射权限 - PROT_READ | PROT_WRITE | PROT_EXEC（用于读/写和执行，1 | 2 | 4）以及已打开的文件描述符 - 3。然后返回映射文件在进程地址空间中的起始地址 - 1073754112。我们可以通过查看 /proc/[pid]/maps（其中 pid 是可执行进程 ID - 在我们的示例中为 4184）文件来验证 injection.o 是否已实际映射到进程地址空间，该文件在 Linux 上包含有关正在运行的进程内存布局的信息。

[lnx63:code_injection] ==> cat /proc/4184/maps
006e1000-006f6000 r-xp 00000000 fd:00 394811     /lib/ld-2.3.4.so
006f6000-006f7000 r-xp 00015000 fd:00 394811     /lib/ld-2.3.4.so
006f7000-006f8000 rwxp 00016000 fd:00 394811     /lib/ld-2.3.4.so
006ff000-00824000 r-xp 00000000 fd:00 394812     /lib/tls/libc-2.3.4.so
00824000-00825000 r-xp 00124000 fd:00 394812     /lib/tls/libc-2.3.4.so
00825000-00828000 rwxp 00125000 fd:00 394812     /lib/tls/libc-2.3.4.so
00828000-0082a000 rwxp 00828000 00:00 0
00832000-00853000 r-xp 00000000 fd:00 394813     /lib/tls/libm-2.3.4.so
00853000-00855000 rwxp 00020000 fd:00 394813     /lib/tls/libm-2.3.4.so
0096e000-00975000 r-xp 00000000 fd:00 394816     /lib/libgcc_s-3.4.6-20060404.so.1
00975000-00976000 rwxp 00007000 fd:00 394816     /lib/libgcc_s-3.4.6-20060404.so.1
00978000-00a38000 r-xp 00000000 fd:00 45535      /usr/lib/libstdc++.so.6.0.3
00a38000-00a3d000 rwxp 000bf000 fd:00 45535      /usr/lib/libstdc++.so.6.0.3
00a3d000-00a43000 rwxp 00a3d000 00:00 0
08048000-08049000 r-xp 00000000 00:34 30468731   /store/fileril104/project/gregory/
						code_injection/app
08049000-0804a000 rwxp 00000000 00:34 30468731   /store/fileril104/project/gregory/
						code_injection/app
0804a000-0806b000 rwxp 0804a000 00:00 0
40000000-40001000 r-xp 00000000 00:34 30468725   /store/fileril104/project/gregory/
						code_injection/libdynlib.so
40001000-40002000 rwxp 00000000 00:34 30468725   /store/fileril104/project/gregory/
						code_injection/libdynlib.so
40002000-40003000 rwxp 40002000 00:00 0
40003000-40004000 rwxs 00000000 00:34 30468724   /store/fileril104/project/gregory/
						code_injection/injection.o
4000f000-40011000 rwxp 4000f000 00:00 0
bfffe000-c0000000 rwxp bfffe000 00:00 0
ffffe000-fffff000 ---p 00000000 00:00 0

您可以验证 /store/fileril104/project/gregory/code_injection/injection.o 在进程地址空间中的地址是否从 0x40003000（十进制 1073754112）开始，到 0x40004000 结束。上面的输出也显示了其他动态库的映射。好的，现在我们已经将所有组件加载到了可执行进程内存中。

重定位

现在是时候从内部检查 ELF 格式的可执行二进制应用程序了。我们将使用 readelf Linux 工具，它显示 ELF 格式目标文件（即 Linux 上的任何对象、库或可执行文件）的不同数据。我们查看 app 可执行文件中的符号重定位。我们对 print() 函数调用的重定位感兴趣。

[lnx63:code_injection] ==> readelf -r app

Relocation section '.rel.dyn' at offset 0x5ec contains 2 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
08049d58  00001706 R_386_GLOB_DAT    00000000   __gmon_start__
08049d60  00000305 R_386_COPY        08049d60   _ZSt4cout

Relocation section '.rel.plt' at offset 0x5fc contains 13 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
08049d24  00000107 R_386_JUMP_SLOT   0804868c   print
08049d28  00000207 R_386_JUMP_SLOT   0804869c   _ZNSt8ios_base4InitC1E
08049d2c  00000507 R_386_JUMP_SLOT   080486ac   _ZStlsISt11char_traits
08049d30  00000607 R_386_JUMP_SLOT   080486bc   _ZNSolsEPFRSoS_E
08049d34  00000707 R_386_JUMP_SLOT   08048664   _init
08049d38  00000807 R_386_JUMP_SLOT   080486dc   sleep
08049d3c  00000907 R_386_JUMP_SLOT   080486ec   _ZNKSsixEj
08049d40  00000b07 R_386_JUMP_SLOT   080486fc   _ZNKSs4sizeEv
08049d44  00000c07 R_386_JUMP_SLOT   0804870c   __libc_start_main
08049d48  00000d07 R_386_JUMP_SLOT   08048ae4   _fini
08049d4c  00001307 R_386_JUMP_SLOT   0804872c   _ZSt4endlIcSt11char_tr
08049d50  00001507 R_386_JUMP_SLOT   0804873c   __gxx_personality_v0
08049d54  00001607 R_386_JUMP_SLOT   0804874c   _ZNSt8ios_base4InitD1E

可以看到，print 符号重定位位于 app 可执行文件的绝对（虚拟）地址（偏移量）0x08049d24，并且该重定位的类型是 R_386_JUMP_SLOT。重定位地址是可执行文件加载到内存并运行之前后的绝对虚拟地址。请注意，此重定位位于可执行二进制映像的 .rel.plt 段中。PLT 代表 **过程链接表 (Procedure Linkage Table)**，它提供函数的间接调用。这意味着当你调用一个函数时，你并不是直接跳转到函数的地址，而是先跳转到 **过程链接表** 中的一个条目，然后从 PLT 跳转到实际的函数代码。当调用一个位于动态库（在我们的示例中是 libdynlib.so）中的函数时，这是必需的，因为你事先不知道动态库将在可执行进程空间的哪个地址加载，也不知道你将首先在哪个动态库中找到所需的函数（在我们的示例中是 print()）。所有这些信息仅在应用程序加载到内存中并准备运行时才可用，届时动态链接器（Linux 上的 ld-linux.so）负责解析重定位，以便正确调用请求的函数。在我们的示例中，动态链接器将 libdynlib.so 库加载到可执行进程地址空间，找到库中 print() 函数的地址，并将该地址设置到重定位地址 0x08049d24。

我们的目标是用 injection.o 对象文件中的 injection() 函数地址来替换 print() 函数的地址，该对象文件在程序开始运行时并未包含在可执行进程映像中。

有关 ELF 格式、重定位和动态链接器的更多信息，请参见 可执行与可链接格式 (ELF) 文档。

我们可以检查地址 08049d24 当前是否包含 print() 函数的地址。

(gdb) p & print
$4 = (void (*)(void)) 0x40000be8 <print>
(gdb) p/x * 0x08049d24
$5 = 0x40000be8
(gdb)

通过在 injection.o 文件上运行 readelf -s（显示目标文件符号表）可以找到 injection() 函数的地址。

[lnx63:code_injection] ==> readelf -s injection.o

Symbol table '.symtab' contains 13 entries:
   Num:    Value  Size Type    Bind   Vis      Ndx Name
     0: 00000000     0 NOTYPE  LOCAL  DEFAULT  UND
     1: 00000000     0 FILE    LOCAL  DEFAULT  ABS injection.cpp
     2: 00000000     0 SECTION LOCAL  DEFAULT    1
     3: 00000000     0 SECTION LOCAL  DEFAULT    3
     4: 00000000     0 SECTION LOCAL  DEFAULT    4
     5: 00000000     0 SECTION LOCAL  DEFAULT    5
     6: 00000000     0 SECTION LOCAL  DEFAULT    6
     7: 00000000     0 SECTION LOCAL  DEFAULT    8
     8: 00000000     0 SECTION LOCAL  DEFAULT    9
     9: 00000000    25 FUNC    GLOBAL DEFAULT    1 injection
    10: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND system
    11: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND print
    12: 00000000     0 NOTYPE  GLOBAL DEFAULT  UND __gxx_personality_v0

函数（符号）injection 位于 injection.o 对象文件的 .text 段的偏移量 0 处。但是，.text 段在 injection.o 对象文件中的偏移量为 0x000034。

[lnx63:code_injection] ==> readelf -S injection.o
There are 13 section headers, starting at offset 0x104:

Section Headers:
  [Nr] Name              Type            Addr     Off    Size   ES Flg Lk Inf Al
  [ 0]                   NULL            00000000 000000 000000 00      0   0  0
  [ 1] .text             PROGBITS        00000000 000034 000019 00  AX  0   0  4
  [ 2] .rel.text         REL             00000000 000418 000018 08     11   1  4
  [ 3] .data             PROGBITS        00000000 000050 000000 00  WA  0   0  4
  [ 4] .bss              NOBITS          00000000 000050 000000 00  WA  0   0  4
  [ 5] .rodata           PROGBITS        00000000 000050 000005 00   A  0   0  1
  [ 6] .eh_frame         PROGBITS        00000000 000058 000038 00   A  0   0  4
  [ 7] .rel.eh_frame     REL             00000000 000430 000010 08     11   6  4
  [ 8] .note.GNU-stack   NOTE            00000000 000090 000000 00      0   0  1
  [ 9] .comment          PROGBITS        00000000 000090 000012 00      0   0  1
  [10] .shstrtab         STRTAB          00000000 0000a2 00005f 00      0   0  1
  [11] .symtab           SYMTAB          00000000 00030c 0000d0 10     12   9  4
  [12] .strtab           STRTAB          00000000 0003dc 00003b 00      0   0  1
Key to Flags:
  W (write), A (alloc), X (execute), M (merge), S (strings)
  I (info), L (link order), G (group), x (unknown)
  O (extra OS processing required) o (OS specific), p (processor specific)

用 injection() 函数替换 print() 函数

我想提醒您，injection.o 文件已加载到可执行进程内存中，地址为 0x40003000（见上文）。因此，injection() 函数在可执行进程中的最终绝对地址是 0x40003000 + 0x000034。

现在我们将此地址设置到 print() 函数的重定位地址 0x08049d24。

(gdb) set * 0x08049d24 = 0x40003000 + 0x000034
(gdb)

此时，我们已成功将对 print() 的调用替换为对 injection() 函数的调用。

解析 injection() 函数的重定位

但是，我们仍有一些工作要做。injection() 函数的代码尚未准备好运行，因为它有 3 个未解析的重定位。

[lnx63:code_injection] ==> readelf -r injection.o

Relocation section '.rel.text' at offset 0x418 contains 3 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00000009  00000501 R_386_32          00000000   .rodata
0000000e  00000a02 R_386_PC32        00000000   system
00000013  00000b02 R_386_PC32        00000000   print

Relocation section '.rel.eh_frame' at offset 0x430 contains 2 entries:
 Offset     Info    Type            Sym.Value  Sym. Name
00000011  00000c01 R_386_32          00000000   __gxx_personality_v0
00000024  00000201 R_386_32          00000000   .text

第一个 .rodata 重定位指向存储在 .rodata 只读数据段中的常量字符串 `"date"`，第二个 system 重定位引用 system() 函数调用，第三个 print 重定位引用 print() 函数调用。请注意，所有这三个重定位都位于 .rel.text 段中，它们的偏移量是相对于 .text 段开头的。

我们手动解析以上三个重定位，并将适当的地址设置到这三个内存位置。这些重定位在可执行进程地址空间中的地址计算方法是：将以下各项相加：

injection.o 在进程地址空间中的起始地址（0x40003000）。
injection.o 对象文件中的 .text 段起始偏移量 0x000034。
相对于 .text 段的重定位偏移量（0x00000009 - 对于 .rodata，0x0000000e - 对于 system，以及 00000013 - 对于 print）。

请注意，system 和 print 重定位的类型是 R_386_PC32。这意味着需要设置到重定位位置的值（已解析的地址）应该相对于 PC 程序计数器计算，即相对于重定位位置。此外，R_386_PC32 重定位要求将存储在重定位位置之前的值（addend）加到已解析的地址上。R_386_32 .rodata 重定位也将其 addend 加到其已解析的地址上。

(gdb) p & system
$7 = (<text> *) 0x733650 <system>  // Address of the system() function
(gdb) p * (0x40003000 + 0x000034 + 0x0000000e)
$8 = -4                              // Addend of the system relocation
(gdb) set * (0x40003000 + 0x000034 + 0x0000000e) = 0x733650 -
	(0x40003000 + 0x000034 + 0x0000000e) - 4
(gdb) p & print
$9 = (void (*)(void)) 0x40000be8 <print>    // Address of the print() function
(gdb) p * (0x40003000 + 0x000034 + 0x00000013)
$10 = -4                             // Addend of the print relocation
(gdb) set * (0x40003000 + 0x000034 + 0x00000013) = 0x40000be8 -
	(0x40003000 + 0x000034 + 0x00000013) - 4
(gdb) p * (0x40003000 + 0x000034 + 0x00000009)
$11 = 0                              // Addend of the .rodata relocation
(gdb) set * (0x40003000 + 0x000034 + 0x00000009) = 0x40003000 + 0x000050 // 0x000050 is
		// the offset of .rodata section within injection.o object file.

我们刚刚解析了 injection() 函数代码中的所有三个重定位。好了，我们完成了。我们退出调试器。应用程序将继续运行，现在会执行打印当前日期的额外任务。

gdb) quit
The program is running.  Quit anyway (and detach it)? (y or n) y
Detaching from program:
	/store/fileril104/project/gregory/code_injection/app, process 4184
[lnx63:code_injection] ==>

// The application execution continues

Waked up ...
Thu Feb 12 20:09:40 IST 2009
4: PID 4184: In print()
Going to sleep ...
Waked up ...
Thu Feb 12 20:09:43 IST 2009
5: PID 4184: In print()
Going to sleep ...
Waked up ...
Thu Feb 12 20:09:46 IST 2009
6: PID 4184: In print()
Going to sleep ...
Waked up ...
Thu Feb 12 20:09:49 IST 2009
7: PID 18138: In print()
Going to sleep ...
Waked up ...

就是这样。

结论

我展示了如何在不终止程序的情况下，将 C 函数注入到 Linux 上正在运行的程序中。请注意，上面演示的进程内存操作仅允许您拥有所有权或具有适当权限的进程进行。

历史

2009 年 2 月 12 日：初始发布