深入浅出 Sanitizer Interceptor 机制-六虎

布景

关于 C++ 开发者来说，经常会碰到缓冲区溢出/悬垂指针等内存过错、数据竞赛/死锁等多线程过错，这些过错往往会导致程序出现非预期的行为，从而影响程序的安全性和稳定性。怎么快速定位上述问题，一直是咱们十分头疼的问题。由 Google 开源的 sanitizer 动态剖析东西，能够高效地协助 C/C++ 开发者定位问题，提高研制功率。目前 sanitizer 已经广泛应用于字节跳动的查找、广告、推荐等中心服务端业务的 crash/coredump 剖析中，处理了数百个因内存过错和多线程数据竞赛导致的疑问问题。本文经过介绍 sanitizer interceptor 机制的原理，来协助咱们更好地了解并运用 sanitizer。

Sanitizer 简介

Sanitizer 是由 Google 开源的一系列动态代码剖析东西，从 Clang 3.1 和 GCC 4.8 开端被集成在 Clang 和 GCC 中，能够协助程序员快速精确地在运行时定位程序中的内存过错和多线程过错。Sanitizer 东西集包含：

AddressSanitizer (ASan)：用于检测缓冲区溢出、访问已开释的内存、空指针解引证等内存过错
LeakSanitizer (LSan)：用于检测内存走漏
ThreadSanitizer (TSan)：用于检测多线程数据竞赛和死锁
UndefinedBehaviorSanitizer (UBSsan)：用于检测未界说行为
MemorySanitizer (MSan)：用于检测未初始化内存的访问

从代码完成来看，一切的 sanitizer 都由编译时插桩 (compile-time instrumentation) 和运行时库 (run-time library) 两部分组成。

以 ASan 为例：

ASan 编译时会在每一处内存读写句子之前插入代码，依据每一次访问的内存所对应的影子内存 ( shadow memory，便是运用额定的内存来记载惯例的内存状况）的状况来检测本次内存访问是否合法。还会在栈变量和全局变量附近请求额定内存作为危险区，用于检测内存溢出。
ASan 运行时库会替换 malloc/free, operator new/delete 等内存分配函数的完成，这样应用程序的内存分配都由 ASan 完成的内存分配器担任。ASan 内存分配器会在它分配的堆内存附近请求额定内存用于检测堆内存溢出，还会将被开释的内存优先放在阻隔区 (quarantine) 用于检测像 heap-use-after-free 这样的堆内存过错。

实际上 ASan 运行时库不止替换了 malloc/free, operator new/delete 的函数完成，还替换了十分多的库函数完成，如：memcpy, memmove, strcpy, strcat, pthread_create 等。

那么 sanitizer 是怎么做到替换 malloc/free 这些函数完成的呢？答案便是 sanitizer 中的 interceptor 机制。

本文以 ASan 为例，剖析在 Linux x86_64 环境下 sanitizer interceptor 的完成原理。

Symbol interposition

在解说 sanitizer interceptor 的完成原理之前，咱们先来了解一下前置知识：symbol interposition。

首先咱们考虑这样一个问题：怎么在咱们的应用程序中替换 libc 的 malloc 完成为咱们自己完成的版别？

一个最简略的方法便是在咱们的应用程序中界说一个同名的 malloc 函数
还有一种方法便是将咱们的 malloc 函数完成在 libmymalloc.so 中，然后在运行咱们的应用程序之前设置环境变量 LD_PRELOAD=/path/to/libmymalloc.so

那么为什么上述两种方法能生效呢？答案是 symbol interposition。

ELF specfication 在第五章 Program Loading and Dynamic Linking 中说到：

When resolving symbolic references, the dynamic linker examines the symbol tables with a breadth-first search. That is, it first looks at the symbol table of the executable program itself, then at the symbol tables of the DT_NEEDED entries (in order), and then at the second level DT_NEEDED entries, and so on.

动态链接器 (dynamic linker/loader) 在符号引证绑守时，以一种广度优先查找的次序来查找符号：executable, needed0.so, needed1.so, needed2.so, needed0_of_needed0.so, needed1_of_needed0.so, …

假如设置了 LD_PRELOAD，那么查找符号的次序会变为：executable, preload0.so, preload1.so needed0.so, needed1.so, needed2.so, needed0_of_needed0.so, needed1_of_needed0.so, …

假如一个符号在多个组件（executable 或 shared object）中都存在界说，那么动态链接器会选择它所看到的第一个界说。

咱们经过一个比如来了解该进程：

$ cat main.c
extern int W(), X();
int main() { return (W() + X()); }
$ cat W.c
extern int b();
int a() { return (1); }
int W() { return (a() - b()); }
$ cat w.c
int b() { return (2); }
$ cat X.c
extern int b();
int a() { return (3); }
int X() { return (a() - b()); }
$ cat x.c
int b() { return (4); }
$ gcc -o libw.so -shared w.c
$ gcc -o libW.so -shared W.c -L. -lw -Wl,-rpath=.
$ gcc -o libx.so -shared x.c
$ gcc -o libX.so -shared X.c -L. -lx -Wl,-rpath=.
$ gcc -o test-symbind main.c -L. -lW -lX -Wl,-rpath=.

该比如中可执行文件与动态库之间的依靠联系如下图所示：

按照咱们前面所说，本例中动态链接器在进行符号引证绑守时，是按照 test-symbind, libW.so, libX.so, libc.so.6, libw.so, libx.so 的次序查找符号界说的。

动态链接器提供了环境变量 LD_DEBUG 来输出一些调试信息，咱们能够经过设置环境变量 LD_DEBUG=”symbols:bindings” 看下 test-symbind 的 symbol binding 的进程：

$ LD_DEBUG="symbols:bindings" ./test-symbind
   1884890:        symbol=a;  lookup in file=./test-symbind [0]
   1884890:        symbol=a;  lookup in file=./libW.so [0]
   1884890:        binding file ./libW.so [0] to ./libW.so [0]: normal symbol `a'
   1884890:        symbol=b;  lookup in file=./test-symbind [0]
   1884890:        symbol=b;  lookup in file=./libW.so [0]
   1884890:        symbol=b;  lookup in file=./libX.so [0]
   1884890:        symbol=b;  lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
   1884890:        symbol=b;  lookup in file=./libw.so [0]
   1884890:        binding file ./libW.so [0] to ./libw.so [0]: normal symbol `b'
   1884890:        symbol=a;  lookup in file=./test-symbind [0]
   1884890:        symbol=a;  lookup in file=./libW.so [0]
   1884890:        binding file ./libX.so [0] to ./libW.so [0]: normal symbol `a'
   1884890:        symbol=b;  lookup in file=./test-symbind [0]
   1884890:        symbol=b;  lookup in file=./libW.so [0]
   1884890:        symbol=b;  lookup in file=./libX.so [0]
   1884890:        symbol=b;  lookup in file=/lib/x86_64-linux-gnu/libc.so.6 [0]
   1884890:        symbol=b;  lookup in file=./libw.so [0]
   1884890:        binding file ./libX.so [0] to ./libw.so [0]: normal symbol `b'

函数 a 在 libW.so 和 libX.so 中都有一份界说，但由于是按照 test-symbind, libW.so, libX.so, libc.so.6, libw.so, libx.so 的次序查找符号界说的，所以终究一切对函数 a 的引证都绑定到 libW.so 中函数 a 的完成
函数 b 在 libw.so 和 libx.so 中都有一份界说，但由于是按照 test-symbind, libW.so, libX.so, libc.so.6, libw.so, libx.so 的次序查找符号界说的，所以终究一切对函数 b 的引证都绑定到 libw.so 中函数 b 的完成

这样咱们就了解为什么本节开端说到的两种替换 malloc 的方法能生效了：

方法一：在咱们的应用程序中界说一个同名的 malloc 函数。动态链接器在查找符号时 executable 的次序在 libc.so.6 之前，因此一切对 malloc 的引证都会绑定到 executable 中 malloc 的完成。
方法二：将咱们的 malloc 函数完成在 libmymalloc.so 中，然后在运行咱们的应用程序之前设置环境变量 LD_PRELOAD=/path/to/libmymalloc.so。动态链接器在查找符号时 libmymalloc.so 的次序在 libc.so.6 之前，因此一切对 malloc 的引证都会绑定到 libmymalloc.so 中 malloc 的完成。

实际上 sanitizer 关于 malloc/free 等库函数的替换正是运用了 symbol interposition 这一特性。下面咱们以 ASan 为例来验证一下。

考虑如下代码：

// test.cpp
#include <iostream>
int main() {
    std::cout << "Hello AddressSanitizer!\n";
}

咱们首先看下 GCC 的行为。

运用 GCC 敞开 ASan 编译 test.cpp ，g++ -fsanitize=address test.cpp -o test-gcc-asan 得到编译产品 test-gcc-asan。由于 GCC 默许会动态链接 ASan 运行时库，所以咱们能够运用 objdump -p test-gcc-asan | grep NEEDED 查看 test-gcc-asan 依靠的动态库 (shared objects)：

$ objdump -p test-gcc-asan | grep NEEDED
  NEEDED               libasan.so.5
  NEEDED               libstdc++.so.6
  NEEDED               libm.so.6
  NEEDED               libgcc_s.so.1
  NEEDED               libc.so.6

能够清楚的看到在 test-gcc-asan 依靠的动态库中 libasan.so 的次序是在 libc.so.6 之前的。实际上链接时参数 -fsanitize=address 会使得 libasan.so 成为程序的第一个依靠库。

经过设置环境变量 LD_DEBUG="bindings" 看下 test-gcc-asan 的 symbol binding 的进程：

暂时无法在飞书文档外展现此内容

能够看到动态链接器将 libc.so.6, ld-linux-x86-64.so 和 libstdc++.so 中对 malloc 的引证都绑定到了 libasan.so 中的 malloc 完成。

下面咱们看下 Clang，由于 Clang 默许是静态链接 ASan 运行时库，所以咱们就不看 test-clang-asan 所依靠的动态库了，直接看 symbol binding 的进程：

$ LD_DEBUG="bindings" ./test-gcc-asan
   3309213:        binding file /lib/x86_64-linux-gnu/libc.so.6 [0] to /usr/lib/x86_64-linux-gnu/libasan.so.5 [0]: normal symbol `malloc' [GLIBC_2.2.5]
   3309213:        binding file /lib64/ld-linux-x86-64.so.2 [0] to /usr/lib/x86_64-linux-gnu/libasan.so.5 [0]: normal symbol `malloc' [GLIBC_2.2.5]
   3309213:        binding file /usr/lib/x86_64-linux-gnu/libstdc++.so.6 [0] to /usr/lib/x86_64-linux-gnu/libasan.so.5 [0]: normal symbol `malloc' [GLIBC_2.2.5]

同样能够看到动态链接器将 libc.so.6, ld-linux-x86-64.so.2 和 libstdc++.so 中对 malloc 的引证都绑定到了 test-clang-asan 中的 malloc 完成（由于 ASan 运行时库中完成了 malloc，而且 clang 将 ASan 运行时库静态链接到 test-clang-asan 中）。

Sanitizer interceptor

下面咱们来在源码的视点，学习下 sanitizer interceptor 的完成。

阅览学习 LLVM 代码的一个十分有用的方法便是结合对应的测验代码来学习。

Sanitizer interceptor 存在一个测验文件interception_linux_test.cpp，

#include "interception/interception.h"
#include "gtest/gtest.h"
static int InterceptorFunctionCalled;
DECLARE_REAL(int, isdigit, int);
INTERCEPTOR(int, isdigit, int d) {
  ++InterceptorFunctionCalled;
  return d >= '0' && d <= '9';
}
namespace __interception {
TEST(Interception, Basic) {
  EXPECT_TRUE(INTERCEPT_FUNCTION(isdigit));
  // After interception, the counter should be incremented.
  InterceptorFunctionCalled = 0;
  EXPECT_NE(0, isdigit('1'));
  EXPECT_EQ(1, InterceptorFunctionCalled);
  EXPECT_EQ(0, isdigit('a'));
  EXPECT_EQ(2, InterceptorFunctionCalled);
  // Calling the REAL function should not affect the counter.
  InterceptorFunctionCalled = 0;
  EXPECT_NE(0, REAL(isdigit)('1'));
  EXPECT_EQ(0, REAL(isdigit)('a'));
  EXPECT_EQ(0, InterceptorFunctionCalled);
}
}  // namespace __interception

这段测验代码基于 sanitizer 的 interceptor 机制替换了 isdigit 函数的完成。在测验文件完成的 isdigit 函数中，每次 isdigit 函数被调用时都将变量 InterceptorFunctionCalled 自增 1，然后经过检验变量 InterceptorFunctionCalled 的值来测验 interceptor 机制的完成是否正确。

上述测验文件 interception_linux_test.cpp 中完成替换 isdigit 函数的中心部分是如下代码片段：

暂时无法在飞书文档外展现此内容

INTERCEPTOR(int, isdigit, int d) { ... } 用于将函数 isdigit 的完成替换为 { … } 的完成
在代码中调用 isdigit 之前，需要先调用 INTERCEPT_FUNCTION(isdigit)。假如 INTERCEPT_FUNCTION(isdigit) 回来为 true，则阐明成功替换了将 libc 中 isdigit 函数的完成。
REAL(isdigit)('1') 用于调用真实的 isdigit 完成，不过在调用 REAL(isdigit)('1') 之前需要先 DECLARE_REAL(int, isdigit, int)。

上述代码在宏打开后的内容如下：

INTERCEPTOR(int, isdigit, int d) {
  ++InterceptorFunctionCalled;
  return d >= '0' && d <= '9';
}
INTERCEPT_FUNCTION(isdigit);
DECLARE_REAL(int, isdigit, int);
REAL(isdigit)('1');

咱们首先看下 INTERCEPTOR 宏做了哪些事情
- 首先在 __interception namespace 中界说了一个函数指针 real_isdigit，该函数指针实际上在 INTERCEPT_FUNCTION 宏中会被设置为指向真实的 isdigit 函数地址。
- 然后将 isdigit 函数设置为弱符号 (weak)，而且将 isdigit 设置成 __interceptor_isdigit 的别号 (alias)。
- 最终将咱们自己版别的 isdigit 函数逻辑完成在 __interceptor_isdigit 函数中

依据 symbol interposition 这一节的内容，咱们知道：要想替换 libc.so.6 中某个函数的完成（无妨把该函数称作 foo），只需要在 sanitizer 运行时库中界说同名 foo 函数，然后让动态链接器在查找符号时 sanitizer 运行时库的次序先于 libc.so.6 即可。

那为什么这儿要将咱们的 isdigit 函数逻辑完成在函数 __interceptor_isdigit 中，而且将 isdigit 设置成 __interceptor_isdigit 的别号呢？

考虑如下场景：假定用户代码中也替换了 isdigit 函数的完成，添加了自己的逻辑，那么终究动态链接器选择的是用户代码中的 isdigit 的完成，而不是 sanitizer 运行时库中 isdigit 的完成，这样的话 sanitizer 的功能就不能正确运行了（实际上 sanitizer 运行时库中并没有替换 isdigit 的完成，这儿只是用 isdigit 举比如便于阐明）。

可是假如咱们在 sanitizer 运行时库中将 isdigit 设置成 __interceptor_isdigit 的别号，那么在用户代码中自己替换 isdigit 完成时就能够显式调用 __interceptor_isdigit 。这样既不影响用户自行替换库函数，也不影响 sanitizer 功能的正确运行：

extern "C" int __interceptor_isdigit(int d);
extern "C" int isdigit(int d) {
 fprintf(stderr, "my_isdigit_interceptor\n");
 return __interceptor_isdigit(d);
}

那在 sanitizer 运行时库中为什么将被替换的函数设置为弱符号呢？这是由于假如不设置为弱符号，在静态链接 sanitizer 运行时库时就会由于 multiple definition 而链接失利。

接着咱们看下 INTERCEPT_FUNCTION 宏做了哪些事情
- INTERCEPT_FUNCTION 宏打开后便是对 __interception::InterceptFunction 函数的调用。InterceptFunction 的函数界说：

namespace __interception {
static void *GetFuncAddr(const char *name, uptr wrapper_addr) {
  void *addr = dlsym(RTLD_NEXT, name);
  if (!addr) {
    // If the lookup using RTLD_NEXT failed, the sanitizer 运行时库 is
    // later in the library search order than the DSO that we are trying to
    // intercept, which means that we cannot intercept this function. We still
    // want the address of the real definition, though, so look it up using
    // RTLD_DEFAULT.
    addr = dlsym(RTLD_DEFAULT, name);
    // In case `name' is not loaded, dlsym ends up finding the actual wrapper.
    // We don't want to intercept the wrapper and have it point to itself.
    if ((uptr)addr == wrapper_addr)
      addr = nullptr;
  }
  return addr;
}
bool InterceptFunction(const char *name, uptr *ptr_to_real, uptr func,
                       uptr wrapper) {
  void *addr = GetFuncAddr(name, wrapper);
  *ptr_to_real = (uptr)addr;
  return addr && (func == wrapper);
}
}  // namespace __interception

其实 InterceptFunction 函数的完成很简略：首先经过函数 GetFuncAddr 取得本来的名为 name 的函数地址，然后将该地址保存至指针 ptr_to_real 指向的内存。

函数 GetFuncAddr 的代码完成也很简略，中心便是 dlsym：

dlsym 的第一个参数为 RTLD_DEFAULT 时，查找名为 name 的函数地址的次序便是前面说到的 executable, preload0.so, preload1.so needed0.so, needed1.so, needed2.so, needed0_of_needed0.so, needed1_of_needed0.so, … 这个次序。
dlsym 的第一个参数为 RTLD_NEXT 时，则是以当时 object 后边动态库为起点去查找名为 name 的函数的地址

这也是为什么在函数 GetFuncAddr 中，先用 dlsym(RTLD_NEXT, name) 寻觅被替换函数的真实地址，由于依靠项 sanitizer 运行时库是先于 name 函数真实所在的动态库。

最终咱们看下 DECLARE_REAL 宏和 REAL 宏做了哪些事情

DECLARE_REAL 打开后便是声明了在 __interception namespace 中存在一个指向被替换函数真实完成的函数指针，REAL 宏便是经过这个函数指针来调用被替换函数的真实完成。

例如，在测验用例中，DECLARE_REAL(int, isdigit, int); 便是在声明 __interception namespace 中存在一个函数指针 real_isdigit，该函数指针指向真实的 isdigit 函数地址，经过 REAL(isdigit) 来调用真实的 isdigit 函数。

总结

至此，咱们就了解在 Linux 下 sanitizer interceptor 机制的底层原理了。

ASan 基于 sanitizer interceptor 机制替换了 malloc/free 这类的内存分配/开释函数，使得一切的内存分配和开释都由 ASan 完成的内存分配器担任，这样 ASan 就能很容易检测到 heap-use-after-free，double-free 这样的堆内存过错。

关于 sanitizer 的运用者来说，了解 sanitizer 的原理后，就能够协助咱们更好地了解它，运用它的机制协助咱们更高效地排查程序中存在的疑问过错。

参阅链接

ELF interposition and -Bsymbolic | MaskRay
Observing Symbol Bindings – Linker and Libraries Guide
dlsym(3) – Linux manual pagedlsym(3) – Linux manual page
asan/tsan: weak interceptors llvm/llvm-project@7fb7330 GitHub

直播预告

近来，字节正式对外开源了高功能的C++ JSON 库sonic-cpp，极致地运用当时 CPU 硬件特性与向量化编程，大幅提高了序列化反序列化功能，解析功能为 rapidjson 的2.5 倍。sonic-cpp 在字节内部上线以来，已为抖音、今日头条等中心业务，累计节省了数十万 CPU 中心。

为了协助咱们更好地了解其原理与运用，咱们将于2022年12月15日19:30在《公开课18期》 ，与咱们直播分享 sonic-cpp 的技术原理、实践效果和未来规划。参加直播互动还有机会赢取周边礼品哦！礼品多多，欢迎咱们关注并扫描下方二维码预定直播。

直播互动礼品图片

深入浅出 Sanitizer Interceptor 机制

布景

Sanitizer 简介

Symbol interposition

Sanitizer interceptor

总结

参阅链接

直播预告

相关文章

41岁的程序员的”毕业“走向自由职业 | 2022年中总结征文大赛

你真的需要防腐层吗？DDD 系统间的7种关系梳理与实践

优雅又实用的 Java 代码优化技巧

CopyOnWriteArrayList真的线程安全吗?

作者信息