亲测有效!如何用 Address Sanitizer 精准定位内存漏洞?附保姆级操作指南

Address Sanitizer是谷歌开发的检测 use-after-free、内存泄漏等内存访问错误的工具。它内置在GCC版本>= 4.8中,可以在C和c++代码中使用 。Address Sanitizer程序使用运行时检测 来跟踪内存分配,这意味着你必须使用Address Sanitizer程序来构建代码以利用它的特性。

在AddressSanitizer Github Wiki上有大量的文档。

内存泄漏会增加程序使用的总内存 。当不再需要内存时,适当地释放内存是很重要的。对于小程序,到处丢失几个字节似乎不是什么大问题。然而,对于使用千兆内存的长时间运行程序,避免内存泄漏变得越来越重要。如果您的程序在不再需要内存时未能释放它所使用的内存,那么它可能会耗尽内存,从而导致应用程序提前终止。AddressSanitizer可以帮助检测这些内存泄漏。

此外,AddressSanitizer可以检测 use-after-free 的错误。当程序试图读取或写入已经释放的内存时,就会出现use-after-free错误。这是未定义的行为,可能导致数据损坏、不正确的结果,甚至程序崩溃

Building With Address Sanitzer

我们需要使用gcc来构建我们的代码,所以我们将加载gcc模块:

  
module load gnu/9.1.0  

-fsanitize=address标志用于告诉编译器添加AddressSanitizer.

此外,由于OSC系统上的一些环境配置设置,我们还必须静态地链接到Asan。这是使用-static-libasan or -l:libasan.a标志完成的。

用debug symbols编译代码是很有帮助的。如果存在调试符号,AddressSanitizer将打印行号。为此,添加-g标志。此外,如果您发现堆栈跟踪看起来不太正确,-fno-omit-frame-pointer标志可能会有所帮助

在一个命令中,它看起来像:

  
   
gcc main.c -o main -fsanitize=address -l:libasan.a -g  
gcc main.c -o main -fsanitize=address -static-libasan -g  

或者,分为单独的编译和链接阶段:

  
gcc -c main.c -fsanitize=address -g  
gcc main.o -o main -fsanitize=address -static-libasan  

注意,编译和链接步骤都需要-fsanize -address标志,但只有链接步骤需要-static-libasan。如果您的构建系统更复杂,那么将这些标志放在CFLAGSLDFLAGS环境变量中可能是有意义的。

就是这样!

Examples

No Leak

首先,让我们看一个没有内存泄漏的程序(noleak.c):

  
#include <stdio.h>  
#include <stdlib.h>  
#include <string.h>  
   
int main(int argc, const char *argv[]) {  
    char *s = malloc(100);  
    strcpy(s, "Hello world!");  
    printf("string is: %s\n", s);  
    free(s);  
    return 0;  
}  

为了构建它,我们运行:

  
gcc noleak.c -o noleak -fsanitize=address -static-libasan -g  

并且,运行它后得到的输出:

  
string is: Hello world!  

看起来是对的!因为在这个程序中没有内存泄漏,AddressSanitizer没有打印任何东西。但是,如果有泄漏怎么办?

Missing free

让我们再看一遍上面的程序,但是这一次,去掉free调用(leak.c):

  
#include <stdio.h>  
#include <stdlib.h>  
#include <string.h>  
   
int main(int argc, const char *argv[]) {  
    char *s = malloc(100);  
    strcpy(s, "Hello world!");  
    printf("string is: %s\n", s);  
    return 0;  
}  

然后,构建:

  
gcc leak.c -o leak -fsanitize=address -static-libasan  

输出:

  
string is: Hello world!  
  
=================================================================  
==235624==ERROR: LeakSanitizer: detected memory leaks  
  
Direct leak of 100 byte(s) in 1 object(s) allocated from:  
    #00x4eaaa8 in \_\_interceptor\_malloc ../../.././libsanitizer/asan/asan\_malloc\_linux.cc:144  
    #10x5283dd in main /users/PZS0710/edanish/test/asan/leak.c:6  
    #20x2b0c29909544 in \_\_libc\_start\_main (/lib64/libc.so.6+0x22544)  
  
SUMMARY: AddressSanitizer: 100 byte(s) leaked in 1 allocation(s).  

这是AddressSanitizer公司的泄漏报告。它检测到分配了100个字节,但从未释放。查看它提供的堆栈跟踪,我们可以看到内存是在leak.c中的第6行分配的

Use After Free

假设我们在代码中发现了上述漏洞,我们想要修复它。我们需要添加免费呼叫。但是,如果我们把它加错了位置呢?

  
#include <stdio.h>  
#include <stdlib.h>  
#include <string.h>  
   
int main(int argc, const char *argv[]) {  
    char *s = malloc(100);  
    free(s);  
    strcpy(s, "Hello world!");  
    printf("string is: %s\n", s);  
    return 0;  
}  

以上(uaf.c)显然是错误的。在一个人为的例子中,已分配的内存(由“s”指向)在被释放后被写入和读取。

To Build:

  
gcc uaf.c -o uaf -fsanitize=address -static-libasan  

构建它并运行它,我们从AddressSanitizer获得以下报告:

  
=================================================================  
==244157==ERROR: AddressSanitizer: heap-use-after-free on address 0x60b0000000f0 at pc 0x00000047a560 bp 0x7ffcdf0d59f0 sp 0x7ffcdf0d51a0  
WRITE of size 13 at 0x60b0000000f0 thread T0  
    #0 0x47a55f in \_\_interceptor\_memcpy ../../.././libsanitizer/sanitizer\_common/sanitizer\_common\_interceptors.inc:790  
    #1 0x528403 in main /users/PZS0710/edanish/test/asan/uaf.c:8  
    #2 0x2b47dd204544 in \_\_libc\_start\_main (/lib64/libc.so.6+0x22544)  
    #3 0x405f5c  (/users/PZS0710/edanish/test/asan/uaf+0x405f5c)  
  
0x60b0000000f0 is located 0 bytes inside of 100-byte region [0x60b0000000f0,0x60b000000154)  
freed by thread T0 here:  
    #0 0x4ea6f7 in \_\_interceptor\_free ../../.././libsanitizer/asan/asan\_malloc\_linux.cc:122  
    #1 0x5283ed in main /users/PZS0710/edanish/test/asan/uaf.c:7  
    #2 0x2b47dd204544 in \_\_libc\_start\_main (/lib64/libc.so.6+0x22544)  
  
previously allocated by thread T0 here:  
    #0 0x4eaaa8 in \_\_interceptor\_malloc ../../.././libsanitizer/asan/asan\_malloc\_linux.cc:144  
    #1 0x5283dd in main /users/PZS0710/edanish/test/asan/uaf.c:6  
    #2 0x2b47dd204544 in \_\_libc\_start\_main (/lib64/libc.so.6+0x22544)  
  
SUMMARY: AddressSanitizer: heap-use-after-free ../../.././libsanitizer/sanitizer\_common/sanitizer\_common\_interceptors.inc:790 in \_\_interceptor\_memcpy  
Shadow bytes around the buggy address:  
  0x0c167fff7fc0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  0x0c167fff7fd0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  0x0c167fff7fe0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  0x0c167fff7ff0: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00  
  0x0c167fff8000: fa fa fa fa fa fa fa fa fd fd fd fd fd fd fd fd  
=>0x0c167fff8010: fd fd fd fd fd fa fa fa fa fa fa fa fa fa[fd]fd  
  0x0c167fff8020: fd fd fd fd fd fd fd fd fd fd fd fa fa fa fa fa  
  0x0c167fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa  
  0x0c167fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa  
  0x0c167fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa  
  0x0c167fff8060: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa  
Shadow byte legend (one shadow byte represents 8 application bytes):  
  Addressable:           00  
  Partially addressable: 01 02 03 04 05 06 07  
  Heap left redzone:       fa  
  Freed heap region:       fd  
  Stack left redzone:      f1  
  Stack mid redzone:       f2  
  Stack right redzone:     f3  
  Stack after return:      f5  
  Stack use after scope:   f8  
  Global redzone:          f9  
  Global init order:       f6  
  Poisoned by user:        f7  
  Container overflow:      fc  
  Array cookie:            ac  
  Intra object redzone:    bb  
  ASan internal:           fe  
  Left alloca redzone:     ca  
  Right alloca redzone:    cb  
  Shadow gap:              cc  
==244157==ABORTING  

这有点吓人。看起来这里发生了很多事情,但并没有看起来那么糟糕。从顶部开始,我们看到AddressSanitizer检测到了什么。在本例中,一个13字节的“WRITE”(来自我们的strcpy)。紧接着,我们得到写操作发生位置的堆栈跟踪。这告诉我们,写操作发生在uaf.c中名为“main”的函数的第8行。

接下来,AddressSanitizer报告内存的位置。我们现在可以忽略这一点,但根据您的用例,它可能是有用的信息。

下面是两个关键信息。AddressSanitizer告诉我们内存被释放的位置(“这里由线程T0释放”一节),给我们另一个堆栈跟踪,表明内存在第7行被释放。然后,它报告它最初在哪里被分配(“之前由线程T0在这里分配:”),在uaf.c中的第6行。

这可能是开始调试问题所需的足够信息。报告的其余部分提供了有关内存如何布局的详细信息,以及访问/写入的确切地址。你可能不需要太关注这一部分。对于大多数用例来说,它有点“落伍”。

Heap Overflow

AddresssSanitizer也可以检测堆溢出。考虑下面的代码(overflow.c):

  
#include <stdio.h>  
#include <stdlib.h>  
#include <string.h>  
int main(int argc, const char *argv[]) {  
    // whoops, forgot c strings are null-terminated  
    // and not enough memory was allocated for the copy  
    char *s = malloc(12);  
    strcpy(s, "Hello world!");  
    printf("string is: %s\n", s);  
    free(s);  
    return 0;  
}  

"Hello world!"字符串有13个字符长,包括null结束符,但是我们只分配了12个字节,所以上面的字符串会溢出分配的缓冲区。要构建这个:

  
gcc overflow.c -o overflow -fsanitize=address -static-libasan -g -Wall  

然后,运行它,我们从AddressSanitizer获得以下报告:

  
==168232==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x60200000003c at pc 0x000000423454 bp 0x7ffdd58700e0 sp 0x7ffdd586f890  
WRITE of size 13 at 0x60200000003c thread T0  
    #00x423453 in \_\_interceptor\_memcpy /apps\_src/gnu/8.4.0/src/libsanitizer/sanitizer\_common/sanitizer\_common\_interceptors.inc:737  
    #10x5097c9 in main /users/PZS0710/edanish/test/asan/overflow.c:8  
    #20x2ad93cbd7544 in \_\_libc\_start\_main (/lib64/libc.so.6+0x22544)  
    #30x405d7b  (/users/PZS0710/edanish/test/asan/overflow+0x405d7b)  
  
0x60200000003c is located 0 bytes to the right of 12-byte region [0x602000000030,0x60200000003c)  
allocated by thread T0 here:  
    #00x4cd5d0 in \_\_interceptor\_malloc /apps\_src/gnu/8.4.0/src/libsanitizer/asan/asan\_malloc\_linux.cc:86  
    #10x5097af in main /users/PZS0710/edanish/test/asan/overflow.c:7  
    #20x2ad93cbd7544 in \_\_libc\_start\_main (/lib64/libc.so.6+0x22544)  
  
SUMMARY: AddressSanitizer: heap-buffer-overflow /apps\_src/gnu/8.4.0/src/libsanitizer/sanitizer\_common/sanitizer\_common\_interceptors.inc:737 in \_\_interceptor\_memcpy  
Shadow bytes around the buggy address:  
0x0c047fff7fb0: 00000000000000000000000000000000  
0x0c047fff7fc0: 00000000000000000000000000000000  
0x0c047fff7fd0: 00000000000000000000000000000000  
0x0c047fff7fe0: 00000000000000000000000000000000  
0x0c047fff7ff0: 00000000000000000000000000000000  
=>0x0c047fff8000: fa fa 00 fa fa fa 00[04]fa fa fa fa fa fa fa fa  
0x0c047fff8010: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa  
0x0c047fff8020: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa  
0x0c047fff8030: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa  
0x0c047fff8040: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa  
0x0c047fff8050: fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa fa  
Shadow byte legend (one shadow byte represents 8 application bytes):  
  Addressable:           00  
  Partially addressable: 01020304050607  
  Heap left redzone:       fa  
  Freed heap region:       fd  
  Stack left redzone:      f1  
  Stack mid redzone:       f2  
  Stack right redzone:     f3  
  Stack after return:      f5  
  Stack use after scope:   f8  
  Global redzone:          f9  
  Global init order:       f6  
  Poisoned by user:        f7  
  Container overflow:      fc  
  Array cookie:            ac  
  Intra object redzone:    bb  
  ASan internal:           fe  
  Left alloca redzone:     ca  
  Right alloca redzone:    cb  
==168232==ABORTING  

这类似于我们上面看到的use-after-free报告。它告诉我们发生了堆缓冲区溢出,然后继续报告写操作发生在哪里,以及内存最初分配在哪里。同样,本报告的其余部分描述了堆的布局,对于您的用例来说可能不太重要。

C++ Delete Mismatch

AddressSanitizer也可以用于c++代码。考虑下面的代码(bad_delete.cxx):

  
#include <iostream>  
#include <cstring>  
   
int main(int argc, const char *argv[]) {  
    char *cstr = new char[100];  
    strcpy(cstr, "Hello World");  
    std::cout << cstr << std::endl;  
   
    delete cstr;  
    return 0;  
}  

这里的问题是什么?"cstr"指向的内存是用new[]分配的。必须使用delete[]操作符删除数组分配,而不是“delete”。

要构建这些代码,只需使用g++而不是gcc:

  
g++ bad\_delete.cxx -o bad\_delete -fsanitize=address -static-libasan -g  

运行它,我们得到以下输出:

  
Hello World  
=================================================================  
==257438==ERROR: AddressSanitizer: alloc-dealloc-mismatch (operator new [] vs operator delete) on 0x60b000000040  
    #0 0x4d0a78 in operator delete(void*, unsigned long) /apps\_src/gnu/8.4.0/src/libsanitizer/asan/asan\_new\_delete.cc:151  
    #1 0x509ea8 in main /users/PZS0710/edanish/test/asan/bad\_delete.cxx:9  
    #2 0x2b8232878544 in \_\_libc\_start\_main (/lib64/libc.so.6+0x22544)  
    #3 0x40642b  (/users/PZS0710/edanish/test/asan/bad\_delete+0x40642b)  
  
0x60b000000040 is located 0 bytes inside of 100-byte region [0x60b000000040,0x60b0000000a4)  
allocated by thread T0 here:  
    #0 0x4cf840 in operator new[](unsigned long) /apps\_src/gnu/8.4.0/src/libsanitizer/asan/asan\_new\_delete.cc:93  
    #1 0x509e5f in main /users/PZS0710/edanish/test/asan/bad\_delete.cxx:5  
    #2 0x2b8232878544 in \_\_libc\_start\_main (/lib64/libc.so.6+0x22544)  
  
SUMMARY: AddressSanitizer: alloc-dealloc-mismatch /apps\_src/gnu/8.4.0/src/libsanitizer/asan/asan\_new\_delete.cc:151 in operator delete(void*, unsigned long)  
==257438==HINT: if you don't care about these errors you may set ASAN\_OPTIONS=alloc\_dealloc\_mismatch=0  
==257438==ABORTING  

这类似于我们看到的其他AddressSanitizer输出。这一次,它告诉我们newdelete之间不匹配。它打印了删除发生位置的堆栈跟踪(第9行)和分配发生位置的堆栈跟踪(第5行)。

Performance

文件指出:

这个工具非常快。instrumented程序的平均减速为~2x

AddressSanitizer比类似的分析工具(如valgrind)要快得多。这允许在HPC代码上使用。

但是,如果您发现AddressSanitizer对于您的代码来说太慢了,可以使用编译器标志来禁用特定函数的AddressSanitizer。这样,您就可以在代码中较冷的部分使用 address sanitizer 器,同时手动审核热路径。

跳过分析函数的编译器指令是:

  
\_\_attribute\_\_((no\_sanitize\_address)  

参考文献

点个「赞」+「在看」❤️

让我们知道这份文字有温暖到你,也是 我们持续 创作的最大动力!

推荐

教娃编程系列|RPG 游戏 – 移动动画

要用 AI 裁员 50% 的千亿独角兽,公开认错,重启招聘!

一些文档去重算法

single codebook和dual codebook在LLM中向量量化上有什么区别?

胖东来与京东联手了

Qwen 的训练数据是怎么做的?

什么是置信度?置信度模型怎么做?

一些文档去重算法

FSQ的原理与VQ-VAE的区别和联系

最佳的指令数据应当是什么样的?

Prefill-Decode分离

CosyVoice 2:基于大型语言模型的可扩展流式语音合成技术

SpeechGPT: LLM with Intrinsic Cross-Modal Conversational

0
0
0
0
评论
未登录
暂无评论