Initial Leak Sanitizer port to Windows.
Details
Diff Detail
Event Timeline
compiler-rt/lib/lsan/lsan_common.cpp | ||
---|---|---|
411 | I guess do needs this as well on windows | |
compiler-rt/lib/lsan/lsan_common_win.cpp | ||
49 | how this can work without ProcessGlobalRegions? | |
compiler-rt/lib/lsan/lsan_interceptors.cpp | ||
506 | Can you please just convert unneeded stuff from INTERCEPT_FUNCTION into LSAN_MAYBE_INTERCEPT | |
compiler-rt/lib/sanitizer_common/sanitizer_stoptheworld_win.cpp | ||
4 | sanitizer_stoptheworld_test.cpp needs to be ported to windows, probably with std:: |
compiler-rt/lib/lsan/lsan.h | ||
---|---|---|
20 | Sorry, is there a reason we can't follow the current style in this patch? sorry did I miss something? Its not about fighting clang-format its about ensuring we follow the style from the lines around it which are following the clang-format style. I don't need you to clang-format the whole file (although that would be great because this directory has very low clang-format status) https://clang.llvm.org/docs/ClangFormattedStatus.html All I'm asking is that we don't indent the preprocessor directives to a non LLVM style. i.e. #include "lsan_win.h" #if !SANITIZER_WINDOWS #endif vs # include "lsan_win.h" # if !SANITIZER_WINDOWS # endif |
compiler-rt/lib/lsan/lsan_common.h | ||
---|---|---|
48 | it might not seem useful to individuals, but I think its the guidance, and a consistent style really help new people to understand what's going on. https://llvm.org/docs/Contributing.html#how-to-submit-a-patch I'm a little surprise the "llvm-premerge-checks" didn't highlight this @kuhnel |
oh! gosh I'm sorry I didn't realize that lsan has this non standard LLVM style .clang-format, my apologies! D100238: [sanitizer] Set IndentPPDirectives: AfterHash in .clang-format
kind of surprised TBH, doesn't feel the indentation happens enough to warrant it, but that's up to you its not my backyard.
$ grep "#if" *.cpp | wc -l 461 llvm-project/compiler-rt/lib/sanitizer_common $ grep "# if" *.cpp | wc -l 27
compiler-rt/lib/lsan/lsan.h | ||
---|---|---|
20 | Yes, this is caused by clang-format. What should I do about it? | |
20 | @vitalybuka Should I create a patch, where I format the whole file or just these #includes? |
compiler-rt/lib/lsan/lsan.h | ||
---|---|---|
20 | You can ignore my comment, whilst I think the lsan .clang-format file is incorrect in terms of matching the LLVM style it is correct from the lsan subprojects point of view. I see now that you are doing what is in the local .clang-format file and its the rest of the file is incorrect. (you would handle that separately or not at all) The fact that the .clang-format file doesn't follow the LLVM style is unfortunate, but as I expressed thats upto you all. |
Please check "Edit Related Revisions" on the right of Phabricator. You can nicely chain this patches into "Stack"
@vitalybuka could you or someone else help me get the lsan tests running on Windows?
For example lets look at compiler-rt\test\lsan\TestCases\leak_check_at_exit.cpp (and basically all other tests, this serves as an easy example)
I imagine that the following defines a variable called LSAN_BASE :
// RUN: LSAN_BASE="use_stacks=0:use_registers=0"
This doesn't seem to work on Windows.
It gives the following error message:
[build] $ ":" "RUN: at line 2" [build] $ "LSAN_BASE=use_stacks=0:use_registers=0" [build] # command stderr: [build] 'LSAN_BASE=use_stacks=0:use_registers=0': command not found
Can you try to remove LSAN_BASE and just hardcode use_stacks=0:use_registers=0 ?
If it works don't mind if we do so for the rest (in a separate patch)
The problem is, that LSAN_BASE is not always use_stacks=0:use_registers=0.
Also there are many test cases, where LSAN_BASE gets used multiple times.
Removing it would result in bad repetition.
Off-cause , I am asking to replace with correct value each test.
Also there are many test cases, where LSAN_BASE gets used multiple times.
Removing it would result in bad repetition.
I don't like LSAN_BASE even without Windows.
Usually when test fail, I have a single command line to reproduce or put into debugger for just the step. Here I need to cherry-pick two line: LSAN_BASE= and actual step
also in most case it used just twice, so it's not a big loss in repetition
@vitalybuka I have now removed all LSAN_BASE occurrences.
However the invocations of the compiled test executables don't seem to work either.
This doesn't work:
// RUN: %env_lsan_opts=use_stacks=0:use_registers=0 not %run %t foo 2>&1 | FileCheck %s --check-prefix=CHECK-do
It generates this error:
[build] $ ":" "RUN: at line 3" [build] $ "env" "LSAN_OPTIONS=:detect_leaks=1:use_stacks=0:use_registers=0" "not" "E:\git\llvm-project\llvm\build\ninja_debug\projects\compiler-rt\test\lsan\X86_64LsanConfig\TestCases\Output\leak_check_at_exit.cpp.tmp.exe" "foo" [build] note: command had no output on stdout or stderr [build] error: command failed with exit status: True
One of the problems seems to be that %t doesn't have a .exe ending on Windows.
I also tried changing %t to %t.exe, but that doesn't fix the problem and produces the same error.
I think the not command(?) is the part that fails?
When I run it on my own, it does work:
PS E:\git\llvm-project> & 'E:\git\llvm-project\llvm\build\ninja_debug\projects\compiler-rt\test\lsan\X86_64LsanConfig\TestCases\Output\leak_check_at_exit.cpp.tmp.exe' Test alloc: 000061A000000000. ================================================================= ==25768==ERROR: LeakSanitizer: detected memory leaks Direct leak of 1337 byte(s) in 1 object(s) allocated from: #0 0x7ff6e2f745be in __asan_wrap_malloc E:\git\llvm-project\compiler-rt\lib\lsan\lsan_interceptors.cpp:75 #1 0x7ff6e2f2aade in main E:\git\llvm-project\compiler-rt\test\lsan\TestCases\leak_check_at_exit.cpp:13:40 #2 0x7ff6e2f79b5b in __scrt_common_main_seh d:\a01\_work\6\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288 #3 0x7fffde187033 (C:\WINDOWS\System32\KERNEL32.DLL+0x180017033) #4 0x7fffde8a2650 (C:\WINDOWS\SYSTEM32\ntdll.dll+0x180052650) SUMMARY: LeakSanitizer: 1337 byte(s) leaked in 1 allocation(s).
@vitalybuka
No check-asan doesn't work for me. It just hangs forever and does absolutely nothing. No output, nothing showing up in Task Manager with high CPU usage or anything.
However check-clang does work. Is there some documentation on running check-asan on windows.
I also saw that none of the Windows buildbots check/test anything related to sanitizers. It seems to me that none of the sanitizer stuff gets ever tested on Windows?
Do the test for sanitizers_common/asan/ubsan/etc. even work on Windows?
This one runs asan on Windows, you can try to replicated using logs. https://lab.llvm.org/buildbot/#/builders/127
I rarely work on Windows, but i did similar setup some time ago.
I will just confirm that check-asan is supposed to work, it worked for me a few months ago, and the linked buildbot tests it continuously.
However, WinASan is very sensitive to the OS and SDK version. The failure mode you described sounds plausible, something like a crash on startup that results in a message box that you can't see or something like that.
See a recent change rG7cd109b92c72855937273a6c8ab19016fbe27d33 by @aganea, who presumably was able to run the ASan tests for it.
It does work indeed. I am running under a "x64 Native Tools Command Prompt for VS 2019" prompt with MSVC 16.11.8.
D:\git\llvm-project>mkdir release && cd release D:\git\llvm-project\release>cmake ../llvm -G Ninja -DLLVM_ENABLE_PROJECTS="clang;lld;llvm;compiler-rt" -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_PDB=ON -DLLVM_ENABLE_ASSERTIONS=ON ... D:\git\llvm-project\release>ninja check-asan [0/1] Running the AddressSanitizer tests -- Testing: 669 tests, 16 workers -- Testing: 0.. 10.. 20.. 30.. 40.. 50.. 60.. 70.. 80.. 90.. Testing Time: 86.33s Unsupported : 229 Passed : 426 Expectedly Failed: 14
However you might need some Unix tools that don't come out-of-the-box on Windows.
You could first install the scoop package manager and then install any tool that you need:
D:\git\llvm-project>powershell ... D:\git\llvm-project [main ↓9 +10 ~0 -0 !]> iwr -useb get.scoop.sh | iex ... D:\git\llvm-project [main ↓9 +10 ~0 -0 !]> scoop install coreutils grep make sed unxutils sysinternals
And then run ninja check-asan again.
If you install sysinternals with scoop above, you can run procexp64 prior to ninja check-asan. That will give a much finer view on the running processes, as compared to the Task Manager. Let me know if I can help further.
@aganea Thanks for your help. I already did everything you wrote (I am however using the coreutils provided by Git).
check-asan works on my new computer. I still don't know why it didn't work on my older one...
I will try to get the lsan tests working (and hopefully passing 😉) on windows soon.
All the lsan tests, when run in lsan+asan mode, Deadlock in StopTheWord when creating the tracer thread...
-> check-lsan gets stuck, due to all the deadlocking tests :(
Could it be caused by asan intercepting the Win32 calls in `StopTheWorld?
@vitalybuka Could this be a problem? If so, how can I call the "real" functions, without being intercepted by asan?
Is there a way (for now), to only run the lsan tests in lsan only mode (no asan)?
Updated the Tests.
The Problem with Lsan+Asan still exists. I have just disabled them for now.
I hope anyone can help me with getting the tests to run correctly?
Hi,
today I found some more time to play/progress with this patch. I'm generally optimistic that most of the basic lsan stuff should already be working on Windows.
However without some help of you, I'm unable to get the tests to properly work. The main Problem I looked into today is that llvm-lit.py seems to be somehow unable to capture the stdout/err of the tests. More precisely my Debugging showed that when I insert a print(procs[-1].stdout.read()) at TestRunner.py:790 I get the correctly captured stdout of a test e.g.:
================================================================= ==25028==ERROR: LeakSanitizer: detected memory leaks Direct leak of 1337 byte(s) in 1 object(s) allocated from: #0 0x7ff77fc0488e in __asan_wrap_malloc C:\Users\cleme\dev\cpp\llvm-project\compiler-rt\lib\lsan\lsan_interceptors.cpp:75 #1 0x7ff77fbbaafe in main C:\Users\cleme\dev\cpp\llvm-project\compiler-rt\test\lsan\TestCases\leak_check_at_exit.cpp:13:40 #2 0x7ff77fc09e2b in invoke_main d:\a01\_work\43\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:78 #3 0x7ff77fc09e2b in __scrt_common_main_seh d:\a01\_work\43\s\src\vctools\crt\vcstartup\src\startup\exe_common.inl:288 #4 0x7ff8cb5654df (C:\Windows\System32\KERNEL32.DLL+0x1800154df) #5 0x7ff8cc9e485a (C:\Windows\SYSTEM32\ntdll.dll+0x18000485a) SUMMARY: LeakSanitizer: 1337 byte(s) leaked in 1 allocation(s).
However further below at lines 816-829 this output is lost (proc.stdout.read() returns a empty string, which obviously causes any FileCheck afterwards to fail)... and I can't figure out why. The only possible explaination I can think of is that the captured stdout is lost during the next loopiteration over cmd.commands
I was wrong, the Problem was, that I didn't change the exit code, when a leak gets detected.
I now intercept ExitProcess and change the exit code in there, with this there are already 12 Tests passing on Windows.
Since I am currently unable to intercept the ExitProcess/TerminateProcess in the ucrt (exit_or_terminate_process in Windows Kits\10\Source\10.0.22000.0\ucrt\startup\exit.cpp:143)
I inserted a call to Die in HandleLeaks (just like on Linux). This brings the passing Test up to 22
I have now split the lsan TestCases chages into https://reviews.llvm.org/D124322
and managed to get the lsan+asan TestCases nearly to work. The current problem is,
that during the start of the RunThread in StopTheWorld, we are intercepting a call
to calloc done by the ucrt in ucrtbase!DllMainDispatch, which somehow results in a deadlock.
If someone has an idea, how we could avoid/fix this, I would be very happy to hear it.
@vitalybuka
I've been working on this on-off for the past months and have gotten the tests in the following state:
Unsupported: 41 Passed : 19 Failed : 48
There is still the bug with the tracer thread deadlocking during an allocation made in the Microsoft CRT thread setup code when using Asan+Lsan.
I hope I can get the failing tests passing.
You probably can try to finish pure lsan, and then get back to lsan+asan.
So if you continue to work please send smaller and ready pieces for review?
@vitalybuka There seems to be a bug in MSVCs bit-field implementation, which causes the disable.c test to fail.
This reproducer (assert) passes with gcc and fails with MSVC (m.tag is 0xffffffff):
#include <stdint.h> #include <assert.h> enum ChunkTag { kDirectlyLeaked = 0, // default kIndirectlyLeaked = 1, kReachable = 2, kIgnored = 3 }; struct ChunkMetadata { uint8_t allocated : 8; // Must be first. ChunkTag tag : 2; uintptr_t requested_size : 54; uint32_t stack_trace_id; }; int main() { ChunkMetadata m; m.tag = kIgnored; assert(m.tag == kIgnored); }
Do you have a suggestion how I could fix this on MSVC
@clemenswasser The bug is there for years, I'm surprised it hasn't been fixed yet. MSVC doesn't support different types with bitfields. You should do:
struct ChunkMetadata { uint64_t allocated : 8; // Must be first. uint64_t tag : 2; uint64_t requested_size : 54; uint32_t stack_trace_id; };
Wow, this fixed a lot of tests, down to only 17 lsan standalone tests failing :) (26 already passing)