This is an archive of the discontinued LLVM Phabricator instance.

[ASan] Use debuginfo for symbolization.
ClosedPublic

Authored by hctim on Jun 10 2022, 4:30 PM.

Details

Summary

Hint: Looking here because your manual invocation of something in
'check-asan' broke? You need a new symbolizer (after D123538).

An upcoming patch will remove the internal metadata for global
variables. With D123534 and D123538, clang now emits DWARF debug info
for constant strings (the only global variable type it was missing), and
llvm-symbolizer is now able to symbolize all global variable addresses
(where previously it wouldn't give you the file:line information).

Move ASan's runtime over from the internal metadata to DWARF.

Diff Detail

Event Timeline

hctim created this revision.Jun 10 2022, 4:30 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2022, 4:30 PM
Herald added a subscriber: Enna1. · View Herald Transcript
hctim requested review of this revision.Jun 10 2022, 4:30 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2022, 4:30 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
eugenis added inline comments.Jun 14 2022, 4:26 PM
compiler-rt/lib/asan/asan_globals.cpp
93

remove the "(from debuginfo)" part? It's a waste of bytes :)

compiler-rt/lib/asan/asan_interface_internal.h
56–58

Do we have to do this every time we change a struct definition now? Sounds like a windows problem.

compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_libcdep.cpp
377

How does this fail with the old symbolizer? Is it possible to detect it and be nice about it?

hctim marked 3 inline comments as done.Jun 15 2022, 2:53 PM
hctim added a subscriber: rnk.
hctim added inline comments.
compiler-rt/lib/asan/asan_interface_internal.h
56–58

Yeah, definitely does. Need to sit down with @rnk at some point and figure out how we might best work around this (maybe for COFF we emit a single huge global that's actually an array, or we disable incremental linkage).

FYI - there's a static assertion for this behaviour in compiler-rt/lib/asan/asan_globals.cpp:375 (or grep for "MSVC incremental linker").

compiler-rt/lib/sanitizer_common/sanitizer_symbolizer_libcdep.cpp
377

It just sets them to empty strings. I've also noticed an outdated function comment above, have updated that, and added an explicit comment about why the behaviour succeeds here.

The *-NO-G tests in compiler-rt/test/asan/TestCases/global-location.cpp exercise this behaviour.

hctim updated this revision to Diff 437353.Jun 15 2022, 2:53 PM
hctim marked 2 inline comments as done.

(update)

eugenis accepted this revision.Jun 15 2022, 3:11 PM

LGTM

This revision is now accepted and ready to land.Jun 15 2022, 3:11 PM
This revision was landed with ongoing or failed builds.Jun 15 2022, 3:36 PM
This revision was automatically updated to reflect the committed changes.
kstoimenov reopened this revision.Jun 15 2022, 4:43 PM
This revision is now accepted and ready to land.Jun 15 2022, 4:43 PM
hctim updated this revision to Diff 437633.Jun 16 2022, 11:35 AM

Split global-location.cpp to only run the tests with debuginfo against non-windows, make the linker invoke strip in order to avoid problems with the android wrapper, and remove a not-possible line/column number check from windows (this is no longer possible).

Hopefully the presubmit bot kicks the windows and I can see there.

This revision was landed with ongoing or failed builds.Jun 16 2022, 1:59 PM
This revision was automatically updated to reflect the committed changes.
hctim added a comment.Jun 16 2022, 2:11 PM

Looks green on Windows: https://lab.llvm.org/buildbot/#/builders/127/builds/31176 (which was the one I was more concerned about as I couldn't test it).

Waiting on the Android bot, but I tested locally without issue: https://lab.llvm.org/buildbot/#/builders/77/builds/18674 (bot was already red, this should be the first build that actually tests the change).

thakis added a subscriber: thakis.Jul 12 2022, 8:17 AM

I'm seeing some warnings after this:

[1/6] CXX stage2_unix/obj/compiler-rt/lib/asan/asan_shared_library.asan_globals.o
../../compiler-rt/lib/asan/asan_globals.cpp:93:52: warning: format specifies type 'int' but the argument has type '__sanitizer::uptr' (aka 'unsigned long') [-Wformat]
    Report("  location: name=%s, %d\n", info.file, info.line);
                                 ~~                ^~~~~~~~~
                                 %lu
../../compiler-rt/lib/asan/asan_globals.cpp:304:37: warning: format specifies type 'int' but the argument has type '__sanitizer::uptr' (aka 'unsigned long') [-Wformat]
    str->append("%s:%d", info.file, info.line);
                    ~~              ^~~~~~~~~
                    %lu

Should we maybe just change DataInfo::line to int, like it is in AddressInfo?

Alternatively, see 44f0e14537801743ffc2c0d40e35b8335d0c1699 for prior art, but printing line numbers as pointers seems a bit weird.

(Or we could just add the explicit cast to int, but 4 billion line numbers should be enough for anyone :) )

But anything that makes the warning go away is fine with me.

I'm seeing some warnings after this:

[1/6] CXX stage2_unix/obj/compiler-rt/lib/asan/asan_shared_library.asan_globals.o
../../compiler-rt/lib/asan/asan_globals.cpp:93:52: warning: format specifies type 'int' but the argument has type '__sanitizer::uptr' (aka 'unsigned long') [-Wformat]
    Report("  location: name=%s, %d\n", info.file, info.line);
                                 ~~                ^~~~~~~~~
                                 %lu
../../compiler-rt/lib/asan/asan_globals.cpp:304:37: warning: format specifies type 'int' but the argument has type '__sanitizer::uptr' (aka 'unsigned long') [-Wformat]
    str->append("%s:%d", info.file, info.line);
                    ~~              ^~~~~~~~~
                    %lu

Should we maybe just change DataInfo::line to int, like it is in AddressInfo?

Alternatively, see 44f0e14537801743ffc2c0d40e35b8335d0c1699 for prior art, but printing line numbers as pointers seems a bit weird.

(Or we could just add the explicit cast to int, but 4 billion line numbers should be enough for anyone :) )

But anything that makes the warning go away is fine with me.

Casted it in 81c48436bbd29736f77a111fc207e28854939907

Hint: Looking here because your manual invocation of something in
'check-asan' broke? You need a new symbolizer (after D123538).

with the runtimes build, doing a ninja check-runtimes without building everything beforehand fails because llvm-symbolizer isn't built. we should add llvm-symbolizer as test dependency, although I'm not familiar enough with the compiler-rt testing CMake machinery to know where that would go