Page MenuHomePhabricator

[2b/3][ASan][libcxx] std::basic_string annotations
Needs ReviewPublic

Authored by AdvenamTacet on Aug 26 2022, 4:04 PM.

Details

Reviewers
philnik
Group Reviewers
Restricted Project
Summary

This revision is a part of a series of patches extending AddressSanitizer C++ container overflow detection capabilities by adding annotations, similar to those existing in std::vector, to std::string and std::deque collections. These changes allow ASan to detect cases when the instrumented program accesses memory which is internally allocated by the collection but is still not in-use (accesses before or after the stored elements for std::deque, or between the size and capacity bounds for std::string).

The motivation for the research and those changes was a bug, found by Trail of Bits, in a real code where an out-of-bounds read could happen as two strings were compared via a std::equals function that took iter1_begin, iter1_end, iter2_begin iterators (with a custom comparison function). When object iter1 was longer than iter2, read out-of-bounds on iter2 could happen. Container sanitization would detect it.

This revision adds annotations for std::basic_string. Long string is very similar to std::vector, and therefore works well with __sanitizer_annotate_contiguous_container from LLVM 15 and therefore annotations works if that implementation is compiled with previous LLVM.
However, only the standard allocator is supported.

As D132522 extended possibilities of __sanitizer_annotate_contiguous_container, now long string annotations are supported with all allocators.

There is also support for annotating objects using short string optimization.
This is limited functionality. If ASan is extended and (for example) objects on stack should not be annotated (note that not every short string is on stack), __annotate_short_string_check should be modified to return false in those situations.

If you have any questions, please email:

  • advenam.tacet@trailofbits.com
  • disconnect3d@trailofbits.com

Diff Detail

Unit TestsFailed

TimeTest
2,550 mslibcxx CI ASAN > llvm-libc++-shared-cfg-in.libcxx/algorithms/alg_sorting::assert.min.max.pass.cpp
Script: -- : 'COMPILED WITH'; /usr/bin/clang++-16 /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/libcxx/test/libcxx/algorithms/alg.sorting/assert.min.max.pass.cpp --target=x86_64-unknown-linux-gnu -g -fno-omit-frame-pointer -fsanitize=address -nostdinc++ -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/libcxx/test/support -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -fcoroutines-ts -Werror=thread-safety -Wuser-defined-warnings -D_LIBCPP_ENABLE_ASSERTIONS=1 -lc++experimental -nostdlib++ -L /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/lib -Wl,-rpath,/home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/lib -lc++ -pthread -o /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/test/libcxx/algorithms/alg.sorting/Output/assert.min.max.pass.cpp.dir/t.tmp.exe
2,250 mslibcxx CI ASAN > llvm-libc++-shared-cfg-in.libcxx/containers/sequences/array/array_zero::assert.back.pass.cpp
Script: -- : 'COMPILED WITH'; /usr/bin/clang++-16 /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/libcxx/test/libcxx/containers/sequences/array/array.zero/assert.back.pass.cpp --target=x86_64-unknown-linux-gnu -g -fno-omit-frame-pointer -fsanitize=address -nostdinc++ -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/libcxx/test/support -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -fcoroutines-ts -Werror=thread-safety -Wuser-defined-warnings -D_LIBCPP_ENABLE_ASSERTIONS=1 -lc++experimental -nostdlib++ -L /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/lib -Wl,-rpath,/home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/lib -lc++ -pthread -o /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/test/libcxx/containers/sequences/array/array.zero/Output/assert.back.pass.cpp.dir/t.tmp.exe
2,230 mslibcxx CI ASAN > llvm-libc++-shared-cfg-in.libcxx/containers/sequences/array/array_zero::assert.front.pass.cpp
Script: -- : 'COMPILED WITH'; /usr/bin/clang++-16 /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/libcxx/test/libcxx/containers/sequences/array/array.zero/assert.front.pass.cpp --target=x86_64-unknown-linux-gnu -g -fno-omit-frame-pointer -fsanitize=address -nostdinc++ -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/libcxx/test/support -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -fcoroutines-ts -Werror=thread-safety -Wuser-defined-warnings -D_LIBCPP_ENABLE_ASSERTIONS=1 -lc++experimental -nostdlib++ -L /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/lib -Wl,-rpath,/home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/lib -lc++ -pthread -o /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/test/libcxx/containers/sequences/array/array.zero/Output/assert.front.pass.cpp.dir/t.tmp.exe
2,350 mslibcxx CI ASAN > llvm-libc++-shared-cfg-in.libcxx/containers/sequences/array/array_zero::assert.subscript.pass.cpp
Script: -- : 'COMPILED WITH'; /usr/bin/clang++-16 /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/libcxx/test/libcxx/containers/sequences/array/array.zero/assert.subscript.pass.cpp --target=x86_64-unknown-linux-gnu -g -fno-omit-frame-pointer -fsanitize=address -nostdinc++ -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/libcxx/test/support -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -fcoroutines-ts -Werror=thread-safety -Wuser-defined-warnings -D_LIBCPP_ENABLE_ASSERTIONS=1 -lc++experimental -nostdlib++ -L /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/lib -Wl,-rpath,/home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/lib -lc++ -pthread -o /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/test/libcxx/containers/sequences/array/array.zero/Output/assert.subscript.pass.cpp.dir/t.tmp.exe
2,470 mslibcxx CI ASAN > llvm-libc++-shared-cfg-in.libcxx/containers/sequences/deque::assert.pop_back.empty.pass.cpp
Script: -- : 'COMPILED WITH'; /usr/bin/clang++-16 /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/libcxx/test/libcxx/containers/sequences/deque/assert.pop_back.empty.pass.cpp --target=x86_64-unknown-linux-gnu -g -fno-omit-frame-pointer -fsanitize=address -nostdinc++ -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/include/c++/v1 -I /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/libcxx/test/support -std=c++2b -Werror -Wall -Wctad-maybe-unsupported -Wextra -Wshadow -Wundef -Wno-unused-command-line-argument -Wno-attributes -Wno-pessimizing-move -Wno-c++11-extensions -Wno-noexcept-type -Wno-atomic-alignment -Wno-user-defined-literals -Wno-tautological-compare -Wsign-compare -Wunused-variable -Wunused-parameter -Wunreachable-code -Wno-unused-local-typedef -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -D_LIBCPP_ENABLE_EXPERIMENTAL -D_LIBCPP_DISABLE_AVAILABILITY -fcoroutines-ts -Werror=thread-safety -Wuser-defined-warnings -D_LIBCPP_ENABLE_ASSERTIONS=1 -lc++experimental -nostdlib++ -L /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/lib -Wl,-rpath,/home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/lib -lc++ -pthread -o /home/libcxx-builder/.buildkite-agent/builds/aaa2661d6269-1/llvm-project/libcxx-ci/build/generic-asan/test/libcxx/containers/sequences/deque/Output/assert.pop_back.empty.pass.cpp.dir/t.tmp.exe
View Full Test Results (850 Failed)

Event Timeline

AdvenamTacet created this revision.Aug 26 2022, 4:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 26 2022, 4:04 PM
AdvenamTacet requested review of this revision.Aug 26 2022, 4:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 26 2022, 4:04 PM
Herald added a reviewer: Restricted Project. · View Herald Transcript
This comment was removed by AdvenamTacet.

Code formatting and naming imporvements.

TODO comment removed.

philnik requested changes to this revision.Sep 9 2022, 9:31 AM
philnik added a subscriber: philnik.

Thanks for working on this! I haven't looked at any of the tests yet.

libcxx/include/string
605
1782–1785

I don't think this should be marked _LIBCPP_INTERNAL_MEMORY_ACCESS. Or is the size/is_long byte also poisoned for some reason?

1830–1841

Two of these are redundant. I've uploaded D132951 to remove the differences. Also, would it be possible to annotate the special member functions of __rep instead? I think adding _LIBCPP_HIDE_FROM_ABI _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS __rep(const __rep&) = default; should do the trick.

1830–1960

This is now unnecessary.

1843–1845

Couldn't we check whether the allocation satisfies the asan requirements? It looks like the overhead of checking pointer alignment seems quite manageable: https://godbolt.org/z/WeE7Wxx73.

1846

Could you mention common_interface_defs.h here instead? I couldn't find any documentation generated by the comments in that header.

1851

You can't check for non-_Uglified defines inside the headers. You should instead use _LIBCPP_HAS_NO_ASAN. You also want this to work outside our test environment, right?

1854

The two characters really don't hurt anybody and makes it a lot nicer to read.

1855–1856

Could you make this a single if?

1865–1868
1887–1891

Could we use a common interface for this and the vector-equivalent? It would be nice if we could add a __debug_utils/address_sanitizer.h where we define __annotate_contiguous_container() and some CRTP empty base that defines the functions used here. AFAICT it should be possible to define all of these functions for the long string with the public API of both vector and string. Then we could simply add a function here with the same name to special-case the short string. I hope this would simplify the implementation here quite a bit. I'm not sure though, so tell me if there is some problem.

1950–1953

Why do you do this special-casing here?

2251–2256

When is __init called after poisoning the memory? __init should only ever be called by a constructor.

libcxx/test/std/strings/basic.string/string.asan/replace.pass.cpp
319

You're missing a few newlines.

This revision now requires changes to proceed.Sep 9 2022, 9:31 AM
This comment was removed by AdvenamTacet.
Herald added a project: Restricted Project. · View Herald TranscriptSep 19 2022, 6:37 PM
Herald added subscribers: Restricted Project, Enna1. · View Herald Transcript
This comment was removed by AdvenamTacet.

Previous push was designed for a different revision. Sorry for it.

API change from D132522 - no new function, annotating function changed.

Harbormaster completed remote builds in B188096: Diff 462067.
AdvenamTacet marked 2 inline comments as done.Sep 22 2022, 5:26 AM
AdvenamTacet added inline comments.
libcxx/include/string
1830–1841

I think adding _LIBCPP_HIDE_FROM_ABI _LIBCPP_STRING_INTERNAL_MEMORY_ACCESS rep(const rep&) = default; should do the trick.

No, because we can also write to poisoned memory.

1843–1845

We can and now (due to changes in D132522) we do. I hope the check should be safe to remove. I can do it, but then it will be different in std::vector. Do we want to do it now and not (for example) wait for another revision to remove it here and in std::vector?

1854

I agree, but everywhere else those arguments are called beg, I'm happy to use begin, if inconsistency is not a problem. Is it?

1865–1868

This code has different behavior and we cannot really remove if from here.

1887–1891

Possible that it's possible for long string, not sure tho. But can we leave it for future improvements outside of that patch?
I will think about potential problems. Future patch may be for both std::basic_string and std::vector and maybe std::deque, if we make it more general.

philnik added inline comments.Sep 22 2022, 6:04 AM
libcxx/include/string
1830–1841

Yes, and why would that break if we annotate special member functions instead?

1843–1845

I think we should fix it here now and update vector in a later patch.

1854

I guess you mean in vector? IMO we should just use proper names here and update vector in another patch. That should remove the inconsistency. We also never use beg anywhere else in the code base AFAIK, so vector is actually inconsistent IMO.

1865–1868

Oh yeah you're right. But you changed the code now, so AFAICT it should now be return !__libcpp_is_constant_evaluated(). Also, could you move the #if into the function to avoid declaring it twice?

AdvenamTacet marked 11 inline comments as done.

This update make annotations work with new string implementation and fixes some issues from code review.

AdvenamTacet marked an inline comment as done.Sun, Jan 22, 7:30 PM
AdvenamTacet added inline comments.
libcxx/include/string
1782–1785

Short size / is_longis poisoned when it's at the end of the buffer. That's after the content and unaligned, so it has to be poisoned. It may help with detecting BO when short string has maximal size.

So __get_short_size should be marked _LIBCPP_INTERNAL_MEMORY_ACCESS when metadata byte is at the end.

1830–1960

I'm not sure if I understand. Could you explain?

1843–1845

D136765 updates the vector.

1854

Not only in vector, but also in ASan API/implementation, but I agree that it's more readable in longer form, so I changed it.

1855–1856

After changes from D132522 we have to check if is_same<allocator_type, __default_allocator_type>::value only for compilers before that patch,, for that only I made a second if.

1950–1953

In some places, it is unknown if object is in long or short state, there it has to be runtime check. In other situations, it is known and we can pass that information in template. It's not only about (insignificant) optimization, but it may be necessary if bytes in object contain trash (eg. object in the middle of modification) and __is_long may return incorrect value.

2251–2256

Situations like __str.__default_init(); in move constructor.

AdvenamTacet edited the summary of this revision. (Show Details)Mon, Jan 23, 1:08 PM
AdvenamTacet marked 2 inline comments as done.
  • Removing __raw_copy.* functions (as suggested in code review)
  • Fixing problem with unused arguments.

Moving #if into a function with arguments is not always possible.

  • Removed packed test.

Annotation logic has been removed from string, because capabilities of
ASan API (__sanitizer_annotate_contiguous_container) were extended.
That logic is moved to compiler-rt and tested there.
There is no reason to keep that test here now.

For more details check D132522