This is an archive of the discontinued LLVM Phabricator instance.

[asan][ARMCXXABI] Added missing asan poison array cookie hooks.
Needs RevisionPublic

Authored by rsundahl on May 8 2022, 5:09 PM.

Details

Summary

Hooks into the address sanitizer that support array cookie poisoning and
validation were being generated for x86_64 but not for ARM. (amended)

In addition to the ItaniumCXXABI array cookie of a single size_t element
containing the number of elements in the allocated array, the ARMCXXABI adds
a second size_t element containing the sizeof(element). This difference in
cookie size created the need to override the methods ::InitializeArrayCookie()
and ::readArrayCookieImpl(). Later, in support of ASAN poison array cookies,
calls to asan_poison_cxx_array_cookie() and asan_load_cxx_array_cookie()
were added to each method respectively. However, these "hooks" were only
implemented for the ItaniumCXXABI. This commit adds the same functionality
to the overridden ARMCXXABI methods.

rdar://92765369

Diff Detail

Event Timeline

rsundahl created this revision.May 8 2022, 5:09 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 8 2022, 5:09 PM
rsundahl requested review of this revision.May 8 2022, 5:09 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptMay 8 2022, 5:09 PM
Herald added subscribers: Restricted Project, cfe-commits. · View Herald Transcript
rsundahl updated this revision to Diff 428099.May 9 2022, 8:42 AM

Adding ASAN test "new_array_cookie_with_new_from_class" (rdar://92884511)

This test "new_array_cookie_with_new_from_class " was dependent on
the cookie size on x86_64 and was either unsupported or expected to
fail on arm. Those restrictions can be removed with this revision.

delcypher added inline comments.May 9 2022, 10:21 AM
clang/lib/CodeGen/ItaniumCXXABI.cpp
2443

Why is there a restriction on the address space?

2478

This comment is confusing. What's returning 0? __asan_load_cxx_array_cookie doesn't do that and AFAICT neither does this code.

2479

I also don't understand what you mean by the comment. Could you elaborate?

Adding Vitaly Buka and Kostya Serebryany (sanitizer maintainers).

yln added inline comments.May 9 2022, 10:51 AM
clang/lib/CodeGen/ItaniumCXXABI.cpp
2453

This code is very similar to what we have in ItaniumCXXABI::InitializeArrayCookie(). Can we try to extract a local handleASanArrayCookiePoisoning() helper?

2485

Same here: extract common helper for the ASan stuff to be shared by ARM and ItaniumCXXABI.

compiler-rt/lib/asan/asan_poisoning.cpp
262–265 ↗(On Diff #428099)

Nitpicking extreme:

  • I think the code inside #if shouldn't have extra indent (because preprocessor sections don't establish code). If in doubt, let clang-format do it for you.
  • Let's move the comment inside the #if, just above before the line of code. If you ever read the pre-processed source-code, then the comment "lives and dies" with the line of code it relates too, i.e, on x86, currently there would be a comment without the code.
compiler-rt/test/asan/TestCases/Posix/new_array_cookie_test.cpp
3 ↗(On Diff #428099)

❤ The boy scouts rule:

Always leave the campground cleaner than you found it.

19 ↗(On Diff #428099)

Can we try to simulate a 1-byte buffer-underflow (and leave the definition of struct C as-is)? This would show that ASan still reports the smallest possible infraction.

compiler-rt/test/asan/TestCases/Posix/new_array_cookie_uaf_test.cpp
20 ↗(On Diff #428099)

Can we try to simulate a 1-byte buffer-underflow (and leave the definition of struct C as-is)? This would show that ASan still reports the smallest possible infraction.

compiler-rt/test/asan/TestCases/Posix/new_array_cookie_with_new_from_class.cpp
5 ↗(On Diff #428099)

sanitizer -> ASan

12 ↗(On Diff #428099)

Avoid NOT in FileCheck prefix names, because -NOT is the directive to do a "negative match", that is, "make sure this text does not appear".

How about:
--check-prefix=COOKIE for the cookie case and just skip --check-prefix for the "no cookie" (it's the default after all)

35 ↗(On Diff #428099)

Why remove this debug output? It might be useful...

43–47 ↗(On Diff #428099)
yln added inline comments.May 9 2022, 10:55 AM
compiler-rt/lib/asan/asan_poisoning.cpp
262 ↗(On Diff #428099)
yln added inline comments.May 9 2022, 11:03 AM
compiler-rt/lib/asan/asan_poisoning.cpp
262 ↗(On Diff #428099)

I find this a bit confusing

  • x86_64: cookie is 1 word and passed p points to it
  • arm64: cookie is 2 words and passed p points to second half of it

Would it be worth to take the extra care in CodeGen to always pass the "beginning of the cookie" to __asan_poison_cxx_array_cookie() and then have something like that:

size_t shadow_cookie_size = SANITIZER_ARM64 ? 2 : 1:
internal_memset(s, kAsanArrayCookieMagic, shadow_cookie_size);

Adding dialog to comments made by reviewers.

clang/lib/CodeGen/ItaniumCXXABI.cpp
2443

It's only there because it's also there in ItaniumCXXABI::InitializeArrayCookie() method and I have no reason to think that the ARMCXXABI version would be different. That said, this is the revision that introduced the notion originally: https://reviews.llvm.org/D5111

2478

(Same answer) It's only there because it's also there in ItaniumCXXABI::InitializeArrayCookie() method and I have no reason to think that the ARMCXXABI version would be different. That said, this is the revision that introduced the notion originally: https://reviews.llvm.org/D5111

2479

(Same answer) It's only there because it's also there in ItaniumCXXABI::InitializeArrayCookie() method and I have no reason to think that the ARMCXXABI version would be different. That said, this is the revision that introduced the notion originally: https://reviews.llvm.org/D5111

2485

Was limiting my impact but yes, I agree with you and thanks for expecting it. FYI and FWIW: the same concern that @delcypher has in the preceding comments will still exist because the factored code will come from the existing Itanium methods and those concerns come from that existing code that was introduced by: https://reviews.llvm.org/D5111.

compiler-rt/lib/asan/asan_poisoning.cpp
262 ↗(On Diff #428099)

I don't think so for a collection of reasons. When the ItaniumCXXABI defines the "array cookie" it states: "The cookie will have size sizeof(size_t)" and it then describes how there may be padding preceding the cookie but makes no mention of a second element anywhere. (https://itanium-cxx-abi.github.io/cxx-abi/abi.html#array-cookies). Now, while the ARMCXXABI(32) specification defines the cookie as two 4-byte (int) values (sizeof(element),numElements), the ARMCXXABI(64) (https://github.com/ARM-software/abi-aa/blob/320a56971fdcba282b7001cf4b84abb4fd993131/cppabi64/cppabi64.rst#the-generic-c-abi) does no such similar thing and defers to the Itanium specification of a single, type size_t (8 bytes) cookie which contains only (numElements). The initial commit creating the ARMCXXABI(64) mimicking the ARMCXXABI(32) "pair" of values occurred here: https://marc.info/?l=cfe-commits&m=135915714232578&w=2 and is then (unfortunately) this interpretation was reiterated here: https://developer.apple.com/documentation/xcode/writing-arm64-code-for-apple-platforms
which (imo incorrectly) states:
"The ABI provides a fixed layout of two size_t words for array cookies, with no extra alignment requirements. This behavior matches the ARM 32-bit C++ ABI." However... there is no supporting documentation for this two-element cookie in the ARMCXXABI(64). Furthermore, throughout the llvm source code, including sanitizer, the term "array cookie" ALWAYS means the (size_t) bytes immediately preceding the array pointer that is returned by the allocator. For these reasons, I have chosen to treat the "array cookie" uniformly as a size_t singleton and the additional sizeof(element) as the padding that it is, but with the additional information. (There is no use in llvm of the second element either in the Itanium nor the ARM ABIs including the 32 bit ARM ABI. I'm confident that we should support only the defined array cookie as a single value but also fill the padding with sizeof(element) and deprecate the whole idea eventually.)

262–265 ↗(On Diff #428099)

Good stuff. Please nit-pick all you like. Will update.

compiler-rt/test/asan/TestCases/Posix/new_array_cookie_test.cpp
3 ↗(On Diff #428099)

I pick up litter on my walks too. ;-)

19 ↗(On Diff #428099)

That's a good idea, but it's not about leaving the struct the same because it's definitely wrong in that type "int" has nothing to do with the cookie and it only happens to be true that (buffer[-2].x) gets to the beginning of the cookie since two "int"s happen to fit inside of a "size_t". (This may not always be true as in most 64-on-32 type arrangements) It's better to treat the cookie as [-1] of an array of size_t elements so that when indexing [-1] you get all of the cookie at once (see: https://itanium-cxx-abi.github.io/cxx-abi/abi.html#array-cookies). The reason I write to the *whole* cookie at once is because of endian-ness. The existing code which only writes to the first half of the cookie would reach a different half of the cookie on a big-endian machine so this covers both. But you have a good point, we should see if the minimum underflow is caught so I'll think of a way to do that in an endian-agnostic way...

compiler-rt/test/asan/TestCases/Posix/new_array_cookie_uaf_test.cpp
20 ↗(On Diff #428099)

(Same answer as above)

rsundahl marked an inline comment as not done.May 11 2022, 7:18 AM
rsundahl added inline comments.
compiler-rt/test/asan/TestCases/Posix/new_array_cookie_test.cpp
19 ↗(On Diff #428099)

I believe that because the existing code:

buffer[-1].x = 10

uses a negative address from an array of int, I took that as something unchangeable and if it were, my arguments for making the size of the elements size_t have some context, But...now I see that unjustified and merely clever in some way and your approach explicitly separating the type of the elements of the allocation from the overflow is better in at least two ways. First, it removes any such "cleverness" from the test (which is really more distracting than it is useful) and second, it allows us to test the smallest possible ingress into the cookie area. Sorry that was a hard sell but appreciate the insight.

buffer[-1].x = 10
compiler-rt/test/asan/TestCases/Posix/new_array_cookie_uaf_test.cpp
20 ↗(On Diff #428099)

In this situation the UAF isn't caught until the corrupted cookie has called delete on all 42 (supposed) elements of the array. The test depends on seeing all 42 members' being deleted so the write must go the low-byte. If you place the 42 (or anything non-zero) in the high byte, the cookie becomes > 2^54 with a corresponding number of calls to delete each member. If all the deletes are not allowed to execute, the actual UAF detection does not get executed. Therefore I'm going to leave this one as it is and leave the "minimum-underflow" to new_array_cookie_test. (The test explicitly disables sanitizing the underflow to the cookie but what is not clear to me is why access to the cookie by the second "delete" after the free doesn't trap the read of the poisoned cookie before deleting the (supposed) members. We might want to instrument ASan to check for a poisoned cookie earlier.)

rsundahl updated this revision to Diff 428716.May 11 2022, 11:00 AM

Implement suggestions from reviews. (Incremental update.)

rsundahl updated this revision to Diff 428718.May 11 2022, 11:18 AM

Revert ItaniumCXXABI.cpp for now (unintentional push)

rsundahl updated this revision to Diff 428731.May 11 2022, 11:40 AM
rsundahl marked 6 inline comments as done.

Update diff back to dc7e02d4b4dc30d44ddebd832719a6e63396e718

rsundahl marked 3 inline comments as done and 4 inline comments as done.May 11 2022, 12:04 PM
yln added inline comments.May 11 2022, 4:11 PM
compiler-rt/lib/asan/asan_poisoning.cpp
263 ↗(On Diff #428731)
compiler-rt/test/asan/TestCases/Posix/new_array_cookie_test.cpp
19 ↗(On Diff #428731)

C *buffer = new C[argc];

Although this looks weird, I think this is to prevent the compiler from reasoning about the array size.
I know that we had tests in the past that were "silently passing", because the compiler optimized away parts of the test.

I am not sure if it applies here, but let's keep it just in case ...

20 ↗(On Diff #428731)
compiler-rt/test/asan/TestCases/Posix/new_array_cookie_with_new_from_class.cpp
4–15 ↗(On Diff #428731)

Thanks for improving the test like this! This is now better than "just a test", it's essentially "executable documentation"! :)

36–38 ↗(On Diff #428731)

Is the reason that we had to remove this assert (and change how we overwrite the cookie) that on arm the cookie is 2 words?

Can we do the following?

int cooke_words = <preprocessor for "is ARM?"> : 2 : 1;
assert(reinterpret_cast<uintptr_t>(foo) ==
         reinterpret_cast<uintptr_t>(Foo::allocated) + cookie_words * sizeof(void*));
rsundahl updated this revision to Diff 429135.May 12 2022, 8:57 PM
  • Refactor InitializeArrayCookie and readArrayCookieImpl.

This update to the differential implements the final suggestions
for refactoring ItaniumCXXABI to remove duplicated code in the
function InitializeArrayCookie() and readArrayCookieImpl().

rsundahl marked 8 inline comments as done.May 12 2022, 9:23 PM

The update completes the suggested changes. The generated code is slightly different around initialization of the array cookie due to choosing one implementation over another when I "folded" ARMCXXABI::InitializeArrayCookie() into ItaniumCXXABI::InitializeArrayCookie(). The generated code LGTM but I would appreciate added scrutiny for the ItaniumCXXABI.cpp changes. check-sanitizer and check-asan completed successfully on x86_64, arm64 and arm64e.

clang/lib/CodeGen/ItaniumCXXABI.cpp
2478

I'm not sure where to find a definitive spec for __asan_load_cxx_array_cookie() so I'm conservatively leaving it as it currently is.

yln added inline comments.May 13 2022, 10:28 AM
clang/lib/CodeGen/ItaniumCXXABI.cpp
2359

Variable names should start with uppercase:
https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly

Please also be conservative with "rename cleanups" in reviews. The smaller your patch is, the easier it is to review. You can do these in a follow-up NFC commit.

compiler-rt/lib/asan/asan_poisoning.cpp
266 ↗(On Diff #429135)

I just realized that we are poisoning the cookie, but we are not doing anything on the detection side.
Is this actually required to make detection work/the tests pass?

If this bit isn't needed to make the existing tests pass, then I would like to suggest 1 of 2 things:

  • Remove the additional poisoning (recommended), -or-
  • Add another (ARM-specific?) test that requires poisoning of the second cookie word. I think this would require adaptation of __asan_load_cxx_array_cookie as well.
rsundahl updated this revision to Diff 429352.May 13 2022, 2:12 PM

This update corrects merge conflicts in Build #104091

rsundahl added inline comments.May 13 2022, 2:24 PM
clang/lib/CodeGen/ItaniumCXXABI.cpp
2373

NumElementsPtr.getRawPointer(CGF) had to be changed to CGF.Builder.CreateCall(F, NumElementsPtr.getPointer() but likely will have to be switched back when it appears downstream.

2404

NumElementsPtr.getRawPointer(CGF) had to be changed to CGF.Builder.CreateCall(F, NumElementsPtr.getPointer() but likely will have to be switched back when it appears downstream.

rsundahl updated this revision to Diff 430461.May 18 2022, 11:24 AM

Fix (and cleanup) for failure of CodeGen arm.c check in ci/cd pipeline.

vitalybuka requested changes to this revision.Dec 7 2022, 1:39 PM

Is this still relevant?
If so, I would recommend to split ItaniumCXXABI from asan changes.

This revision now requires changes to proceed.Dec 7 2022, 1:39 PM

Is this still relevant?
If so, I would recommend to split ItaniumCXXABI from asan changes.

I haven't proceeded with this because this would be ABI breaking. I intend to support the (I believe) wrong ARM64 array cookie implementation and add the poisoning and tests re-enabled separately. Thanks for your input @vitalybuka.