This is an archive of the discontinued LLVM Phabricator instance.

[hwasan] Don't report short-granule shadow as overwritten.
ClosedPublic

Authored by hctim on Aug 11 2021, 5:59 PM.

Details

Summary

The shadow for a short granule is stored in the last byte of the
granule. Currently, if there's a tail-overwrite report (a
buffer-overflow-write in uninstrumented code), we report the shadow byte
as a mismatch against the magic.

Fix this bug by slapping the shadow into the expected value. This also
makes sure that if the uninstrumented WRITE does clobber the shadow
byte, it reports the shadow was actually clobbered as well.

Diff Detail

Event Timeline

hctim requested review of this revision.Aug 11 2021, 5:59 PM
hctim created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptAug 11 2021, 5:59 PM
Herald added a subscriber: Restricted Project. · View Herald Transcript
hctim added inline comments.Aug 11 2021, 6:00 PM
compiler-rt/test/hwasan/TestCases/tail-magic.c
39

(side note, maybe we should add a __hwasan_get_tag() interface function now that it's not just "grab the top byte". this is a nice test hack for now though.)

Is there any particular reason we need the last byte in the output, even though it doesn't even get compared in the caller? Otherwise it might be cleaner to just propagate the tail_size from the caller.

hctim added a comment.Aug 13 2021, 9:42 AM

Is there any particular reason we need the last byte in the output, even though it doesn't even get compared in the caller? Otherwise it might be cleaner to just propagate the tail_size from the caller.

An overwrite of the last granule containing the shadow tag can also be reported.

Is there any particular reason we need the last byte in the output, even though it doesn't even get compared in the caller? Otherwise it might be cleaner to just propagate the tail_size from the caller.

An overwrite of the last granule containing the shadow tag can also be reported.

Wouldn't that be reported as a tag mismatch instead?

Maybe I am missing something, but as far as I can tell the only callsite into this is

uptr tail_size = tagged_size - orig_size - 1;
CHECK_LT(tail_size, kShadowAlignment);
void *tail_beg = reinterpret_cast<void *>(
    reinterpret_cast<uptr>(aligned_ptr) + orig_size);
if (tail_size && internal_memcmp(tail_beg, tail_magic, tail_size))
  ReportTailOverwritten(stack, reinterpret_cast<uptr>(tagged_ptr),
                        orig_size, tail_magic);

This doesn't look at the last byte of the tail (actual_expected[tail_size - 1]) that contains the tag for a short granule (because of the -1 in the tail_size computation) at all.

So we only call this function if some other byte of the tail also mismatches.

This revision is now accepted and ready to land.Aug 13 2021, 10:13 AM

Right so if the last byte does not match, we would crash earlier with an invalid-free error, I think. So it should match, and then there is no harm in printing it and highlighting any difference.

Right so if the last byte does not match, we would crash earlier with an invalid-free error, I think. So it should match, and then there is no harm in printing it and highlighting any difference.

No strong opinion, I just thought it would make the code a bit simpler without leaving out any information (because by construction the two always match).

Right so if the last byte does not match, we would crash earlier with an invalid-free error, I think. So it should match, and then there is no harm in printing it and highlighting any difference.

Overwrite of the short granule only results in invalid free if the allocation only consists a single granule.

Wouldn't that be reported as a tag mismatch instead?

Only would be caught as a tag mismatch if dereferenced by instrumented code before the free :).

Maybe I am missing something, but as far as I can tell the only callsite into this is

uptr tail_size = tagged_size - orig_size - 1;
CHECK_LT(tail_size, kShadowAlignment);
void *tail_beg = reinterpret_cast<void *>(
    reinterpret_cast<uptr>(aligned_ptr) + orig_size);
if (tail_size && internal_memcmp(tail_beg, tail_magic, tail_size))
  ReportTailOverwritten(stack, reinterpret_cast<uptr>(tagged_ptr),
                        orig_size, tail_magic);

This doesn't look at the last byte of the tail (actual_expected[tail_size - 1]) that contains the tag for a short granule (because of the -1 in the tail_size computation) at all.

So we only call this function if some other byte of the tail also mismatches.

Yeah, this was just a "free reporting" of whether the short granule tag was clobbered by uninstrumented code. In saying that, the change is actually pretty cheap to detect magic-overwrite of only the short granule tag as well.

Right so if the last byte does not match, we would crash earlier with an invalid-free error, I think. So it should match, and then there is no harm in printing it and highlighting any difference.

Overwrite of the short granule only results in invalid free if the allocation only consists a single granule.

Wouldn't that be reported as a tag mismatch instead?

Only would be caught as a tag mismatch if dereferenced by instrumented code before the free :).

The free does check this, look at CheckInvalidFree.

if (!allocator.PointerIsMine(untagged_ptr) ||
    !PointerAndMemoryTagsMatch(tagged_ptr)) {
  ReportInvalidFree(stack, reinterpret_cast<uptr>(tagged_ptr));
fmayer added a comment.EditedAug 13 2021, 11:13 AM

Overwrite of the short granule only results in invalid free if the allocation only consists a single granule.

Ah, sorry, of course! Then I suggest we change this to also do the correct memcmp in the caller?

So something like internal_memcmp(tail_beg, tail_magic, tail_size) || tail_beg[tail_size] != pointer_tag)

hctim updated this revision to Diff 366328.Aug 13 2021, 11:48 AM

Detect only-short-granule overwrites.

Overwrite of the short granule only results in invalid free if the allocation only consists a single granule.

Ah, sorry, of course! Then I suggest we change this to also do the correct memcmp in the caller?

So something like internal_memcmp(tail_beg, tail_magic, tail_size) || tail_beg[tail_size] != pointer_tag)

Roughly speaking, yep :).

fmayer accepted this revision.Aug 13 2021, 11:51 AM

LGTM thanks!

This revision was automatically updated to reflect the committed changes.

Hi, I am not 100% sure but it seems to trigger a regression on aarch64 bot [1]. I can reproduce with current master as well:

******************** TEST 'HWAddressSanitizer-aarch64 :: TestCases/Linux/create-thread-stress.cpp' FAILED ********************
Script:
--
: 'RUN: at line 2';      /home/adhemerval.zanella/projects/llvm/llvm-src-build-stage2/./bin/clang  --driver-mode=g++   -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions -DREUSE=0 /home/adhemerval.zanella/projects/llvm/llvm-project-git/compiler-rt/test/hwasan/TestCases/Linux/create-thread-stress.cpp -pthread -O2 -o /home/adhemerval.zanella/projects/llvm/llvm-src-build-stage2/projects/compiler-rt/test/hwasan/AARCH64/TestCases/Linux/Output/create-thread-stress.cpp.tmp &&  /home/adhemerval.zanella/projects/llvm/llvm-src-build-stage2/projects/compiler-rt/test/hwasan/AARCH64/TestCases/Linux/Output/create-thread-stress.cpp.tmp 2>&1
: 'RUN: at line 3';      /home/adhemerval.zanella/projects/llvm/llvm-src-build-stage2/./bin/clang  --driver-mode=g++   -Wthread-safety -Wthread-safety-reference -Wthread-safety-beta   -gline-tables-only -fsanitize=hwaddress -fuse-ld=lld -mllvm -hwasan-globals -mllvm -hwasan-use-short-granules -mllvm -hwasan-instrument-landing-pads=0 -mllvm -hwasan-instrument-personality-functions -DREUSE=1 /home/adhemerval.zanella/projects/llvm/llvm-project-git/compiler-rt/test/hwasan/TestCases/Linux/create-thread-stress.cpp -pthread -O2 -o /home/adhemerval.zanella/projects/llvm/llvm-src-build-stage2/projects/compiler-rt/test/hwasan/AARCH64/TestCases/Linux/Output/create-thread-stress.cpp.tmp_reuse &&  /home/adhemerval.zanella/projects/llvm/llvm-src-build-stage2/projects/compiler-rt/test/hwasan/AARCH64/TestCases/Linux/Output/create-thread-stress.cpp.tmp_reuse 2>&1
--
Exit Code: 1

Command Output (stdout):
--
==2654179==ERROR: HWAddressSanitizer: allocation-tail-overwritten; heap object [0xefdeffff0300,0xefdeffff0308) of size 8

Stack of invalid access unknown. Issue detected at deallocation time.
deallocated here:
    #0 0xaaaac4855e74 in operator delete(void*) /home/adhemerval.zanella/projects/llvm/llvm-project-git/compiler-rt/lib/hwasan/hwasan_new_delete.cpp:78:3
    #1 0xaaaac4857890 in deallocate /usr/lib/gcc/aarch64-linux-gnu/10/../../../../include/c++/10/ext/new_allocator.h:133:2
    #2 0xaaaac4857890 in deallocate /usr/lib/gcc/aarch64-linux-gnu/10/../../../../include/c++/10/bits/alloc_traits.h:492:13
    #3 0xaaaac4857890 in _M_deallocate /usr/lib/gcc/aarch64-linux-gnu/10/../../../../include/c++/10/bits/stl_vector.h:354:4
    #4 0xaaaac4857890 in _M_realloc_insert<(lambda at /home/adhemerval.zanella/projects/llvm/llvm-project-git/compiler-rt/test/hwasan/TestCases/Linux/create-thread-stress.cpp:24:28)> /usr/lib/gcc/aarch64-linux-gnu/10/../../../../include/c++/10/bits/vector.tcc:500:7
    #5 0xaaaac4857890 in emplace_back<(lambda at /home/adhemerval.zanella/projects/llvm/llvm-project-git/compiler-rt/test/hwasan/TestCases/Linux/create-thread-stress.cpp:24:28)> /usr/lib/gcc/aarch64-linux-gnu/10/../../../../include/c++/10/bits/vector.tcc:121:4
    #6 0xaaaac4857890 in Thread() /home/adhemerval.zanella/projects/llvm/llvm-project-git/compiler-rt/test/hwasan/TestCases/Linux/create-thread-stress.cpp:24:15
    #7 0xffffa1364c28  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xccc28)
    #8 0xffffa11c34f8 in start_thread /build/glibc-iW00TY/glibc-2.31/nptl/pthread_create.c:477:8
    #9 0xffffa10c9678  /build/glibc-iW00TY/glibc-2.31/misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:78

allocated here:
    #0 0xaaaac4855bc4 in operator new(unsigned long) /home/adhemerval.zanella/projects/llvm/llvm-project-git/compiler-rt/lib/hwasan/hwasan_new_delete.cpp:64:35
    #1 0xaaaac48576ec in allocate /usr/lib/gcc/aarch64-linux-gnu/10/../../../../include/c++/10/ext/new_allocator.h:115:27
    #2 0xaaaac48576ec in allocate /usr/lib/gcc/aarch64-linux-gnu/10/../../../../include/c++/10/bits/alloc_traits.h:460:20
    #3 0xaaaac48576ec in _M_allocate /usr/lib/gcc/aarch64-linux-gnu/10/../../../../include/c++/10/bits/stl_vector.h:346:20
    #4 0xaaaac48576ec in _M_realloc_insert<(lambda at /home/adhemerval.zanella/projects/llvm/llvm-project-git/compiler-rt/test/hwasan/TestCases/Linux/create-thread-stress.cpp:24:28)> /usr/lib/gcc/aarch64-linux-gnu/10/../../../../include/c++/10/bits/vector.tcc:440:33
    #5 0xaaaac48576ec in emplace_back<(lambda at /home/adhemerval.zanella/projects/llvm/llvm-project-git/compiler-rt/test/hwasan/TestCases/Linux/create-thread-stress.cpp:24:28)> /usr/lib/gcc/aarch64-linux-gnu/10/../../../../include/c++/10/bits/vector.tcc:121:4
    #6 0xaaaac48576ec in Thread() /home/adhemerval.zanella/projects/llvm/llvm-project-git/compiler-rt/test/hwasan/TestCases/Linux/create-thread-stress.cpp:24:15
    #7 0xffffa1364c28  (/lib/aarch64-linux-gnu/libstdc++.so.6+0xccc28)
    #8 0xffffa11c34f8 in start_thread /build/glibc-iW00TY/glibc-2.31/nptl/pthread_create.c:477:8
    #9 0xffffa10c9678  /build/glibc-iW00TY/glibc-2.31/misc/../sysdeps/unix/sysv/linux/aarch64/clone.S:78
                                                            ^^ 
This error occurs when a buffer overflow overwrites memory
to the right of a heap object, but within the 16-byte granule, e.g.
   char *x = new char[20];
   x[25] = 42;
HWAddressSanitizer does not detect such bugs in uninstrumented code at the time of write,
but can detect them at the time of free/delete.
To disable this feature set HWASAN_OPTIONS=free_checks_tail_magic=0
Thread: T1 0xeffe00006000 stack: [0xffff9dfb3000,0xffff9e7b21a0) sz: 8384928 tls: [0xffff9e7b21a0,0xffff9e7b3000)
Memory tags around the buggy address (one tag corresponds to 16 bytes):
  0xfefcefffefb0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefcefffefc0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefcefffefd0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefcefffefe0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefcefffeff0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefceffff000: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefceffff010: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefceffff020: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
=>0xfefceffff030:[00] 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefceffff040: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefceffff050: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefceffff060: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefceffff070: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefceffff080: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefceffff090: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefceffff0a0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
  0xfefceffff0b0: 00  00  00  00  00  00  00  00  00  00  00  00  00  00  00  00 
Tags for short granules around the buggy address (one tag corresponds to 16 bytes):
  0xfefceffff020: ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  .. 
=>0xfefceffff030:[..] ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  .. 
  0xfefceffff040: ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  ..  .. 
See https://clang.llvm.org/docs/HardwareAssistedAddressSanitizerDesign.html#short-granules for a description of short granule tags
SUMMARY: HWAddressSanitizer: allocation-tail-overwritten /home/adhemerval.zanella/projects/llvm/llvm-project-git/compiler-rt/lib/hwasan/hwasan_new_delete.cpp:78:3 in operator delete(void*)
failed at iteration 0 / 20

However I can't reliable reproduce it by issuing the tests command directly, only with 'check-all' with some system load. Any ideas? This test is being triggering the regression for a couple of days.

[1] https://lab.llvm.org/buildbot/#/builders/185/builds/359

hctim added a comment.Aug 23 2021, 2:37 PM

Thanks for the pointer. Not immediately obvious what could cause that breakage. Digging into it now.

hctim added a comment.Aug 23 2021, 4:14 PM

Found the bug:

  1. Allocator tagging is disabled.
  2. void* p = malloc(16); memset(p, 16, 16); free(p); - Allocate and use the full memory. Magic string is still written when allocator tagging is disabled. Magic string does NOT write the last granule tag when allocator tagging is disabled.
  3. void* p = malloc(8); free(p); - Re-use the allocation from #2, but with a short granule. Because allocator tagging is disabled, magic string short granule is the 16 leftover from the previous allocation, which causes magic mismatch.

I'll send a fix ASAP. Might end up being tomorrow morning though.