This is an archive of the discontinued LLVM Phabricator instance.

[hwasan] Add __hwasan_add_frame_record to the hwasan interface
ClosedPublic

Authored by leonardchan on Jun 22 2022, 3:27 PM.

Details

Summary

Hwasan includes instructions in the prologue that mix the PC and SP and store it into the stack ring buffer stored at __hwasan_tls. This is a thread_local global exposed from the hwasan runtime. However, if TLS-mechanisms or the hwasan runtime haven't been setup yet, it will be invalid to access __hwasan_tls. This is the case for Fuchsia where we instrument libc, so some functions that are instrumented but can run before hwasan initialization will incorrectly access this global. Additionally, libc cannot have any TLS variables, so we cannot weakly define __hwasan_tls until the runtime is loaded.

A way we can work around this is by moving the instructions into a hwasan function that does the store into the ring buffer and creating a weak definition of that function locally in libc. This way __hwasan_tls will not actually be referenced. This is not our long-term solution, but this will allow us to roll out hwasan in the meantime.

This patch includes:

  • A new llvm flag for choosing to emit a libcall rather than instructions in the prologue (off by default)
  • The libcall for storing into the ringbuffer (__hwasan_record_frame_record)

Diff Detail

Event Timeline

leonardchan created this revision.Jun 22 2022, 3:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 22 2022, 3:27 PM
leonardchan requested review of this revision.Jun 22 2022, 3:27 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 22 2022, 3:27 PM

Could you please move refactoring (like extraction emitPrologue) into a separate CL?

We also need one test for runtime and one test for generated IR.

leonardchan edited the summary of this revision. (Show Details)

Added tests and this patch doesn't contain refactoring.

Could you please move refactoring (like extraction emitPrologue) into a separate CL?

We also need one test for runtime and one test for generated IR.

Should I break this up into 2 compiler-rt and llvm patches?

I don't see a problem having the compiler-rt and compiler changes in the same commit. Seems desirable in this case, actually.

compiler-rt/lib/hwasan/hwasan_interface_internal.h
172

A comment here about when the compiler generates calls to this would be helpful.

llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
1237

It seems better to call the subroutine you just broke out above rather than to have its logic still open-coded here as well.
Conversely, perhaps just move this code up before the conditional and reuse the same FrameRecordInfo local variable in both the libcall and direct-store cases, rather than having a subroutine.

Could you please move refactoring (like extraction emitPrologue) into a separate CL?

We also need one test for runtime and one test for generated IR.

Should I break this up into 2 compiler-rt and llvm patches?

I don't have preference regarding compiler-rt / llvm split. Seems having them together is useful.
It's more about moving as much as possible into NFC patches. Also in this snapshot it's not obvious what can be move out of the patch.

compiler-rt/lib/hwasan/hwasan.cpp
579

can have a nicer name?
maybe __hwasan_add_frame_record

llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
182–207

maybe better to convert hwasan-record-stack-history into enum?
hwasan-record-stack-history-with-calls is meaningless if hwasan-record-stack-history=0

1237

Also looks like can be inserted twice
Value *ThreadLong = IRB.CreateLoad(IntptrTy, SlotPtr);

1241–1244

When I asked about refactoring, you can do all NFC changes in a separate patch,
maybe you can reorder ThreadLongMaybeUntagged calculations in NFC patch so you don't have to touch it here

leonardchan marked 5 inline comments as done.
leonardchan retitled this revision from [hwasan] Add __hwasan_record_frame_record to the hwasan interface to [hwasan] Add __hwasan_add_frame_record to the hwasan interface.

Addressed some comments. Also updated the current stack history codegen path such that getFrameRecordInfo can be reused. Although this changes some test cases. If the patch is too big with this, I think I can move that into a separate one.

llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
182–207

Done and updated tests. I couldn't think of a better enum value for the default behavior other than instr :/

1237

It seems better to call the subroutine you just broke out above rather than to have its logic still open-coded here as well. Conversely, perhaps just move this code up before the conditional and reuse the same FrameRecordInfo local variable in both the libcall and direct-store cases, rather than having a subroutine.

Can do. I initially wanted to leave that out since it would mean this patch changes the IR in the prologue slightly and updating some tests that I could save for another patch. Currently, the order is (1) inttoptr (2) or PC, SP (3) store. But reusing the subroutine or FrameRecordInfo would change the order to (1) or SP, PC (2) inttoptr (3) store, but I think that shouldn't affect lowering from IR much.

Also looks like can be inserted twice
Value *ThreadLong = IRB.CreateLoad(IntptrTy, SlotPtr);

I think the second Value *ThreadLong = IRB.CreateLoad(IntptrTy, SlotPtr); in the conditional below shouldn't happen if ThreadLongMaybeUntagged is set, which will be set in this block where the first CreateLoad would happen since it looks like ThreadLongMaybeUntagged = TargetTriple.isAArch64() ? ThreadLong : untagPointer(IRB, ThreadLong) should never be nullptr, unless I'm perhaps missing something.

1241–1244

So this part's a bit awkward bc in the above block, the intermediate variables SlotPtr and ThreadLong are used for a bunch of other things like WrapMask and StackBaseTag which aren't needed here. That is, if ThreadLongMaybeUntagged isn't set above, then neither would SlotPtr or ThreadLong. I suppose I could have something like:

Value *getThreadLongMaybeUntagged(Value *&SlotPtr, Value *&ThreadLong) {
    if (!SlotPtr)
        SlotPtr = getHwasanThreadSlotPtr(IRB, IntptrTy);
    if (!ThreadLong)
        ThreadLong = IRB.CreateLoad(IntptrTy, SlotPtr);
    // Extract the address field from ThreadLong. Unnecessary on AArch64 with TBI.
    return TargetTriple.isAArch64() ? ThreadLong : untagPointer(IRB, ThreadLong);
}

  Value *ThreadLongMaybeUntagged = nullptr, *SlotPtr = nullptr, *ThreadLong = nullptr;
  if (WithFrameRecord) {
    if (ClRecordStackHistoryWithCalls) {
        ....
    } else {
        ThreadLongMaybeUntagged = getThreadLongMaybeUntagged(SlotPtr, ThreadLong);
        ....
    }
  }

  if (!ShadowBase) {
    if (!ThreadLongMaybeUntagged) {
      ThreadLongMaybeUntagged = getThreadLongMaybeUntagged(SlotPtr, ThreadLong);
      ...

where I can save intermediate values?

LGTM

llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
1160

you can add this function in a separate patch

1200–1238

switch/case

1241–1244

function local lambda works for me

auto getThreadLongMaybeUntagged = [&]() {
    if (!SlotPtr)
        SlotPtr = getHwasanThreadSlotPtr(IRB, IntptrTy);
    if (!ThreadLong)
        ThreadLong = IRB.CreateLoad(IntptrTy, SlotPtr);
    // Extract the address field from ThreadLong. Unnecessary on AArch64 with TBI.
    return TargetTriple.isAArch64() ? ThreadLong : untagPointer(IRB, ThreadLong);
}
vitalybuka accepted this revision.Jul 7 2022, 10:26 AM
vitalybuka added 1 blocking reviewer(s): mcgrathr.
leonardchan marked an inline comment as done.
leonardchan added inline comments.
llvm/lib/Transforms/Instrumentation/HWAddressSanitizer.cpp
1160
leonardchan marked 3 inline comments as done.
leonardchan marked an inline comment as done.

ping @mcgrathr does this look good to you?

mcgrathr accepted this revision.Jul 13 2022, 1:28 PM

LGTM, I didn't realize Phabricator considered me to be blocking.

This revision is now accepted and ready to land.Jul 13 2022, 1:28 PM
This revision was landed with ongoing or failed builds.Jul 13 2022, 2:09 PM
This revision was automatically updated to reflect the committed changes.