This is an archive of the discontinued LLVM Phabricator instance.

Modify BoundsSan to improve debuggability
Needs ReviewPublic

Authored by oskarwirga on Apr 18 2023, 12:45 PM.

Details

Summary

Context

BoundsSanitizer is a mitigation that is part of UBSAN. It can be enabled in "trap" mode to crash on OOB array accesses.

Problem

BoundsSan has zero false positives meaning every crash is a OOB array access, unfortunately optimizations cause these crashes in production builds to be a bit useless because we only know which function is crashing but not which line of code.

Godbolt example of the optimization: https://godbolt.org/z/6qjax9z1b

This Diff

I wanted to provide a way to know exactly which LOC is responsible for the crash. What we do here is use the size of the basic block as an iterator to an immediate value for the ubsan trap.

Diff Detail

Event Timeline

oskarwirga created this revision.Apr 18 2023, 12:45 PM
oskarwirga created this object with visibility "oskarwirga (Oskar Wirga)".
oskarwirga created this object with edit policy "Subscribers".
Herald added a project: Restricted Project. · View Herald TranscriptApr 18 2023, 12:45 PM
oskarwirga requested review of this revision.Apr 18 2023, 12:45 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptApr 18 2023, 12:45 PM
oskarwirga changed the visibility from "oskarwirga (Oskar Wirga)" to "All Users".Apr 18 2023, 12:46 PM
oskarwirga changed the edit policy from "Subscribers" to "All Users".

You should upload this with full context and add some test cases :)

Thinking about this a bit more, should the trap not have an associated stack trace that can be symbolicated to tell you which line of code was crashing? If the issue is that multiple traps can get folded together, the nomerge attribute (D78659) could be useful.

oskarwirga added a comment.EditedMay 2 2023, 6:10 PM

Thinking about this a bit more, should the trap not have an associated stack trace that can be symbolicated to tell you which line of code was crashing? If the issue is that multiple traps can get folded together, the nomerge attribute (D78659) could be useful.

I tried adding the nomerge attribute to TrapCall but I still found the call being optimized to a single site :(

Edit: Added it incorrectly, seems to work fine!

oskarwirga edited the summary of this revision. (Show Details)May 2 2023, 6:11 PM
oskarwirga updated this revision to Diff 518933.May 2 2023, 7:00 PM
oskarwirga edited the summary of this revision. (Show Details)

Update the diff to use the nomerge attribute

oskarwirga edited the summary of this revision. (Show Details)

Turns out the lowering of ubsantrap() to a single instruction resulted in binaries that did NOT have nonmerged traps, so this is going back to what we had before. I also added tests to show that the trap gets preserved.

Rebase on trunk :)

kyulee added a subscriber: kyulee.May 10 2023, 6:36 PM

clang-format 😺

CC: @nlopes @chandlerc @jgalenson

I have y'all added here because of your past work on BoundsSan, if you know of anyone else who may be able to provide review please tag them!

The patch should be uploaded with full context to make review easier.

Adding another potential reviewer.

Add full context

nlopes added inline comments.May 12 2023, 2:17 PM
llvm/lib/Transforms/Instrumentation/BoundsChecking.cpp
187–219

this seems like code duplication. This pass already has the single-trap flag to exactly control if you get a single trap BB or one per check for better debug info.

oskarwirga added inline comments.May 12 2023, 3:35 PM
llvm/lib/Transforms/Instrumentation/BoundsChecking.cpp
187–219

Unfortunately, even with the single trap flag it gets optimized out in later passes because the machine code emitted is the exact same

smeenai added inline comments.May 12 2023, 3:41 PM
llvm/lib/Transforms/Instrumentation/BoundsChecking.cpp
187–219

I believe we end up tail merging the trap instructions. A previous iteration of this patch attempted to use the nomerge attribute to directly avoid the tail merging, but that only works for function calls, not for the trap instruction ultimately emitted here.

oskarwirga changed the visibility from "All Users" to "Public (No Login Required)".May 15 2023, 11:09 AM

Jobs are now passing, CC: @nlopes @chandlerc @jgalenson for review :)

vitalybuka added inline comments.Jul 13 2023, 11:22 AM
clang/lib/CodeGen/CGExpr.cpp
3612–3633

looks like a lot of code duplication

3630

wouldn't be you issues solved with
TrapCall->setCannotMerge() here?

oskarwirga added inline comments.Jul 13 2023, 1:10 PM
clang/lib/CodeGen/CGExpr.cpp
3612–3633

Let me refactor this

3630

ubsantrap gets lowered to an instruction in MIR which then gets merged later. This was the only way I was able to create unique instruction per check.

Refactor CodeGen code

oskarwirga marked 2 inline comments as done.Jul 13 2023, 2:10 PM
aeubanks added inline comments.Jul 13 2023, 2:32 PM
clang/lib/CodeGen/CGExpr.cpp
3630

isn't that a nomerge bug that should get fixed instead?

oskarwirga added inline comments.Jul 13 2023, 2:44 PM
clang/lib/CodeGen/CGExpr.cpp
3630

I don't think so because nomerge is a function attribute and the ubsantrap intrinsic gets lowered as an instruction. Creating instruction attributes from lowered function attributes seems a bit involved to me.

Properly refactor the code bc my last attempt was flawed -_-

oskarwirga planned changes to this revision.Aug 3 2023, 2:26 AM

Fix clang crash and retest

vitalybuka added inline comments.Sep 7 2023, 3:48 PM
clang/lib/CodeGen/CGExpr.cpp
47

this file and BoundsChecking.cpp belong to different patches

56

this applies only to fsanitize=undefined, and does not apply to llvm level sanitizers, like msan, asan
we need better name: maybe ubsan-unique-traps

BTW do we want this as frontend flag?

3581

so here we have two problems?

  1. OptimizationLevel > 0 clang creates only one TrapBB per check type
  2. even if we create multiple bb here, branch-folder will merge them later
3597–3599
(TrapBB->getParent()->size() * 0x10000 + CheckHandlerID)
llvm/lib/Transforms/Instrumentation/BoundsChecking.cpp
187–219

branches of if (DebugTrapBB) condition has a lot of code duplication, can you try to imrove?

196

why Fn->size(), to make a counter?

202

can you please create a test where bounds-checking-single-trap=0 and setCannotMerge produce invalid result.

oskarwirga added inline comments.Sep 11 2023, 2:59 AM
clang/lib/CodeGen/CGExpr.cpp
47

Just to be clear on terms here, these changes should be two different commits?

56

Good point, as for being a frontend flag I don't feel strongly one way or the other, not sure how useful this is outside of my use-case.

3581

Yeah exactly, despite my best efforts to come up with an existing way to fix this issue, because the ubsantrap intrinsic ends up being an instruction in MIR, it loses any function attributes we tack on now.

3597–3599

Why * 0x10000 ?

llvm/lib/Transforms/Instrumentation/BoundsChecking.cpp
187–219

Yes, I will address all your comments in a new patch on github, thank you for your feedback I appreciate it :)

196

I was looking for a way to create a more or less unique value without creating a global iterator. It doesn't have to be this, I just thought that the fn->size would be increasing as checks go in so it would give a unique value per check.

202

Can do!

oskarwirga marked 5 inline comments as done.Sep 11 2023, 9:04 AM