This is an archive of the discontinued LLVM Phabricator instance.

[dfsan] Fix origin tracking for fast8
ClosedPublic

Authored by gbalats on Apr 29 2021, 3:59 PM.

Details

Summary

The problem is the following. With fast8, we broke an important invariant when loading shadows.
A wide shadow of 64 bits used to correspond to 4 application bytes with fast16; so, generating a single load was okay since those 4 application bytes would share a single origin.
Now, using fast8, a wide shadow of 64 bits corresponds to 8 application bytes that should be backed by 2 origins (but we kept generating just one).

Let’s say our wide shadow is 64-bit and consists of the following: 0xABCDEFGH. To check if we need the second origin value, we could do the following (on the 64-bit wide shadow) case:

  • bitwise shift the wide shadow left by 32 bits (yielding 0xEFGH0000)
  • push the result along with the first origin load to the shadow/origin vectors
  • load the second 32-bit origin of the 64-bit wide shadow
  • push the wide shadow along with the second origin to the shadow/origin vectors.

The combineOrigins would then select the second origin if the wide shadow is of the form 0xABCDE0000.
The tests illustrate how this change affects the generated bitcode.

Diff Detail

Event Timeline

gbalats created this revision.Apr 29 2021, 3:59 PM
gbalats requested review of this revision.Apr 29 2021, 3:59 PM
llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
505

-> loadNextOriginAndIncAddr or a similar name? So a reader can follow the main code w/o looking into the method.

2150

The LangRef needs both operands have the same type.
I guess on a 64bit system, this is fine. But making this one have type WideShadowTy would make them always consistent.

2195

If we lift this origin load out of the if-else, this if-else could be shared with the if-else before the for-loop, so the comments and the assertion assert(BytesPerWideShadow == 8); will also be shared.

FirstOrigin = DFS.loadNextOrigin(Pos, OriginAlign, &OriginAddr);
if () {
   ...
   Origins.push_back(FirstOrigin)
} else {
   ...
   Origins.push_back(FirstOrigin)
}
llvm/test/Instrumentation/DataFlowSanitizer/origin_load.ll
258

If WIDE_SHADOW_LO is named to be WIDE_SHADOW_1; and WIDE_SHADOW is named WIDE_SHADOW_12; and ORIGIN is named ORIGIN1;
and we do not reuse ORIGIN, but named new assignment to ORIGIN as ORIGIN23 if it is a select from ORIGIN2 and ORIGIN3, it helps read the code.

gbalats updated this revision to Diff 342049.Apr 30 2021, 2:45 PM
gbalats marked an inline comment as done.

Address reviewer comments.

Add lambda to remove code duplication and description for loadOrigin.

gbalats marked an inline comment as done.Apr 30 2021, 2:47 PM
gbalats added inline comments.
llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
505

The "Next" part of the name alludes to a pointer advance (how can you load the next thing without advancing the pointer). Also, the fact that the OriginAddr param is passed by pointer (indicating that it will be updated). However, I did add some documentation to make it clearer.

2150

Done. Thanks for catching this!

2195

Done. Extracted to a lambda.

llvm/test/Instrumentation/DataFlowSanitizer/origin_load.ll
258

I think if we use WIDE_SHADOW_1 instead of _LO it will be difficult to differentiate between the different wide shadows and their left-shifted counterparts. The naming pattern I used is adding _LO suffix to indicate that WIDE_SHADOW(N)_LO is the left-shifted version of WIDE_SHADOW(N). Overwriting ORIGIN is essential to keep the test smaller by reusing the same expression later on (see lines 312, 316). We need a way to state that, whatever path has been taken (fast8 or fast16, combine load ptr or not), the ORIGIN will refer to the name of the final origin that we have computed. Otherwise, we'd have to replicate those lines as well even though they are essentially the same.

However, I can partially address this (remove some of the overwriting) when removing fast16.

stephan.yichao.zhao added inline comments.
llvm/test/Instrumentation/DataFlowSanitizer/origin_load.ll
258

Got it. I missed the pattern needs to be shared. Yes. Please see if this can be simplified after fast16 is removed.

This revision is now accepted and ready to land.Apr 30 2021, 2:52 PM
gbalats updated this revision to Diff 342056.Apr 30 2021, 2:57 PM
gbalats marked an inline comment as done.

Pulling latest changes

This revision was landed with ongoing or failed builds.Apr 30 2021, 3:58 PM
This revision was automatically updated to reflect the committed changes.