This is an archive of the discontinued LLVM Phabricator instance.

[DFSan] Handle landingpad inst explicitly as zero shadow.
ClosedPublic

Authored by browneee on Jun 14 2021, 2:34 PM.

Details

Summary

Before this change, DFSan was relying fallback cases when getting origin
address.

Diff Detail

Event Timeline

browneee created this revision.Jun 14 2021, 2:34 PM
browneee requested review of this revision.Jun 14 2021, 2:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 14 2021, 2:34 PM
stephan.yichao.zhao added inline comments.
llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
2567

This link does not explain this issue in details.
https://github.com/llvm/llvm-project/blame/dde9dcc24b23e0b8185cf7ce5072c0dc8ff086c9/llvm/lib/Transforms/Instrumentation/MemorySanitizer.cpp shows that at the very beginning, MemorySanitizer has the comment here.

From https://llvm.org/docs/ExceptionHandling.html#try-catch, we can see in practice the first element of the return value of a landing pad is an exception.
So this means it is possible to have data flows from an exception raise to here, but not 100% sure.
For MSan, it is only to ignore tracking this because at runtime all returned values can be assumed initialized. For DFSan, this may miss some flows.

@eugenis: Do you still happen to remember how https://github.com/google/sanitizers/issues/504 affected MSan? I did not see how other sanitizers handle landing pads. Will DFSan be affected too?

Not sure if https://github.com/google/sanitizers/issues/504 implies any

browneee added inline comments.Jun 15 2021, 12:32 PM
llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
2567

Testing throwing data:

int foo(int a) {
    throw a;
}

int bar(int a) {
    try {
        foo(a);
    } catch (int x) {
        return x;
    }
    return 0;
}
; Function Attrs: noinline optnone uwtable
define dso_local i32 @_Z3fooi(i32 %0) #0 {
  %2 = alloca i32, align 4
  store i32 %0, i32* %2, align 4
  %3 = call i8* @__cxa_allocate_exception(i64 4) #4    ; STORAGE FROM ABI
  %4 = bitcast i8* %3 to i32*
  %5 = load i32, i32* %2, align 4
  store i32 %5, i32* %4, align 16        ; STORE VALUE TO THROW
  call void @__cxa_throw(i8* %3, i8* bitcast (i8** @_ZTIi to i8*), i8* null) #5
  unreachable
}

declare dso_local i8* @__cxa_allocate_exception(i64)

declare dso_local void @__cxa_throw(i8*, i8*, i8*)

; Demangled: "bar(int)"
; Function Attrs: noinline optnone uwtable
define dso_local i32 @_Z3bari(i32 %0) #0 personality i8* bitcast (i32 (...)* @__gxx_personality_v0 to i8*) {
  %2 = alloca i32, align 4
  %3 = alloca i32, align 4
  %4 = alloca i8*, align 8
  %5 = alloca i32, align 4
  %6 = alloca i32, align 4
  store i32 %0, i32* %3, align 4
  %7 = load i32, i32* %3, align 4
  %8 = invoke i32 @_Z3fooi(i32 %7)
          to label %9 unwind label %10

9:                                                ; preds = %1
  br label %24

10:                                               ; preds = %1
  %11 = landingpad { i8*, i32 }
          catch i8* bitcast (i8** @_ZTIi to i8*)
  %12 = extractvalue { i8*, i32 } %11, 0
  store i8* %12, i8** %4, align 8
  %13 = extractvalue { i8*, i32 } %11, 1
  store i32 %13, i32* %5, align 4
  br label %14

14:                                               ; preds = %10
  %15 = load i32, i32* %5, align 4
  %16 = call i32 @llvm.eh.typeid.for(i8* bitcast (i8** @_ZTIi to i8*)) #4
  %17 = icmp eq i32 %15, %16
  br i1 %17, label %18, label %27

18:                                               ; preds = %14
  %19 = load i8*, i8** %4, align 8
  %20 = call i8* @__cxa_begin_catch(i8* %19) #4
  %21 = bitcast i8* %20 to i32*
  %22 = load i32, i32* %21, align 4    ; LOAD VALUE FROM THROW
  store i32 %22, i32* %6, align 4
  %23 = load i32, i32* %6, align 4
  store i32 %23, i32* %2, align 4
  call void @__cxa_end_catch() #4
  br label %25

24:                                               ; preds = %9
  store i32 0, i32* %2, align 4
  br label %25

25:                                               ; preds = %24, %18
  %26 = load i32, i32* %2, align 4
  ret i32 %26

27:                                               ; preds = %14
  %28 = load i8*, i8** %4, align 8
  %29 = load i32, i32* %5, align 4
  %30 = insertvalue { i8*, i32 } undef, i8* %28, 0
  %31 = insertvalue { i8*, i32 } %30, i32 %29, 1
  resume { i8*, i32 } %31
}

It seems like the ABI will provide storage for data flows. DFSan does handle this flow correctly.

As for:

%12 = extractvalue { i8*, i32 } %11, 0
%13 = extractvalue { i8*, i32 } %11, 1

I'd expect the indirection in i8* %12 would allow DFSan to have a shadow for any data at load(%12).
i32 %13 is a type ID, and should never be tainted.

I'm thinking I should:

  • remove the out of date issue link from MSan and
  • add comments about not needing to track flows through landingpad.
browneee updated this revision to Diff 352229.Jun 15 2021, 1:19 PM

Updating comments.

browneee marked an inline comment as done.Jun 15 2021, 1:20 PM
llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
2567

Thank you for extracting this.

  %12 = extractvalue { i8*, i32 } %11, 0
  store i8* %12, i8** %4, align 8

...

  %19 = load i8*, i8** %4, align 8
  %20 = call i8* @__cxa_begin_catch(i8* %19) #4
  %21 = bitcast i8* %20 to i32*
  %22 = load i32, i32* %21, align 4    ; LOAD VALUE FROM THROW
  store i32 %22, i32* %6, align 4
  %23 = load i32, i32* %6, align 4
  store i32 %23, i32* %2, align 4
  ...
  br label %25

25:                                               ; preds = %24, %18
  %26 = load i32, i32* %2, align 4
  ret i32 %26

From the above, %12 eventually goes to %26 via __cxa_begin_catch.

If we build the above c code with dfsan and C++ exception enabled, does it have a link error saying dfs$cxa_begin_catch cannot be found? If there is no such an error, I think our problem is that we do not have a correct wrapper of cxa_begin_catch, so %19 may not go to %20 correctly, by default its return value will have 0 shadow. .../compiler-rt/lib/dfsan/done_abilist.txt does not have this.

https://llvm.org/docs/ExceptionHandling.html says "__cxa_begin_catch takes an exception structure reference as an argument and returns the value of the exception object." To support this, the wrapper needs to help load correct shadow and assign it to the shadow of what it ret points to.

But we cannot provide a wrapper to this builtin libcall yet, and can only wrap glibc calls.
So I think zero-out here is like a workaround before we can have a correct wrapper of __cxa_begin_catch.
Please add a comment to explain this as a TODO after we can wrapp libcall APIs.

llvm/lib/Transforms/Instrumentation/DataFlowSanitizer.cpp
2567

I missed that part that we verified that cxa_begin_catch is able to 'propagate' shadows even if we did not instrument it. Please update the comments about cxa_begin_catch's behavior.

browneee updated this revision to Diff 352310.Jun 15 2021, 6:12 PM
browneee marked 2 inline comments as done.

Updated comments.

This revision is now accepted and ready to land.Jun 15 2021, 6:14 PM
This revision was landed with ongoing or failed builds.Jun 15 2021, 6:30 PM
This revision was automatically updated to reflect the committed changes.