Page MenuHomePhabricator

[ubsan] Function Sanitizer: Don't require writable text segments
ClosedPublic

Authored by vsk on Sep 7 2017, 3:27 PM.

Details

Summary

This change will make it possible to use -fsanitize=function on Darwin and
possibly on other platforms. It fixes an issue with the way RTTI is stored into
function prologue data.

On Darwin, addresses stored in prologue data can't require run-time fixups and
must be PC-relative. Run-time fixups are undesirable because they necessitate
writable text segments, which can lead to security issues. And absolute
addresses are undesirable because they break PIE mode.

The fix is to create a private global which points to the RTTI, and then to
encode a PC-relative reference to the global into prologue data.

Diff Detail

Repository
rL LLVM

Event Timeline

pcc edited edge metadata.Sep 7 2017, 3:56 PM

Thanks. Once we apply this fix to other platforms this would seem to fix PR17633.

We may also want to change the ubsan function signature in order to avoid bad pointer reads in case of version mismatches.

lib/CodeGen/CodeGenFunction.cpp
434 ↗(On Diff #114267)

I think you can just do this unconditionally. As far as I know, all three object formats should support 32-bit relative relocations on x86 and x86_64, which are the only two architectures which currently support -fsanitize=function.

445 ↗(On Diff #114267)

This can be constant I think.

463 ↗(On Diff #114267)

Maybe use Int32Ty (here and below). That should be sufficient under the small code model.

pcc added inline comments.Sep 7 2017, 3:58 PM
lib/CodeGen/CodeGenFunction.cpp
463 ↗(On Diff #114267)

Sorry, I meant that the difference could be truncated to Int32Ty, and stored as an integer, not a pointer.

vsk added a comment.Sep 7 2017, 5:03 PM

Thanks for your comments!

lib/CodeGen/CodeGenFunction.cpp
434 ↗(On Diff #114267)

Ok! I'll make this change in the next version of this patch (probably tomorrow).

445 ↗(On Diff #114267)

Can the value of the global change if a dynamic library is unloaded? I'm unsure about this case: A.dylib and B.dylib both have linkonce_odr definitions of X, dyld picks the definition from A, and A is unloaded.

463 ↗(On Diff #114267)

I tried this out but it resulted in a mysterious dyld crash after the indirect callee returns, which I've yet to understand. I'll have to try using a Debug build of dyld to see what's happening.

vsk updated this revision to Diff 114303.Sep 7 2017, 6:57 PM
vsk edited the summary of this revision. (Show Details)

Address review feedback:

  • Enable this change on all platforms.
  • Bump the function signature version.
  • Store the indirect RTTI pointer as an IntPtrTy. This saves a "ptrtoint" instruction. Truncating to Int32Ty should be possible, but causes dyld to crash, and also requires an extra sign-extension.
vsk added inline comments.Sep 12 2017, 12:15 PM
lib/CodeGen/CodeGenFunction.cpp
463 ↗(On Diff #114267)

@pcc It turned out that the jump we encode in getUBSanFunctionSignature not correct if we truncate a 64-bit address to a 32-bit address. If we jump to +8 instead of +12, we can make the truncation work :). There is no dyld bug.

That said, I think we should store the pc-rel address as an IntPtrTy to keep things a bit simpler. We could avoid logic to truncate/sign-extend conditionally, and not emit those instructions at all. Wdyt?

pcc added inline comments.Sep 12 2017, 1:06 PM
lib/CodeGen/CodeGenFunction.cpp
445 ↗(On Diff #114267)

I don't think it is possible to get into a situation where a global's address can change if a library is unloaded. In your scenario, suppose that B takes the address of X and then A is dlclosed. In that scenario the program would rightly expect the address to continue to be valid; to do otherwise would break the C++ standard's guarantee that each global has a unique address. That I believe is why POSIX does not provide a mechanism to require that a DSO is unloaded, only to request it to be unloaded; see the specification of dlclose [0].

The Linux manpage for dlclose is more specific: "If the reference count drops to zero and no other loaded libraries use symbols in it, then the dynamic library is unloaded." I believe that any references to X in B would count as a use of X in A.

[0] http://pubs.opengroup.org/onlinepubs/009695399/functions/dlclose.html

463 ↗(On Diff #114267)

The problem is that COFF does not have a relocation that can express a 64-bit relative reference. If we want something that will work with all object formats, 32-bit is probably best.

vsk updated this revision to Diff 114932.Sep 12 2017, 4:01 PM
vsk marked 8 inline comments as done.

Address review feedback.

lib/CodeGen/CodeGenFunction.cpp
445 ↗(On Diff #114267)

I see, thanks for explaining!

463 ↗(On Diff #114267)

Got it.

pcc accepted this revision.Sep 12 2017, 4:22 PM

LGTM, thanks!

lib/CodeGen/CodeGenFunction.cpp
461 ↗(On Diff #114932)

Is this conditional necessary? I believe that IRBuilder will return the same value back if you cast to the same type.

This revision is now accepted and ready to land.Sep 12 2017, 4:22 PM
This revision was automatically updated to reflect the committed changes.
vsk marked 4 inline comments as done.Sep 12 2017, 5:06 PM
vsk added inline comments.
lib/CodeGen/CodeGenFunction.cpp
461 ↗(On Diff #114932)

Thanks, it's not needed. I've gotten rid of it.