This is an archive of the discontinued LLVM Phabricator instance.

[ubsan] Add -fubsan-strip-path-components=N
ClosedPublic

Authored by filcab on Apr 28 2016, 9:30 AM.

Details

Summary

This option allows the user to control how much of the file name is
emitted by UBSan. Tuning this option allows one to save space in the
resulting binary, which is helpful for restricted execution
environments.

With a positive N, UBSan skips the first N path components.
With a negative N, UBSan only keeps the last N path components.

Diff Detail

Event Timeline

filcab updated this revision to Diff 55427.Apr 28 2016, 9:30 AM
filcab retitled this revision from to [ubsan] Add -fubsan-strip-path-components=N.
filcab updated this object.
filcab added a reviewer: rsmith.
filcab added a subscriber: cfe-commits.
rsmith added inline comments.Apr 28 2016, 11:45 AM
include/clang/Driver/Options.td
1102

This should be spelled -fsanitize-undefined-strip-path-components= or similar to match the other -fsanitize-... options.

lib/CodeGen/CGExpr.cpp
2400–2402

This is a fragile way to compute the resulting string: you're relying on PLoc.getFilename() returning a nul-terminated string; this code would fail if PLoc.getFilename() started (very reasonably) returning a StringRef. Also, *I on a path iterator doesn't necessarily produce a StringRef whose contents are taken from the path being iterated (particularly for paths ending in a /, which can't happen here, but there seem to be no guarantees that this doesn't happen in other cases too).

In the forward iteration case, you can use FilenameString.substr(I - path::begin(FilenameString)) to get a correct StringRef denoting the relevant piece of the path. The path reverse iterators don't have an operator- but one could presumably be added.

filcab updated this revision to Diff 55605.Apr 29 2016, 8:41 AM
filcab marked 2 inline comments as done.

Addressed Richard's comments.
Waiting on D19724 (operator- on Path's reverse_iterator) before landing this.

filcab updated this revision to Diff 56164.May 4 2016, 9:51 AM

Remove unneeded comments.
Simplify code.

filcab updated this revision to Diff 56262.May 5 2016, 4:23 AM

Remove .data() call

ygribov added a subscriber: ygribov.May 5 2016, 4:29 AM

Can we have generic option for other sanitizers?

filcab added a comment.May 5 2016, 5:26 AM

Can we have generic option for other sanitizers?

Do other sanitizers emit paths this way?
For ASan, for example, we end up emitting them only when there are global constructors involved, where we'd emit an __asan_global.
I don't know much about msan and tsan, but I'd think it wouldn't make as much a difference as this does in ubsan, since ubsan basically emits checks in every (non-trivial) file you compile.

I'm not opposed to doing it for other sanitizers, but the better option would be to measure first to see if it's useful. If it is, then deprecate this flag and do a more generic one.
BTW, how would we pass this information along to llvm-land so the other sanitizers could use it? Is there a mechanism already in place?

filcab updated this revision to Diff 56289.May 5 2016, 7:48 AM

Improve Windows support

rsmith added inline comments.May 5 2016, 2:38 PM
docs/UndefinedBehaviorSanitizer.rst
235

"option" -> "Option" or "The option"

lib/CodeGen/CGExpr.cpp
2385–2386

This doesn't look right: if I == E, we were asked to keep at least as many components as there were, so we should keep the entire string. I think the substr(I - E) codepath is appropriate in either case here.

test/CodeGen/ubsan-strip-path-components.cpp
7

I think this test is incorrect too.

filcab added inline comments.May 6 2016, 12:51 AM
lib/CodeGen/CGExpr.cpp
2385–2386

Ah, right.
My rationale was "if the number of stuff to strip (positive or negative) is greater than the number of path components, keep only the filename", which doesn't really fit with the negative case.
Will change today.

test/CodeGen/ubsan-strip-path-components.cpp
7

Yep, test update coming.

filcab updated this revision to Diff 56396.May 6 2016, 4:15 AM

Address Richard's comments.
Final Windows fixes (force a Linux target so our mangled label matches).

rsmith accepted this revision.May 6 2016, 2:43 PM
rsmith edited edge metadata.

LGTM

This revision is now accepted and ready to land.May 6 2016, 2:43 PM
This revision was automatically updated to reflect the committed changes.