This is an archive of the discontinued LLVM Phabricator instance.

[clang][UBSan] Sanitization for alignment assumptions.
ClosedPublic

Authored by lebedev.ri on Nov 15 2018, 10:25 AM.

Details

Summary

UB isn't nice. It's cool and powerful, but not nice.
Having a way to detect it is nice though.
P1007R3: std::assume_aligned / http://www.open-std.org/jtc1/sc22/wg21/docs/papers/2018/p1007r2.pdf says:

We propose to add this functionality via a library function instead of a core language attribute.
...
If the pointer passed in is not aligned to at least N bytes, calling assume_aligned results in undefined behaviour.

This differential teaches clang to sanitize all the various variants of this assume-aligned attribute.

Requires D54588 for LLVM IRBuilder changes.
The compiler-rt part is D54590.

Diff Detail

Repository
rC Clang

Event Timeline

Adapt to D54588 no longer providing the actual pointer.
The handled itself will have to deal with subtracting the offset.

rjmccall added inline comments.Nov 26 2018, 11:58 AM
docs/UndefinedBehaviorSanitizer.rst
75

assume_aligned-like

199

Is there a reason for this exception?

lib/CodeGen/CGBuiltin.cpp
1916–1917

Is this {}-initializing a SourceLocation? Please use SourceLocation() instead and put the comment before it.

lib/CodeGen/CodeGenFunction.cpp
2509

What's the deal with the two different source locations?

lib/CodeGen/CodeGenFunction.h
2827

dyn_cast, please. And these two methods should really be defined out-of-line.

lebedev.ri added inline comments.Nov 26 2018, 12:16 PM
docs/UndefinedBehaviorSanitizer.rst
199

Are you asking about the LHS of the diff, or about adding an exception to that for this sanitizer?
I'm adding an exception here because i don't know what should be done here.
Does it make sense to emit an assumptions for volatile pointers, but do not sanitize these assumptions?

lib/CodeGen/CGBuiltin.cpp
1916–1917

Is this {}-initializing a SourceLocation?

Yes

Ok.

lib/CodeGen/CodeGenFunction.cpp
2509

The first one points to the source-location of this alignment assumption.
The second one *may* point to the location where the alignment was specified.
See e.g. "test/ubsan/TestCases/Pointer/alignment-assumption-attribute-align_value-on-lvalue.cpp" in https://reviews.llvm.org/D54590#change-jI44M13yrBNo

rjmccall added inline comments.Nov 26 2018, 12:54 PM
docs/UndefinedBehaviorSanitizer.rst
199

Are you asking about the LHS of the diff, or about adding an exception to that for this sanitizer?

I'm asking about adding a new exception for one portion of one sanitizer.

I'm adding an exception here because i don't know what should be done here.

Okay, that's not a good enough reason.

The overall rule for annotation-based language/tool designs is that explicit/specific/close wins over implicit/general/distant. So the question is: how does that rule apply here?

You can't end up with a pointer to volatile completely implicitly — at some point, a programmer was explicit about requesting volatile semantics, and that has somehow propagated to this particular access/assumption site. So that's a pretty strong piece of information, and if we have a general rule for the sanitizers that volatile bypasses the check, it's generally a good idea to be consistent with that.

On the other hand, these assumption annotations are themselves always explicit, right? If you have to be explicit about putting align_value on a specific pointer variable, and that pointer just happens to be volatile-qualified, we probably *shouldn't* bypass the check: that's about an explicit, specific, and close as a programmer can get, just short of literally writing it on every access to the variable. The only counter-argument is that maybe the pointer is only volatile-qualified because of template instantiation or something.

So I think it makes sense to enforce it for at least some of these annotations and/or builtin calls, but we should be clear about *why* it makes sense. However, it's possible that I may be misunderstanding part of the motivation behind the general exception for volatile, so you should reach out for input from the UBSan etc. people.

lib/CodeGen/CodeGenFunction.cpp
2509

I see, so the runtime diagnostic can point to the second location if it's known.

The parameter names do not make that clear at all. It should at least be documented, and you should probably change the variable names to be clearer.

lebedev.ri marked an inline comment as done.Nov 26 2018, 1:16 PM
lebedev.ri added inline comments.
docs/UndefinedBehaviorSanitizer.rst
199

Tried with nullability attributes https://godbolt.org/z/rJUb9U
They do not bypass this "ignore volatile".
So i guess i will drop this exception.

lebedev.ri marked an inline comment as not done.Nov 26 2018, 1:18 PM
lebedev.ri added inline comments.
docs/UndefinedBehaviorSanitizer.rst
199

Whoops, i meant nullable of course, showed the wrong snippet https://godbolt.org/z/DbzKK0

lebedev.ri marked 11 inline comments as done.

Address @rjmccall review notes.

rjmccall added inline comments.Nov 27 2018, 8:13 AM
docs/ReleaseNotes.rst
337

unfinished

docs/UndefinedBehaviorSanitizer.rst
199

Okay, WFM.

lib/CodeGen/CGStmtOpenMP.cpp
1475–1476

Same comment here about SourceLocation().

lib/CodeGen/CodeGenFunction.cpp
2506

typo

lebedev.ri marked 4 inline comments as done.

Address @rjmccall review notes.

rjmccall accepted this revision.Nov 27 2018, 9:39 AM

LGTM.

This revision is now accepted and ready to land.Nov 27 2018, 9:39 AM

LGTM.

Thank you for the review!

Rebased, NFC.

Rebased before commit, NFC.