This is an archive of the discontinued LLVM Phabricator instance.

Implement nonnull-attribute sanitizer
ClosedPublic

Authored by samsonov on Aug 26 2014, 4:30 PM.

Details

Summary

This patch implements a new UBSan check, which verifies
that function arguments declared to be nonnull with attribute((nonnull))
are actually nonnull in runtime.

To implement this check, we pass FunctionDecl to CodeGenFunction::EmitCallArgs
(where applicable) and if function declaration has nonnull attribute specified
for a certain formal parameter, we compare the corresponding RValue to null as
soon as it's calculated.

Diff Detail

Event Timeline

samsonov updated this revision to Diff 12975.Aug 26 2014, 4:30 PM
samsonov retitled this revision from to Implement nonnull-attribute sanitizer.
samsonov updated this object.
samsonov edited the test plan for this revision. (Show Details)
samsonov added a reviewer: rsmith.
samsonov added subscribers: rnk, Unknown Object (MLST).
rsmith added inline comments.Aug 26 2014, 4:51 PM
projects/compiler-rt/lib/ubsan/ubsan_handlers.cc
378

Please consistently use either NonNull or Nonnull.

tools/clang/lib/CodeGen/CGCall.cpp
2423–2424

What should happen here:

__attribute__((nonnull)) void f(const char *, ...);
int main() { void *p = 0; f("%s", p); }

(I have no idea if the attribute applies in this case.)

2425–2429

Can a function have multiple __attribute__((nonnull(N)))s on it?

2431

What guarantees this? I don't see where you're checking that the parameter is of a pointer type.

samsonov added inline comments.Aug 26 2014, 6:24 PM
projects/compiler-rt/lib/ubsan/ubsan_handlers.cc
378

Done

tools/clang/lib/CodeGen/CGCall.cpp
2423–2424

Neither am I :( GCC docs say that "If no argument index list is given to the nonnull attribute, all pointer arguments are marked as non-null", but it's not clear if it refers to call arguments, or formal parameters. What is worse, GCC and Clang seem to disagree on this:

$ cat tmp/ubsan/bad.cpp
__attribute__((nonnull)) void *foo2(int *a, ...);

void bar() { foo2(0, (void*)0); }
$ g++ -Wnonnull -Wall tmp/ubsan/bad.cpp -c -o /dev/null
tmp/ubsan/bad.cpp: In function ‘void bar()’:
tmp/ubsan/bad.cpp:3:30: warning: null argument where non-null required (argument 1) [-Wnonnull]
void bar() { foo2(0, (void*)0); }
                            ^
tmp/ubsan/bad.cpp:3:30: warning: null argument where non-null required (argument 2) [-Wnonnull]
$ ./bin/clang++ -Wnonnull -Wall tmp/ubsan/bad.cpp -c -o /dev/null
tmp/ubsan/bad.cpp:3:30: warning: null passed to a callee which requires a non-null argument [-Wnonnull]
void bar() { foo2(0, (void*)0); }
                  ~          ^
1 warning generated.

Sigh. Should Clang strive for GCC-compatible behavior, or you have the idea of which behavior is "more predictable" in this case?

2425–2429

Nice catch! Both Clang and GCC accept this code and seem to calculate union of all arguments passed into nonnull().

$ cat tmp/ubsan/bad.cpp
__attribute__((nonnull(1)))
__attribute__((nonnull(2)))
void foo(int *a, int *a2);
void bar() { foo(0, 0); }
$ g++ -Wnonnull -Wall tmp/ubsan/bad.cpp -c -o /dev/null
tmp/ubsan/bad.cpp: In function ‘void bar()’:
tmp/ubsan/bad.cpp:9:11: warning: null argument where non-null required (argument 1) [-Wnonnull]
   foo(0, 0);
           ^
tmp/ubsan/bad.cpp:9:11: warning: null argument where non-null required (argument 2) [-Wnonnull]
$ ./bin/clang++ -Wnonnull -Wall tmp/ubsan/bad.cpp -c -o /dev/null
tmp/ubsan/bad.cpp:9:11: warning: null passed to a callee which requires a non-null argument [-Wnonnull]
  foo(0, 0);
         ~^
tmp/ubsan/bad.cpp:9:11: warning: null passed to a callee which requires a non-null argument [-Wnonnull]
  foo(0, 0);
      ~   ^
2 warnings generated.

The code as written won't handle this case properly (the last nonnull() attribute in function declaration will win). Sadly, there is the same problem already present in EmitFunctionProlog() function. I will work on a fix in a separate change.

2431

This seems to be handled by Sema:

tmp/ubsan/bad.cpp:1:33: warning: 'nonnull' attribute only applies to pointer arguments [-Wignored-attributes]
void foo(int *a, __attribute__((nonnull)) int b);
                 ~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~
rsmith added inline comments.Aug 26 2014, 8:44 PM
tools/clang/lib/CodeGen/CGCall.cpp
2423–2424

I think the GCC behavior makes more sense. We should fix the frontend to also understand this, and model it properly in the AST. Our current representation doesn't work.

2431

I was thinking about this case:

void f(non_scalar_type x) __attribute__((nonnull));
void g() { f(x); }

... but it looks like the attribute handling flattens the attribute into a list of parameters when it's applied. That's broken in several ways, alas. For instance, we fail to diagnose this properly:

template<typename T> __attribute__((nonnull)) void f(T t);
void g() { f((void*)0); }

... because we can't preserve that we had a parameter-less nonnull attribute until we get to template instantiation.

rsmith edited edge metadata.Aug 26 2014, 10:10 PM

The frontend should deal with this a lot better as of r216520.

samsonov updated this revision to Diff 13017.Aug 27 2014, 6:43 PM
samsonov edited edge metadata.

Rebase patch to incorporate various recent fixes to nonnull
attribute handling. Add support for vararg functions with nonnull attribute.

tools/clang/lib/CodeGen/CGCall.cpp
2423–2424

Vararg sanitization should now work.

2425–2429

Fixed

2431

Fixed

rsmith added inline comments.Aug 27 2014, 6:52 PM
tools/clang/lib/CodeGen/CGCall.cpp
2437

Is there a reason to pass ArgNo as dynamic data rather than static?

samsonov updated this revision to Diff 13046.Aug 28 2014, 10:58 AM

Move ArgNo to static data.

samsonov added inline comments.Aug 28 2014, 10:59 AM
tools/clang/lib/CodeGen/CGCall.cpp
2437

Right, we can move it to static data now, as we now discriminate checks for different call arguments by source location anyway. Done.

Any more comments on this?

rsmith added a comment.Sep 5 2014, 2:12 PM

One diagnostic quality comment, otherwise this looks fine.

projects/compiler-rt/lib/ubsan/ubsan_handlers.cc
365

It would seem better to point to the attribute rather than just to the function.

samsonov updated this revision to Diff 13346.Sep 5 2014, 2:44 PM

Provide source location of nonnull attribute declaration,
instead of function declaration.

samsonov added inline comments.Sep 5 2014, 2:45 PM
projects/compiler-rt/lib/ubsan/ubsan_handlers.cc
365

Done.

Side note: do you it makes sense to provide source location of returns_nonnull attribute as well? (I'd say it could make sense, as attribute is attached to function declaration, which might be in a completely different where definition is).

rsmith accepted this revision.Sep 6 2014, 3:42 AM
rsmith edited edge metadata.

Yes, I think it makes sense for returns_nonnull too.

projects/compiler-rt/lib/ubsan/ubsan_handlers.h
135

Maybe AttrLoc?

This revision is now accepted and ready to land.Sep 6 2014, 3:42 AM
samsonov added inline comments.Sep 8 2014, 10:00 AM
projects/compiler-rt/lib/ubsan/ubsan_handlers.h
135

Done

samsonov updated this revision to Diff 13406.Sep 8 2014, 10:00 AM
samsonov edited edge metadata.

Address review comment.

samsonov updated this revision to Diff 13407.Sep 8 2014, 10:31 AM

Document new UBSan option (-fsanitize=nonnull-attribute).

samsonov closed this revision.Sep 8 2014, 10:32 AM