This is an archive of the discontinued LLVM Phabricator instance.

[FunctionAttrs] Added nonnull atribute to libc function args
AbandonedPublic

Authored by xbolva00 on Jul 30 2018, 11:20 PM.

Details

Summary

This informs other passes that we can optimize e.g.:
int f(char *s) {

puts(s);
if (s == NULL)
        puts("was null");
return 99;

}

to:

int f(char *s) {

 puts(s);
return 99;

}

Diff Detail

Event Timeline

xbolva00 created this revision.Jul 30 2018, 11:20 PM
xbolva00 added a comment.EditedJul 30 2018, 11:22 PM

if @efriedma, @bkramer is OK with this approach, I will add some tests, etc.

For libc functions the approach seems right for me, but not sure about mem* intrinsics, if the IR builder is a good place where we should add the nonnull arg info.

Reviewers, please check the places where I added nonnull function arg attribute, I may missed something.

xbolva00 added a comment.EditedJul 31 2018, 9:01 AM

Calls which are not reordered in the IR -> transformation already works

int foo(void *s) {

memset(s, 0, 10);
if (!s) // eliminated
    abort();
return l;

}

int foo(char *s) {

int l = strlen(s);
if (!s)
    abort();
return l;

}

for strlen, we have IR:

; Function Attrs: nounwind
define dso_local i32 @foo(i8* readonly %s) local_unnamed_addr #0 {
entry:

%tobool = icmp eq i8* %s, null
br i1 %tobool, label %if.then, label %if.end

if.then: ; preds = %entry

tail call void @abort() #3
unreachable

if.end: ; preds = %entry

%call = tail call i32 @strlen(i8* %s) #4
ret i32 %call

}

; Function Attrs: argmemonly nounwind readonly
declare dso_local i32 @strlen(i8* nocapture nonnull) local_unnamed_addr #

Where should we work on above case? SCCP?

I don't think we can reasonably do this for the functions which take a pointer+size pair; see http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html etc. Maybe we can mark specific calls where we know the size is nonzero.

Not sure how this should interact with -fno-delete-null-pointer-checks; maybe we can make that work "correctly" by just doing all our nonnull marking on individual calls, even for the functions which take null-terminated strings.

xbolva00 added inline comments.Jul 31 2018, 1:44 PM
lib/Transforms/Utils/BuildLibCalls.cpp
477
  • 1 is correct
xbolva00 added a subscriber: rsmith.EditedJul 31 2018, 1:59 PM

I don't think we can reasonably do this for the functions which take a pointer+size pair; see http://lists.llvm.org/pipermail/cfe-dev/2017-January/052066.html etc. Maybe we can mark specific calls where we know the size is nonzero.

Not sure how this should interact with -fno-delete-null-pointer-checks; maybe we can make that work "correctly" by just doing all our nonnull marking on individual calls, even for the functions which take null-terminated strings.

so @rsmith 's list is what we want to follow for this type of transformation?

ISO C:

memcpy, memmove, memset, memcmp, memchr, memrchr
strncpy [param 1 only], strncat [param 1 only], strncmp

Extensions:

strndup, memccpy, mempcpy, strnlen, bcopy, bzero, bcmp, memfrob,
strncasecmp, strncasecmp_l [not parameter 4], stpncpy

Side note: GCC has no problem to optimize/remove condition (there is already ubsan checker for 0/null for mem* functions, or maybe is under development, I saw it somewhere)
int foo(void *p, int n) {

memset(p, 0, n);
if (!p)
    abort();
return 1;

}

int foo(void *p, int n) {

memset(p, 0, 0);
if (!p)
    abort();
return 1;

}
is not optimized by GCC.

Shouldn't we just do it like GCC? Do not optimize when explicit NULL or zero, otherwise be optimistic.

Yes, that's the right list of functions. Like the discussion noted, we don't really want to match gcc's behavior here because it breaks legacy code for little practical benefit.

Yes, that's the right list of functions. Like the discussion noted, we don't really want to match gcc's behavior here because it breaks legacy code for little practical benefit.

Okay. I will update this patch tomorrow.

xbolva00 updated this revision to Diff 158754.Aug 2 2018, 7:37 AM
  • Updated the list of functions where we want to add this attribute
  • Updated tests
xbolva00 edited the summary of this revision. (Show Details)Aug 2 2018, 7:38 AM
xbolva00 updated this revision to Diff 158771.Aug 2 2018, 8:55 AM
  • Small fix
xbolva00 updated this revision to Diff 161212.Aug 17 2018, 5:01 AM
  • Rebased
xbolva00 marked an inline comment as done.Aug 17 2018, 5:01 AM

Should I reduce this patch to just string functions for easier review?

I'm still concerned that this doesn't really interact appropriately with -fno-delete-null-pointer-checks... specifically, the nonnull markings could cause unexpected optimizations. I mean, it's sort of dubious to use -fno-delete-null-pointer-checks without -ffreestanding anyway, given that many C library APIs special-case null, but we should still try to make it work as well as we can. I think we can solve that problem by changing how the nonnull markings are implemented: instead of marking the functions themselves as non-null, we should mark calls to those functions (as long as the calls are not in "null-pointer-is-valid" functions).

In terms of splitting the patch, yes, it probably makes sense to split into markings for char* pointers, and markings for other pointers.

lib/Transforms/Utils/BuildLibCalls.cpp
272

For sprintf, I think we can mark both 0 and 1.

Ok, makes sense. Just looking into FunctionAttrs, there is a FIXME comment for "EnableNonnullArgPropagation". If we carefully handle these nonnull arg attributes (e.g. avoid memcpy(d, s, 0) optimization), are there any reasons whyEnableNonnullArgPropagation should be still disabled?

Off the top of my head, I think we fixed clang, so it should be okay... but please verify that the clang fix actually went in. And it should be a separate patch, of course.

Adding @spatel since he is the author of "EnableNonnullArgPropagation" (https://reviews.llvm.org/D27855)

Adding @spatel since he is the author of "EnableNonnullArgPropagation" (https://reviews.llvm.org/D27855)

From what I can tell, the patch that was supposed to make the C functions safe and let us flip the setting of EnableNonnullArgPropagation is here:
D30806

It was approved, but it was never committed?

xbolva00 added a comment.EditedSep 28 2018, 1:51 AM

Since we are gonna mark args as nonull at the callsites, we can be more precisice since we know if "size" arg is constant and > 0.

e.g.
strncmp(d,s, 10) - we can mark "d", "s" as nonnull, right?

strncmp(d,s, 10) - we can mark "d", "s" as nonnull, right?

Yes, sure.

xbolva00 added a comment.EditedOct 13 2018, 3:25 PM

I am trying to do this in functionattrs but problem is that the call was sunk to the successor before the functionattrs ran and in the functionattrs we just go thru instructionsof EntryBlock :/

xbolva00 updated this revision to Diff 169624.Oct 14 2018, 1:42 PM
xbolva00 retitled this revision from [LibCalls] Added nonnull atribute to libc function args to [FunctionAttrs] Added nonnull atribute to libc function args.
  • Reimplemented

@efriedma , please, can you look at it?

Tests will be added later after major design issues are addressed.

xbolva00 updated this revision to Diff 169627.Oct 14 2018, 1:44 PM
xbolva00 added inline comments.Oct 14 2018, 1:48 PM
lib/Transforms/IPO/FunctionAttrs.cpp
636 ↗(On Diff #169627)

As a possible follow up for FunctionAttrs, is there any interest to implement TODO comment (check for post-dominance, be unrestricted to just entry block)?

xbolva00 updated this revision to Diff 169629.Oct 14 2018, 1:50 PM
xbolva00 updated this revision to Diff 169632.Oct 14 2018, 3:15 PM
  • Fixed strtok

This doesn't look like the patch I was expecting.

Propogating nonnull like this probably makes sense, but this patch should be split into two parts. In the first part, you can extend SimplifyLibCalls to just call Call->addParamAttr(ArgNo, Attribute::NonNull) on calls to known library functions, or something like that. Then the second part does the inference from callee to caller; essentially this patch, but looking for nonnull attributes instead of specific libcalls.

Alright, definitely a better solution! Thanks, I will split it.

Then the second part does the inference from callee to caller; essentially this patch, but looking for nonnull attributes instead of specific libcalls.

But basically, this is what addArgumentAttrsFromCallsites does. So if I do this correctly in Simplifylibcalls, can I remove "EnableNonnullArgPropagation" flag which blocks the propagation to function args?

I think we still need the clang patch to preemptively strip nonnull annotations from C library functions before we turn on EnableNonnullArgPropagation by default, to be safe. But yes, the right approach is to add the correct nonnull attributes, then enable nonnull propagation.

Thanks :)

What I would like to do as follow up is to turn FunctionAttrs to use dominators info.

xbolva00 added a comment.EditedOct 17 2018, 4:04 PM

InstCombine does something weird for me.

define dso_local i32 @fff6(i8* nocapture readonly %s) local_unnamed_addr #5 {
entry:

%call = tail call i64 @strlen(i8* %s) #8
%conv = trunc i64 %call to i32
ret i32 %conv

}

I add nonnull for strlen so... -> call i64 @strlen(i8* nonnull %s) but after -instcombine we have undef here (@spatel ?)
define dso_local i32 @fff6(i8* nocapture readonly %s) local_unnamed_addr #5 {
entry:

ret i32 undef

}

I will look more closely to visitCallSite...

xbolva00 updated this revision to Diff 170039.Oct 17 2018, 4:30 PM
  • More marked functions
xbolva00 abandoned this revision.Oct 18 2018, 2:18 AM