Page MenuHomePhabricator

try to extend nonnull-ness of arguments from a callsite back to its parent function
ClosedPublic

Authored by spatel on Dec 16 2016, 10:00 AM.

Details

Summary

As discussed here:
http://lists.llvm.org/pipermail/llvm-dev/2016-December/108182.html
...we should be able to propagate 'nonnull' info from a callsite back to its parent.

The original motivation for this patch is our botched optimization of "dyn_cast" (PR28430), but this won't solve that problem alone.

I think this is the strongest nonnull transform of this type because tagging the argument can enable transforms for all other instructions in the function. Possible follow-ups would handle cases where we can use nonnull to directly eliminate an icmp (InstSimplify+ValueTracking), or add llvm.assume to trigger existing transforms.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 81766.Dec 16 2016, 10:00 AM
spatel retitled this revision from to [InstCombine] try to extend nonnull-ness of arguments from a callsite back to its parent function.
spatel updated this object.
spatel added reviewers: hfinkel, mkuper, majnemer.
spatel added a subscriber: llvm-commits.
majnemer added inline comments.Dec 16 2016, 12:02 PM
lib/Transforms/InstCombine/InstCombineCalls.cpp
2779–2820 ↗(On Diff #81766)

Any chance this logic could live in llvm::isKnownNonNullAt

spatel added inline comments.
lib/Transforms/InstCombine/InstCombineCalls.cpp
2779–2820 ↗(On Diff #81766)

Hmm...yes, that would probably be better. Today's comments in the earlier linked llvm-dev thread suggest that this approach (adding attributes that enable existing analysis rather than doing more analysis) is not ideal, so we might want to abandon this entirely for simplify/combines that do the work directly.

Isn't this something that could be part of the "Deduce function attributes" pass?

Isn't this something that could be part of the "Deduce function attributes" pass?

I actually looked at that first, but this (nonnull) is a parameter attribute rather than a function attribute, so it didn't seem like the right place. If I misunderstood the boundary, please let me know.

efriedma added inline comments.
lib/Transforms/InstCombine/InstCombineCalls.cpp
2799 ↗(On Diff #81766)

I don't think this check is sufficient; consider something like this:

void g(int* NONNULL notnull_ptr);
void f(int *ptr, bool ptr_is_nonnull) {
  if (ptr_is_nonnull) g(ptr);
}

I think your check marks the "ptr" argument to f() as nonnull. This deduction gets propagated out to callers of f(), which then assume the pointer isn't null.

spatel added inline comments.Dec 16 2016, 5:10 PM
lib/Transforms/InstCombine/InstCombineCalls.cpp
2799 ↗(On Diff #81766)

Nice catch. Yes, that would be wrong. We can only do this if the callsite is in the entry block?

mkuper added inline comments.Dec 16 2016, 5:23 PM
lib/Transforms/InstCombine/InstCombineCalls.cpp
2799 ↗(On Diff #81766)

I'd say "if the callsite post-dominates the entry block", except that LLVM's notion of post-dominance isn't really strong enough to be useful, IIRC.

efriedma added inline comments.Dec 16 2016, 5:37 PM
lib/Transforms/InstCombine/InstCombineCalls.cpp
2799 ↗(On Diff #81766)

You need to prove that it would be undefined behavior if the argument were in fact null, i.e. the if the function is called, the callsite will be executed.

You can break it down like this: the callsite needs to dominate every exit from the function, and every instruction which might execute before the callsite in question needs to be isGuaranteedToTransferExecutionToSuccessor() . The first is true for every instruction in the entry block, but the second isn't.

Isn't this something that could be part of the "Deduce function attributes" pass?

I actually looked at that first, but this (nonnull) is a parameter attribute rather than a function attribute, so it didn't seem like the right place. If I misunderstood the boundary, please let me know.

This time I looked closer, and sure enough we have FunctionAttrs.cpp::addArgumentAttrs(). So there already exists a place where we could splice this in. However, it still seems mismatched in intent because that function is dealing with interprocedural transforms, while this patch is limited to analysis within a single function. If anyone thinks it's better handled there, I'm not opposed, but I haven't thought of a compelling reason for that yet. It seems logical to me to keep this next to the existing InstCombine nonnull transform since this transform can enable that one.

lib/Transforms/InstCombine/InstCombineCalls.cpp
2779–2820 ↗(On Diff #81766)

I drafted a patch to split some of the code over to isKnownNonNullAt(), but I don't think it belongs there. In this case, we're not asking if the value is known non-null at some point; we're asking if the value is known non-null throughout because of the context instruction. So it feels like a better fit for the plain isKnownNonNull() except it could be expensive to determine this nonnull-ness from there?

I'll post an updated patch with the changes suggested by @efriedma and @mkuper that hopefully makes it clearer.

spatel updated this revision to Diff 81847.Dec 17 2016, 10:39 AM

Patch updated:

  1. Add checks for callsite-in-entry-block + isGuaranteedToTransferExecutionToSuccessor().
  2. Add tests for those situations.
  3. Add comments to this transform and the existing nonnull transform to provide motivational/philosophical context as noted in the llvm-dev thread.
sanjoy added a subscriber: sanjoy.Dec 18 2016, 10:55 AM

Few random unsolicited comments inline.

lib/Transforms/InstCombine/InstCombineCalls.cpp
2784 ↗(On Diff #81847)

We pass CallSite by value.

2791 ↗(On Diff #81847)

When can these be false (i.e. when can we have a call site not in a basic block or inside a function)?

2801 ↗(On Diff #81847)

Maybe use range for here?

for (Instruction &I : make_range(BB->begin(), CS.getInstruction()->getIterator()))

at which point you can just std::all_of over the above.

2827 ↗(On Diff #81847)

I don't think this is necessary. You've proved that once you enter the caller, you are going to pass the said argument as a nonnull parameter to CS. This means that the incoming value is nonnull too, irrespective of where all it is used.

That is, instead of NonNullParams.push_back(V), you should be able to do:

if (auto *A = dyn_cast<Argument>(V))
  A->addAttr(NonNull);
spatel marked 2 inline comments as done.Dec 18 2016, 1:23 PM

Few random unsolicited comments inline.

Always appreciated!

lib/Transforms/InstCombine/InstCombineCalls.cpp
2791 ↗(On Diff #81847)

Honestly, I don't know. I saw some other code - that I'm not finding now - that scared me into thinking that it's not safe to assume that an instruction is always tied to a block/function. I can make it an assert if we believe it's safe.

2827 ↗(On Diff #81847)

We have to prove that the first use of the arg is a nonnull use, and so the 'dominates' check is needed. Let me know if I'm not understanding the suggestion.

The first test case was supposed to demonstrate this, but it doesn't fulfill that intent after the addition of the isGuaranteedToTransferExecutionToSuccessor() check. I can add a test case like below to show this possibility.

declare i8 @use1readonly(i8* %x) readonly ; readonly guarantees that execution continues to successor
declare void @use1nonnull(i8* nonnull %x);

define i8 @parent7(i8* %a) {
  %ret = call i8 @use1readonly(i8* %a) ; %a could be null in here
  call void @use1nonnull(i8* %a) ; guaranteed nonnull use, but can't extend nonnull to parent
  ret i8 %ret
}
silvas added a subscriber: silvas.

This transform seems like it is sensitive to the order in which functions are visited when running function passes, which I think we try to avoid. I.e. if the ordering is right, we can propagate across arbitrarily many transitive calls; but if the ordering is wrong (e.g. we visit callers before callees) then we don't be able to propagate as much.

You may want to check with Chandler about whether it makes sense to do this. Based on my understanding of the rules, this is too strong of a transformation (looks at too large of a scope) to be a function pass (the rules as far as I understand them are essentially "we should be able to conceptually process functions in parallel"; and if the order of visitation matters, then our output will suddenly be sensitive to thread execution order; I don't think we meet this bar today, but I think it is a useful guideline / thing to strive for).

spatel added inline comments.Dec 18 2016, 2:55 PM
lib/Transforms/InstCombine/InstCombineCalls.cpp
2827 ↗(On Diff #81847)

On 2nd thought, my reply makes no sense. :)
You're right - we're now guaranteeing that the call with nonnull must execute, so the arg must be nonnull for the parent.
I think we still need the extra test case to show this because the first test's call doesn't guarantee execution to successor.

spatel updated this revision to Diff 81896.Dec 18 2016, 3:05 PM

Patch updated based on Sanjoy's suggestions:

  1. Pass CallSite by value.
  2. Change parent null checks to assert.
  3. Use std::all_of()
  4. Simplify logic for propagation of null attribute.
  5. Add test case with 'readonly' call to show it is ok to propagate nonnull even if the call with nonnull is not first.

This transform seems like it is sensitive to the order in which functions are visited when running function passes, which I think we try to avoid. I.e. if the ordering is right, we can propagate across arbitrarily many transitive calls; but if the ordering is wrong (e.g. we visit callers before callees) then we don't be able to propagate as much.

You may want to check with Chandler about whether it makes sense to do this. Based on my understanding of the rules, this is too strong of a transformation (looks at too large of a scope) to be a function pass (the rules as far as I understand them are essentially "we should be able to conceptually process functions in parallel"; and if the order of visitation matters, then our output will suddenly be sensitive to thread execution order; I don't think we meet this bar today, but I think it is a useful guideline / thing to strive for).

This would be a good argument for Mehdi's suggestion? Ie, this should be in FunctionAttrs.cpp?

This would be a good argument for Mehdi's suggestion? Ie, this should be in FunctionAttrs.cpp?

Yes. I think that's the natural place to put this.

This would be a good argument for Mehdi's suggestion? Ie, this should be in FunctionAttrs.cpp?

Yes. I think that's the natural place to put this.

I have a draft of a patch to put this in FunctionAttrs instead of InstCombine which looks straightforward, but there's a problem:

define i1 @bar(i32* nonnull %x) {
  %ld = load i32, i32* %x
  %cmp = icmp eq i32 %ld, 42
  ret i1 %cmp
}

define i1 @foo(i32* %x) { ; we want this param marked 'nonnull'
  %d = call i1 @bar(i32* %x)
  %null_check = icmp eq i32* %x, null ; foo() always returns the result of bar()
  br i1 %null_check, label %t, label %f
t:
  ret i1 1
f:
  ret i1 %d
}

$ ./opt -functionattrs -inline -instcombine -simplifycfg isa12.ll -S
...

define i1 @foo(i32* nonnull readonly %x) #0 {
f:
  %ld.i = load i32, i32* %x, align 4
  %cmp.i = icmp eq i32 %ld.i, 42 ; success! no null check
  ret i1 %cmp.i
}

$ ./opt -O2 isa12.ll -S
...

define i1 @foo(i32* readonly %x) local_unnamed_addr #0 {
t:
  %ld.i = load i32, i32* %x, align 4
  %cmp.i = icmp eq i32 %ld.i, 42
  %null_check = icmp eq i32* %x, null ; FAIL
  %.d = or i1 %null_check, %cmp.i
  ret i1 %.d
}

$ ./opt -O2 -debug-pass=Arguments -S < /dev/null
...
Pass Arguments:
...
-inline -functionattrs

In the -O2 pipeline, -functionattrs is only called once (directly after -inline), so we hit the exact problem that I'm trying to avoid: the information provided by the callsite's arg attribute is gone. Is a change to the -O2 pipeline an acceptable follow-up to this patch?

spatel updated this revision to Diff 82030.Dec 19 2016, 4:01 PM
spatel retitled this revision from [InstCombine] try to extend nonnull-ness of arguments from a callsite back to its parent function to try to extend nonnull-ness of arguments from a callsite back to its parent function.

Patch updated:
Move the logic and test cases from InstCombine to FuncAttrs to avoid sensitivity to function visit ordering, but this exposes a different ordering problem: -functionattrs is currently run after -inline, so this doesn't appear to help the motivating case at all.

chandlerc edited edge metadata.Dec 19 2016, 4:11 PM

Patch updated:
Move the logic and test cases from InstCombine to FuncAttrs to avoid sensitivity to function visit ordering, but this exposes a different ordering problem: -functionattrs is currently run after -inline, so this doesn't appear to help the motivating case at all.

I don't think this is an issue of pass ordering. IMO, the issue is that we're losing information when inlining rather than preserving it. For example, your patch will catch cases where the called function *isn't* inlined. The patch is doing the correct thing, its just that you need two strategies here, not just one:

  1. Handle case where secondary call is inlined.
  2. Handle case where secondary call is not inlined.

Your patch handles #2 but not #1.

This is also somewhat abusing functionattrs IMO. You don't need this to be a function attribute at all.

I think the transformation this patch is trying to do is exactly what the "assume" framework tries to do -- you have one point in your code that makes an assumption valid (nonnull) and you want to use it later in that same code. The fact that in some cases you can promote this to an attribute on the argument is kind of incidental.

I think it would be better to teach valuetracking to look at whether a value is passed to an argument attributed 'nonnull' and/or teach the inliner to introduce an assume when inlining a function with a nonnull argument attribute to avoid losing that information.

If you want to keep the functionattrs change to try and propagate this across long distances of procedure calls, I'm all down with that. We should be able to do this kind of bottom-up inference. But it isn't really the right way to solve your proximate use case IMO, you should use that to solve more complex cases further down the road.

Patch updated:
Move the logic and test cases from InstCombine to FuncAttrs to avoid sensitivity to function visit ordering, but this exposes a different ordering problem: -functionattrs is currently run after -inline, so this doesn't appear to help the motivating case at all.

I don't think this is an issue of pass ordering. IMO, the issue is that we're losing information when inlining rather than preserving it. For example, your patch will catch cases where the called function *isn't* inlined. The patch is doing the correct thing, its just that you need two strategies here, not just one:

  1. Handle case where secondary call is inlined.
  2. Handle case where secondary call is not inlined.

    Your patch handles #2 but not #1.

    This is also somewhat abusing functionattrs IMO. You don't need this to be a function attribute at all.

    I think the transformation this patch is trying to do is exactly what the "assume" framework tries to do -- you have one point in your code that makes an assumption valid (nonnull) and you want to use it later in that same code. The fact that in some cases you can promote this to an attribute on the argument is kind of incidental.

    I think it would be better to teach valuetracking to look at whether a value is passed to an argument attributed 'nonnull' and/or teach the inliner to introduce an assume when inlining a function with a nonnull argument attribute to avoid losing that information.

    If you want to keep the functionattrs change to try and propagate this across long distances of procedure calls, I'm all down with that. We should be able to do this kind of bottom-up inference. But it isn't really the right way to solve your proximate use case IMO, you should use that to solve more complex cases further down the road.

Thanks for the guidance! I agree that this a special case, but I thought I'd try this one first for the learning experience. On that count at least, progress. :)

On the associated llvm-dev thread, we've discussed related solutions involving llvm.assume and ValueTracking, so yes, I'll see what I can do with those too. And the interplay of analysis and (abuse of) attributes was also nicely explained.

Hopefully, I'll pick this back up in a couple of weeks, but if anyone else is interested (dyn_cast!), feel free to push ahead.

spatel added inline comments.Dec 19 2016, 5:24 PM
lib/Transforms/IPO/FunctionAttrs.cpp
586–587 ↗(On Diff #82030)

Side note for reference: this gets in the way of any hoped-for change with the dyn_cast example. Those functions (there are ~25 functions to inline underneath one call to dyn_cast) are marked "linkonce_odr", so we don't even consider them with this patch.

I've spent some more time looking at the dyn_cast motivating case, and it seems we have multiple analysis and transform holes; it's going to take several fixes to correctly propagate and act on the nonnull info.

Related bugs:
https://llvm.org/bugs/show_bug.cgi?id=31512
https://llvm.org/bugs/show_bug.cgi?id=31518

If there isn't moral objection to this patch, I'd like to get this in just so I can cross it off the list of potential places where we might lose the nonnull. If the consensus is that this crosses the line into attribute abuse, I'll abandon and keep chasing the other leads.

Isn't this something that could be part of the "Deduce function attributes" pass?

I actually looked at that first, but this (nonnull) is a parameter attribute rather than a function attribute, so it didn't seem like the right place. If I misunderstood the boundary, please let me know.

This time I looked closer, and sure enough we have FunctionAttrs.cpp::addArgumentAttrs(). So there already exists a place where we could splice this in. However, it still seems mismatched in intent because that function is dealing with interprocedural transforms, while this patch is limited to analysis within a single function.

It seems that FunctionAttrs.cpp::addArgumentAttrs() is doing the same level of thing:

Patch updated:
Move the logic and test cases from InstCombine to FuncAttrs to avoid sensitivity to function visit ordering, but this exposes a different ordering problem: -functionattrs is currently run after -inline, so this doesn't appear to help the motivating case at all.

I don't think this is an issue of pass ordering. IMO, the issue is that we're losing information when inlining rather than preserving it. For example, your patch will catch cases where the called function *isn't* inlined. The patch is doing the correct thing, its just that you need two strategies here, not just one:

  1. Handle case where secondary call is inlined.
  2. Handle case where secondary call is not inlined.

    Your patch handles #2 but not #1.

    This is also somewhat abusing functionattrs IMO. You don't need this to be a function attribute at all.

Uh? How do you handle 2) without an attribute? It seems fundamentally that we want to do bottom-up inter-procedural propagation here, which is what attributes are for AFAIK.

Are you saying that relying on the inter-procedural analysis for the purpose of intra-procedural analysis is not "the right thing to do"? I'd agree, but that does not invalidate the use case of inter-procedural though, right?

I think it would be better to teach valuetracking to look at whether a value is passed to an argument attributed 'nonnull' and/or teach the inliner to introduce an assume when inlining a function with a nonnull argument attribute to avoid losing that information.

That seems complementary to the "infer attribute approach", or I missed something?

If you want to keep the functionattrs change to try and propagate this across long distances of procedure calls, I'm all down with that. We should be able to do this kind of bottom-up inference. But it isn't really the right way to solve your proximate use case IMO, you should use that to solve more complex cases further down the road.

I think what you're saying here is matching my comments above, but I was not totally sure to parse it correctly.

I'll try and get a proper response to this later, but I wanted to very quickly mention that I'm somewhat opposed to us doing *any* stronger optimization on non-null until we teach Clang to strip off that attribute from memcpy, memmove, and memset.

Several versions of glibc have unfortunately added this attribute. There have already been *several* critical security vulnerabilities from optimizing based on on the attribute because code was never written to avoid a null pointer in the case where the *size was zero*.

I really want these optimizations on nonnull to go in, but I'd like to avoid having a (large) window of time where Clang will "miscompile" code using memcpy and friends in this way.

I've spoken with Richard Smith and he's going to send an email to cfe-dev about handling this in Clang. I'm also writing a paper for the C++ committee to standardize on somewhat more sane handling here. But I think we should at least defend users against the known misuses of this attribute and *then* start optimizing it harder. Are folks OK with that?

spatel added a comment.Jan 3 2017, 2:29 PM

I've spoken with Richard Smith and he's going to send an email to cfe-dev about handling this in Clang. I'm also writing a paper for the C++ committee to standardize on somewhat more sane handling here. But I think we should at least defend users against the known misuses of this attribute and *then* start optimizing it harder. Are folks OK with that?

That sounds ok to me. We could add a patch dependency here in phab to this, D28204, D27114 (any others?) once the clang patch is posted.
Given how many places drop the nonnull metadata, I'm surprised we ever get any nullptr optimizations. :)

Sounds to me like a good motivation indeed to wait for the change in clang first!

reames edited edge metadata.Jan 3 2017, 9:19 PM

Minor comments inline. I'm going to defer to Chandler on the clang specific implications here. This will strictly benefit my frontend and I'd like to see it land.

lib/Transforms/IPO/FunctionAttrs.cpp
537 ↗(On Diff #82030)

I know this was request earlier in the review thread, but there are lots of cases where inlining drops information available at the call site. This isn't specific to this case. Doesn't hurt to call it out, but the wording seems to imply this is unique to this inference which it really isn't.

540 ↗(On Diff #82030)

minor: call this Changed for consistency with the rest of the code.

547 ↗(On Diff #82030)

Scanning only the entry block is perfectly reasonable for an initial patch, but could be easily extended, even without dominator information. walking forwards along single target branches or backwards from return points along single predecessor branches would give a more robust analysis.

(I would prefer this be done as a separate patch to make it easier to review.)

spatel updated this revision to Diff 83057.Jan 4 2017, 7:59 AM
spatel edited edge metadata.
spatel marked 3 inline comments as done.

Patch updated:

  1. Improved code comments and variable name as suggested by Philip.
  2. Updated the last test to include 'nounwind' because the implementation of isGuaranteedToTransferExecutionToSuccessor() changed with rL290794.

Ping.

I don't follow cfe closely, but I don't see any cfe-dev posts about the libc work-around since Jan 4. Any updates or progress?

Aren't we blocked on clang? Or did it already stop emitting the null attr for memcpy and cie.

spatel added a subscriber: luqmana.Feb 9 2017, 6:04 AM

Aren't we blocked on clang? Or did it already stop emitting the null attr for memcpy and cie.

That's what I'd like to know. There was no reply to my question here or the pings in D27114. I'm hoping to hear that the clang blocker is in someone's queue?
Apart from that, if there is any other feedback for this patch itself, please let me know. Thanks!

That's what I'd like to know. There was no reply to my question here or the pings in D27114. I'm hoping to hear that the clang blocker is in someone's queue?

I think the blocker is real: we need clang to not annotate a few libc function before optimizing aggressively. At least this is my understanding of the email thread on clang-dev linked above.
(that does not mean that anyone is actively working on it at the moment)

reames accepted this revision.Feb 10 2017, 9:45 AM

So as to unblock this review, I suggest that we introduce this change behind an off by default flag. Separately, we should pursue the clang side changes need to enable this flag by default. As stated previously, this will benefit my frontend and I'd like to see it land so that I can enable it.

Given the above, LGTM w/the off by default flag and minor comments addressed.

lib/Transforms/IPO/FunctionAttrs.cpp
550 ↗(On Diff #83057)

minor: Should handle an invoke terminator as well.

This revision is now accepted and ready to land.Feb 10 2017, 9:45 AM

So as to unblock this review, I suggest that we introduce this change behind an off by default flag. Separately, we should pursue the clang side changes need to enable this flag by default. As stated previously, this will benefit my frontend and I'd like to see it land so that I can enable it.

Sounds like a good plan!

spatel updated this revision to Diff 88228.Feb 13 2017, 11:04 AM

Patch updated:

  1. Hide the transform behind a disabled-by-default hidden flag.
  2. Use CallSite so the transform works for invokes as well as calls.
  3. Add a test case with invoke to show that works.
This revision was automatically updated to reflect the committed changes.