Page MenuHomePhabricator

[instcombine] Generalize one-compare rule from foldAllocaCmp for noalias calls

Authored by reames on Feb 22 2022, 5:04 PM.



This change generalizes the handling for a potentially capturing comparison of an unescaped alloca to handle noalias calls as well. In the process, it also generalizes the existing alloca handling to reuse the generic capture tracking logic.

Doing this requires reverting 43d7e1c at the same time, and thus fixing If we add nocapture argument support without reverting that change, we take a currently mostly silent miscompile and risk increasing it's frequency.

Diff Detail

Event Timeline

reames created this revision.Feb 22 2022, 5:04 PM
reames requested review of this revision.Feb 22 2022, 5:04 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2022, 5:04 PM
reames planned changes to this revision.Feb 22 2022, 5:12 PM

Uploaded wrong patch version. Fixing...

reames updated this revision to Diff 410685.Feb 22 2022, 5:34 PM

Corrected patch. Sorry for the noise.

reames added inline comments.Feb 22 2022, 5:44 PM
302 ↗(On Diff #410685)

To help others avoid the same confusion I had: This case isn't being caught by foldAllocCmp. We have another transform in visitAllocSite which is doing escape analysis and removing the allocation in one go - and thus doing consistent folding.

I should probably add a non-capturing use to this test.

nikic added a comment.Feb 23 2022, 1:07 AM

This looks basically right to me.


Can you please add an explicit check here that this is an icmp user? I guess you omitted the check on the premise that you have a known icmp user, so if you see a single capture it must be that icmp. That may be true, but the fact that we also have certain non-capturing icmps makes the reasoning here very non-obvious.


malloc(4) + 1 == 1 would be a hidden null check. Though from a quick look, we would not recognize malloc(4) + 1 as non-zero.

malloc(4) < 4 would be another possibility, making use of malloc alignment guarantees to perform a hidden null pointer check.

Not sure what to do about these, it's rather contrived...

nikic added inline comments.Feb 23 2022, 2:22 AM

Or rather, we should check that this user it the same one we're trying to fold.

And it may be worth pointing out that this critically depends on the fact that captures are per-use, not per-user, so if we do something like alloc + x == alloc + y, then we'll end up not folding because we have two capturing uses in the same icmp.

reames added inline comments.Feb 23 2022, 7:52 AM

I'd tried to clarify the later point in the comment, but I will expand.


Oh, yuck. I'd completely missed this possibility.

I see two variants here, and they're a bit different.

  1. If we ptrtoint the allocation (to do the comparison), then that's a capture, and we're fine.
  1. If we do inttoptr of the RHS instead, that's essentially a *guess* of the address (i.e. it's null), but the logic about heap layouts doesn't apply because we can't just decide the allocation didn't fail (since some later use might fault.)

I'm going to add some test cases of this variety, then restrict the transform to the non-null allocation case OR when directly comparing a non-offset allocation.

Once that lands, I'll revisit and see if I can do something stronger.

nlopes added a subscriber: nlopes.Feb 24 2022, 11:12 AM
reames planned changes to this revision.Feb 24 2022, 3:57 PM
reames abandoned this revision.Oct 24 2022, 9:44 AM

Closing review I'm not currently working on, and am unlikely to get back to in near future. Will reopen if priorities change.

Herald added a project: Restricted Project. · View Herald TranscriptOct 24 2022, 9:44 AM