This is an archive of the discontinued LLVM Phabricator instance.

Remove unnecessary variable indexing into single-element arrays
ClosedPublic

Authored by hfinkel on Feb 19 2015, 4:06 PM.

Details

Summary

This patch is intended to address PR22629. To copy from the bug report:

[from the bug report]

Consider this code:

int f(int x) {
  int a[] = {12};
  return a[x];
}

GCC knows to optimize this to

movl     $12, %eax
ret

The code generated by recent Clang at -O3 is:

movslq   %edi, %rax
movl     .L_ZZ1fiE1a(,%rax,4), %eax
retq

.L_ZZ1fiE1a:
  .long    12                      # 0xc

[end from the bug report]

This definitely seems worth fixing. I've also seen this kind of code before (as the base case of generic vector wrapper templates with one element).

The general idea is to look at the GEP feeding a load or a store, which has some variable as its first non-zero index, and determine if that index must be zero (or else an out-of-bounds access would occur). We can do this for allocas and globals with constant initializers where we know the maximum size of the underlying object. When we find such a GEP, we create a new one for the memory access with that first variable index replaced with a constant zero.

Even if we can't eliminate the memory access (and sometimes we can't), it is still useful because it removes unnecessary indexing calculations.

Diff Detail

Event Timeline

hfinkel updated this revision to Diff 20351.Feb 19 2015, 4:06 PM
hfinkel retitled this revision from to Remove unnecessary variable indexing into single-element arrays.
hfinkel updated this object.
hfinkel edited the test plan for this revision. (Show Details)
hfinkel added a subscriber: Unknown Object (MLST).
chandlerc added inline comments.Feb 19 2015, 5:20 PM
lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
480

Why bother? We strip when popping.

509–510

We don't return true here though, we keep searching. I would state this inverted. "If we find an object that isn't less then MaxSize, we're done, bail." or some such.

520

What if this multiply wraps? I know, I know, it won't but it could in theory. I would write this in some way that no one has to think about that. Maybe just use the APInt we get from the array size?

541–550

Merge the comment blocks? Maybe make them doxygen?

I think it would be nice to generalize this to fold *any* GEP index which would trigger UB if > zero to zero. Maybe just leave a FIXME if doing this is hard? You could maybe put it better down below to not bail at the first non-zero constant index, but to keep going to the first non-constant index.

This is only sound for inbounds GEPs, no? I would probably restrict it to them, see below.

554

What GEP has fewer than 2 operands but passes the verifier?

588–591

You don't need to do this. If the GEP is inbounds, than all intermediate pointer values must be inbounds, so you can skip checking the tail once you have observed badness. However, see my comment below.

Also, the implementation checks for non-negative numbers, not positive numbers.

665–667

Gah. Almost. But not quite.

We have to be able to form one-past-the-end of the object. So if all the subsequent pointers are 0, the one that strides the object could be 1 instead of 0.

I think we need to look for a *positive* tail (not just non-negative) of the GEP, or we need to look at the users and fold if they would be UB for a one-past-the-end GEP. Essentially, as long as we're post-dominated or only used by a load or store, we're golden.

850–854

Fold this into the helper so that it returns the rewritten GEP?

856

As long as you use the instruction combiner's IR builder, this isn't necessary.

hfinkel added inline comments.Feb 19 2015, 6:24 PM
lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
480

Good point.

509–510

Ack. Code is new and already an out-of-sync comment ;)

520

Will do.

541–550

Merge the comment blocks? Maybe make them doxygen?

Merged.

I think it would be nice to generalize this to fold *any* GEP index which would trigger UB if > zero to zero. Maybe just leave a FIXME if doing this is hard? You could maybe put it better down below to not bail at the first non-zero constant index, but to keep going to the first non-constant index.

I'll add a FIXME (for both). I need to think about how to do that. For the second point, yes, although we'd also need to keep track of the offsets those indices implied from the start of the object.

This is only sound for inbounds GEPs, no? I would probably restrict it to them, see below.

Ah, yes, if there are other indices then, even if positive, they could wrap.

554

Good question. But SimplifyGEPInst in lib/Analysis/InstructionSimplify.cpp has:

// getelementptr P -> P.
if (Ops.size() == 1)
  return Ops[0];

So, to be safe...

588–591

You don't need to do this. If the GEP is inbounds, than all intermediate pointer values must be inbounds, so you can skip checking the tail once you have observed badness. However, see my comment below.

Right, but one past the end still works, and then we could use a negative offset of some contained array to get back inside the object.

Also, the implementation checks for non-negative numbers, not positive numbers.

Indeed; Positive -> NonNegative

665–667

No, I think this is fine. We do need to be able to form one-past-the-end, but we don't need to load from it. Here I know we're loading from it, so it can't be the one-past-the-end pointer.

This is why I clone the GEP and then change it, so only the load or store gets the rewritten GEP, not any other users of the original pointer.

850–854

Sure.

856

Right, but I'm not.

hfinkel updated this revision to Diff 20363.Feb 19 2015, 6:26 PM

Updates in response to Chandler's review.

chandlerc accepted this revision.Feb 19 2015, 6:39 PM
chandlerc edited edge metadata.

Generally looks fine to commit whenever. Mostly small bits below.

lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
474

Add a FIXME to hoist this into Analysis or somewhere it can be re-used? We should make it widely available. This is a great pattern to detect.

592–595

Maybe too clever, but what do you think about bailing the first time you see a positive index as well since you're checking for inbounds? You could make the predicate "canBacktrackFromPastEnd" or some such. As soon as you see a potentially negative index return true, as soon as you see a definitely positive index, return false, if you exhaust the indices return false.

596

nit: no need for explicit return type. You can make it more explicit this is a predicate with an 'is' or can' name prefix...

665–667

Totally missed that you were carefully only transforming a load or store's GEP. Yea, we're on the same page now.

856

Could you? I find that pattern cleaner. We pass IR builders into helper functions regularly.

This revision is now accepted and ready to land.Feb 19 2015, 6:39 PM
hfinkel added inline comments.Feb 19 2015, 6:58 PM
lib/Transforms/InstCombine/InstCombineLoadStoreAlloca.cpp
474

Sure.

592–595

I'm not sure that works. Even after a positive index, we could have a "larger" negative index, and we could still keep everthing inbounds.

596

Sure.

856

Could? Yes. But I'm not sure that's better (if I use the builder interface, I still need to worry about subsequently copying over the inbounds flag -- but only if the builder did not fold the GEP to something else anyway, in which case I need to be careful that it is not folded to some other GEP that used to be related to the original GEP's first operand, etc.).

hfinkel closed this revision.Feb 19 2015, 7:08 PM

r229959, thanks!