This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine][SelectionDAG][AArch64] fold gep into select to enable speculation of load
AbandonedPublic

Authored by labrinea on Nov 6 2018, 11:40 AM.

Details

Summary

The motivation here is to reduce the stall cycles spent when a load is waiting for its address to become available. I am using InstCombine to fold GEP into Select when it seems profitable. This can enable speculation of loads, which is already happening in InstCombine. The DAGCombiner tries to revert the folding of load into select at the moment.

I am planning to break this patch down to smaller pieces, but first I wanted to show what I am trying to achieve and hopefully get some feedback. Another place this optimization might fit is codegenprepare. I think implementing it at the backend is not an option as it's too late to know whether we can safely speculate the loads.

The codegen test below shows the motivating example. The load doesn't have to wait for the select.
Before:

add x8, x20, #8
add x9, x20, #4
cmp w0, w19
csel x8, x8, x9, gt
ldr w0, [x8]

After:

ldp w8, w9, [x20, #4]
cmp w0, w19
csel w0, w9, w8, gt

This change improves an internal benchmark approximately by 4%.

Diff Detail

Event Timeline

labrinea created this revision.Nov 6 2018, 11:40 AM

This looks like a bunch of separate changes which should be split into multiple patches. Especially the changes to DAGCombine and InstCombiner::visitZExt .

test/Transforms/InstCombine/gep-select.ll
29

I'm pretty sure this isn't safe, in general. The "inbounds" marker only guarantees that the arithmetic is in bounds; it doesn't make any guarantees about the type of the pointer.

This looks like a bunch of separate changes which should be split into multiple patches. Especially the changes to DAGCombine and InstCombiner::visitZExt .

Sure, I've explained in the description why I put everything together in this revision.

test/Transforms/InstCombine/gep-select.ll
29

My comment was not referring to the inbounds marker. I am actually looking whether the indices derived from the select fall into the boundaries of a static type. In this example the last index of the geps ( i32 1 and i32 2 respectively) indicate valid offsets of struct members (the last two i32 members of struct.A).

efriedma added inline comments.Nov 6 2018, 2:35 PM
test/Transforms/InstCombine/gep-select.ll
29

The static type of a pointer or GEP can't be used to prove anything about the memory it points to. In C, accessing a member does imply the base pointer points to a valid struct, but we don't record that anywhere in IR. (IIRC there was some discussion of trying to add metadata to model this, but it never went anywhere.)

john.brawn resigned from this revision.May 12 2020, 6:46 AM
labrinea abandoned this revision.Jan 28 2022, 3:12 AM