This is an archive of the discontinued LLVM Phabricator instance.

[LSR] Don't force bases of foldable formulae to the final type.
ClosedPublic

Authored by ebevhan on Jan 16 2018, 7:01 AM.

Details

Summary

Before emitting code for scaled registers, we prevent
SCEVExpander from hoisting any scaled addressing mode
by emitting all the bases first. However, these bases
are being forced to the final type, resulting in some
odd code.

For example, if the type of the base is an integer and
the final type is a pointer, we will emit an inttoptr
for the base, a ptrtoint for the scale, and then a
'reverse' GEP where the GEP pointer is actually the base
integer and the index is the pointer. It's more intuitive
to use the pointer as a pointer and the integer as index.

Diff Detail

Event Timeline

ebevhan created this revision.Jan 16 2018, 7:01 AM

Hi,

Could you upload the patch with the full context?

I'd like to check the type is consistent with the surrounding code.

Cheers,
-Quentin

ebevhan updated this revision to Diff 130837.Jan 22 2018, 12:05 AM

Included the complete context. Sorry for missing that!

qcolombet added inline comments.Jan 26 2018, 3:28 PM
lib/Transforms/Scalar/LoopStrengthReduce.cpp
4996

I don't think this is what we want, FullV is supposed to have the same type as F.

That said, I would have expected your test case to have been recognized by the check line 4948.

Could you check why this is not happening?

Out-of-curiosity do you actually see codegen differences with that patch?

lib/Transforms/Scalar/LoopStrengthReduce.cpp
5012

I believe we would have the same problem here.

5022

And here.

Basically, I am fine either way (with or without the cast), but we would need to be consistent.

Out-of-curiosity do you actually see codegen differences with that patch?

The only difference in the test case is that the base and scale registers on the memory operation are swapped.

In our downstream target we see fairly large differences because our pointers and integers go in different registers, so it's inefficient to do a GEP like this where the base and scale are 'swapped'.

lib/Transforms/Scalar/LoopStrengthReduce.cpp
4996

In the test case in this patch (and it's probably the same for the local test case I found this in originally), the check on 4948 does trigger. Ty is i64 and OpTy is i8*. Since the effective SCEV type for both of these is i64, Ty is set to i8* and the base (which is originally an integer) is converted into a pointer when it's expanded.

qcolombet accepted this revision.Jan 29 2018, 4:15 PM

Alright, LGTM then.
Please also patches the two other places I mentioned assuming you can have a test case for them.

lib/Transforms/Scalar/LoopStrengthReduce.cpp
4996

Thanks for double checking.

This revision is now accepted and ready to land.Jan 29 2018, 4:15 PM

I replaced the other two instances of Ty with nullptr, and only observed a difference in one AMDGPU test case (preserve-addrspace-assert.ll), likely caused by the change on line 5022. Before, GEPs for double* and i32* were produced, but with the patch, GEPs for the containing struct are produced instead. This is probably because the expander was being forced to emit the pre-offset expressions as the final types (which were double* and i32*) but with the patch it will emit GEPs that take the struct pointer instead.

It's not really clear if this is better or not, and I can't really say that I have a test case that represents this change specifically. Should I upload a diff with the fixes to the test case or leave this patch as is?

Let's leave the patch as is.

ebevhan updated this revision to Diff 132175.Jan 31 2018, 7:49 AM

Rebased. Another test case was affected, but as far as I can tell, the only difference is the elimination of two bitcasts.

This revision was automatically updated to reflect the committed changes.