This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Add legalization support for non-power-2 loads and stores
ClosedPublic

Authored by aemerson on Mar 28 2019, 4:58 PM.

Details

Summary

Legalize things like i24 load/store by splitting them into smaller power of 2 operations.

This change also adds an artifact combiner for G_INSERT -> G_EXTRACT where we can just forward the inserted value straight to the user of the extract. To do this I had to add G_INSERT to the list of artifact opcodes, and then to fix test failures where the legalization order meant that we need to try to combine them away before they're legalized, so that we don't have to look through G_TRUNC ops in between.

Diff Detail

Event Timeline

aemerson created this revision.Mar 28 2019, 4:58 PM

@arsenm Matt there are changes to the AMDGPU tests, but I'm not sure if they're ok or not. The tests run with -global-isel-abort=0 and the change of adding G_INSERT to the artifacts list means that it doesn't get legalized the same way (as an artifact, it doesn't get pushed onto the legalization list so it's deferred until later). Does it look benign?

@arsenm Matt there are changes to the AMDGPU tests, but I'm not sure if they're ok or not. The tests run with -global-isel-abort=0 and the change of adding G_INSERT to the artifacts list means that it doesn't get legalized the same way (as an artifact, it doesn't get pushed onto the legalization list so it's deferred until later). Does it look benign?

The -global-isel-abort=0 is a workaround for the broken artifact handling. I didn't have any particular expectations for these other than not crashing

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
1379–1389

I think this breaks the alignment. You should use MF.getMachineMemOperand(MMO, Size, Offset)

arsenm added inline comments.Mar 28 2019, 8:03 PM
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
1379–1389

I got the order wrong, it's:

MachineMemOperand *getMachineMemOperand(const MachineMemOperand *MMO,
                                          int64_t Offset, uint64_t Size);
aemerson updated this revision to Diff 192779.Mar 28 2019, 10:48 PM

Fix alignment for smaller MMO. I couldn't use your suggested getMachineMemOperand() method because it doesn't update the alignment it seems.

aemerson abandoned this revision.Mar 29 2019, 11:17 AM

Actually I've realized now there's two problems with this approach:

  1. This doesn't work when we try to legalize other non power of 2 operations. The artifacts for arithmetic ops for example will try to use G_ANYEXT to legalize their source, but if loads are producing the type with G_INSERTS, it's difficult to combine them away.
  2. It's not correct when alignment is smaller than the largest sub-memop. E.g. if the i24 load had an alignment of 1. In this case, we have no choice but to split the op into the alignment sized components and use something like G_MERGE on the resulting values.

I'm instead going to work on first adding the legalization support for the pessimistic case of smaller alignment memops, and then deal with the common case.

Fix alignment for smaller MMO. I couldn't use your suggested getMachineMemOperand() method because it doesn't update the alignment it seems.

Are you confusing the alignment and base alignment? I had this problem last time I touched this. getMachineMemOperand should do the right thing

Fix alignment for smaller MMO. I couldn't use your suggested getMachineMemOperand() method because it doesn't update the alignment it seems.

Are you confusing the alignment and base alignment? I had this problem last time I touched this. getMachineMemOperand should do the right thing

Ah probably. When I changed to using the getMachineMemOperand() function the test still passed, I was expecting the MMO printer to instead print "align 2". It seems it prints the base pointer alignment, not the alignment of the access.

Fix alignment for smaller MMO. I couldn't use your suggested getMachineMemOperand() method because it doesn't update the alignment it seems.

Are you confusing the alignment and base alignment? I had this problem last time I touched this. getMachineMemOperand should do the right thing

aemerson reclaimed this revision.Apr 15 2019, 3:14 PM

Reviving this as the overall approach was fine, it seems the alignment of non pow2 types is assumed to be the alignment of the next largest pow-2 type, so we don't need to worry about alignment during the breakdown.

I did however change the legalization method to not use extracts/inserts, but instead use extending loads and truncating stores, so that the artifacts get combined away and it Just Works.

aemerson updated this revision to Diff 195260.Apr 15 2019, 3:16 PM

New and improved patch.

Reviving this as the overall approach was fine, it seems the alignment of non pow2 types is assumed to be the alignment of the next largest pow-2 type, so we don't need to worry about alignment during the breakdown.

I did however change the legalization method to not use extracts/inserts, but instead use extending loads and truncating stores, so that the artifacts get combined away and it Just Works.

I don't like how we do everything in bits, and then the mem operand forces bytes. Would it cost anything to switch MemOperands to also be in bits?

arsenm added inline comments.Apr 16 2019, 12:36 AM
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
1379–1389

This is still using the new function?

aemerson marked an inline comment as done.Apr 16 2019, 10:28 AM

Reviving this as the overall approach was fine, it seems the alignment of non pow2 types is assumed to be the alignment of the next largest pow-2 type, so we don't need to worry about alignment during the breakdown.

I did however change the legalization method to not use extracts/inserts, but instead use extending loads and truncating stores, so that the artifacts get combined away and it Just Works.

I don't like how we do everything in bits, and then the mem operand forces bytes. Would it cost anything to switch MemOperands to also be in bits?

Yes I think we could introduce a getSizeInBits() accessor to make it clearer, but I think it should be a separate patch from this.

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
1379–1389

It should do, I forgot to make that change in the latest patch.

aemerson updated this revision to Diff 195412.Apr 16 2019, 10:40 AM

Use MF.getMachineMemOperand()

Reviving this as the overall approach was fine, it seems the alignment of non pow2 types is assumed to be the alignment of the next largest pow-2 type, so we don't need to worry about alignment during the breakdown.

I did however change the legalization method to not use extracts/inserts, but instead use extending loads and truncating stores, so that the artifacts get combined away and it Just Works.

I don't like how we do everything in bits, and then the mem operand forces bytes. Would it cost anything to switch MemOperands to also be in bits?

I just had at doing this and some places still work in terms of bytes, not bits. I don't think it's worth it to change the internal representation, the getSizeInBits() should help anyway.

arsenm added inline comments.Apr 17 2019, 2:41 AM
llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
1379–1384

Why can't uses use getMachineMemOperand for both halves from the original?

aemerson updated this revision to Diff 195597.Apr 17 2019, 10:26 AM

Yep, in that case might as well remove the entire helper.

arsenm accepted this revision.Apr 17 2019, 12:04 PM

LGTM

This revision is now accepted and ready to land.Apr 17 2019, 12:04 PM
This revision was automatically updated to reflect the committed changes.