This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/GlobalISel: Stop using NarrowScalar/FewerElements for unaligned splitting
ClosedPublic

Authored by arsenm on Aug 2 2021, 6:30 AM.

Details

Summary

These actions should only be used for adjusting the register types
(and the memory type as needed to satisfy the register
type). Unaligned accesses should be split as a type of lowering.

This has the effect of improving the code in many cases since now we
produce zextloads instead of separate loads with ands. The load/store
legality rules still seem far more complicated than necessary though.

Diff Detail

Unit TestsFailed

Event Timeline

arsenm created this revision.Aug 2 2021, 6:30 AM
arsenm requested review of this revision.Aug 2 2021, 6:30 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 2 2021, 6:30 AM
Herald added a subscriber: wdng. · View Herald Transcript
foad added a comment.Aug 2 2021, 6:55 AM

Looks reasonable to me.

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
1250

Is this case still required for some reason, even though you've removed the corresponding code from the scalar case above?

arsenm added inline comments.Aug 2 2021, 7:39 AM
llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
1250

This one you can technically remove, but it results in worse code since it hits the full scalarization path below.

arsenm updated this revision to Diff 395294.Dec 18 2021, 8:13 AM

Rebase and remove another manual check

foad accepted this revision.Dec 20 2021, 8:08 AM
This revision is now accepted and ready to land.Dec 20 2021, 8:08 AM