This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Better selection of base offset when merging DS reads/writes
ClosedPublic

Authored by foad on Feb 10 2021, 7:25 AM.

Details

Summary

When merging a pair of DS reads or writes needs to materialize the base
offset in a vgpr, choose a value that is aligned to as high a power of
two as possible. This maximises the chance that different pairs can use
the same base offset, in which case the base offset registers can be
commoned up by MachineCSE.

Diff Detail

Event Timeline

foad created this revision.Feb 10 2021, 7:25 AM
foad requested review of this revision.Feb 10 2021, 7:25 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2021, 7:25 AM
arsenm added inline comments.Feb 10 2021, 8:12 AM
llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
802

Should probably use uint32_t throughout here

805

const

814–816

This bit magic is a bit fancy. Can you extract it to a separate function

rampitec accepted this revision.Feb 10 2021, 10:01 AM

Nice, thank you! LGTM with Matt's comments addressed.

This revision is now accepted and ready to land.Feb 10 2021, 10:01 AM
foad updated this revision to Diff 323004.Feb 11 2021, 7:32 AM

Split out mostAlignedValueInRange.

foad marked 2 inline comments as done.Feb 11 2021, 7:33 AM
foad added inline comments.
llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
802

I'd prefer not to, since the rest of the file is pretty consistent in using unsigned throughout.

arsenm added inline comments.Feb 11 2021, 7:47 AM
llvm/lib/Target/AMDGPU/SILoadStoreOptimizer.cpp
802

LLVM is pretty consistently wrong in using unsigned for 32-bit values when it's not guaranteed by the standard

foad updated this revision to Diff 323030.Feb 11 2021, 8:22 AM

Use uint32_t.

arsenm accepted this revision.Feb 11 2021, 8:54 AM
This revision was landed with ongoing or failed builds.Feb 11 2021, 9:46 AM
This revision was automatically updated to reflect the committed changes.