This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Split large offsets when selecting global saddr mode
ClosedPublic

Authored by arsenm on Nov 11 2020, 4:21 PM.

Details

Summary

When the offset doesn't fit in the immediate field, move some to
voffset.

Diff Detail

Event Timeline

arsenm created this revision.Nov 11 2020, 4:21 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 11 2020, 4:21 PM
arsenm requested review of this revision.Nov 11 2020, 4:21 PM
rampitec added inline comments.Nov 11 2020, 4:48 PM
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1830–1831

This todo is not relevant anymore.

1836

If you have several consequtive loads this logic will result in reinitialization of vaddr before each load. You could probably mask few low bits in MaxOffset for this purpose to create a window of instructions which can use the same base registers.

arsenm added inline comments.Nov 11 2020, 5:14 PM
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1836

I copied this from the new splitting logic in 760af7a0743278b6dd7782b177f4d6d086c726e0.

This won't be re-initialized since in the common case the common base part will CSE

rampitec added inline comments.Nov 11 2020, 5:30 PM
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1836

Yeah, I understand it is a copy. I had the same sentiment while looking at it earlier.

If you have a same base it will be resused. But then if you have an address different by 4 bytes these 4 bytes will go to register and immediate will be MaxOffset, right?

Also since this is a copy it makes sense to factor it into an utility function.

foad added inline comments.Nov 12 2020, 6:41 AM
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1834

I would add && COffsetVal >= 0. I don't think there's any point trying this for large negative offsets, because it will always fail at the isUint<32> check below. That will simplify the code because you don't need to bother with the "signed division by a power of two" thing.

arsenm updated this revision to Diff 304845.Nov 12 2020, 8:03 AM

Avoid duplication

foad added inline comments.Nov 12 2020, 9:06 AM
llvm/lib/Target/AMDGPU/AMDGPUInstructionSelector.cpp
3503

Don't bother if ConstOffset is negative.

3510–3512

This can be a bit simpler if you know ConstOffset isn't negative, e.g. just use a right shift instead of a divide.

arsenm updated this revision to Diff 304898.Nov 12 2020, 10:45 AM

Share globalisel code and restructure to avoid bug in future patch

rampitec accepted this revision.Nov 12 2020, 10:51 AM

LGTM, although I think we need to look into consecutive addressing which will always use max imm offset and reinitialize base. This problem is pre-existing though.

This revision is now accepted and ready to land.Nov 12 2020, 10:51 AM