This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Improve global SADDR selection
ClosedPublic

Authored by rampitec on Apr 29 2021, 5:18 PM.

Details

Summary

An address can be a uniform sum of two i64 bit values.
That regularly happens in a loop where index is an induction
variable promoted to 64 bit by the LSR. We can materialize
zero in a VGPR and still use SADDR form of the load.

Diff Detail

Event Timeline

rampitec created this revision.Apr 29 2021, 5:18 PM
rampitec requested review of this revision.Apr 29 2021, 5:18 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 29 2021, 5:18 PM
Herald added a subscriber: wdng. · View Herald Transcript

This looks like it's regressing the case where the offset is known constant, but just doesn't fit.

Can you also make the corresponding globalisel change?

llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
90–91

This is more instructions

This looks like it's regressing the case where the offset is known constant, but just doesn't fit.

Do you mean atomic_add_i32_huge_offset() test? Will check it.

Can you also make the corresponding globalisel change?

Yes, will do.

llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
90–91

It is one materialized zero, I think it is more or less degenarate case. In a real world there is usually a vgpr with zero. On the other hand this is one VGPR less.

This looks like it's regressing the case where the offset is known constant, but just doesn't fit.

Do you mean atomic_add_i32_huge_offset() test? Will check it.

That's one case, but it looks like all of the cases that add an extra v_mov_b32 (e.g. global_load_saddr_i8_offset_neg4096 which I commented on).

Can you also make the corresponding globalisel change?

Yes, will do.

llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
90–91

This is the same pattern in all of the cases with a large immediate offset split. I think this is reasonably common and I would lean towards fewer instructions here

rampitec added inline comments.Apr 29 2021, 6:05 PM
llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
90–91

We still need to materialize zero, but I got the idea, will check what is possible to do here.

One caveat with non zero vaddr is it will be impossible to switch to vaddr form if calculations will end up in sgprs (and they likely will in these cases) but for any reason we would need to move instruction to VALU which in turn will result in readfirstlane instructions for the address, like it is now.

rampitec updated this revision to Diff 342553.May 3 2021, 2:57 PM
rampitec marked 2 inline comments as done.

Handle illegal immediate offset case yielding more instructions.

rampitec updated this revision to Diff 342586.May 3 2021, 4:33 PM

Added global isel change.

I am not sure if that can be simplified. It seems we have already checked for base with constant offset, maybe I can move the new checks higher where we have constant offset. I will try to simplify it.

rampitec updated this revision to Diff 342798.May 4 2021, 10:43 AM

Simplified the logic. I believe it is ready now.

foad added inline comments.May 5 2021, 1:25 AM
llvm/test/CodeGen/AMDGPU/global-saddr-load.ll
2384

Could you pre-commit this bit so we can see the diff please?

rampitec updated this revision to Diff 343084.May 5 2021, 9:23 AM
rampitec marked an inline comment as done.

Pre-commited test and rebased.

arsenm accepted this revision.May 5 2021, 2:14 PM
This revision is now accepted and ready to land.May 5 2021, 2:14 PM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.May 5 2021, 3:38 PM

Any chance that this causes this test breakage http://45.33.8.238/linux/45940/step_12.txt ?

thakis added a comment.May 5 2021, 3:40 PM

Ah no, looks like this is due to @arsenm's https://github.com/llvm/llvm-project/commit/ef5f0adecd02d92cbb1a713ac7316f6768269412 which doesn't have a phab link :/

Could someone revert that? (On phone ATM, so can't do it myself)

Ah no, looks like this is due to @arsenm's https://github.com/llvm/llvm-project/commit/ef5f0adecd02d92cbb1a713ac7316f6768269412 which doesn't have a phab link :/

Could someone revert that? (On phone ATM, so can't do it myself)

Yes, it is this one: https://reviews.llvm.org/rGef5f0adecd02d92cbb1a713ac7316f6768269412

@arsenm you have missing colon in the pattern there.