This is an archive of the discontinued LLVM Phabricator instance.

Fix VGPR + offset Scratch offset folding
ClosedPublic

Authored by Petar.Avramovic on Feb 28 2023, 4:43 AM.

Details

Reviewers
dstuttard
foad
Group Reviewers
Restricted Project
Commits
rG3ae310d0ae34: Fix VGPR + offset Scratch offset folding
Summary

Values in VGPR register are treated as unsigned by hardware.

When value in 32-bit VGPR base can be negative calculate offset using
32-bit add instruction, otherwise use vgpr base(unsigned) + offset.
Does not affect case where whole offset comes from VGPR register
(immediate offset is 0).

LoopStrengthReduce.cpp changes offsets to negative and in some
iterations value in VGPR register could be negative.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptFeb 28 2023, 4:43 AM
Petar.Avramovic requested review of this revision.Feb 28 2023, 4:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 28 2023, 4:43 AM
arsenm added inline comments.Feb 28 2023, 5:17 AM
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1536–1537

But the offsets are negative for scratch and global, isLegalFLATOffset should have accounted for this

foad added inline comments.Feb 28 2023, 9:19 AM
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1541

How does this work? If this condition is false, then you never assign to Addr.

foad added inline comments.Feb 28 2023, 9:23 AM
llvm/test/CodeGen/AMDGPU/memory_clause.ll
212

This looks correct but it is obviously less optimized than before. I wonder if we could get back to the optimized version by using dereferenceable info, or looking at the underlying object from the MMO, or something like that. E.g. if the address is of the form "baseptr + 16" and we know that the access is 16 bytes into a dereferenceable object, then we know that baseptr points to the start of that object, so it should be safe to use the vaddr + offset addressing mode.

llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1517

It stays the same, and offset stays 0. It is a little different from sgpr version

Petar.Avramovic added inline comments.Mar 7 2023, 8:37 AM
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1536–1537

isLegalFLATOffset checks for constant offset (the + offset), this check is for value in register (the base(unsigned))

foad added inline comments.Mar 7 2023, 8:46 AM
llvm/lib/Target/AMDGPU/AMDGPUISelDAGToDAG.cpp
1541

I think the isNonNegative check should not be inside the isLegalFLATOffset case, because we could also run into the same problem in the other case ("If the offset doesn't fit, put the low bits into the offset field and add the rest").

In other words I think it should be more like:

if (isBaseWithConstantOffset(N0, N1) && N0 is non-negative) {
  if (isLegalFLATOffset) {
    ...
  } else {
    ...
  }
}

Move check next to isBaseWithConstantOffset, introduce helper function that can be used in other two patches also.

foad accepted this revision.Mar 8 2023, 6:26 AM

LGTM, thanks.

This revision is now accepted and ready to land.Mar 8 2023, 6:26 AM

add same checks for globalisel

This revision was landed with ongoing or failed builds.Mar 9 2023, 1:55 AM
This revision was automatically updated to reflect the committed changes.