This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Optimize scalable frame offset calculation when VLEN is precisely known
ClosedPublic

Authored by reames on Nov 7 2022, 2:37 PM.

Details

Summary

When we have a precisely known VLEN, we can replace runtime usage of VLENB with compile time constants. This converts offsets involving both fixed and scalable components into fixed offsets. The result is that we avoid the csr read of vlenb, and can often fold the multiply as well.

Diff Detail

Event Timeline

reames created this revision.Nov 7 2022, 2:37 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 7 2022, 2:37 PM
reames requested review of this revision.Nov 7 2022, 2:37 PM
kito-cheng added inline comments.Nov 7 2022, 2:46 PM
llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll
7

Should be SPILL-O2-VLEN256?

90

I thought there should not appear any read vlenb here? since according the description I expect we could just use 256 here?

kito-cheng added inline comments.Nov 7 2022, 2:51 PM
llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll
6

And this also an old problem here, should we emit a warring or error when zvl*b info is mismatch with -riscv-v-vector-bits-min=? min vlen is 128 according the zvl128b (which implied by v), but we force that become 256 by -riscv-v-vector-bits-min=, that became an issue when we use Tag_RISCV_arch* to check the expected minimal vlen requirement for the object?

  • There is still controversial about how to interpret Tag_RISCV_arch, but eventually will resolved that by some way.
kito-cheng added inline comments.Nov 7 2022, 3:00 PM
llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll
90

my bad, it's separated patch, I saw https://reviews.llvm.org/D137593 now

reames added inline comments.Nov 7 2022, 3:30 PM
llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll
6

There's no conflict here. Zvl128b is a minimum, not a maximum. As such, specifying a larger VLEN via -riscv-v-vector-bits-min does not conflict.

There's a separate question of whether the -riscv-v-vector-bits-min should effect the attributes, but that's well out of scope for this patch.

7

This is a typo, and I'll fix.

reames updated this revision to Diff 473853.Nov 7 2022, 6:26 PM

Fix test line per reviewer comment

asb accepted this revision.Nov 8 2022, 3:15 AM

This looks good to me.

This revision is now accepted and ready to land.Nov 8 2022, 3:15 AM
reames added inline comments.Nov 8 2022, 7:48 AM
llvm/test/CodeGen/RISCV/rvv/rv64-spill-vector-csr.ll
98

The offset here is wrong. I missed a factor of 8 divide in my code which exists in the original getVLENFactoredAmount path. It's not really clear to me where that factor comes from, but what's here is definitely wrong in this test. Updated version forthcoming.

reames updated this revision to Diff 474557.Nov 10 2022, 8:39 AM

Fix bug around missing divide by 8.

I have to admit I don't know why that divide is needed. It parallels the code in getVLENFactoredAmount, but it's not clear to me where the factor of 8 comes from.

reames requested review of this revision.Nov 10 2022, 8:40 AM

Fix bug around missing divide by 8.

I have to admit I don't know why that divide is needed. It parallels the code in getVLENFactoredAmount, but it's not clear to me where the factor of 8 comes from.

I think all scalable vector stack objects are rounded up to be a multiple of 8 by this code in RISCVFrameLowering::assignRVVStackObjectOffsets

// If the data type is the fractional vector type, reserve one vector        
// register for it.                                                          
if (ObjectSize < 8)                                                          
  ObjectSize = 8;

I think this is done because there is no fractional whole register load/store. The divide by 8 is trying to figure out how many whole registers there are.

llvm/lib/Target/RISCV/RISCVRegisterInfo.cpp
185

I'm sure you copy and pasted this from somewhere else, but this message isn't great. It should be something like "Scalable offset is not a multiple of a single vector size."

reames updated this revision to Diff 474567.Nov 10 2022, 9:42 AM

Address reviewer comment on assert message

Fix bug around missing divide by 8.

I have to admit I don't know why that divide is needed. It parallels the code in getVLENFactoredAmount, but it's not clear to me where the factor of 8 comes from.

I think all scalable vector stack objects are rounded up to be a multiple of 8 by this code in RISCVFrameLowering::assignRVVStackObjectOffsets

// If the data type is the fractional vector type, reserve one vector        
// register for it.                                                          
if (ObjectSize < 8)                                                          
  ObjectSize = 8;

I think this is done because there is no fractional whole register load/store. The divide by 8 is trying to figure out how many whole registers there are.

The bit I don't understand is the interpretation of object size for a scalable value. Why is "8" the right number for a single VLEN sized register?

This revision is now accepted and ready to land.Nov 18 2022, 9:13 AM
This revision was landed with ongoing or failed builds.Nov 18 2022, 9:57 AM
This revision was automatically updated to reflect the committed changes.