Page MenuHomePhabricator

[NOT YET FOR REVIEW][AArch64][LV] Implement AArch64TTIImpl::getRegisterClassForType
Changes PlannedPublic

Authored by peterwaller-arm on May 19 2022, 2:36 AM.

Details

Reviewers
None
Summary
NOTE: I'm sending this for early review. Here's a possible solution to fixing the register class issue, if it needs fixing. Another solution would be to have getRegUsageForType return a tuple of (Count, Class), but this might not be straightforward to implement for all targets, I haven't investigated.

The loop vectorizer uses getRegisterClassForType to determine the
register class occupied by a given an llvm::Type* for the purposes of
register pressure estimation, which in turn feeds into the choice of the
a maximum VF to use.

As part of this, the loop vectorizer has a heuristic
isScalarAfterVectorization, which `getRegisterClassForType(bool
Vector, Type *Ty)` effectively receives as its 'Vector' parameter. The
default implementation of getRegisterClassForType ignores the Type
parameter, just returning a 'VectorRC' if Vector is True.

However, some large vector types (e.g. v128i1) are represented with a
number of scalar registers. This is known by TLI->getRegisterType, so
add a TTI interface getRegMVTForType and propagate this information
up to the newly implemented getRegisterClassForType.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptMay 19 2022, 2:36 AM
peterwaller-arm requested review of this revision.May 19 2022, 2:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 19 2022, 2:36 AM

It might be worth teaching AArch64TTIImpl::getRegisterClassForType that fp regs and vector regs are the same thing. That would require some testing to make sure it actually produces better results though.

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
229

Can this just use the existing TLI->getTypeLegalizationCost call to get the MVT?

A fair point on the vector/fp overlap, will consider.

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
229

As far as I can see, no, because the return value of getTypeLegalizationCost is wrong, deriving its value from getValueType. This just does EVT::getVectorVT(Ty->getContext(), EVT::getEVT(EltTy, false), under the hood, which is different than TLI->getRegisterType, which uses information derived from computeRegisterProperties.

A fair point on the vector/fp overlap, will consider.

Yeah, it at least deserves to be a separate patch.

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
229

I would expect getTypeLegalizationCost(..).second to be a scalar type if the original vector was scalarized. It goes via the legalization mechanism in getTypeConversion.

peterwaller-arm marked 2 inline comments as done.May 19 2022, 8:01 AM

On the effect of this patch: there are 10 translation units out of ~2,000 in the LNT benchmarks which have differing codegen as a consequence. I'll check for any performance effect next week before proposing this.

MultiSource/Applications/JM/ldecod/CMakeFiles/ldecod.dir/erc_do_p.o
MultiSource/Applications/JM/lencod/CMakeFiles/lencod.dir/me_distortion.o
MultiSource/Benchmarks/mediabench/mpeg2/mpeg2dec/CMakeFiles/mpeg2decode.dir/recon.o
MultiSource/Benchmarks/mediabench/jpeg/jpeg-6a/CMakeFiles/cjpeg.dir/jquant2.o
MultiSource/Benchmarks/Bullet/CMakeFiles/bullet.dir/btQuantizedBvh.cpp.opt.yaml
MultiSource/Benchmarks/MiBench/automotive-susan/CMakeFiles/automotive-susan.dir/susan.o
MultiSource/Benchmarks/MiBench/consumer-jpeg/CMakeFiles/consumer-jpeg.dir/jquant2.o
MultiSource/Benchmarks/7zip/CMakeFiles/7zip-benchmark.dir/CPP/7zip/Crypto/WzAes.cpp.opt.yaml
MultiSource/Benchmarks/7zip/CMakeFiles/7zip-benchmark.dir/C/Delta.o
SingleSource/Benchmarks/Misc/CMakeFiles/mandel.dir/mandel.o
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
229

getTypeConversion appears to use TransformToType. Here is the information I've dumped from computeRegisterProperties with vscale_range(4,4) (512 bit reg):

CRP:     v1i1 PreferredAction=TypeScalarizeVector NumRegisters=1 NumIntermediates=1 VT=    v1i1 IntermediateVT=      i1 RegisterVT=     i32 Final ValueTypeActions[VT]= TypeScalarizeVector TransformToType=    v1i1
CRP:     v2i1 PreferredAction=TypePromoteInteger Promote    v2i32
CRP:     v4i1 PreferredAction=TypePromoteInteger Promote    v4i16
CRP:     v8i1 PreferredAction=TypePromoteInteger Promote     v8i8
CRP:    v16i1 PreferredAction=TypePromoteInteger Promote    v16i8
CRP:    v32i1 PreferredAction=TypePromoteInteger Promote    v32i8
CRP:    v64i1 PreferredAction=TypePromoteInteger Promote    v64i8
CRP:   v128i1 PreferredAction=TypePromoteInteger NumRegisters=128 NumIntermediates=128 VT=  v128i1 IntermediateVT=      i1 RegisterVT=     i32 Final ValueTypeActions[VT]= TypeSplitVector TransformToType=  v128i1
CRP:   v256i1 PreferredAction=TypePromoteInteger NumRegisters=256 NumIntermediates=256 VT=  v256i1 IntermediateVT=      i1 RegisterVT=     i32 Final ValueTypeActions[VT]= TypeSplitVector TransformToType=  v256i1
CRP:   v512i1 PreferredAction=TypePromoteInteger NumRegisters=512 NumIntermediates=512 VT=  v512i1 IntermediateVT=      i1 RegisterVT=     i32 Final ValueTypeActions[VT]= TypeSplitVector TransformToType=  v512i1
CRP:  v1024i1 PreferredAction=TypePromoteInteger NumRegisters=1024 NumIntermediates=1024 VT= v1024i1 IntermediateVT=      i1 RegisterVT=     i32 Final ValueTypeActions[VT]= TypeSplitVector TransformToType= v1024i1

For example, getTypeLegalizationCost uses the TransformToType of v128i1, not understanding that it use the RegisterVT of i32.

This could be considered buggy too, I started playing around with this but found myself mired in failing tests, so I chose the easier option as a first pass.

peterwaller-arm marked an inline comment as done.May 23 2022, 2:22 AM

I'm going to abandon this for now, though I may take another swipe at this later. I've identified while testing other types that this patch is wrong because it's not returning VectorRC when it should.

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp
229

Can this just use the existing TLI->getTypeLegalizationCost call to get the MVT?

It can however directly use getTLI()->getRegisterType, so I'm thinking I shouldn't introduce another TTI. I was however thinking along the lines of getRegUsage returning this information alongside the register count, hence putting it nearby.

peterwaller-arm planned changes to this revision.May 23 2022, 2:23 AM

Forgot to press the abandon button, and changed my mind, I'll leave it in changes planned for a bit since there are no reviewers tagged yet.

Matt added a subscriber: Matt.May 23 2022, 8:39 AM