This is an archive of the discontinued LLVM Phabricator instance.

[LV] Update logic for calculating register usage due to invariants
ClosedPublic

Authored by sushgokh on Feb 6 2023, 10:30 AM.

Details

Summary

Previously, while calculating register usage due to invariants, it was assumed that invariant would always be part of widening instructions. This resulted in calculating vector register types for vectors which cant be legalized. e.g. For the attached SVE test case, it resulted in calculating register usage for type nxv4i128 resulting in legalization error(and crash).

An invariant might not always need a vector register. For e.g., invariant might just be used for iteration check.

This patch checks if the invariant is part of any widening instruction and considers register usage accordingly. Fixes issue 60493

Diff Detail

Event Timeline

sushgokh created this revision.Feb 6 2023, 10:30 AM
Herald added a project: Restricted Project. · View Herald Transcript
sushgokh requested review of this revision.Feb 6 2023, 10:31 AM
Matt added a subscriber: Matt.Feb 6 2023, 2:32 PM

I see the problem of calculating incorrect register usage. However, is it just limited to loop invariants?

I see the problem of calculating incorrect register usage. However, is it just limited to loop invariants?

Yes

sushgokh updated this revision to Diff 495724.Feb 7 2023, 10:06 PM
sushgokh retitled this revision from [LV][SVE] Don't consider vector types for loop invariants when calculating register usage to [LV] Don't consider vector types for loop invariants when calculating register usage.Feb 7 2023, 10:08 PM

One thing I missed out is a check if the invariant is taking part in vector ops. If its indeed taking part in vector ops, we will need vector registers for it. But, if it doesnt, then scalar registers should be sufficient.

I need to rework on this patch

sushgokh updated this revision to Diff 496931.Feb 13 2023, 4:30 AM
sushgokh retitled this revision from [LV] Don't consider vector types for loop invariants when calculating register usage to [LV] Update logic for calculating register usage due to invariants.
sushgokh edited the summary of this revision. (Show Details)Feb 13 2023, 4:40 AM

Hi,
Could someone please have a look at the patch?

efriedma edited reviewers, added: sdesmalen, david-arm, reames; removed: efriedma.Feb 16 2023, 6:06 PM

any suggestions/comments on the patch ?

sdesmalen added inline comments.Feb 21 2023, 1:57 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6107–6108

I think you can write the algorithm above a bit shorter using a lambda:

bool IsScalar = all_of(cast<Instruction>(Inst)->users(), [&](User *U) {
  auto *I = dyn_cast<Instruction>(U);
  return !I || TheLoop != LI->getLoopFor(I->getParent()) ||
         isScalarAfterVectorization(I, VFs[i]);
});      

ElementCount VF = IsScalar ? ElementCount::getFixed(1) : VFs[i];
unsigned  ClassID = TTI.getRegisterClassForType(VF.isVector(), Inst->getType());
Invariant[ClassID] += GetRegusage(Inst->getType(), VF);;
6110

It seems that Inst can only ever be an Instruction, so this can be a cast. It would be even better to make LoopInvariants be a SmallVector of Instruction rather than Value (as a NFC precursory patch) , given that the loop that populates it skips non-Instruction values.

6112

If User is not a Instruction, the dyn_cast will result in a nullptr and the call to getParent() will fail.

sushgokh updated this revision to Diff 499132.Feb 21 2023, 5:49 AM
sushgokh updated this revision to Diff 499232.Feb 21 2023, 10:42 AM
sdesmalen added inline comments.Feb 22 2023, 12:55 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6107–6108

I think you can write the algorithm above a bit shorter using a lambda:

bool IsScalar = all_of(cast<Instruction>(Inst)->users(), [&](User *U) {
  auto *I = dyn_cast<Instruction>(U);
  return !I || TheLoop != LI->getLoopFor(I->getParent()) ||
         isScalarAfterVectorization(I, VFs[i]);
});      

ElementCount VF = IsScalar ? ElementCount::getFixed(1) : VFs[i];
unsigned  ClassID = TTI.getRegisterClassForType(VF.isVector(), Inst->getType());
Invariant[ClassID] += GetRegusage(Inst->getType(), VF);;

@sushgokh did you try the above suggestion?

When I try it locally, all the tests in your patch still pass.

sushgokh added inline comments.Feb 22 2023, 3:02 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6107–6108

@sdesmalen Yes, I tried your suggestion and all the tests passed. But,my only intention was not to assume anything(as I have stated in the comment). But if you advise, will go with your suggestion

sdesmalen added inline comments.Feb 22 2023, 9:36 AM
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6107–6108

That case should already be covered because in getMaxLegalScalableVF it goes through all instructions in the loop to see if the element type is legal for scalable vectors.
So if there are any uses of this scalar value that are non-scalar, then the code in getMaxLegalScalableVF should have prevented the LV from considering a scalable VF for this loop.

Based on that, I believe my suggestion is safe.

sushgokh updated this revision to Diff 499724.Feb 22 2023, 9:59 PM

@sdesmalen Have updated the patch. Thanks for the explanation

Thanks for making the changes. I've just left a few more comments on the test.

llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll
2 ↗(On Diff #499724)

nit: ; Run:

2 ↗(On Diff #499724)

If the test uses -debug-only, then it also needs REQUIRES: asserts

4 ↗(On Diff #499724)

Can you add a more elaborate description here of what it's testing?

7 ↗(On Diff #499724)

nit: Can you remove the dso_local and the local_unnamed_addr, as they're not needed.

28 ↗(On Diff #499724)

Can this test be simplified a bit further? A lot of the instructions here (e.g. some the PHIs in the loop, the load %0 and subsequent GEP, etc) are not necessary for this test.

sushgokh updated this revision to Diff 499921.Feb 23 2023, 11:08 AM

@sdesmalen I couldnt simplify the crashing test further. I hope this suffices

fhahn accepted this revision.Feb 24 2023, 9:46 AM

LGTM, thanks!

llvm/lib/Transforms/Vectorize/LoopVectorize.cpp
6107–6108

nit: LoopInvariants stores Instruction* ,so there should be no need to use cast.

6108

nit: should be able to use cast, as all users of instructions should be other instructions. Then there's also no need to check !I in the lambda.

llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll
1 ↗(On Diff #499921)

this is outdated?

41 ↗(On Diff #499921)

nit: change the function to return void?

This revision is now accepted and ready to land.Feb 24 2023, 9:46 AM
This revision was landed with ongoing or failed builds.Feb 27 2023, 9:37 PM
This revision was automatically updated to reflect the committed changes.
bjope added a subscriber: bjope.Feb 28 2023, 8:50 AM
paulkirth added inline comments.
llvm/test/Transforms/LoopVectorize/AArch64/reg-usage.ll
20–21 ↗(On Diff #501024)

I think this test fails when building w/ -LLVM_REVERSE_ITERATION=On. Would you mind addressing this?

For context, I was testing something else and ran into this failure. I bisected to this commit w/ reverse iteration enabled.

@paulkirth Trying to address the issue you have mentioned in D145472. Thanks

Thanks for the fix. That change looks good to me. Its unfortunate that we can express that ordering is unimportant in lit for some range of code. The only reason I found the issue here was that I had a similar problem when testing stack layout analysis, and never updated my build flags.