This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Select 5 bit immediate for VSETIVLI during isel rather than peepholing in the custom inserter.
ClosedPublic

Authored by craig.topper on Apr 24 2021, 11:56 PM.

Details

Summary

This adds a special operand type that is allowed to be either
an immediate or register. By giving it a unique operand type the
machine verifier will ignore it.

This perturbs a lot of tests but mostly it is just slightly different
instruction orders. Something bad did happen to some min/max reduction
tests. We're spilling vector registers when we weren't before.

Diff Detail

Event Timeline

craig.topper created this revision.Apr 24 2021, 11:56 PM
craig.topper requested review of this revision.Apr 24 2021, 11:56 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 24 2021, 11:56 PM
Herald added a subscriber: MaskRay. · View Herald Transcript
craig.topper edited the summary of this revision. (Show Details)Apr 24 2021, 11:57 PM
craig.topper retitled this revision from [RISCV] Select 5 bit for VSETIVLI during isel rather than peepholing in the custom inserter. to [RISCV] Select 5 bit immediate for VSETIVLI during isel rather than peepholing in the custom inserter..Apr 25 2021, 11:06 PM

Select the immediate during isel does make sense to me, but unfortunately there are some cases have a slower result.
Do you know is there any cases which have better instruction order and reduce register spilling when apply the new scheme?
I'm just afraid of the new one would be always generate the slower instruction order.

Rebase after switching SEW=64 splats on RV32 to use stack slot

Select the immediate during isel does make sense to me, but unfortunately there are some cases have a slower result.
Do you know is there any cases which have better instruction order and reduce register spilling when apply the new scheme?
I'm just afraid of the new one would be always generate the slower instruction order.

I think the extra spilling cases are gone now. Those were LMUL=8 cases on RV32 where we used 2 vector register groups to splat a 64 bit element. This left 2 register groups to use for other things. It looks like we're also using a vlmax splat when we have a fixed length vector which causes an extra vsetvli that acts as a scheduling barrier. Some of this should be improved now because we're using a stack store instead of tying up 2 vector registers. We're at least not spilling anymore.

khchen accepted this revision.Apr 27 2021, 12:42 AM

Thanks for clarification, LGTM.

This revision is now accepted and ready to land.Apr 27 2021, 12:42 AM

LGTM!

llvm/lib/Target/RISCV/MCTargetDesc/RISCVBaseInfo.h
121

Should we comment what this is since it's not as obvious as the IMMs above? Or is it expected that people look to the tablgen Operand definition?