This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Fix a vsetvli insertion bug involving loads/stores.
ClosedPublic

Authored by craig.topper on Jan 31 2022, 4:20 PM.

Details

Summary

The first phase of the analysis can avoid a vsetvli if an earlier
instruction in the block used an SEW and LMUL that when combined with
the EEW of the load/store would produce the desired EMUL. If we
avoided a vsetvli this will affect the global analysis we do in the
second phase.

The third phase where we really insert the vsetvlis needs to agree
with the first phase. If it doesn't we can insert vsetvlis that
invalidate the global analysis.

In the test case there is a VSETVLI in the preheader that sets
SEW=64 and LMUL=1. Inside the loop there is a VADD with SEW=64 and LMUL=1.
This VADD is followed by a store that wants wants SEW=32 LMUL=1/2.
Because it has EEW=32 as part of the opcode the SEW=64 LMUL=1 from the
VADD can be become EMUL=1 for the store. So the first phase determines no
vsetvli is needed.

The third phase manages CurInfo differently than BBInfo.Change from the
first phase. CurInfo is only updated when we see a vsetvli or insert
a vsetvli. This was done to allow predecessor block information from
the global analysis to be applied to multiple instructions. Since the
loop body has no vsetvli we won't update CurInfo for either the VADD
or the VSE. This prevented us from checking the store vsetvli elision
for the VSE resulting in a vsetvli SEW=32 LMUL=1/2 being emitted which
invalidated the global analysis.

To mitigate this, I've added a BBLocalInfo variable that more closely
matches the first phase propagation. This gets updated based on the
VADD and prevents emitting a vsetvli for the store like we did in the
first phase.

I wonder if we should do an earlier phase to handle the load/store case
by adding more pseudo opcodes and changing the SEW/LMUL for those
instructions before the insertion analysis. That might be more robust
than trying to guarantee two phases make the same decision.

Fixes the test from D118629.

Diff Detail

Event Timeline

craig.topper created this revision.Jan 31 2022, 4:20 PM
craig.topper requested review of this revision.Jan 31 2022, 4:20 PM
frasercrmck accepted this revision.Feb 1 2022, 1:53 AM

LGTM, thank you! Certainly saved me some time today.

I do think that the cracks are showing in this pass, though. I think spending time making it more robust would benefit us in the long term. If we could get the optimization parts of the pass working as actual optimizations on top of a valid MIR program (rather than as speculative changes on data structures over invalid MIR), that'd be a good thing. Might help make it more maintainable too.

llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
1002

Missing a word here, perhaps?

This revision is now accepted and ready to land.Feb 1 2022, 1:53 AM
This revision was landed with ongoing or failed builds.Feb 1 2022, 7:37 AM
This revision was automatically updated to reflect the committed changes.