This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Hoist vsetvli with vreg operand out of loops
Changes PlannedPublic

Authored by reames on Jun 2 2022, 7:54 AM.

Details

Summary

This is an extension to the PRE line of work. The goal is to be able to handle large constant AVLs which are too large to be encoded in a five bit immediate and thus need to be materialized in register. Handling other register cases is a nice to have.

This patch removes a predicate from the correctness logic for the PRE transform, and replaces it with an alternate one. The code being removed was phrased as "having a fixed result"; it's being replaced with a standard (if handrolled to avoid needing dominators) availability check.

When I wrote the original code, I was concerned about three cases. Thinking about each further, I think I was simply being too conservative from the beginning.

First, we might be in a loop where the VL is potentially changing. The availability check handles this as the value would not be available in the preheader in this case. PREing into a rarer path in the loop is fine, as by assumption the final VL state is the same along all paths.

Second, vsetvli is undefined in certain cases - specifically the vsetvli x0, x0, vtype case. If we moved it above control flow, could we introduce a fault? I believe the answer to be yes, but the key point here is that we're *not* moving an existing instruction. We're moving a state transition. If VL actually changes, the insert code would not use the x0, x0 form at the new point. (And couldn't at the original either since VL could change on at least one incoming path.)

To say this point differently, we don't need the standard "anticipation" check from PRE as the data flow and insertion logic covers that bit for us. By construction, there isn't some unmodeled control dependence for the "instruction being moved". Or said differently, state transitions are always safe to speculate.

Third, the specification leaves the hardware significant flexibility to handle VLMAX < AVL < 2 x VLMAX. Originally, I think my concern had been that hoisting such cases might allow different VL values, but thinking about this further, the spec explicitly requires determinism for each AVL value. Given that, we'd only run into trouble if we considered two different AVLs (which we'd somehow falsely proven to produce the same VL) along different paths. Doing this would be an unrelated bug in the dataflow, and thus we don't have to account for it here.

Skeptical review to make sure I haven't missed something here is appreciated.

Diff Detail

Event Timeline

reames created this revision.Jun 2 2022, 7:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2022, 7:54 AM
reames requested review of this revision.Jun 2 2022, 7:54 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2022, 7:54 AM
craig.topper added inline comments.Jun 2 2022, 10:07 PM
llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
1321

Does the end iterator of a basic block point a unique state such that this returning true means that DefMI is in UnavailablePred? Or could they both be at the end of different basic blocks?

reames added inline comments.Jun 3 2022, 7:56 AM
llvm/lib/Target/RISCV/RISCVInsertVSETVLI.cpp
1321

They could be in different basic blocks, so there's an iterator comparison bug here. Thanks!

Will rev with a fix.

reames updated this revision to Diff 434032.Jun 3 2022, 8:09 AM

Fix bug noted by reviewer.

This revision is now accepted and ready to land.Jun 15 2022, 4:09 PM
reames planned changes to this revision.Jun 20 2022, 10:43 AM

Went to land this today - I've been intentionally spacing out changes since I have a lot of churn in this file - and encountered a new crash in a test failure. Need to investigate and probably re-review once cause is understood.