This is an archive of the discontinued LLVM Phabricator instance.

[RISCV][ISel] Refactor the formation of VW operations
ClosedPublic

Authored by qcolombet on Sep 26 2022, 9:58 PM.

Details

Summary

This patch centralizes all the combines of add|sub|mul with extended operands in one "framework".
The rationale for this change is to offer a one-stop-shop for all these transformations so that, in the future (next patch), it is easier to make combine decisions for a web of instructions (i.e., instructions connected through s|zext operands).

Technically this patch is not NFC because the new version is more powerful than the previous version.
In particular, it diverges in two cases:

  • VWMULSU can now also be produced from mul(splat, zext), whereas previously only mul(sext, splat) were supported when splats were involved. (As demonstrated in rvv/fixed-vectors-vwmulsu.ll)
  • VWSUB(U) can now also be produced from sub(splat, ext), whereas previously only sub(ext, splat) were supported when splats were involved. (As demonstrated in rvv/fixed-vectors-vwsub.ll)

If we wanted, we could block these transformations to make this patch really NFC. For instance, we could do something similar to AllowSplatInVW_W, which prevents the combines to form vw(add|sub)(u)_w when the RHS is a splat.

Regarding the "framework" itself, the bulk of the patch is some boilerplate code that abstracts away the actual extensions that are present in the DAG. This allows us to handle vwadd_w(ext a, b) as if it was a regular add(ext a, ext b). Since the node ext b doesn't actually exist in the DAG, we have a bunch of methods (all in the NodeExtensionHelper class) that fake all that for us.

The other half of the change is around CombineToTry and CombineResult. These helper structures respectively:

  • Represent the kind of combines that can be applied to a node, and
  • Store what needs to happen to do that combine.

This can be viewed as a two step approach:

  • First, check if a pattern applies, and
  • Second apply it.

The checks and the materialization of the combines are decoupled so that in the future we can perform several checks and do all the related applies in one go.

Diff Detail

Event Timeline

qcolombet created this revision.Sep 26 2022, 9:58 PM
qcolombet requested review of this revision.Sep 26 2022, 9:58 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 26 2022, 9:58 PM
qcolombet added inline comments.Sep 26 2022, 10:01 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
8214–8217

Everything in this anonymous namespace is the boilderplate code I was talking about in the description.

8677

The main entry point is here.

Note: https://reviews.llvm.org/harbormaster/unit/view/4989286/ was failing because D134701 hadn't landed yet (and thus was exposing the bug)

craig.topper edited the summary of this revision. (Show Details)Sep 30 2022, 10:38 AM

I've reversed the parent/child links so that the patches that need to come first are parents and the later patches are children.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
8294

I think using auto [Mask, VL] = getMaskAndVL(Root); would be consistent with D134442.

8346

Unexpected*

8366

Unexpected*

8410

Would Root ever be null?

8685

call -> called

I've reversed the parent/child links so that the patches that need to come first are parents and the later patches are children.

Oh thanks!
Stupid me!!

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
8294

Sounds good to me.

8346

Good catch!

8410

Technically no.
We can get rid of the check.

craig.topper added inline comments.Sep 30 2022, 2:35 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
8403

I'm little confused by this check. The number of operands should be a property of the RISCVISD opcode. All RISCVISD::VMV_V_X_VL are supposed to have the same number of operands.

8492

This can also use structured binding

8509

Can this be llvm_unreachable?

8522

I think these can be references to NodeExtensionHelper? That would avoid needing to copy them. But I don't know if that would be fragile.

qcolombet added inline comments.Sep 30 2022, 3:14 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
8403

I was surprised too but I've seen them in some tests.
I assumed this was expected and didn't dig more.
For instance, I see that we create some without VLs at line 1935 before this patch:

SDValue SplatZero = DAG.getNode(
    RISCVISD::VMV_V_X_VL, DL, DstContainerVT, DAG.getUNDEF(DstContainerVT),
    DAG.getConstant(0, DL, Subtarget.getXLenVT()));

You can find other instances here and there in that file. E.g., another one:

// Create 2^eltbits - 1 copies of V2 by multiplying by the largest integer.
SDValue Multiplier = DAG.getNode(RISCVISD::VMV_V_X_VL, DL, IntHalfVT,
                                    DAG.getUNDEF(IntHalfVT),
                                    DAG.getAllOnesConstant(DL, XLenVT));
8509

Yep

8522

You're right, for this patch, they could be references.
But for the next patch they need to be copied because we apply the changes much later and the references can change between the "match" and "apply".

I've happy to introduce the copy only in the next patch (in particular since the next patch is probably more controversial/may not land.)

Let me know what you prefer.

qcolombet added inline comments.Oct 3 2022, 11:44 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
8403

@craig.topper ^^^
Tagging you to make sure you saw that I needed some feedback on that front.

Put differently, is it expected to have VMV_V_X_VL with no VL or should the snippet I showed need fixing?
(Unless I missed something and the VL are indeed added at some point.)

craig.topper added inline comments.Oct 3 2022, 11:53 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
8403

Those are both bugs. I guess they don't get noticed because isel matches them into the .vx or .vi form of the instruction that uses them. I'll fix them.

craig.topper added inline comments.Oct 3 2022, 12:14 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
8403

I fixed those and a few others in 4c03c9f

qcolombet added inline comments.Oct 3 2022, 12:18 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
8403

Awesome, thanks for the quick fix!
I'll remove this check then.

qcolombet updated this revision to Diff 464859.Oct 3 2022, 5:27 PM
  • Fix typos
  • Use structured binding
  • Replace some asserts with unreachable
  • Use references for NodeExtensionHelper for now (we don't need the copy for this patch).
qcolombet marked 5 inline comments as done.Oct 3 2022, 5:30 PM
This revision is now accepted and ready to land.Oct 4 2022, 9:12 AM
This revision was landed with ongoing or failed builds.Oct 5 2022, 10:49 AM
This revision was automatically updated to reflect the committed changes.
craig.topper added a comment.EditedOct 31 2022, 12:00 PM

@qcolombet We're seeing a crash from this in our downstream

define <vscale x 4 x i64> @test(<vscale x 4 x i1> %0, <vscale x 4 x i1> %1) {    
entry:                                                                           
  %vp.cast = call <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i8(<vscale x 4 x i8> zeroinitializer, <vscale x 4 x i1> %0, i32 1)
  %vp.cast42 = call <vscale x 4 x i64> @llvm.vp.sext.nxv4i64.nxv4i16(<vscale x 4 x i16> zeroinitializer, <vscale x 4 x i1> zeroinitializer, i32 1)
  %vp.op = call <vscale x 4 x i64> @llvm.vp.add.nxv4i64(<vscale x 4 x i64> %vp.cast42, <vscale x 4 x i64> %vp.cast, <vscale x 4 x i1> %0, i32 1)
  ret <vscale x 4 x i64> %vp.op                                                  
}                                                                                
                                                                                 
declare <vscale x 4 x i64> @llvm.vp.zext.nxv4i64.nxv4i8(<vscale x 4 x i8>, <vscale x 4 x i1>, i32)
declare <vscale x 4 x i64> @llvm.vp.sext.nxv4i64.nxv4i16(<vscale x 4 x i16>, <vscale x 4 x i1>, i32)
declare <vscale x 4 x i64> @llvm.vp.add.nxv4i64(<vscale x 4 x i64>, <vscale x 4 x i64>, <vscale x 4 x i1>, i32)

We end up with this

Vector/type-legalized selection DAG: %bb.0 'test:entry'                          
SelectionDAG has 15 nodes:                                                       
  t0: ch,glue = EntryToken                                                       
  t2: nxv4i1,ch = CopyFromReg t0, Register:nxv4i1 %0                             
        t12: nxv4i16 = splat_vector Constant:i64<0>                              
        t14: nxv4i1 = splat_vector Constant:i64<0>                               
      t23: nxv4i64 = RISCVISD::VSEXT_VL t12, t14, Constant:i64<1>                
        t7: nxv4i8 = splat_vector Constant:i64<0>                                
      t22: nxv4i64 = RISCVISD::VZEXT_VL t7, t2, Constant:i64<1>                  
    t25: nxv4i64 = RISCVISD::ADD_VL t23, t22, undef:nxv4i64, t2, Constant:i64<1> 
  t19: ch,glue = CopyToReg t0, Register:nxv4i64 $v8m4, t25                       
  t20: ch = RISCVISD::RET_FLAG t19, Register:nxv4i64 $v8m4, t19:1                
                                                                                 
                                                                                 
                                                                                 
Combining: t25: nxv4i64 = RISCVISD::ADD_VL t23, t22, undef:nxv4i64, t2, Constant:i64<1>
Creating new node: t26: nxv4i32 = RISCVISD::VZEXT_VL t7, t2, Constant:i64<1>     
Creating new node: t27: nxv4i64 = RISCVISD::VWADDU_W_VL t12, t26, undef:nxv4i64, t2, Constant:i64<1>
 ... into: t27: nxv4i64 = RISCVISD::VWADDU_W_VL t12, t26, undef:nxv4i64, t2, Constant:i64<1>

The first operand to the VWADDU_W_VL has type nxv4i16 instead of the nxv4i64 it should have. We appear to have dropped the original VSEXT_VL.

Proposed patch D137106

Hi @craig.topper ,

Thanks for the heads-up!
I see that you've already gone to the bottom of it with https://reviews.llvm.org/D137106. I'll review that patch.

Sorry for the breakage.

Cheers,
-Quentin