Currently when folding vmerge into its operands, we stop if the VLs aren't
identical.  However since the body of (vmerge (vop)) is the intersection of
vmerge and vop's bodies, we can use the smaller of the two VLs if we know it
ahead of time.  This patch relaxes the constraint on VL if they are both
constants, or if either of them are VLMAX.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
| llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
|---|---|---|
| 3338 | Is the below code more concise? if (auto *CLHS = dyn_cast<ConstantSDNode>(LHS)) if (auto *CRHS = dyn_cast<ConstantSDNode>(RHS)) | |
Address review comment
| llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
|---|---|---|
| 3338 | Yeah that's much more readable, thanks! | |
Generally seems correct, but enough minor comments that I want to see a revision before giving an LGTM.
| llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
|---|---|---|
| 3331–3348 | Naming: GetMinVL | |
| 3338 | Suggestion: auto *CLHS = dyn_cast<ConstantSDNode>(LHS); auto *CRHS = dyn_cast<ConstantSDNode>(RHS); if (!CRHS || !CLHS) return SDValue(); return CLHS->getZExtValue() <= CRHS->getZExtValue() ? LHS : RHS; | |
| 3344 | Nit pick - have the same merge operand *or the merge on the True instruction doesn't exist and is thus undef*. | |
| 3345 | This comment is awfully verbose, and I don't think the example helps much. | |
| 3354 | The comment change needs reverted. | |
| llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect.ll | ||
| 250–251 | Totally off topic for this review, but we can do way better for odd sized mask loads. The scalarization code here already touches all bits in the byte, we can just do a single byte vector load and mask off the high bits directly. Not sure we care, just noticed it as I was looking at this test. | |
| llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
|---|---|---|
| 3344 | My mental model of this is that we need to have the same merge operand for both, but if the merge operand on True is undefined or missing, then we assume it's the same merge operand as N. Could we make this reasoning more explicit in the preceding code? | |
| 3345 | After reading this the next day, I agree. I think I was using the comments as a mental scratch pad | |
| llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp | ||
|---|---|---|
| 3344 | Something like: diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp index d956b303e584..6a084efa0a8f 100644 --- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp @@ -3245,18 +3245,20 @@ bool RISCVDAGToDAGISel::performCombineVMergeAndVOps(SDNode *N) { if (!Info) return false; - if (HasTiedDest && !isImplicitDef(True->getOperand(0))) { - // The vmerge instruction must be TU. - // FIXME: This could be relaxed, but we need to handle the policy for the - // resulting op correctly. - if (isImplicitDef(Merge)) - return false; - SDValue MergeOpTrue = True->getOperand(0); - // Both the vmerge instruction and the True instruction must have the same - // merge operand. - if (False != MergeOpTrue) - return false; - } + SDValue TrueMerge = HasTiedDest ? True->getOperand(0) : False; + if (isImplicitDef(TrueMerge)) + TrueMerge = False; + + // Both the vmerge instruction and the True instruction must have the same + // merge operand. + if (TrueMerge != False) + return false; + + // The vmerge instruction must be TU. + // FIXME: This could be relaxed, but we need to handle the policy for the + // resulting op correctly. + if (isImplicitDef(Merge) && !isImplicitDef(TrueMerge)) + return false; if (IsMasked) { assert(HasTiedDest && "Expected tied dest"); | |
I think I'm seeing a miscompile from this
I have a tail undefined unmasked vrgather.vi with VL=2 followed by a tail undefined vmerge with VL=4. This gets combined to a tail agnostic masked undisturbed masked vrgather.vi with VL =2. I think this needs to be a tail undisturbed VL to account for elements 2 and 3 from the vmerge becoming part of the tail?
Here's a hack to avoid the buggy case
diff --git a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp index dfa54740c0d6..ad3f0e98d12f 100644 --- a/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp +++ b/llvm/lib/Target/RISCV/RISCVISelDAGToDAG.cpp @@ -3629,7 +3629,7 @@ bool RISCVDAGToDAGISel::performCombineVMergeAndVOps(SDNode *N) { SDValue TrueVL = True.getOperand(TrueVLIndex); SDValue SEW = True.getOperand(TrueVLIndex + 1); - auto GetMinVL = [](SDValue LHS, SDValue RHS) { + auto GetMinVL = [&](SDValue LHS, SDValue RHS) { if (LHS == RHS) return LHS; if (isAllOnesConstant(LHS)) @@ -3640,7 +3640,11 @@ bool RISCVDAGToDAGISel::performCombineVMergeAndVOps(SDNode *N) { auto *CRHS = dyn_cast<ConstantSDNode>(RHS); if (!CLHS || !CRHS) return SDValue(); - return CLHS->getZExtValue() <= CRHS->getZExtValue() ? LHS : RHS; + if (CRHS->getZExtValue() <= CLHS->getZExtValue()) + return RHS; + if (isImplicitDef(Merge)) + return SDValue(); + return LHS; }; // Because N and True must have the same merge operand (or True's operand is
| llvm/test/CodeGen/RISCV/rvv/fixed-vectors-vselect.ll | ||
|---|---|---|
| 62 | I think this shows the bug. This is a ta load, formed from a vl=6 load and vl=8 merge both ta. Elements 6 and 7 are now part of the tail after merging. Elements 6 and 7 were defined if the mask was 0 for those elements. We're now ignoring mask bits 6 and 7 with the reduce vl. So we need to use tu to make them defined. | |
Is the below code more concise?