This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Truncate constants to EltSize when combine store of BUILD_VECTOR
ClosedPublic

Authored by wangpc on Aug 10 2023, 4:48 AM.

Details

Summary

The constants can be with larger bit width, so we need to truncate
them to EltSize or we will exceed the width of fixed-length vector.

Fixes #64588

Diff Detail

Event Timeline

wangpc created this revision.Aug 10 2023, 4:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2023, 4:48 AM
wangpc requested review of this revision.Aug 10 2023, 4:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2023, 4:48 AM
wangpc edited the summary of this revision. (Show Details)Aug 10 2023, 4:50 AM
wangpc added inline comments.Aug 10 2023, 4:52 AM
llvm/test/CodeGen/RISCV/rvv/pr64588.ll
11

There is still a problem here that this insertelement is discarded somehow:

Initial selection DAG: %bb.0 'bar:'
SelectionDAG has 19 nodes:
  t0: ch,glue = EntryToken
  t2: i64,ch = CopyFromReg t0, Register:i64 %0
  t4: i64,ch = CopyFromReg t0, Register:i64 %1
  t7: i32 = Constant<1>
            t6: v64i64 = BUILD_VECTOR Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>
          t9: v64i64 = insert_vector_elt t6, Constant:i64<0>, Constant:i64<1>
        t10: v64i1 = truncate t9
      t14: ch = store<(store (s64) into %ir.p11)> t0, t10, t2, undef:i64
      t16: v8i8 = BUILD_VECTOR Constant:i8<0>, Constant:i8<0>, Constant:i8<0>, Constant:i8<0>, Constant:i8<0>, Constant:i8<0>, Constant:i8<0>, Constant:i8<0>
      t12: i64 = add t2, Constant:i64<8>
    t17: ch = store<(store (s64) into %ir.p2)> t14, t16, t12, undef:i64
  t18: ch = RISCVISD::RET_GLUE t17



Combining: t18: ch = RISCVISD::RET_GLUE t17

Combining: t17: ch = store<(store (s64) into %ir.p2)> t14, t16, t12, undef:i64
Creating new node: t19: ch = store<(store (s64) into %ir.p2)> t0, t16, t12, undef:i64
Creating new node: t20: ch = TokenFactor t14, t19

Replacing.1 t17: ch = store<(store (s64) into %ir.p2)> t14, t16, t12, undef:i64

With: t20: ch = TokenFactor t14, t19
 and 0 other values

Combining: t19: ch = store<(store (s64) into %ir.p2)> t0, t16, t12, undef:i64
Creating new node: t21: ch = store<(store (s64) into %ir.p2)> t0, Constant:i64<0>, t12, undef:i64
 ... into: t21: ch = store<(store (s64) into %ir.p2)> t0, Constant:i64<0>, t12, undef:i64

Combining: t21: ch = store<(store (s64) into %ir.p2)> t0, Constant:i64<0>, t12, undef:i64

Combining: t20: ch = TokenFactor t14, t21

Combining: t18: ch = RISCVISD::RET_GLUE t20

Combining: t14: ch = store<(store (s64) into %ir.p11)> t0, t10, t2, undef:i64

Combining: t13: i64 = undef

Combining: t12: i64 = add t2, Constant:i64<8>

Combining: t11: i64 = Constant<8>

Combining: t10: v64i1 = truncate t9
Creating new node: t22: v64i1 = BUILD_VECTOR Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>

Replacing.2 t10: v64i1 = truncate t9

With: t22: v64i1 = BUILD_VECTOR Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>


Combining: t22: v64i1 = BUILD_VECTOR Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>

Combining: t14: ch = store<(store (s64) into %ir.p11)> t0, t22, t2, undef:i64
Creating new node: t23: ch = store<(store (s64) into %ir.p11)> t0, Constant:i64<0>, t2, undef:i64
 ... into: t23: ch = store<(store (s64) into %ir.p11)> t0, Constant:i64<0>, t2, undef:i64

Combining: t13: i64 = undef

Combining: t23: ch = store<(store (s64) into %ir.p11)> t0, Constant:i64<0>, t2, undef:i64

Combining: t20: ch = TokenFactor t23, t21

Combining: t5: i64 = Constant<0>

Combining: t2: i64,ch = CopyFromReg t0, Register:i64 %0

Combining: t1: i64 = Register %0

Combining: t0: ch,glue = EntryToken
Optimized lowered selection DAG: %bb.0 'bar:'
SelectionDAG has 11 nodes:
  t0: ch,glue = EntryToken
  t2: i64,ch = CopyFromReg t0, Register:i64 %0
      t23: ch = store<(store (s64) into %ir.p11)> t0, Constant:i64<0>, t2, undef:i64
        t12: i64 = add t2, Constant:i64<8>
      t21: ch = store<(store (s64) into %ir.p2)> t0, Constant:i64<0>, t12, undef:i64
    t20: ch = TokenFactor t23, t21
  t18: ch = RISCVISD::RET_GLUE t20
wangpc updated this revision to Diff 548987.Aug 10 2023, 5:24 AM

Update test.

wangpc added inline comments.Aug 10 2023, 5:27 AM
llvm/test/CodeGen/RISCV/rvv/pr64588.ll
11

OK I know, I thought the second operand is index.
Not a problem here and I changed the test.

luke accepted this revision.Aug 10 2023, 6:10 AM

LGTM, thanks for the quick fix!

This revision is now accepted and ready to land.Aug 10 2023, 6:10 AM
wangpc added a comment.EditedAug 10 2023, 9:22 PM

The root cause of this assertion is:

  • The stacktrace is:
llvm::SelectionDAG::getConstant(llvm::ConstantInt const&, llvm::SDLoc const&, llvm::EVT, bool, bool) (llvm\lib\CodeGen\SelectionDAG\SelectionDAG.cpp:1571)
llvm::SelectionDAG::getConstant(llvm::APInt const&, llvm::SDLoc const&, llvm::EVT, bool, bool) (llvm\lib\CodeGen\SelectionDAG\SelectionDAG.cpp:1555)
llvm::TargetLowering::SimplifyDemandedBits(llvm::SDValue, llvm::APInt const&, llvm::APInt const&, llvm::KnownBits&, llvm::TargetLowering::TargetLoweringOpt&, unsigned int, bool) const (llvm\lib\CodeGen\SelectionDAG\TargetLowering.cpp:2778)
llvm::TargetLowering::SimplifyDemandedBits(llvm::SDValue, llvm::APInt const&, llvm::KnownBits&, llvm::TargetLowering::TargetLoweringOpt&, unsigned int, bool) const (llvm\lib\CodeGen\SelectionDAG\TargetLowering.cpp:646)
(anonymous namespace)::DAGCombiner::SimplifyDemandedBits(llvm::SDValue, llvm::APInt const&) (llvm\lib\CodeGen\SelectionDAG\DAGCombiner.cpp:335)
(anonymous namespace)::DAGCombiner::SimplifyDemandedBits(llvm::SDValue) (llvm\lib\CodeGen\SelectionDAG\DAGCombiner.cpp:329)
(anonymous namespace)::DAGCombiner::visitTRUNCATE(llvm::SDNode*) (llvm\lib\CodeGen\SelectionDAG\DAGCombiner.cpp:14588)
(anonymous namespace)::DAGCombiner::visit(llvm::SDNode*) (llvm\lib\CodeGen\SelectionDAG\DAGCombiner.cpp:1987)
(anonymous namespace)::DAGCombiner::combine(llvm::SDNode*) (llvm\lib\CodeGen\SelectionDAG\DAGCombiner.cpp:2064)
(anonymous namespace)::DAGCombiner::Run(llvm::CombineLevel) (llvm\lib\CodeGen\SelectionDAG\DAGCombiner.cpp:1856)
  • The code should be responsible for the assertion:

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

SDValue SelectionDAG::getConstant(const ConstantInt &Val, const SDLoc &DL,
                                  EVT VT, bool isT, bool isO) {
  //...
  // In some cases the vector type is legal but the element type is illegal and
  // needs to be promoted, for example v8i8 on ARM.  In this case, promote the
  // inserted value (the type does not need to match the vector element type).
  // Any extra bits introduced will be truncated away.
  if (VT.isVector() && TLI->getTypeAction(*getContext(), EltVT) ==
                           TargetLowering::TypePromoteInteger) {
    EltVT = TLI->getTypeToTransformTo(*getContext(), EltVT);
    APInt NewVal = Elt->getValue().zextOrTrunc(EltVT.getSizeInBits());
    Elt = ConstantInt::get(*getContext(), NewVal);
  }
  // ...
}

Somehow we need to create a v64i1 constant with all elements is 0 when combine TRUNCATE. And element type i1 is illegal, so we get v64i64 actually.

  • DAG Combine log:
Combining: t8: v64i1 = truncate t7
Creating new node: t20: v64i1 = BUILD_VECTOR Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>

Replacing.2 t8: v64i1 = truncate t7

With: t20: v64i1 = BUILD_VECTOR Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>, Constant:i64<0>

For the test, the insertelement is a nop actually. If we change it to non-nop(we will actually insert a value), the assertion won't be triggered (because the different code path of DAGCombine is taken):

define void @assertion(ptr %p) {
  %v = insertelement <64 x i64> zeroinitializer, i64 0, i32 0
  %trunc = trunc <64 x i64> %v to <64 x i1>
  %p1 = getelementptr i8, ptr %p, i32 0
  %p2 = getelementptr i8, ptr %p, i32 8
  store <64 x i1> %trunc, ptr %p1
  store <8 x i8> zeroinitializer, ptr %p2
  ret void
}

define void @no_assertion(ptr %p) {
  %v = insertelement <64 x i64> zeroinitializer, i64 1, i32 0
  %trunc = trunc <64 x i64> %v to <64 x i1>
  %p1 = getelementptr i8, ptr %p, i32 0
  %p2 = getelementptr i8, ptr %p, i32 8
  store <64 x i1> %trunc, ptr %p1
  store <8 x i8> zeroinitializer, ptr %p2
  ret void
}

So I think this kind of IR sequences won't be common in real code.
My fix is truncating all elemments to element size. I'm not so confident that this is the right place to fix, I will wait a few days before landing this.

wangpc updated this revision to Diff 549238.Aug 10 2023, 9:22 PM
  • Rebase.
  • Update test.
wangpc updated this revision to Diff 549244.Aug 10 2023, 10:05 PM

Use zextOrTrunc.

wangpc retitled this revision from [RISCV] Truncate constants to EleSize when combine store of BUILD_VECTOR to [RISCV] zextOrTrunc constants to EleSize when combine store of BUILD_VECTOR.Aug 10 2023, 10:06 PM
wangpc edited the summary of this revision. (Show Details)
wangpc added a reviewer: bjope.
craig.topper added inline comments.Aug 10 2023, 10:14 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
13482

Ele -> Elt

13486–13487

Why does it need to be zextOrTrunc? Can it just be trunc? It never need to be extended right?

wangpc added inline comments.Aug 10 2023, 10:18 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
13486–13487

Actually I don't know, here is just defensive programming.

wangpc updated this revision to Diff 549246.Aug 10 2023, 10:23 PM

EleSize -> EltSize

wangpc retitled this revision from [RISCV] zextOrTrunc constants to EleSize when combine store of BUILD_VECTOR to [RISCV] zextOrTrunc constants to EltSize when combine store of BUILD_VECTOR.Aug 10 2023, 10:23 PM
wangpc edited the summary of this revision. (Show Details)
craig.topper added inline comments.Aug 10 2023, 10:38 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
13486–13487

I would use trunc so we get the assertion. If we did need to extend for some reason, I don't know that zero extend would be correct.

wangpc updated this revision to Diff 549250.Aug 10 2023, 11:10 PM

Use trunc back

wangpc retitled this revision from [RISCV] zextOrTrunc constants to EltSize when combine store of BUILD_VECTOR to [RISCV] Truncate constants to EltSize when combine store of BUILD_VECTOR.Aug 10 2023, 11:11 PM
wangpc edited the summary of this revision. (Show Details)
wangpc marked 3 inline comments as done.
wangpc added inline comments.
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
13486–13487

OK, it makes sense to me.

bjope accepted this revision.Aug 11 2023, 3:22 AM

LG, thanks!

michaelmaitland added inline comments.
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
13486–13487

What happens if we're truncating a constant that is larger than EltSize?

craig.topper added inline comments.Aug 11 2023, 10:34 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
13486–13487

BUILD_VECTOR nodes have a vector result and scalar operands. The scalar operands go through type legalization before the BUILD_VECTOR is lowered. Any scalar operand that is smaller than XLen will be type legalized to XLen by any extending. The semantics of BUILD_VECTOR is that the extra bits of the input aren't supposed to be used. They just exist to make all the types legal. It is safe to drop the extra bits here.

On RV32, it is possible for type legalization to see a vXi64 BUILD_VECTOR with i64 scalar operands that need to be legalized to i32. In that case type legalization will create a v2Xi32 BUILD_VECTOR with twice as many elements and i32 scalar operands followed by a bitcast to vXi64.

It is never possible for a BUILD_VECTOR to have a scalar operand that is smaller than the element size of the result type.

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
13486–13487

Why not change the i64 operand to BUILD_VECTOR to i32 before we get here if thats the case?

michaelmaitland accepted this revision.Aug 11 2023, 11:35 AM
michaelmaitland added inline comments.
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
13486–13487

I understand why its not i32 before we get here -- because i32 ints are not a thing on i64, but i32 vectors are, so it wouldn't make sense for an i32 scalar to come in as an operand of the BUILD_VECTOR.

LGTM.

This revision was landed with ongoing or failed builds.Aug 13 2023, 7:56 PM
This revision was automatically updated to reflect the committed changes.
wangpc marked an inline comment as done.