This is an archive of the discontinued LLVM Phabricator instance.

[ConstantFold][SVE] Fix constant folding for bitcast.
ClosedPublic

Authored by huihuiz on Dec 11 2019, 9:06 PM.

Details

Summary

Do not iterate on scalable vector type in BitCastConstantVector.
Continuation work of D70985, D71147.

Add support to fold bitcast into splat value for splat vector.

Diff Detail

Event Timeline

huihuiz created this revision.Dec 11 2019, 9:06 PM

Current upstream crash with llvm/lib/IR/Value.cpp:404: void llvm::Value::doRAUW(llvm::Value *, llvm::Value::ReplaceMetadataUses): Assertion `New->getType() == getType() && "replaceAllUses of value with new value of different type!"' failed.

run: opt -S -constprop bitcast.ll -o -

bitcast.ll

define <vscale x 4 x float> @bitcast_scalable_constant() {
  %i1 = insertelement <vscale x 4 x i32> undef, i32 1, i32 0
  %i2 = shufflevector <vscale x 4 x i32> %i1, <vscale x 4 x i32> undef, <vscale x 4 x i32> zeroinitializer
  %i3 = bitcast <vscale x 4 x i32> %i2 to <vscale x 4 x float>
  ret <vscale x 4 x float> %i3
}
spatel accepted this revision.Dec 12 2019, 4:41 AM

LGTM. I don't know how many of these patches are needed, but it would be slightly more efficient to create a "vscale.ll" test file and put all of the related tests in that 1 file.

This revision is now accepted and ready to land.Dec 12 2019, 4:41 AM
sdesmalen added inline comments.Dec 12 2019, 5:14 AM
llvm/lib/IR/ConstantFold.cpp
52

Do we want to add support for a special case here for splat operations (like the one in the test)?

huihuiz marked an inline comment as done.Dec 12 2019, 11:54 AM

LGTM. I don't know how many of these patches are needed, but it would be slightly more efficient to create a "vscale.ll" test file and put all of the related tests in that 1 file.

Thank you Sanjay!
Yes, I will try my best to combine them. The current code base has a lot of assumption based on fixed length vector.
I am going through the IR passes, and try to fix them to the best I can.

llvm/lib/IR/ConstantFold.cpp
52

Actually we can not, we will end up in scalable vector constant.
Instead of <4 x i32> <i32 1, i32 1, i32 1, i32 1>, we will need a form like "<vscale x 4 x i32> [?]"

I don't know how to express scalable vector constant in current upstream, please let me know if I get anything wrong?

efriedma added inline comments.Dec 12 2019, 12:17 PM
llvm/lib/IR/ConstantFold.cpp
52

I think the suggestion is to special-case the specific sequence of insertelement+splatvector+bitcast, and bitcast the operand of the insertelement. Not sure how useful that would be in general, though.

spatel added inline comments.Dec 12 2019, 12:40 PM
llvm/lib/IR/ConstantFold.cpp
52

If we want to do that transform, it would go in instsimplify since we're doing a multi-instruction analysis/fold that becomes a constant.

spatel added inline comments.Dec 12 2019, 12:42 PM
llvm/lib/IR/ConstantFold.cpp
52

Sorry - might've misread that. If we're creating a new bitcast, that would have to go in instcombine (we don't create new instructions in instsimplify).

Yes, this transformation will need to go in InstCombine.
We should probably implement this when constant scalable vector support is ready, so that the effect will be most obvious.

efriedma added inline comments.Dec 12 2019, 1:30 PM
llvm/lib/IR/ConstantFold.cpp
52

By the time we get here, the shuffle and insertelement are ConstantExprs, not instructions.

sdesmalen added inline comments.Dec 13 2019, 2:57 AM
llvm/lib/IR/ConstantFold.cpp
52

In terms of usefulness, this would make it easier to determine whether the value is a splat when calling Constant::getSplatValue without having to look through the bitcast first.

huihuiz updated this revision to Diff 234432.Dec 17 2019, 5:32 PM

Add special case handling:

For splat vector, fold bitcast to splat value.

Thanks for adding the extra constant fold @huihuiz!

llvm/lib/IR/ConstantFold.cpp
580

Because we're matching for a zero mask, explicitly matching m_Undef here seems unnecessarily restrictive, m_Value() should be sufficient.

587

Can we match this value directly with m_Zero(Zero), rather than needing auto InsertElem and InsertElem->getOperand(2) ?

590

Can we match the value for CE->getOperand(1) value directly with something like m_Value(Vec2). I'd expect CE->getOperand(2) can reuse Zero as suggested above.

bjope added a subscriber: bjope.Dec 18 2019, 6:52 AM
bjope added inline comments.
llvm/lib/IR/ConstantFold.cpp
572

Don't you need some check that both sides of the bitcast are vectors with matching number of elements?

huihuiz updated this revision to Diff 234639.Dec 18 2019, 4:46 PM
huihuiz marked 4 inline comments as done.
huihuiz edited the summary of this revision. (Show Details)

Clean up code based on reviewer feedback.

llvm/lib/IR/ConstantFold.cpp
580

Yes, we don't need to restrict it to m_Undef().

I think we should restrict it with m_Constant(). I would expect operands to be constant at ConstantFoldCastInstruction().
Doing this can also help avoid unnecessary cast<constant> when calling ConstantExpr to get new InsertElement or cast operands into new types.

587

Current pattern matcher doesn't support m_Zero(Zero).
I use m_CombineAnd(m_Zero(), m_Constant(Zero)) instead.

This revision was automatically updated to reflect the committed changes.