This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Fix assert from non-constant index in insertelement
ClosedPublic

Authored by bcahoon on Feb 20 2022, 4:43 PM.

Details

Summary

A call to getInsertIndex() in getTreeCost() is returning None,
which causes an assert because a non-constant index value for
insertelement was not expected. This case occurs when the
insertelement index value is defined with a PHI.

I'm not sure if this is the best way to fix this issue, but it does
fix the assert.

Diff Detail

Event Timeline

bcahoon created this revision.Feb 20 2022, 4:43 PM
bcahoon requested review of this revision.Feb 20 2022, 4:43 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 20 2022, 4:43 PM

This is a partial revert of rG954ea0f044e0. It looks like there is still an issue with cost computation here, but I'm to fix it in a separate patch and to backport both patches to 14.0.0. Therefore looks good to me. Add @ABataev.

ABataev added inline comments.Feb 21 2022, 5:14 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5921–5922

I think we need an extra check here. If the index is undefined - continue, but if the index is not constant, need to calculate the cost as extractelement+insertelement.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5921–5922

This is just a reversion to previous state, we can proceed with this and commit a separate patch for correct check. There is also another check we need here: that EU.Scalar is used as a first operand of insertelement.

5921–5922

As for undef index check, I'm not sure it should be processed specially: we still generate extractelement for EU.Scalar at vectorizeTree() stage.

ABataev added inline comments.Feb 21 2022, 6:06 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5921–5922

No, we don't need this check here.
I was not quite correct in my previous comment. No need to to calculate the cost as extractelement+insertelement, just extractelement.
I mean, we need to have something like this:

Optional<unsigned> Idx = getInsertIndex(VU);
if (Idx) {
  // Process as shuffle.
  ...
} else if (isa<Constant>(VY->getOperand(2))
  continue;
5921–5922

Yes, but it will optimized out by InstCombiner, since the user with undef index is just an undef.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5921–5922

No, we don't need this check here.

What if EU.Scalar is used a second operand?

5921–5922

Sure, it will optimized in that case, but why do we need this special check for insertelement here? Why not for mul i32 ..., poison, for instance? If we want to check that the value extracted is used by poison only, we can do this generally in special place.

5921–5922

I propose to proceed with current fix and then go on further.

ABataev added inline comments.Feb 21 2022, 6:43 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5921–5922

Still need to extract it and count the cost of extract.

5921–5922

Generally speaking, it would be good to have such checks, we just can't check everything because of the compile time and amount of work and rely on InstCombiner here, that it transforms such instructions to undefs before SLP pass. We just have some artificial tests, which has undef indices, this check is mostly for such test cases.

5921–5922

We already underestimate the cost in many cases, would be good to have the proper fix.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5921–5922

Sure. So this is another, main branch. So we need something like EU.Scalar == VU.getOperand(0) check here.

5921–5922

Ok, I see. So do you believe @bcahoon could commit this and then I add other checks?

ABataev added inline comments.Feb 21 2022, 10:05 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5921–5922

It means that the scalar is an insertelement instruction, i.e. has vector type, right? It is checked already in line 5908.

5921–5922

For the 14.0 branch? I would suggest following more conservative solution, probably:

Optional<unsigned> Idx = getInsertIndex(VU);
if (Idx) {
  // Process as shuffle.
  ...
}

i.e. do not ignore it but instead count as extractelement cost

bcahoon added inline comments.Feb 22 2022, 10:06 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5921–5922

Thanks for the reviews so far. I'm trying to figure out the next steps to fix the assert due to dereferencing the result of getInsertIndex(VU) when it returns None. What's involved with the comment,

// Process as shuffle.

Is that the existing code that appears in lines 5923-5949?

ABataev added inline comments.Feb 22 2022, 10:07 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5921–5922

Yes

bcahoon added inline comments.Feb 22 2022, 11:13 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5921–5922

Isn't that the behavior of the patch? The existing code is skipped if (!Idx), otherwise Idx contains a valid value, so execute the existing code. Does something else need to be added?

ABataev added inline comments.Feb 22 2022, 11:16 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5921–5922

No, the patch does continue, i.e. skips everything, but instead it should fallback to the cost for extractelement.

bcahoon added inline comments.Feb 22 2022, 11:28 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5921–5922

Ah, I think I understand. When getInsertValue returns None, the lines 5962-5970 should still occurs rather than being skipped. Those lines are performing the "cost for extractelement"? I will try that.

ABataev added inline comments.Feb 22 2022, 11:30 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
5921–5922

Yep, you got it right.

bcahoon updated this revision to Diff 410614.Feb 22 2022, 12:55 PM

Changes based upon review. Don't just skip over insertelements with a
non-constant index. Instead, fall through to include additional cost
calculation.

This revision is now accepted and ready to land.Feb 22 2022, 12:56 PM
This revision was landed with ongoing or failed builds.Feb 22 2022, 2:01 PM
This revision was automatically updated to reflect the committed changes.
bcahoon marked an inline comment as done.