This is an archive of the discontinued LLVM Phabricator instance.

[SLP] Avoid converting undef to poison when gathering.
AbandonedPublic

Authored by vporpo on Jun 2 2022, 7:48 PM.

Details

Summary

The reduced test case is in phi-undef-input.ll

We expect SLP to generate code like this:

entry:
  br i1 %cond, label %bb2, label %bb3
bb2:
  br label %bb3
bb3:
  %phi = phi <2 x i8> [ %arg0, %bb2 ], [ <i8 0, i8 undef> %entry ]
  %zext = zext <2 x i8> %phi to <2 x i32>

If we get to bb3 through the entry block, then %zext lane 1 should have its high-order bits 0.

But instead it generates a poison in phi's input coming from %entry:

entry:
  br i1 %cond, label %bb2, label %bb3
bb2:
  br label %bb3
bb3:
  %phi = phi <2 x i8> [ %arg0, %bb2 ], [ <i8 0, i8 poison> %entry ]
  %zext = zext <2 x i8> %phi to <2 x i32>

This time if we get to bb3 through the entry block, then %zext lane 1 is poison, so no guarantees about the high-order bits.

Please also check: https://alive2.llvm.org/ce/z/aMa3qD

Diff Detail

Event Timeline

vporpo created this revision.Jun 2 2022, 7:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2022, 7:48 PM
vporpo requested review of this revision.Jun 2 2022, 7:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 2 2022, 7:48 PM
vporpo added a comment.Jun 2 2022, 7:51 PM

Nuno was right, the previous patch was over reduced. The issue happens when the input is reachable and undef. I think in this case SLP is actually changing the program behavior.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7559

@ABataev I am not sure this is the right fix for it, please take a look.

nlopes added a comment.Jun 3 2022, 2:22 AM

That's a bug, yes. I won't comment on the fix as I don't know enough about SLP's code.

ABataev added inline comments.Jun 3 2022, 4:14 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7559

I think you need to drop && NumValues == 1 check, otherwise we may miss something like undef, 0, under, undef

vporpo updated this revision to Diff 434139.Jun 3 2022, 2:05 PM

I think this fix should work better.

MaskRay accepted this revision.EditedJun 3 2022, 3:04 PM

Better to have @ABataev's approval.

Was the issue introduced by D103458?

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7564–7567
7566

Note: I have tried isa<PoisonValue>(V) ? PoisonValue::get(VL[0]->getType()) : UndefValue::get(VL[0]->getType()) but that seems to break a number of tests.

This revision is now accepted and ready to land.Jun 3 2022, 3:04 PM

Better to have @ABataev's approval.

Was the issue introduced by D103458?

Looks so

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7565

Why just not UniqueValue.append(std::next(VL.begin(), VF-UniqueValues.size()), VL.end());?

vporpo marked 2 inline comments as done.Jun 3 2022, 5:22 PM

Thanks @nlopes for confirming that this is a bug.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7565

Yes, that should work.

7566

I changed it to what Alexey suggested, it may work better.

vporpo updated this revision to Diff 434215.Jun 3 2022, 5:24 PM
vporpo marked an inline comment as done.

Updated the fix. It seems to be generating broadcasts with insertelements now instead of shuffles.

Please change the summary (commit message) to describe the commit caused regression.
It helps other rolling-update users to find a stable state in llvm.

llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7564–7567

Better to reword the comment what this append does, and how it avoids appending poison.

The new code isn't clear how poison is avoided (while the old is).

vporpo updated this revision to Diff 434231.Jun 3 2022, 6:30 PM
vporpo marked an inline comment as done.

Addressed comments.

hvdijk added a comment.Jun 3 2022, 9:44 PM

The updates to the tests suggest that this doesn't just fix the broken cases, it generates worse code where the previously generated code was already correct: I'm seeing insertelements into a vector immediately followed by a shufflevector that discards the inserted element, where previously those elements were not set. Is there some way that we can avoid that?

nlopes added inline comments.Jun 4 2022, 12:59 AM
llvm/test/Transforms/SLPVectorizer/X86/slp-fma-loss.ll
55

I second @hvdijk's comment. Your last update to the patch introduced significant regressions.
This function was just fine and shouldn't be affected by the patch.

ABataev added inline comments.Jun 4 2022, 3:07 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7565

Try this to fix the regressions:

UniqueValues.append(NumValues-UniqueValues.size(), PoisonValue::get(VL[0]->getType())); 
UniqueValue.append(std::next(VL.begin(), NumValues), VL.end());
ABataev added inline comments.Jun 4 2022, 3:21 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7565

Though, I think, it still may lead to incorrect results. We'd better to keep the original VL here or just do something like:

UniqueValues.assign(VL.begin(), VL end());
hvdijk added inline comments.Jun 4 2022, 7:26 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7530

What if we try this the other way around? We have the initialisation of NumValues here to find the last non-undef value and then assume we do not need to look beyond that. What if we look for the last non-poison value there, by changing this to return !isa<PoisonValue>(V); and treat undef as just another constant that needs to be preserved? (The isa<UndefValue> inside the next loop would also need to be changed to isa<PoisonValue>.) That appears to fix your test just as well, and the impact on other existing SLPVectorizer tests is far less great. Does that also fix your original test?

ABataev added inline comments.Jun 4 2022, 7:38 PM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7530

It won't help, there is still a bug, better just to copy VL in the corner case.

hvdijk added inline comments.Jun 5 2022, 2:54 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7530

Why won't it help? It means we only reach the append of PoisonValue when we actually want to append PoisonValue, so we won't need to fix things up there, does it not?

ABataev added inline comments.Jun 5 2022, 4:08 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7530

UniqueValues may have different order of values than VL, it is not enough just append the tail, also need to reorder values. It is easier just to copy the whole VL.

hvdijk added inline comments.Jun 5 2022, 4:13 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7530

Yeah, but if we treat UndefValue as just another constant, the logic for handling all other constants already covers that too. I think I may not be explaining accurately what I mean, I can put that up as a separate change later to make it clearer.

ABataev added inline comments.Jun 5 2022, 4:58 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7530

Yes, possible. Just not sure if we need it. I think it is just 3nough to use original VL in tgis corner case, other cases should be handled by UndefMaskElem.

hvdijk added inline comments.Jun 5 2022, 5:09 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7530

Should be, but aren't (at least not always) because when we get to ShuffleInstructionBuilder::finalize, we check ShuffleVectorInst::isIdentityMask which can return true for masks containing undef despite the fact that it's not actually an identity mask, and the correct shufflevector we would otherwise generate gets optimised away, so we again let poison through. We avoid that whole problem if we don't use undef masks to get undef values.

ABataev added inline comments.Jun 5 2022, 5:30 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7530

Ah, yes, you're right. Ok, go ahead with your solution.

hvdijk added inline comments.
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7530

Okay, I have created D127073 for that. I would encourage @vporpo to check if the other approach can be made to have a smaller impact on tests though, and then go with whatever shows the best results for tests.

vporpo abandoned this revision.Jun 6 2022, 10:22 AM

Thanks everyone for taking a look. Yeah this patch is broken, we can go ahead with whichever version works best. I just pushed the test and will abandon this revision.