Page MenuHomePhabricator

[SLP] Treat undef as any other constant
Needs ReviewPublic

Authored by hvdijk on Jun 5 2022, 7:51 AM.

Details

Summary

vporpo's phi-undef-input.ll shows how we would build vectors with poison where we wanted undef. Rather than continuing to treat undef as a special case and fixing the handling of it, this change fixes the problem by limiting the special casing to poison. undef is now treated as any other constant, by selecting from a specific input, and preserved that way.

This is an alternative to D126939.

Diff Detail

Event Timeline

hvdijk created this revision.Jun 5 2022, 7:51 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2022, 7:51 AM
hvdijk requested review of this revision.Jun 5 2022, 7:51 AM
ABataev added inline comments.Jun 5 2022, 8:03 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7598

For undefs need to push UndefMaskElem:

ReuseShuffleIndicies.emplace_back(isa<UndefValue>(V) ? UndefMaskElem : UniqueValues.size());
hvdijk added inline comments.Jun 5 2022, 8:05 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7598

I commented in D126939 that if we use UndefMaskElem when we want undef results, we run into another issue later on where we mistakenly treat a shuffle mask as an identity mask and remove the shuffle.

ABataev added inline comments.Jun 5 2022, 8:08 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7598

But here it should be a case? You're pushing original UndefValue to the UniqueValues vector. So, you have undervalue in the vector and can have UndefMaskElem for it in the mask.

hvdijk added inline comments.Jun 5 2022, 8:13 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7598

I'm pushing UndefValue in the UniqueValues vector, and its index into ReuseShuffleIndicies, to ensure that that UndefValue I just pushed gets used. If we push UndefMaskElem into ReuseShuffleIndicies (that's what we did before when undef was covered by the earlier if block), we may end up getting another value from UniqueValues instead, and that other value from UniqueValues may be poison.

hvdijk added inline comments.Jun 5 2022, 8:18 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7598

A concrete case may be easier to talk about: consider what happens with your idea when we want to build a vector [0, poison, poison, undef]. We get UniqueValues = [0, undef], and ReuseShuffleIndicies = [0, undef, undef, undef]. We then pad UniqueValues to get [0, undef, poison, poison], and interpret the ReuseShuffleIndicies as an identity mask. Thus we actually build a different vector from what we wanted. By preserving the index to get ReuseShuffleIndicies = [0, undef, undef, 1], we get a correct result.

ABataev added inline comments.Jun 5 2022, 8:36 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7598

But it should not be treated as an identity mask since it changes the size. If it does, it is a bug in finalize function.

hvdijk added inline comments.Jun 5 2022, 8:40 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7598

It does not change the size? The shuffle's operand (if it would actually be created) would be [0, undef, poison, poison] of size 4, with a shuffle mask of [0, undef, undef, undef], also of size 4.

ABataev added inline comments.Jun 5 2022, 8:45 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7598

But it should not. This vector should be built as is, without an extra shuffle, just a constant buildvector of the elements in their original order.

hvdijk added inline comments.Jun 5 2022, 8:59 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7598

Oh, you're right, I picked a bad example. If all inputs are constant we should end up with a constant, sure. And we also have special logic when we have a single non-constant value. That's separate from the undef/poison problem we're looking at though, please imagine what happens when we try to build [%a, %b, poison, undef] then (and I hope I picked a better example now): we get UniqueValues = [%a, %b, undef], ReuseShuffleIndicies = [0, 1, undef, undef], pad UniqueValues to [%a, %b, undef, poison], interpret [0, 1, undef, undef] as an identity mask, and get a wrong result. By preserving the index for undef we get ReuseShuffleIndicies = [0, 1, undef, 2], which gives a right result.

ABataev added inline comments.Jun 5 2022, 9:08 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7598

But here we also do not need a mask, actually, we need to build a vector in its original order, without extra shuffle.

hvdijk added inline comments.Jun 5 2022, 9:24 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7598

That would be better, sure. I don't think it's one we would do for that example (but I could be wrong), and a patch that fixes the poison/undef bug while also optimising that example would be better than this patch. We don't have that yet, but I'm happy to wait a little bit to see if someone (whether you, me, or someone else) can figure out how to do that.

ABataev added inline comments.Jun 5 2022, 9:33 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7598

I can do it tomorrow, if it is ok.

hvdijk added inline comments.Jun 5 2022, 9:53 AM
llvm/lib/Transforms/Vectorize/SLPVectorizer.cpp
7598

Thanks, much appreciated. If I can figure that out today, I'll update, otherwise I'll wait for you.

Thanks. I like this version which is cleaner:)

nlopes added inline comments.Jun 5 2022, 11:41 AM
llvm/test/Transforms/SLPVectorizer/X86/diamond_broadcast_extra_shuffle.ll
120

this is a regression. poison was just fine there

121

this is also a regression.
Before we were doing a broadcast of %ld and doing mul %ld, %ld in all 4 lanes, as the undefs were replaced with %ld.

hvdijk added inline comments.Jun 5 2022, 12:13 PM
llvm/test/Transforms/SLPVectorizer/X86/diamond_broadcast_extra_shuffle.ll
121

Replacing undef with %ld is only valid if we can prove %ld is not poison, and we have no checks for that. It might be valid here, but it is not valid in general, is it?

nlopes added inline comments.Jun 5 2022, 1:39 PM
llvm/test/Transforms/SLPVectorizer/X86/diamond_broadcast_extra_shuffle.ll
121

True in general. Here it's fine because %ld is already present in the expression tree.
Plus these days clang emits the noundef attribute, so it's worth checking for it.

FWIW, the good news is that there's no information loss, so we can have instcombine recover from some of these patterns. But I would suggest checking for isGuaranteedNotToBePoison() (cf ValueTracking).