Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
1029–1030 | Can VecTy be changed to FixedVectorType*? | |
llvm/lib/Transforms/Vectorize/VectorCombine.cpp | ||
431 | IIUC, we can't safely cast to FixedVectorType at this point (the dyn_cast may have failed). Should we add a test like this: define <vscale x 4 x float> @scalable_bitcast_same_elt_size(<vscale x 4 x i32> %v) { %shuf = shufflevector <vscale x 4 x i32> %v, <vscale x 4 x i32> undef, <vscale x 4 x i32> zeroinitializer %r = bitcast <vscale x 4 x i32> %shuf to <vscale x 4 x float> ret <vscale x 4 x float> %r } I would add that in a preliminary commit myself, but it already crashes somewhere in the TTI cost model. |
llvm/lib/Transforms/Vectorize/VectorCombine.cpp | ||
---|---|---|
431 | I can take a look at adding this test case and seeing if I can get to this line with a scalable vector. |
llvm/lib/Transforms/Vectorize/VectorCombine.cpp | ||
---|---|---|
431 | Looking at this more closely, it occurs to me that the TTI stuff is completely unimplemented for scalable vectors currently, so I'm not going to be able to add this test case. That said, for now, the scope of the cast to FixedVectorType can be reduced to just the two calls to getNumElements(). In principle, it should be possible to implement this function in terms of ElementCount. I'll add a fixme for this. I think this should be safe, assuming it passes all the tests. |
llvm/lib/Transforms/Vectorize/VectorCombine.cpp | ||
---|---|---|
431 | I think there's still a potential bug in doing the plain cast<> even if we can't show it in a test at this time. auto *SrcTy = dyn_cast<FixedVectorType>(V->getType()); if (!DestTy || !SrcTy || ...) |
llvm/lib/Transforms/Vectorize/VectorCombine.cpp | ||
---|---|---|
431 | What is the bug? V must be nonnull because the matcher succeeded, and it must be a value with type VectorType, because it is the first argument to a shufflevector. What am I missing? |
llvm/lib/Transforms/Vectorize/VectorCombine.cpp | ||
---|---|---|
431 | The diff has changed, so the above code is fine now. But the diff below is nakedly casting to FixedVectorType. How did we ensure that our generic VectorType values actually are FixedVectorType? Ie, if we re-arrange this code for some reason to move the TTI checks after the diff below, then my test example will crash on this line: unsigned DestNumElts = cast<FixedVectorType>(DestTy)->getNumElements(); |
llvm/lib/Transforms/Vectorize/VectorCombine.cpp | ||
---|---|---|
431 | Oh yeah, sorry about that. I posted the comments before I pushed the diff. To answer your question, nothing prevents foldBitcastShuf from being called with a scalable vector. It just happens to not happen yet. Prior to this change, the code just does the wrong thing for scalable vectors. It might just happen to work, given how shuffle masks for scalable vectors work, but if masks other than all 0, or all -1 are ever added, this will break down. For all these patches to remove calls to getNumElements(), the strategy has been to assume that existing calls to getNumElements() that don't check if the vector is scalable are explicitly expecting the vector to have fixed width. All these calls are being changed to unconditionally cast to FixedVectorType. If the existing tests don't break, then it must be the case that the assumption was correct. I acknowledge that on some level, substituting a miscompile with a crash is actually a behavior change. However, my goal has been to prevent different control flow paths from being taken after the point of the bug. At the point of a call to getNumElements(), either the compiler will crash if it gets a scalable vector, or it will behave the same as it did before. For this case, TTI.getShuffleCost will already crash if it DestTy or SrcTy are scalable vectors, so this really is an NFC. |
llvm/lib/Transforms/Vectorize/VectorCombine.cpp | ||
---|---|---|
431 | Thanks for explaining. I'm ok with that reasoning. And as noted, we're going to crash in TTI if anyone tries to push scalable vector code through here or the other vector passes. Note: Since I became aware of the scalable vector changes that you've been making, I have tried to be safer by using dyn_cast<FixedVectorType> in this pass (because I don't know if the fixed vector transforms I'm looking at translate to scalable). So I'd have gone with some variation of the earlier version of this diff to avoid being blamed for crashing, but we can leave this as-is if that makes it easier to get scalable vectors off the ground. |
Can VecTy be changed to FixedVectorType*?