It's possible for a scalar bitcast takes vector input and the current
LoopVectorize cannot manage it correctly and crash. Because a scalar
loop only contains scalar instructions that cannot update vectors, the
first vector operand of such bitcasts must be loop-invariant and should
be uniform.
Details
- Reviewers
reames fhahn craig.topper
Diff Detail
Event Timeline
Any suggestion on that? IIUC there's also some inconsistency as the current legality check fails if ExtractElementInst is found:
// Check that the instruction return type is vectorizable. // Also, we can't vectorize extractelement instructions. if ((!VectorType::isValidElementType(I.getType()) && !I.getType()->isVoidTy()) || isa<ExtractElementInst>(I)) { reportVectorizationFailure("Found unvectorizable type", "instruction return type cannot be vectorized", "CantVectorizeInstructionReturnType", ORE, TheLoop, &I); return false; }
which I suppose is uniform.
For context. loading i64 and loading <8 x i8> and bitcasting to i64 should be the same thing.
IIUC there's also some inconsistency as the current legality check fails if ExtractElementInst is found:
// Check that the instruction return type is vectorizable. // Also, we can't vectorize extractelement instructions. if ((!VectorType::isValidElementType(I.getType()) && !I.getType()->isVoidTy()) || isa<ExtractElementInst>(I)) { reportVectorizationFailure("Found unvectorizable type", "instruction return type cannot be vectorized", "CantVectorizeInstructionReturnType", ORE, TheLoop, &I); return false; }which I suppose is uniform.
I'm not really familiar with this area, so i can't quite help with this.
Let me try to catch your point, the key difference here is loading <8 x i8> cannot appear in a scalar loop and the legality check will ensure that. Thus the vector operand of bitcast must be defined outside the loop.
llvm/lib/Transforms/Vectorize/LoopVectorize.cpp | ||
---|---|---|
4736 | This should use isOutOfScope like the ExtractValueInst case? It would also be good to add a test case where the vector bit cast is loop-dependent. At the moment, we bail out on any instruction with a vector return type, but it would be good to have a case if the restriction gets removed some day. | |
4738 | No need to dump there I think, the message will be printed once the entry in the worklist has been processed. | |
llvm/test/Transforms/LoopVectorize/vector2scalar-bitcast.ll | ||
1 | the test doesn't requires asserts. | |
2 | This is missing CHECK lines? Should probably check at least the vector code generated. | |
4 | not needed | |
5 | dso_local not needed. | |
8 | those checks should not be needed | |
11 | no need for the zext here, could either use constant or pass %n as i64. | |
14 | for easier readability, move the block to the end and call something like exit:? | |
19 | indvars.* prefix makes the names unnecessarily long, could be dropped. | |
21 | Could you also add a variant where %illegal is used in the loop? |
This should use isOutOfScope like the ExtractValueInst case?
It would also be good to add a test case where the vector bit cast is loop-dependent. At the moment, we bail out on any instruction with a vector return type, but it would be good to have a case if the restriction gets removed some day.