This change enables cases for which the index value for the first load/store instruction
in a pair could be a function argument. This allows using llvm.assume to provide known
bits information in such cases.
Details
Diff Detail
Event Timeline
llvm/test/Transforms/LoadStoreVectorizer/X86/vectorize-i8-nested-add.ll | ||
---|---|---|
400 | I've realized this should be "v2i8" in the test name. |
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
521 | By the way, we have discussed some of these points with @rtereshin and @bogner Neither of OpA and OpB is a point of the vectorization. They both dominate the point of vectorization, but there's no information here about which one is closer to it. This change only enables more cases to pass: if OpB is nullptr then it means we have bailed long ago in the function: // At this point A could be a function parameter, i.e. not an instruction Value *ValA = OpA->getOperand(0); OpB = dyn_cast<Instruction>(OpB->getOperand(0)); if (!OpB || ValA->getType() != OpB->getType()) return false; ) I just allow it not to bail if OpA appears to be nullptr. There's no assumption as to which one goes at the point or closer to the point of vectorization. Also without llvm.assume I couldn't differentiate the two contexts in a test. And I added a test with the llvm.assume - ld_v2i8_add_different_contexts. Now, sometimes OpA might be with a better context for the calculation of known bits. In such cases we might fail to vectorize. If llvm.assume's are inconsistent in the two points then this ought to be an UB. For an improvement in the future we might need to use the actual vectorization point as the context. |
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
521 | Don't you think ld_v2i8_add_different_contexts would count as one, or do you prefer a test without a control flow in it? | |
521 |
Having said that, if a case is a result of unrolling, in many such occasions, OpB indeed naturally goes after OpA. And that means the context is more precise |
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
521 | ld_v2i8_add_different_contexts does not have an assume between the two loads, so I don't mean that |
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
521 | I see some unexpected results when I place llvm.assume between loads, or just before the first load. A safer version could be to still use OpA if it's not nullptr and OpB if it is. |
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
521 | In fact llvm.assume itself did block the transformation as it was considered to be writing to the memory. |
Added tests where the assume intrinsic is placed between the vectorized instructions or even after them in the end of the basic block.
Added assume intrinsic to the list of ignored instructions when detecting whether it's legal to vectorize over an instruction which may read or write to memory.
Update the test ld_v2i8_add_context to align it with the comment put for the test: the llvm.assume should be placed between the two loads.
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
521 | Now the tests ld_v2i8_add_different_contexts1 and ld_v2i8_add_context have llvm.assume between the vectorized loads. |
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
521 | I've added the tests along with the patch to let them show a difference (ignore llvm.assume itself when checking on memory conflicts between the two loads). Are there more concerns? |
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
678 | isa<AssumeInst> |
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
678 | Thanks! This could be addressed later on with the other intrinsics where that is applicable. |
How do we know A is the point the vectorized instruction will be inserted, and not B?