Currently, AArch64 doesn't support vectorization for non temporal loads because isLegalNTLoad is not implemented for the target.
This patch applies similar functionality as D73158 but for non temporal loads
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Could you please update the description to provide some details about the fix? It will also require building on top of D131773 to also add support for generating LDNP for smaller types than 256 bits.
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h | ||
---|---|---|
329 | This can be dyn_cast<FixedVectorType>, which avoids the need for the cast<..> below. The same thing can be done above in isLegalNTStore too. |
This sounds good to me if Florian has not further comments.
llvm/test/Transforms/LoopVectorize/AArch64/nontemporal-load-store.ll | ||
---|---|---|
283–284 | Is it worth adding a quick check line for the code that is produced? ; CHECK: = load <16 x i8>, <16 x i8>* {{.*}}, align 1, !nontemporal !0 so that the new load is tested. |
Refactor common code between isLegalNTStore and isLegalNTLoad and move it to a seperate function.
Add extra checks in nontemporal-load-store.ll to look for the generated load instruction
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h | ||
---|---|---|
324 | This could also be folded into isLegalNTStoreLoad, same for return BaseT::isLegalNTStore(DataType, Alignment);? |
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h | ||
---|---|---|
312 | This should be also moved to isLegalNTStoreLoad |
llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h | ||
---|---|---|
312 | This line only or the whole comment block ? |
This should be also moved to isLegalNTStoreLoad