Use shufflevector to do the subvector extracts. This allows a lot more
load merging on AMDGPU and also on NVPTX when <2 x half> is involved.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Thanks, I completely forgot this didn't handle this
llvm/lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
1303 | Too many spaces before In? | |
llvm/test/Transforms/LoadStoreVectorizer/NVPTX/4x2xhalf.ll | ||
5 | Should use named values in tests | |
33 | Should add a larger variety of tests with more vector widths, particularly some 3 element vectors |
llvm/test/Transforms/LoadStoreVectorizer/NVPTX/4x2xhalf.ll | ||
---|---|---|
33 | Anything in particular you're looking for? Currently the pass also has the limitation of only creating vector sizes that are a power of 2, which means there is no interesting case for 3-element vectors. |
llvm/test/Transforms/LoadStoreVectorizer/NVPTX/4x2xhalf.ll | ||
---|---|---|
33 | Not handling those is of interest itself? Vectors of pointers are always an interesting source of edge cases |
llvm/test/Transforms/LoadStoreVectorizer/NVPTX/4x2xhalf.ll | ||
---|---|---|
33 | I don't think that's a particularly valuable test, but added some. |
llvm/test/Transforms/LoadStoreVectorizer/NVPTX/4x2xhalf.ll | ||
---|---|---|
95–96 | I would have expected this to vectorize |
llvm/test/Transforms/LoadStoreVectorizer/NVPTX/4x2xhalf.ll | ||
---|---|---|
95–96 | The pass explicitly disallows pointers of vectors. Shouldn't be hard to turn it on though, I guess it didn't really matter so far. |
Too many spaces before In?