Diff Detail
Event Timeline
lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
6402 | I don't think this is correct. Instead, you should try to collect GEPs with the different address spaces into different nodes |
lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
6402 | That's what I thought at first, but his isn't vectorizing the GEP itself as far as I can see. It only cares about the index operation, which could be the same size between different address spaces |
lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
6402 | Yes, I see now. But I still think this not quite correct, just like the original code. As I understand, it tries to vectorize indices with the same base. So, when we gather GEPs, we just don't need to use GetUnderlyingObject(GEP->getPointerOperand(), *DL) as the key, just GEP->getPointerOperand(). I think, we should not dig deep into the base of the GEP. |
lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
6402 | This isn't considering the bases at all, which makes sense based on this comment: // Vectorize the index computations of getelementptr instructions. This // is primarily intended to catch gather-like idioms ending at // non-consecutive loads. |
lib/Transforms/Vectorize/SLPVectorizer.cpp | ||
---|---|---|
6402 | No, it considers the bases when gathers the GEPs in bundles. These bundles are grouped using GetUnderlyingObject(GEP->getPointerOperand(), *DL) as the key, but instead it should use just GEP->getPointerOperand(). This may effect in best vectorization result, may reduce the compilation time + fix your problem with the compiler crash as all GEPs in bundles will have the same base + index. It means that the types of the SCEVs also will be the same. |
test/Transforms/SLPVectorizer/AMDGPU/address-space-ptr-sze-gep-index-assert.ll | ||
---|---|---|
2 | Use utils/update_test_checks.py to generate correct checks for the test |
test/Transforms/SLPVectorizer/AMDGPU/address-space-ptr-sze-gep-index-assert.ll | ||
---|---|---|
10–15 | You need to manually cleanup the old manually-generated check-lines. |
I don't think this is correct. Instead, you should try to collect GEPs with the different address spaces into different nodes