Diff Detail
- Repository
- rL LLVM
Event Timeline
I don't think this should be needed? Is this representative of the case where you saw this as a problem? If I run this testcase through-separate-const-offset-from-gep first it handles it. This and other cases are expected to be cleaned up by that first
It is quite representative. Here is the real piece of code from app, right as it comes to the vectorizer:
%97 = zext i32 %96 to i64 %98 = getelementptr inbounds float, float addrspace(1)* %1, i64 %97 %99 = bitcast float addrspace(1)* %98 to i32 addrspace(1)* %100 = load i32, i32 addrspace(1)* %99, align 4, !tbaa !10 %101 = add i32 %96, 1 %102 = zext i32 %101 to i64 %103 = getelementptr inbounds float, float addrspace(1)* %1, i64 %102 %104 = bitcast float addrspace(1)* %103 to i32 addrspace(1)* %105 = load i32, i32 addrspace(1)* %104, align 4, !tbaa !10 %106 = add i32 %96, 2 %107 = zext i32 %106 to i64 %108 = getelementptr inbounds float, float addrspace(1)* %1, i64 %107 %109 = bitcast float addrspace(1)* %108 to i32 addrspace(1)* %110 = load i32, i32 addrspace(1)* %109, align 4, !tbaa !10 %111 = add i32 %96, 3 %112 = zext i32 %111 to i64 %113 = getelementptr inbounds float, float addrspace(1)* %1, i64 %112 %114 = bitcast float addrspace(1)* %113 to i32 addrspace(1)* %115 = load i32, i32 addrspace(1)* %114, align 4, !tbaa !10
What does it look like immediately after SeparateConstOffsetFromGEP? Does one of the other passes break this somehow?
This is after SeparateConstOffsetFromGEP:
%408 = zext i32 %407 to i64 %409 = getelementptr inbounds float, float addrspace(1)* %1, i64 %408 %410 = bitcast float addrspace(1)* %409 to i32 addrspace(1)* %411 = load i32, i32 addrspace(1)* %410, align 4, !tbaa !10 %412 = or i32 %407, 1 %413 = zext i32 %412 to i64 %414 = getelementptr inbounds float, float addrspace(1)* %1, i64 %413 %415 = bitcast float addrspace(1)* %414 to i32 addrspace(1)* %416 = load i32, i32 addrspace(1)* %415, align 4, !tbaa !10 %417 = or i32 %407, 2 %418 = zext i32 %417 to i64 %419 = getelementptr inbounds float, float addrspace(1)* %1, i64 %418 %420 = bitcast float addrspace(1)* %419 to i32 addrspace(1)* %421 = load i32, i32 addrspace(1)* %420, align 4, !tbaa !10 %422 = or i32 %407, 3 %423 = zext i32 %422 to i64 %424 = getelementptr inbounds float, float addrspace(1)* %1, i64 %423 %425 = bitcast float addrspace(1)* %424 to i32 addrspace(1)* %426 = load i32, i32 addrspace(1)* %425, align 4, !tbaa !10
Bitcasts are there plus we have or's instead of adds.
Can you add a test with bitcasts between pointers with different element types? I thought the or problem was also supposed to be solved
Matt, what do you exactly mean by pointers with different element types? They are different in this test, float vs i32.
I meant type sizes. float and i32 are both 4 bytes, I could see something going wrong if later code relied on this assumption if the source type were i8 for example
LGTM.
lib/Transforms/Vectorize/LoadStoreVectorizer.cpp | ||
---|---|---|
295 | May be turn it into a helper function? |
May be turn it into a helper function?