Fix https://bugs.llvm.org/show_bug.cgi?id=44419 by preserving the nuw on sub of geps. We only do this for the simple case where we have one non-swapped gep. We could also preserve some information for the two gep case, but this would be more complicated.
Details
Diff Detail
Event Timeline
Thank you for looking into this.
llvm/test/Transforms/InstCombine/sub-gep.ll | ||
---|---|---|
30–41 | ---------------------------------------- define i64 @test_nuw(* %base, i64 %idx) { %0: %p1 = gep * %base, 0 x i64 0, 4 x i64 0 %p2 = gep * %base, 0 x i64 0, 4 x i64 %idx %i1 = ptrtoint * %p1 to i64 %i2 = ptrtoint * %p2 to i64 %d = sub nuw i64 %i2, %i1 ret i64 %d } => define i64 @test_nuw(* %base, i64 %idx) { %0: %P2_IDX = shl nuw i64 %idx, 2 ret i64 %P2_IDX } Transformation doesn't verify! ERROR: Target is more poisonous than source Example: * %base = null i64 %idx = undef Source: * %p1 = null * %p2 = null [based on undef value] i64 %i1 = #x0000000000000000 (0) i64 %i2 = #x0000000000000000 (0) i64 %d = #x0000000000000000 (0) Target: i64 %P2_IDX = poison Source value: #x0000000000000000 (0) Target value: poison Summary: 0 correct transformations 1 incorrect transformations 0 errors |
llvm/test/Transforms/InstCombine/sub-gep.ll | ||
---|---|---|
30–41 | Thanks for catching this! To give a more specific example, if %base=0 and %idx has only the top bit set, then %p1 and %p2 will both be zero and the original sub is nuw, but shifting the %idx is not nuw. |
I think this is still not quite right for the case where there are multiple GEP indexes. For example:
define i64 @test_inbounds_nuw_multi_index([0 x [2 x i32]]* %base, i64 %idx, i64 %idx2) { ; CHECK-LABEL: @test_inbounds_nuw_multi_index( ; CHECK-NEXT: [[P2_IDX:%.*]] = shl nuw nsw i64 [[IDX:%.*]], 3 ; CHECK-NEXT: [[P2_IDX1:%.*]] = shl nuw nsw i64 [[IDX2:%.*]], 2 ; CHECK-NEXT: [[P2_OFFS2:%.*]] = add i64 [[P2_IDX]], [[P2_IDX1]] ; CHECK-NEXT: ret i64 [[P2_OFFS2]] ; %p1 = getelementptr inbounds [0 x [2 x i32]], [0 x [2 x i32]]* %base, i64 0, i64 0, i64 0 %p2 = getelementptr inbounds [0 x [2 x i32]], [0 x [2 x i32]]* %base, i64 0, i64 %idx, i64 %idx2 %i1 = ptrtoint i32* %p1 to i64 %i2 = ptrtoint i32* %p2 to i64 %d = sub nuw i64 %i2, %i1 ret i64 %d }
Let's say %idx=-1, %idx2=4, then the overall result is 8 * %idx1 + 4 * %idx2 = 8, which is positive, even though one of the intermediate indexes is negative and as such can't use shl nuw. I think in this case the nuw can't be on any of the instructions (including also not the add). Right?
Sure.
---------------------------------------- define i64 @test_inbounds_nuw_multi_index(* %base, i64 %idx, i64 %idx2) { %0: %p1 = gep inbounds * %base, 0 x i64 0, 8 x i64 0, 4 x i64 0 %p2 = gep inbounds * %base, 0 x i64 0, 8 x i64 %idx, 4 x i64 %idx2 %i1 = ptrtoint * %p1 to i64 %i2 = ptrtoint * %p2 to i64 %d = sub nuw i64 %i2, %i1 ret i64 %d } => define i64 @test_inbounds_nuw_multi_index(* %base, i64 %idx, i64 %idx2) { %0: %P2_IDX = shl nsw nuw i64 %idx, 3 %P2_IDX1 = shl nsw nuw i64 %idx2, 2 %P2_OFFS2 = add i64 %P2_IDX, %P2_IDX1 ret i64 %P2_OFFS2 } Transformation doesn't verify! ERROR: Target is more poisonous than source Example: * %base = pointer(non-local, block_id=1, offset=498773658766409739) i64 %idx = #x0322800007800000 (225883668936130560) i64 %idx2 = #xfa00000000000000 (18014398509481984000, -432345564227567616) Source: * %p1 = pointer(non-local, block_id=1, offset=498773658766409739) * %p2 = pointer(non-local, block_id=1, offset=576460753345183755) i64 %i1 = #x171bffffc3ffffff (1665205961213607935) i64 %i2 = #x182fffffffffffff (1742893055792381951) i64 %d = #x011400003c000000 (77687094578774016) Target: i64 %P2_IDX = #x191400003c000000 (1807069351489044480) i64 %P2_IDX1 = poison i64 %P2_OFFS2 = poison Source value: #x011400003c000000 (77687094578774016) Target value: poison Summary: 0 correct transformations 1 incorrect transformations 0 errors
---------------------------------------- define i64 @test_inbounds_nuw_multi_index(* %base, i64 %idx, i64 %idx2) { %0: %p1 = gep inbounds * %base, 0 x i64 0, 8 x i64 0, 4 x i64 0 %p2 = gep inbounds * %base, 0 x i64 0, 8 x i64 %idx, 4 x i64 %idx2 %i1 = ptrtoint * %p1 to i64 %i2 = ptrtoint * %p2 to i64 %d = sub nuw i64 %i2, %i1 ret i64 %d } => define i64 @test_inbounds_nuw_multi_index(* %base, i64 %idx, i64 %idx2) { %0: %P2_IDX = shl nsw i64 %idx, 3 %P2_IDX1 = shl nsw i64 %idx2, 2 %P2_OFFS2 = add i64 %P2_IDX, %P2_IDX1 ret i64 %P2_OFFS2 } Transformation seems to be correct! Summary: 1 correct transformations 0 incorrect transformations 0 errors
Fun fact: If you change https://github.com/llvm/llvm-project/blob/773667b8c20d35c18334f8c7663df8ceacfdd2e5/llvm/include/llvm/Analysis/Utils/Local.h#L92 to not emit a useless add 0, %x you get a regression in https://github.com/llvm/llvm-project/blob/773667b8c20d35c18334f8c7663df8ceacfdd2e5/llvm/test/Transforms/InstCombine/icmp.ll#L513-L524, because IRBuilder instructions get inserted into the InstCombine worklist in the wrong order :(
Only add nuw to final mul.
As the scope of this optimization turned out to be a lot smaller than I originally expected, I've moved this as a fixup operation in OptimizePointerDifference(), instead of directly emitting the nuw flag in EmitGEPOffset(), which is somewhat tricky to do.
The code is slightly odd due to the need to match the zero add, which I wasn't able to simply drop for the reason mentioned above.
Not really familiar with this code.
This seems correct to me, but maybe i'm missing some subtlety.
llvm/test/Transforms/InstCombine/sub-gep.ll | ||
---|---|---|
88 | Precommit this? |