Same as D60846 but with a fix for the problem encountered there which was a missing context adjustment in the handling of PHI nodes.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/lib/Analysis/InstructionSimplify.cpp | ||
---|---|---|
556 | Thx, will do. | |
llvm/lib/Analysis/ValueTracking.cpp | ||
1906 | I thought so too, run it, and caused the assert to trigger. The last case has two ifs nested. That is already a way out of the first case, might be more. I don't think we want to look at dominance of constants (for now) so I just return false. There is actually something we could do here, but hat is a different story. |
llvm/lib/Analysis/ValueTracking.cpp | ||
---|---|---|
1906 | I don't think so, and unfortunately !isa<ConstantPointerNull> doesn't work either (but that would be closer). I was hoping that "easy" constants don't make it here and hard ones would need us to inspect the surrounding (and are unlikely anyway, e.g., globals with weak linkage) |
For reference, here's the IR reproducer based on the provided C code:
define zeroext i8 @test(i8* nocapture readonly %0) local_unnamed_addr #0 { br label %2 2: ; preds = %5, %1 %.06 = phi i8 [ 0, %1 ], [ %11, %5 ] %.0 = phi i32 [ 0, %1 ], [ 1, %5 ] %3 = icmp eq i32 %.0, 0 br i1 %3, label %5, label %4 4: ; preds = %2 ret i8 %.06 5: ; preds = %2 %6 = load i8, i8* %0, align 1 %7 = zext i8 %6 to i32 %8 = shl nuw nsw i32 %.0, 3 %9 = shl nuw nsw i32 %7, %8 %10 = trunc i32 %9 to i8 %11 = or i8 %.06, %10 %12 = add nuw nsw i32 %.0, 1 br label %2 }
The ret gets optimized to ret i8 0 with this patch.
Yep, that looks like a miscompile to me.
alive doesn't support phi's yet, so i'm not 100% sure,
but if i manually unroll it (since we know we will return after first iteration),
i get
; ModuleID = './example.ll' source_filename = "./example.ll" define zeroext i8 @test(i8* %arg) local_unnamed_addr { bb: %tmp4 = load i8, i8* %arg, align 1 %tmp5 = zext i8 %tmp4 to i32 %tmp6 = shl nuw nsw i32 0, 3 %tmp7 = shl nuw nsw i32 %tmp5, %tmp6 %tmp8 = trunc i32 %tmp7 to i8 %tmp9 = or i8 0, %tmp8 %tmp10 = add nuw nsw i32 0, 1 ret i8 %tmp9 }
which is just
define zeroext i8 @test(i8* nocapture readonly %arg) local_unnamed_addr #0 { bb: %tmp4 = load i8, i8* %arg, align 1 ret i8 %tmp4 }
And that replacement says the load is always 0.
@nikic Thanks for reopening this, I did not see the problem at all!
Same thing as last time. When we recurse through PHI nodes we need to update the context instruction or we get information for the wrong place. This time I fixed it in computeKnownBits.
I created D71181 with the proper fixes and additional test case.
You can use Q.getWithInstruction(InTI) here.