[[ https://reviews.llvm.org/D116332 | Make icmp of gep fold offset based ]] fold icmp of gep with globals variables, to a constant expression. This fails an assertion in reuseTableCompare function in SimplifyCFG.cpp where the expectation is either a true constant or a false constant, since now it can return a icmp constant expression which result is either a true or a false, we can remove that assertion without any functional changes to the code.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
@nikic generating a small test-case is really hard, since it was triggering due to small peephole under very specific constraints. what would you advise here?
If you extract the bitcode or IR from clang (-emit-llvm -Xclang -disable-llvm-optzns) and then run it through either llvm-reduce or bugpoint, it should be possible to create a small reproducer.
hi @nikic I tired both bugpoint and llvm-reduce both of them, failed to reduce the issue. It would be very much helpful to know how to proceed further. I'm new to the llvm middle-end, I could really use some help here.
If it will be helpful I will share the complete IR module which fails at assert.
Hi @nikic
https://gist.github.com/pvellien/e4f519d17b25adf10fdd5978ee0b1de9
After trying a lot, this is minimal IR i can able to find. It is reduction from 84k lines to 617 lines.
Please compile with Opt with O3.
LLVM top commit.
commit 5afbfe33e7d6ce40af0ad6d99421b443c45b351b (HEAD)
Author: Nikita Popov <npopov@redhat.com>
Date: Tue Dec 28 12:27:04 2021 +0100
[ConstantFold] Make icmp of gep fold offset based
Thanks!
Here's what I did: Create a crash.sh with:
#!/bin/sh ! build/bin/opt -S -O2 -o /dev/null < $1
And then run build/bin/llvm-reduce --test crash.sh input.ll. This produces a 424 line file.
Then I ran llvm/utils/reduce_pipeline.py --opt-binary build/bin/opt --passes 'default<O2>' --input reduced.ll --output output.ll. This produces a 231 line file (https://gist.github.com/nikic/9abff537ba4641543951c8d06aa35a3b) and a reduced pipeline: -passes="cgscc(devirt<4>(inline,function-attrs,function<eager-inv>(instcombine,loop-mssa(licm,loop-rotate),simplifycfg<bonus-inst-threshold=1;no-forward-switch-cond;no-switch-to-lookup;keep-loops;no-hoist-common-insts;no-sink-common-insts>,instcombine,loop(loop-unroll-full),sroa,jump-threading,memcpyopt))),function<eager-inv>(simplifycfg<bonus-inst-threshold=1;forward-switch-cond;switch-to-lookup;no-keep-loops;hoist-common-insts;sink-common-insts>)"
For some reason, the pipeline does not get reduced further (cc @markus) even though we can run the first part (the cgscc pipeline) and then feed it into the second part (the function pipeline). This gets us down to https://gist.github.com/nikic/506f4dd532e313ac5b3dcd2759fd209a and -passes="simplifycfg<switch-to-lookup>".
Now we can run it through llvm-reduce again, and get this reduction: https://gist.github.com/nikic/3d15c42b7b8f54e6253215576a4cba07 Using opt -S -metarenamer we can clean up the symbol names. With a tiny bit of manual cleanup we get this test case:
; RUN: opt -S -passes="simplifycfg<switch-to-lookup>" < %s | FileCheck %s target triple = "x86_64-unknown-linux-gnu" %struct.ham = type <{ i32, i32, i32, i8, i8, [2 x i8] }> @global = external constant [75 x { i32, i32, i32, i8, i8 }] define i1 @zot(i32 %arg) { bb: %tmp = icmp eq i32 %arg, 1 br i1 %tmp, label %bb6, label %bb1 bb1: ; preds = %bb %tmp2 = icmp eq i32 %arg, 2 br i1 %tmp2, label %bb6, label %bb3 bb3: ; preds = %bb1 %tmp4 = icmp eq i32 %arg, 0 br i1 %tmp4, label %bb6, label %bb5 bb5: ; preds = %bb3 br label %bb6 bb6: ; preds = %bb5, %bb3, %bb1, %bb %tmp7 = phi %struct.ham* [ null, %bb5 ], [ bitcast (i32* getelementptr inbounds ([75 x { i32, i32, i32, i8, i8 }], [75 x { i32, i32, i32, i8, i8 }]* @global, i64 0, i64 6, i32 0) to %struct.ham*), %bb ], [ null, %bb1 ], [ null, %bb3 ] %tmp8 = icmp eq %struct.ham* %tmp7, bitcast (i32* getelementptr inbounds ([75 x { i32, i32, i32, i8, i8 }], [75 x { i32, i32, i32, i8, i8 }]* @global, i64 1, i64 0, i32 0) to %struct.ham*) ret i1 %tmp8 }
For some reason, the pipeline does not get reduced further (cc @markus) even though we can run the first part (the cgscc pipeline) and then feed it into the second part (the function pipeline).
Thanks for letting me know @nikic. I will have a look to see what could be the problem and provide a fix if possible.
For some reason, the pipeline does not get reduced further (cc @markus) even though we can run the first part (the cgscc pipeline) and then feed it into the second part (the function pipeline).
Just looked into this a bit. Apparently the pipeline reduction looks into nested pass managers and will split up the cgscc pass manager, doing something like cgscc(inline) and cgscc(the-rest),function(simplifycfg). This no longer reproduces the issue, and the reduction is aborted early.
I think that either the early abort should be dropped so we later still try the cgscc(inline,the-rest) and function(simplifycfg) split, or pass manager splitting should be an entirely separate reduction step.
Right, I think what would make most sense would be to do multiple reduction steps where the tree depth where splitting is performed is increased for each reduction step.
@nikic Thanks a lot for the detailed description to reduce the test-case, it greatly helped me.
Updated the patch with test-cases.
llvm/lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5808–5809 | I think this should be something like: if (!CaseConst || CaseConst == DefaultConst || (CaseConst != TrueConst && CaseConst != FalseConst)) return; If we don't get back true or false, we don't assume which way the comparison goes. | |
llvm/test/Transforms/SimplifyCFG/X86/switch-to-lookup-globals.ll | ||
11 | Can you please use update_test_checks.py instead? |
LGTM
llvm/lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5808 | You can drop the isa<UndefValue> check here, because it is already covered by the other condition (undef is neither true nor false). |
@nikic thanks a lot for accepting. Could you please commit this patch on my behalf? I don't have a commit access to llvm. thanks
You can drop the isa<UndefValue> check here, because it is already covered by the other condition (undef is neither true nor false).