e.g. if a>=b && a!=b, then we can infer a>b. this patch will combine the
a>=b and a!=b into a new compare instruction a>b first, then do the next
level implication calculation by reusing the existing isImpliedCondition
function.
Details
Diff Detail
Event Timeline
llvm/lib/IR/Instructions.cpp | ||
---|---|---|
3401–3402 ↗ | (On Diff #175016) | This should already exist elsewhere, i think |
use existing logic to implement 'AND' operation of two compare instruction.
getICmpCode in InstCombine already done this job, reuse this logic instead of adding redundant function.
add most test cases to cover.
Please always upload all patches with full context (-U99999)
llvm/lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5953–5955 | And this too might already exist.. |
llvm/lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5953–5955 | Thanks for the comments. |
llvm/lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5953–5955 | Perhaps it can be promoted to not be private? (cc @spatel) |
upload full diff with -U99999 for other's review comments.
lebedev.ri suggested to use InstCombiner::foldAndOfICmps() instead of duplicate logic, this requires function change from private to public, also a new instance of InstComber is needed in SimplifyCFGOpt, pending for discussions.
Thanks.
llvm/lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5953–5955 | I think the way to share the code is to move it from InstCombine to CmpInstAnalysis.cpp. That's how CmpInstAnalysis was created in the first place. I'm not sure if would avoid that work, but have you looked at trying to simplify the icmp in CorrelatedValuePropagation instead of here in SimplifyCFG? |
llvm/lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5953–5955 | Thanks for the reply. Also, I've investigated about moving InstCombiner::foldAndOfICmps() to other file like CmpInstAnalysis.cpp, more than 1000+ LOC changes required as this function calls 10+ member functions like foldAndOrOfICmpsOfAndWithPow2(), simplifyRangeCheck(), insertRangeTest() and static functions getNewICmpValue(), foldLogOpOfMaskedICmps,foldLogOpOfMaskedICmps(), foldAndOrOfEqualityCmpsWithConstants(), foldSignedTruncationCheck(). In addition, the variable IRBuilder<TargetFolder, IRBuilderCallbackInserter> is different from outside need change. |
llvm/lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5953–5955 | Thanks for checking on CVP. I haven't looked into that pass very much, but if you can simplify the icmp to true/false within that pass, then doesn't that allow dead code elimination/instcombine/simplifycfg to later kill all of the unused instructions? I agree that any refactoring of the code from instcombine should be done in separate patches from this one. I also agree that it's a big mess of code. Maybe we can break that job up to allow progress for this patch? As a preliminary step, you can commit the test cases from this patch to trunk with the current output. For that, please use FileCheck rather than 'grep' and use utils/update_test_checks.py to generate the assertions. There are many examples of test files that use that script, but feel free to ask questions if it's not clear how to use it. I also ask that you give each test a comment, so we can tell what the differences are between the tests. |
llvm/lib/Transforms/Utils/SimplifyCFG.cpp | ||
---|---|---|
5953–5955 | Thanks, spatel. The test case is already split and uploaded to https://reviews.llvm.org/D54994. The problem is this type of icmps ( a>=b && a!=b => a>b) can't be simplified, they can only be folded by the function InstCombiner::foldAndOfICmps(). I just come up with another idea that whether can this be implemented in this way of combination optimization: the first round of 'simplifycfg' will do the condition check , if BCmpPred equals to (ACmpPred && BCmpPred), do nothing(means it may be folded, to avoid endless loop), else replace the BCmpPred with the AND instruction, this AND instruction can be folded to BCmpPred2(i.e. a>b) by the next 'instcombine' pass in foldAndOfICmps(). then the second round of 'simplifycfg' can do the implication check with BCmpPred2 && CCmpPred, to merge Block 3 or eliminated it. I tried this experiment on my code base, it can work. But I am not sure this kind of pass combination introduces PASS DEPENDENCY will cause side effect or break the LLVM design rule. If this implementation is also not acceptable, then possibly the only way is refactoring the function foldAndOfICmps() from instcombine. Fundamentally, a shared foldAndOfICmps() can achieve best benefits. |
I've been looking at this:
rL347868
rL347896
rL348088
...and I'm still not exactly sure how we should fix it, but I'm confident that what this patch is proposing is not the right answer. We don't want to create a dummy instruction only to analyze and delete it.
My best guess is that we can handle this case using a small enhancement to ValueTracking plus a minimal change to the callers in InstCombine and/or SimplifyCFG. These are hacks vs. a proper solution that uses a dominator tree and a more significant fix to CVP, but it's what we already do in both InstCombine and SimplifyCFG, so there's really no additional cost for optimizing simple cases like this.
And this too might already exist..
Have you seen [[ https://github.com/llvm-mirror/llvm/blob/5e3c9a56cb080c7a5131d6063d04d1d4a22044d0/lib/Transforms/InstCombine/InstCombineAndOrXor.cpp | InstCombiner::foldAndOfICmps() ]] ?