Fold integer comparisons where the known bits of the operands allows the result to be constant folded.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
You need to upload the diff between the master branch and your branch, not the last commit in your branch.
What is the motivation for this patch? The added tests should already get folded before reaching the backend (by InstCombine). We should only add folds to DAGCombine if the folded pattern is created by legalization. What is the original context for this change?
This was an unassigned enhancement - https://bugs.llvm.org/show_bug.cgi?id=41182 - that I was interested in and wanted to have a go at. Unfortunately that's all I have for context on my end
@RKSimon Meant to say this before - Thank you for putting up additional reviewers Simon, I appreciate your help with this
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
14059 ↗ | (On Diff #288003) |
I found the UnreachableMachineBlockElim pass which may be able to be enhanced for this, but I'm still looking |
llvm/test/CodeGen/X86/test-shrink.ll | ||
574 ↗ | (On Diff #288003) | Thank you for taking a look at the changes.
Sorry, would you be able to elaborate what you mean here? Do you mean separate these out into another pre-commit review? |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
14059 ↗ | (On Diff #288003) | Sorry, I meant is there anywhere else that does something like this while SelectionDAG exists? |
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp | ||
4237–4238 | Is this just a formatting change? | |
llvm/test/CodeGen/X86/test-shrink.ll | ||
574 ↗ | (On Diff #288003) | I went ahead and modified the tests in trunk. Can you rebase this patch? |
llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp | ||
---|---|---|
14059 ↗ | (On Diff #288003) | Thanks for elaborating, sorry for the confusion. I don't believe there is another place that does something like this while SelectionDAG exists. It seems like BranchFolding might be a good place for me to handle this? For some additional context, without this change if we have define i32 @simple(i32 %x) nounwind { %1 = and i32 %x, 2147483648 %2 = icmp sle i32 %1, 0 br i1 %2, label %yes, label %no yes: ; will always get here ret i32 1 no: ret i32 2 } we get ## %bb.0: xorl %eax, %eax testb %al, %al jne LBB0_2 ## %bb.1: ## %yes movl $1, %eax retq LBB0_2: ## %no movl $2, %eax retq so jne will never be taken instead of just removing the branch completely |
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp | ||
4237–4238 | Yes this is just formatting, I must have messed up the merge when I rebased and missed this. Sorry about that, should be fixed now | |
llvm/test/CodeGen/X86/test-shrink.ll | ||
574 ↗ | (On Diff #288003) |
Sure can, thanks for updating them. Sorry for the confusion |
@darcangelo.matthew are you still looking at this?
llvm/include/llvm/Support/KnownBits.h | ||
---|---|---|
145 ↗ | (On Diff #288883) | Please add unit test coverage in llvm\unittests\Support\KnownBitsTest.cpp |
@darcangelo.matthew Are you in a position to continue with this, if not I would like to commandeer and complete this please?
I’m not able to continue with this. You can go ahead and complete it. Sorry I missed your previous comment as well.
Updated the old patch to use the KnownBits comparison helpers that have been added since the original patch was created and add all test changes for review.
I've removed the branch removal code for now - which means we have a number of "constant" conditional branches that we currently have no way to fold to a jump, whatever we go for here handling this in the DAG would be rather ugly.
I'm open to suggestions for any individual test changes - some are definite improvements, some just need some tweaking and others probably affect the original regression the test was reduced from.
llvm/test/CodeGen/AMDGPU/wave32.ll | ||
---|---|---|
720 ↗ | (On Diff #400593) | @foad I guess there's no good way around this if on wave32 targets - unless we duplicate the test with a 16 divergent variant? |
llvm/test/CodeGen/X86/avx512-mask-op.ll | ||
623 | test7 (and test6) don't look great (and the codegen was already redundant) - I wonder if we should at least have them return something? |
llvm/test/CodeGen/AMDGPU/wave32.ll | ||
---|---|---|
720 ↗ | (On Diff #400593) | OK, please rebase on 7af959673e67256c5ba711070aca509dce794350. |
llvm/test/CodeGen/AMDGPU/wave32.ll | ||
---|---|---|
720 ↗ | (On Diff #400593) | Thanks - looks like we have an equivalent issue in wqm.ll |
llvm/test/CodeGen/X86/fold-rmw-ops.ll | ||
---|---|---|
1830 | Regression? |
llvm/test/CodeGen/X86/fold-rmw-ops.ll | ||
---|---|---|
1830 | I'm really at a loss as to what is best to do for these cases where we've created a constant conditional branch - the middle-end would have nuked most of these cases already (which probably means doing something in CGP is a waste of time as well) and I'm not certain how often they will be generated in backend. We can tweak the tests in some cases I suppose, which is a cheat but probably the most realistic approach. Opinions welcome! |
llvm/test/CodeGen/AMDGPU/wave32.ll | ||
---|---|---|
720 ↗ | (On Diff #400593) | Fixed in 0bc14a0a989fe4268b899100aafc07e3d94decbb. |
To repeat my previous question, because it's not obvious from tests: Which legalizations does this improve? That is, for input IR that has already gone through the opt -O2 pipeline, which cases does this improve?
Its quite a long time ago when I first encountered this but IIRC one of the places I hit this was in the sdiv/udiv by constant expansions
This review may be stuck/dead, consider abandoning if no longer relevant.
Removing myself as reviewer in attempt to clean dashboard.
llvm/lib/CodeGen/SelectionDAG/TargetLowering.cpp | ||
---|---|---|
3644–3646 | Just put Result = KnownBits::whatever(LHSKnown, RHSKnown) inside the switch, and sink the if (Result) return ... out? |
Just put Result = KnownBits::whatever(LHSKnown, RHSKnown) inside the switch, and sink the if (Result) return ... out?