As pointed out by Eli Friedman, the ARM backend should call setBooleanContents so that it can use known bits to make some optimizations.
Details
Diff Detail
Event Timeline
You might be able to rescue the testcases by adding a parameter "i1 %b" to the functions and using it instead of undef. If you can't fix them, though, just throw them away; a test isn't really useful if it just gets simplified away.
Does this have any code generation impact on real code?
Adding an i1 parameter saves two of the testcases. Thanks for the idea.
This does make the code worse on these benchmarks, mainly by doing things like moving 0 into an undef that is used as a condition. However, I compiled a dozen or so of the C programs in the LLVM test suite and there was no difference in the generated code.
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp | ||
---|---|---|
3674 ↗ | (On Diff #108102) | An AND of an AND should get optimized away by DAGCombine, I think? Why isn't that happening? |
test/CodeGen/ARM/cse-call.ll | ||
8 | If you delete the check line, this is no longer testing what it's supposed to test (which is that we don't CSE the cmp across the call). You might need to use an icmp to get the required output here. | |
test/CodeGen/ARM/i1.ll | ||
12 | What are you trying to test here? | |
test/CodeGen/Thumb2/2009-09-28-ITBlockBug.ll | ||
1 | This test is ridiculous; we should just delete the whole file. | |
test/CodeGen/Thumb2/2010-11-22-EpilogueBug.ll | ||
11 | This looks weird; what is the assembly for the whole function? |
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp | ||
---|---|---|
3674 ↗ | (On Diff #108102) | It is getting optimized away by DAGCombine. But in the wide-compares.ll test (in case I need to run it again) that happens right after it tries to combine a brcond, which fails to recognize the pattern if there are two ands (which feed into the brcond). So it happens too late and the code for that test becomes slightly worse. |
test/CodeGen/ARM/cse-call.ll | ||
8 | Good idea. That worked. | |
test/CodeGen/ARM/i1.ll | ||
12 | I couldn't easily come up with a simple test for this, so if you have any suggestions I'd be happy to try them instead. But my goal was to test that the compiler ensures that the i1 is either 0 or 1 (since this commit says it must be one of them). So my check for that first "mov" is the important part that is not there without my patch. | |
test/CodeGen/Thumb2/2009-09-28-ITBlockBug.ll | ||
1 | Yeah, that's a pretty crazy file. Sounds good to me. | |
test/CodeGen/Thumb2/2010-11-22-EpilogueBug.ll | ||
11 | _foo: push {r4, r7, lr} add r7, sp, #4 sub sp, #4 movs r4, #0 LBB0_1: @ %bb cbnz r4, LBB0_3 bl _bar cmp r4, #0 bne LBB0_1 LBB0_3: @ %return add sp, #4 pop {r4, r7, pc} Without this patch, that "movs r4, #0" isn't needed, so r4 isn't used, so the "add r7, sp, #4" is "mov r7, sp": _foo: push {r7, lr} mov r7, sp sub sp, #4 LBB0_1: @ %bb cbnz r0, LBB0_3 bl _bar cmp r0, #0 bne LBB0_1 LBB0_3: @ %return add sp, #4 pop {r7, pc} |
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp | ||
---|---|---|
3673 ↗ | (On Diff #108941) | Use isOne() here, not getZExtValue(). (getZExtValue() is generally a bad idea because it crashes on values which don't fit into 64 bits.) Use cast<>, not dyn_cast<>, when the cast must succeed. |
There are two patches here, please split them.
It may sound silly to split such small patches, but they're both behaviour changing, and it'd be easier to revert one and not the other if they're separate.
It'd also be clear what tests belong where, and to make sure we have tests for both cases.
cheers,
--renato
test/CodeGen/ARM/i1.ll | ||
---|---|---|
4 | arguments are never used, why do you need them? | |
12 | I'm not sure this is testing what you want and it may depend on the interpretation of undef. In this case, the compiler has selected to always branch to the first block, thus always returning one. Many things there could change, including the order of the blocks, the default value, etc. which would destroy your test. What would happen if you cast an i8 into an i1? Would your patch create masks to guarantee only the first bit is set? |
Thanks for the suggestions, Renato. This now includes only the setBooleanContents part of the patch. Removing the SelectionDAG optimization causes wide-compares.ll to generate worse code, but no other test is affected. If people would like, I can easily submit a follow-up patch with that change.
As for the i1 test, I couldn't figure out how to get it to work with a cast, so I instead I removed some of the overly-specific CHECK lines. It now simply checks for the "mov r0, #0" before the cmp.
Thanks! Could you commit it for me?
Also, should I submit the SelectionDAG optimization, or is it not worth it (given that it adds an extra case)?
If you delete the check line, this is no longer testing what it's supposed to test (which is that we don't CSE the cmp across the call). You might need to use an icmp to get the required output here.