Page MenuHomePhabricator

Have ARM call setBooleanContents(ZeroOrOneBooleanContent).
ClosedPublic

Authored by jgalenson on Jul 24 2017, 4:22 PM.

Details

Summary

As pointed out by Eli Friedman, the ARM backend should call setBooleanContents so that it can use known bits to make some optimizations.

Diff Detail

Event Timeline

jgalenson created this revision.Jul 24 2017, 4:22 PM
efriedma edited edge metadata.Jul 24 2017, 4:51 PM

You might be able to rescue the testcases by adding a parameter "i1 %b" to the functions and using it instead of undef. If you can't fix them, though, just throw them away; a test isn't really useful if it just gets simplified away.

Does this have any code generation impact on real code?

jgalenson updated this revision to Diff 108102.Jul 25 2017, 9:33 AM

Adding an i1 parameter saves two of the testcases. Thanks for the idea.

This does make the code worse on these benchmarks, mainly by doing things like moving 0 into an undef that is used as a condition. However, I compiled a dozen or so of the C programs in the LLVM test suite and there was no difference in the generated code.

efriedma added inline comments.Jul 28 2017, 5:22 PM
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3674

An AND of an AND should get optimized away by DAGCombine, I think? Why isn't that happening?

test/CodeGen/ARM/cse-call.ll
5

If you delete the check line, this is no longer testing what it's supposed to test (which is that we don't CSE the cmp across the call). You might need to use an icmp to get the required output here.

test/CodeGen/ARM/i1.ll
11

What are you trying to test here?

test/CodeGen/Thumb2/2009-09-28-ITBlockBug.ll
10

This test is ridiculous; we should just delete the whole file.

test/CodeGen/Thumb2/2010-11-22-EpilogueBug.ll
11

This looks weird; what is the assembly for the whole function?

jgalenson added inline comments.Jul 31 2017, 9:24 AM
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3674

It is getting optimized away by DAGCombine. But in the wide-compares.ll test (in case I need to run it again) that happens right after it tries to combine a brcond, which fails to recognize the pattern if there are two ands (which feed into the brcond). So it happens too late and the code for that test becomes slightly worse.

test/CodeGen/ARM/cse-call.ll
5

Good idea. That worked.

test/CodeGen/ARM/i1.ll
11

I couldn't easily come up with a simple test for this, so if you have any suggestions I'd be happy to try them instead. But my goal was to test that the compiler ensures that the i1 is either 0 or 1 (since this commit says it must be one of them). So my check for that first "mov" is the important part that is not there without my patch.

test/CodeGen/Thumb2/2009-09-28-ITBlockBug.ll
10

Yeah, that's a pretty crazy file. Sounds good to me.

test/CodeGen/Thumb2/2010-11-22-EpilogueBug.ll
11

_foo:

push    {r4, r7, lr}
add     r7, sp, #4
sub     sp, #4
movs    r4, #0

LBB0_1: @ %bb

cbnz    r4, LBB0_3
bl      _bar
cmp     r4, #0
bne     LBB0_1

LBB0_3: @ %return

add     sp, #4
pop     {r4, r7, pc}

Without this patch, that "movs r4, #0" isn't needed, so r4 isn't used, so the "add r7, sp, #4" is "mov r7, sp":

_foo:

push    {r7, lr}
mov     r7, sp
sub     sp, #4

LBB0_1: @ %bb

cbnz    r0, LBB0_3
bl      _bar
cmp     r0, #0
bne     LBB0_1

LBB0_3: @ %return

add     sp, #4
pop     {r7, pc}
jgalenson updated this revision to Diff 108941.Jul 31 2017, 9:25 AM
jgalenson edited the summary of this revision. (Show Details)
efriedma added inline comments.Aug 7 2017, 5:12 PM
lib/CodeGen/SelectionDAG/LegalizeDAG.cpp
3673

Use isOne() here, not getZExtValue(). (getZExtValue() is generally a bad idea because it crashes on values which don't fit into 64 bits.)

Use cast<>, not dyn_cast<>, when the cast must succeed.

jgalenson updated this revision to Diff 110120.Aug 7 2017, 5:58 PM

Thanks for the advice; isOne is much cleaner.

fhahn added a subscriber: fhahn.Aug 8 2017, 2:00 AM

Looks fine, but it's a significant change, so adding some other ARM reviewers.

rengolin edited edge metadata.Aug 12 2017, 6:29 AM

There are two patches here, please split them.

It may sound silly to split such small patches, but they're both behaviour changing, and it'd be easier to revert one and not the other if they're separate.

It'd also be clear what tests belong where, and to make sure we have tests for both cases.

cheers,
--renato

test/CodeGen/ARM/i1.ll
4

arguments are never used, why do you need them?

11

I'm not sure this is testing what you want and it may depend on the interpretation of undef. In this case, the compiler has selected to always branch to the first block, thus always returning one. Many things there could change, including the order of the blocks, the default value, etc. which would destroy your test.

What would happen if you cast an i8 into an i1? Would your patch create masks to guarantee only the first bit is set?

jgalenson edited the summary of this revision. (Show Details)

Thanks for the suggestions, Renato. This now includes only the setBooleanContents part of the patch. Removing the SelectionDAG optimization causes wide-compares.ll to generate worse code, but no other test is affected. If people would like, I can easily submit a follow-up patch with that change.

As for the i1 test, I couldn't figure out how to get it to work with a cast, so I instead I removed some of the overly-specific CHECK lines. It now simply checks for the "mov r0, #0" before the cmp.

rengolin accepted this revision.Aug 14 2017, 11:16 AM

LGTM, thanks!

This revision is now accepted and ready to land.Aug 14 2017, 11:16 AM

Thanks! Could you commit it for me?

Also, should I submit the SelectionDAG optimization, or is it not worth it (given that it adds an extra case)?

Submit as a different revision and we can discuss the merits there.

Could someone commit this for me?

rengolin closed this revision.Aug 22 2017, 4:04 AM