This is an archive of the discontinued LLVM Phabricator instance.

ARM: Better codegen for 64-bit compares.
ClosedPublic

Authored by pcc on Dec 4 2015, 7:30 PM.

Details

Summary

This introduces a custom lowering for ISD::SETCCE (introduced in r253572)
that allows us to emit a short code sequence for 64-bit compares.

Before:

push {r7, lr}
cmp r0, r2
mov.w r0, #0
mov.w r12, #0
it hs
movhs r0, #1
cmp r1, r3
it ge
movge.w r12, #1
it eq
moveq r12, r0
cmp.w r12, #0
bne .LBB1_2
@ BB#1: @ %bb1
bl f
pop {r7, pc}
.LBB1_2: @ %bb2
bl g
pop {r7, pc}

After:

push {r7, lr}
subs r0, r0, r2
sbcs.w r0, r1, r3
bge .LBB1_2
@ BB#1: @ %bb1
bl f
pop {r7, pc}
.LBB1_2: @ %bb2
bl g
pop {r7, pc}

Saves around 80KB in Chromium's libchrome.so.

Some notes on this patch:

  • I don't much like the ARMISD::BRCOND and ARMISD::CMOV combines I introduced (nothing else needs them). However, they are necessary in order to avoid poor codegen, and they seem similar to existing combines in other backends (e.g. X86 combines (brcond (cmp (setcc Compare))) to (brcond Compare)).
  • No support for Thumb-1. This is in principle possible, but we'd need to implement ARMISD::SUBE for Thumb-1.

Diff Detail

Repository
rL LLVM

Event Timeline

pcc updated this revision to Diff 41972.Dec 4 2015, 7:30 PM
pcc retitled this revision from to ARM: Better codegen for 64-bit compares..
pcc updated this object.
pcc added reviewers: rengolin, hans.
pcc added subscribers: llvm-commits, mkuper, DavidKreitzer.
hans edited edge metadata.Dec 7 2015, 5:14 PM

Looks reasonable to me, but I don't know enough about ARM to be a good reviewer here.

lib/Target/ARM/ARMISelLowering.cpp
4720 ↗(On Diff #41972)

I'm too unfamiliar with ARM to follow this.

Should I understand it that Cmp.getValue(1) is something suitable to be moved into the flags register, and that makes it possible to do the CMOV below? (Which the BRCOND combine below will eliminate if we really want to branch on this rather than get a value?)

pcc added inline comments.Dec 7 2015, 5:23 PM
lib/Target/ARM/ARMISelLowering.cpp
4720 ↗(On Diff #41972)

Yes, exactly. Cmp.getValue(1) represents the flags register itself; ARMISD::CMOV does not take a flags register as input so we need a CopyToReg to get the scheduling correct.

+Tilmann for size optimizations, +James for ARM

jmolloy requested changes to this revision.Mar 18 2016, 7:38 AM
jmolloy edited edge metadata.

Hi Peter,

Sorry for the round trip time on this, I've only just become aware of it.

Some comments but it generally looks fine. Thanks!

James

lib/Target/ARM/ARMISelLowering.cpp
4719 ↗(On Diff #41972)

This seems fishy to me. Shouldn't we be chaining to DAG.getRoot() (maybe?), rather than getEntryNode()?

Surely we want CPSR's value just before the SETCCE, not at the start of the BB?

10502 ↗(On Diff #41972)

This is very hard to read due to the many isa<> and casts<>. Can we please hoist these out as a dyn_cast to reduce code complexity?

10578 ↗(On Diff #41972)

Same here

This revision now requires changes to proceed.Mar 18 2016, 7:38 AM
pcc updated this revision to Diff 51085.Mar 18 2016, 4:01 PM
pcc edited edge metadata.
  • Use dyn_cast to improve readability of DAG matchers
pcc marked 2 inline comments as done.Mar 18 2016, 4:01 PM
pcc added inline comments.
lib/Target/ARM/ARMISelLowering.cpp
4719 ↗(On Diff #41972)

The correct ordering is created by the edge from the SUBE to the copy. I don't think the chain edge is significant here.

Using the graph root wouldn't be correct, as that would create cycles in the graph.

I'm also not sure if using the entry node is entirely correct, but in my defense, there seems to be similar code in other backends [1].

[1] http://llvm-cs.pcc.me.uk/?q=getCopyToReg.*getEntryNode

jmolloy accepted this revision.Mar 19 2016, 1:39 AM
jmolloy edited edge metadata.

Thanks Peter, LGTM

This revision is now accepted and ready to land.Mar 19 2016, 1:39 AM
tilmann accepted this revision.Mar 21 2016, 12:47 AM
tilmann edited edge metadata.

Great code size reduction, thanks for working on this!

LGTM

This revision was automatically updated to reflect the committed changes.