This is an archive of the discontinued LLVM Phabricator instance.

[Kryo] Cannonicalize commutative operands based on LSLFast
AbandonedPublic

Authored by haicheng on Apr 3 2016, 10:53 PM.

Details

Summary

LSLFast - a logical shift left up to 3 places.

In Kryo if an commutative instruction has a LSL for both operands and if the LSL can be folded into the instruction's shifted register (e.g., add x0, x1, x2, lsl #3) then we should canonicalize the operands so the smaller (in terms of the number of shifts) is the operands that is folded.

For example, rather than

lsl x1, x1, #1
add x0, x1, x2, lsl #4

we should prefer

lsl x2, x2, #4
add x0, x2, x1, lsl #1

as this safes a cycle on the add instruction.

I will add commutative instructions after this patch is approved.

Diff Detail

Repository
rL LLVM

Event Timeline

haicheng updated this revision to Diff 52521.Apr 3 2016, 10:53 PM
haicheng retitled this revision from to [Kryo] Cannonicalize commutative operands based on LSLFast.
haicheng updated this object.
haicheng added reviewers: mcrosier, gberry, mssimpso.
haicheng set the repository for this revision to rL LLVM.
mcrosier edited edge metadata.Apr 5 2016, 11:37 AM

Any particular reason this is being done in the MachineCombiner pass? My first thought would be to do this as a DAG combine..

haicheng updated this revision to Diff 52847.Apr 6 2016, 1:25 PM
haicheng edited edge metadata.

Reimplemented in DAG Combine.

mcrosier added inline comments.Apr 6 2016, 3:18 PM
lib/Target/AArch64/AArch64ISelLowering.cpp
8292 ↗(On Diff #52847)

I would further narrow this such that we only commute the operands when we know the shift is 3 or fewer. Otherwise, code will be perturbed for no real reason.

haicheng updated this revision to Diff 52884.Apr 6 2016, 10:17 PM
haicheng marked an inline comment as done.

Restrict further.

t.p.northover added inline comments.
lib/Target/AArch64/AArch64ISelLowering.cpp
8292 ↗(On Diff #52884)

I'd be very tempted to turn this on for all CPUs (perhaps with a comment that Kryo is particularly keen on it). The optimal checks would be slightly different for others, but it's already a transformation that allows the formation of extra "ldr Dst, [xA, xB, lsl #N]" instructions for everyone.

mcrosier added inline comments.Apr 8 2016, 2:30 PM
lib/Target/AArch64/AArch64ISelLowering.cpp
8292 ↗(On Diff #52884)

I very much agree with you, Tim. If the optimization is applicable to all subtargets we should not guard this with isKryo.

haicheng abandoned this revision.Apr 10 2016, 11:56 PM

Move to D18949 which supports Add/Xor/Or/And, includes llvm-commits for review, and enables this feature for all AArch64 back-ends.