This is an archive of the discontinued LLVM Phabricator instance.

[PPC]: Peephole optimize small accesss to aligned globals.
ClosedPublic

Authored by iteratee on Nov 19 2015, 6:21 PM.

Details

Summary

[PPC]: Peephole optimize small accesss to aligned globals.

Access to aligned globals gives us a chance to peephole optimize nonzero
offsets. If a struct is 4 byte aligned, then accesses to bytes 0-3 won't
overflow the available displacement. For example:

addis 3, 2, b4v@toc@ha
addi 4, 3, b4v@toc@l
lbz 5, b4v@toc@l(3) ; This is the result of the current peephole
lbz 6, 1(4)         ; optimizer
lbz 7, 2(4)
lbz 8, 3(4)

If b4v is 4-byte aligned, we can skip using register 4 because we know
that b4v@toc@l+{1,2,3} won't overflow 32K, and instead generate:

addis 3, 2, b4v@toc@ha
lbz 4, b4v@toc@l(3)
lbz 5, b4v@toc@l+1(3)
lbz 6, b4v@toc@l+2(3)
lbz 7, b4v@toc@l+3(3)

Saving a register and an addition.
Larger alignments allow larger structures/arrays to be optimized.
This only applies on power8 if there is a single use resulting in a new fusion opportunity, or when optimizing for size.

Diff Detail

Event Timeline

iteratee updated this revision to Diff 40731.Nov 19 2015, 6:21 PM
iteratee retitled this revision from to [PPC]: Peephole optimize small accesss to aligned globals..
iteratee updated this object.
iteratee added reviewers: wschmidt, kbarton, echristo.
echristo edited edge metadata.Nov 19 2015, 6:31 PM

One small inline comment.

lib/Target/PowerPC/PPCISelDAGToDAG.cpp
4167–4181

Not sure I understand the moves here?

iteratee updated this revision to Diff 40826.Nov 20 2015, 1:47 PM
iteratee updated this object.
iteratee edited edge metadata.
iteratee marked an inline comment as done.
iteratee added inline comments.
lib/Target/PowerPC/PPCISelDAGToDAG.cpp
4167–4181

They got sorted according to size when I thought I needed to know the size.
Reverted.

iteratee set the repository for this revision to rL LLVM.Dec 2 2015, 1:28 PM
iteratee added a subscriber: llvm-commits.

Ping? Any objections?

iteratee marked an inline comment as done.Dec 8 2015, 3:49 PM
hfinkel added inline comments.Dec 9 2015, 2:03 AM
lib/Target/PowerPC/PPCISelDAGToDAG.cpp
4143

But we should do this when optimizing for code size, even on the P8:

if (PPCSubTarget->hasFusion() && !MF->getFunction()->optForSize())

or, if this really *hurts* performance on the P8, use optForMinSize().

Also, we don't need to turn this off on the P8 when there is only a single (non-debug) user because, as the ELF v2 ABI spec points out:

addis r4, r3, upper
<lbz,lhz,lwz,ld> r4, lower(r4)

is also good.

iteratee updated this revision to Diff 42477.Dec 10 2015, 4:13 PM
iteratee removed rL LLVM as the repository for this revision.

Run the optimization on fusion platforms if it results in a new fusion opportunity, or optimizing for size.

iteratee updated this object.Dec 10 2015, 4:14 PM
iteratee marked an inline comment as done.

I implemented your suggestion and fired this selectively for power8

lib/Target/PowerPC/PPCISelDAGToDAG.cpp
4143

Nice catch. I've implemented that.

echristo added inline comments.Dec 10 2015, 4:16 PM
lib/Target/PowerPC/PPCISelDAGToDAG.cpp
4197–4199

Looks weird. Feel like collapsing it to a single if conditional?

iteratee updated this revision to Diff 42480.Dec 10 2015, 4:24 PM
iteratee marked an inline comment as done.

Collapse nested condition

iteratee marked an inline comment as done.Dec 10 2015, 4:24 PM
iteratee updated this revision to Diff 42482.Dec 10 2015, 4:35 PM

Adjust formatting.

hfinkel accepted this revision.Dec 10 2015, 4:42 PM
hfinkel edited edge metadata.

LGTM

This revision is now accepted and ready to land.Dec 10 2015, 4:42 PM
iteratee closed this revision.Dec 11 2015, 11:04 AM