This is an archive of the discontinued LLVM Phabricator instance.

[PowerPC] Update branch coalescing to be a PowerPC specific pass
ClosedPublic

Authored by lei on May 2 2017, 5:21 PM.

Details

Summary

Implementing this pass as a PowerPC specific pass.

This branch coalescing pass utilizes the analyzeBranch method which currently does not include any implicit operands.
This is not an issue on PPC but must be handled on other targets. For this pass to be made target-independent, the analyzeBranch API need to be updated to support implicit operands and there would need to be a way to verify that any implicit operands would not be clobbered by merging blocks.

Original patch implementation: https://reviews.llvm.org/D28249
Bugzilla: https://llvm.org/bugs/show_bug.cgi?id=25219

Diff Detail

Event Timeline

lei created this revision.May 2 2017, 5:21 PM
nemanjai added inline comments.May 2 2017, 8:46 PM
include/llvm/Target/TargetFrameLowering.h
149

I find this an unusual place for this check. What is the relationship between this pass and frame lowering?

test/CodeGen/PowerPC/select-i1-vs-i1.ll
1029

I don't think this comment is necessary here. It should suffice to just remove the FIXME comment. The reason I say that is that the user will not know that we once had this duplicate block issue.

lei added inline comments.May 4 2017, 10:42 AM
include/llvm/Target/TargetFrameLowering.h
149

None. Will move.

test/CodeGen/PowerPC/select-i1-vs-i1.ll
1029

k.

lei updated this revision to Diff 97851.May 4 2017, 10:44 AM

Move check to subtarget.

kbarton accepted this revision.May 4 2017, 8:04 PM

Aside from two minor comments, this LGTM.
Please make sure to update the release notes when this gets committed.

include/llvm/Target/TargetSubtargetInfo.h
139 ↗(On Diff #97851)

We no longer need \brief since autobrief has been turned on for doxygen.

test/CodeGen/PowerPC/branch_coalesce.ll
2

There is nothing specific to pwr8 in this patch so I think we're better to remove -mcpu=pwr8.

This revision is now accepted and ready to land.May 4 2017, 8:04 PM
echristo requested changes to this revision.May 4 2017, 8:49 PM

Hrm.

I can think of three ways to enable this:

a) Enable it all the time, this would take some benchmarking across platforms, but I think is the right long term direction.
b) Add it as a pass in the backend, and use something like a lambda passed into the constructor to conditionalize it if you want (see inline question below)
c) Do this for now, but do all of the work in a as well to turn it on (i.e. a above) and remove the subtarget query (caveat of the inline question below).

I guess this comes down to my (now expanded) inline question of:

why don't we just want this on all the time on all targets?

One additional followup question:

Did you see any other performance changes across spec/lnt? Any degredations? Any other benchmark runs?

Thanks!

-eric

lib/CodeGen/TargetPassConfig.cpp
722

You don't need this because of the skipFunction query in the pass.

lib/Target/PowerPC/PPCSubtarget.cpp
183 ↗(On Diff #97851)

Is this the right choice or is this just the only place it's been tested?

This revision now requires changes to proceed.May 4 2017, 8:49 PM
lei added a comment.May 10 2017, 9:45 AM

Hrm.

I can think of three ways to enable this:

a) Enable it all the time, this would take some benchmarking across platforms, but I think is the right long term direction.
b) Add it as a pass in the backend, and use something like a lambda passed into the constructor to conditionalize it if you want (see inline question below)
c) Do this for now, but do all of the work in a as well to turn it on (i.e. a above) and remove the subtarget query (caveat of the inline question below).

I guess this comes down to my (now expanded) inline question of:

why don't we just want this on all the time on all targets?

Investigation into this is now being done.

One additional followup question:

Did you see any other performance changes across spec/lnt? Any degredations? Any other benchmark runs?

The only performance run I did so far is SPEC2006 on a power8 LE machine.
A speed up of > 1 means better performance with branch coalescing enabled. Based on the speed up below, we see a 11% improvement for lbm and 28% improvement for mcf.

Benchmark Speed Up
433.milc 0.997
444.namd 1.000
450.soplex 1.002
453.povray 0.995
470.lbm 1.112
482.sphinx3 1.006
400.perlbench 1.001
401.bzip2 1.000
403.gcc 1.067
429.mcf 1.284
445.gobmk 1.001
456.hmmer 1.000
458.sjeng 1.000
462.libquantum 1.025
464.h264ref 0.997
473.astar 0.998
483.xalancbmk 1.043

Thanks!

-eric

lei updated this revision to Diff 110607.Aug 10 2017, 11:55 AM
lei edited edge metadata.

Updating this to be a PPC specific pass since it does not handle implicit operands which are used by other targets.

lei updated this revision to Diff 110615.Aug 10 2017, 12:00 PM

remove -mcpu=pwr8 from IR test case.

kbarton accepted this revision.Aug 22 2017, 1:26 PM

I would change the summary of this review to reflect the fact we are moving this pass back to the PPC target.
Aside from that, this LGTM.

It would be good if @hfinkel could take a quick look and make sure he has no further comments.

hfinkel edited edge metadata.Aug 22 2017, 1:54 PM

I would change the summary of this review to reflect the fact we are moving this pass back to the PPC target.
Aside from that, this LGTM.

It would be good if @hfinkel could take a quick look and make sure he has no further comments.

Yes, please go ahead.

lei retitled this revision from Enable branch coalescing on PowerPC to [PowerPC] Update branch coalescing to be a PowerPC specific pass.Aug 23 2017, 8:52 AM
lei edited the summary of this revision. (Show Details)
lei updated this revision to Diff 113448.Aug 31 2017, 11:50 AM

Turn patch off by default.

iteratee added inline comments.Aug 31 2017, 12:45 PM
lib/CodeGen/TargetPassConfig.cpp
40–41

These seem unrelated. Can you pull them out?

lei added inline comments.Sep 5 2017, 6:48 AM
lib/CodeGen/TargetPassConfig.cpp
40–41

I do not understand why you want me to pull these out. They were not added by me and it is being used later on in this src.

iteratee requested changes to this revision.Sep 5 2017, 11:52 AM

I don't think this should go in, even off by default as long as it's broken. You can pull the produceSameValue check or fix it, but I don't like having a known-broken pass in tree.

lib/CodeGen/TargetPassConfig.cpp
40–41

Sorry, I was trying to look at your original diff vs this one to see what changed.

This revision now requires changes to proceed.Sep 5 2017, 11:52 AM
hfinkel added inline comments.Sep 5 2017, 1:01 PM
lib/CodeGen/TargetPassConfig.cpp
40–41

What are you talking about? I see no changes here.

sfertile added inline comments.Sep 5 2017, 1:19 PM
lib/CodeGen/TargetPassConfig.cpp
40–41

If you compare diff 4 to diff 5 it makes it look like EnableIPRA got added in diff 5.

lei updated this revision to Diff 114117.Sep 6 2017, 7:28 PM
lei edited edge metadata.

Added logic to filter out physical register uses on line 347 in lib/Target/PowerPC/PPCBranchCoalescing.cpp.

@iteratee, please let me know if this fixes the issue you were seeing. Thanks!

lei added a comment.Sep 6 2017, 7:37 PM

Pass is currently off by default. Enable pass via -enable-ppc-branch-coalesce.

iteratee accepted this revision.Sep 11 2017, 1:47 PM

I'm OK with this.

echristo accepted this revision.Sep 12 2017, 2:34 PM

Acking to clear the no bit.

This revision is now accepted and ready to land.Sep 12 2017, 2:34 PM
lei closed this revision.Sep 13 2017, 6:34 AM