This is an archive of the discontinued LLVM Phabricator instance.

[PeepholeOptimizer] Refactor the advance copy optimization to take advantage of the isRegSequence property
ClosedPublic

Authored by qcolombet on Aug 12 2014, 2:09 PM.

Details

Reviewers
hfinkel
Summary

Hi,

This patch makes use of the isRegSequence property in the advance copy optimization of the peephole optimizer.

Note that the advance copy optimization is still disabled by default.

Thanks for your feedbacks.

  • Context **

This is a follow-up of r215394 and r215404, which respectively introduces the isRegSequence property and uses it for ARM.
This patch is the last of a series of three patches (see http://reviews.llvm.org/D4734 for the initial thread), it uses the property introduced by the previous two patches to improve the coalescing of target-specific instruction performing cross-register files copies.

Thanks to the property introduced in by the previous commits, this patch is able to optimize the following sequence:
_simpleVectorDiv: @ @simpleVectorDiv
@ BB#0: @ %entry
vmov d0, r2, r3
vmov d1, r0, r1
vmov r0, s0
vmov r1, s2
udiv r0, r1, r0
vmov r1, s1
vmov r2, s3
udiv r1, r2, r1
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov r0, r1, d16
bx lr

into:
udiv r0, r0, r2
udiv r1, r1, r3
vmov.32 d16[0], r0
vmov.32 d16[1], r1
vmov r0, r1, d16
bx lr

  • Proposed Patch **

The proposed patch refactors how the copy optimizations are done in the peephole optimizer are done.
Prior to this patch, we had one copy-related optimization that replaced a copy or bitcast by a generic, more suitable (in terms of register file), copy.

With this patch, the peephole optimizer features two copy-related optimization:

  1. One for rewriting generic copies to generic copies: PeepholeOptimizer::optimizeCoalescableCopy.
  2. One for replacing non-generic copies with generic copies: PeepholeOptimizer::optimizeUncoalescableCopy.

The goals of these two optimizations are slightly different: one rewrite the operand of the instruction (#1), the other kills off the non-generic instruction and replace it by a (sequence of) generic instruction(s).

Both optimizations rely on the ValueTracker introduced in r212100.

The ValueTracker has been refactored to use the information from the TargetInstrInfo for non-generic instruction. As part of the refactoring we switch the tracking from the index of the definition to the actual register (virtual or physical). This one change is to provide a greater consistency with register related APIs and to ease the use of the TargetInstrInfo.

Moreover, this patch introduces a new helper class CopyRewriter used to ease the rewriting of generic copy (i.e., #1).

Finally, this patch adds a dead code elimination pass right after the peephole optimizer to get rid of dead code that may appear after rewriting.

Thanks,
-Quentin

Diff Detail

Event Timeline

qcolombet updated this revision to Diff 12415.Aug 12 2014, 2:09 PM
qcolombet retitled this revision from to [PeepholeOptimizer] Refactor the advance copy optimization to take advantage of the isRegSequence property.
qcolombet updated this object.
qcolombet edited the test plan for this revision. (Show Details)
qcolombet added a reviewer: hfinkel.
qcolombet added a subscriber: Unknown Object (MLST).

One quick comment/question:

You've got TargetInstrInfo as optional in the class, but as far as I can tell nothing checks it before using it in isLoadFoldable (or the path that gets there). I could easily be wrong, it was the best I could do with search on the diff. Can you confirm?

Thanks for asking!

The TargetInstrInfo is optional for the ValueTracker but not for the PeepholeOptimizer. isLoadFoldable is part of the PeepholeOptimizer.

That said, this is possible I missed some uses in the ValueTracker, but if it is the case, I do not see them!

-Quentin

Aha. Figured I'd missed something. Thanks :)

hfinkel added inline comments.Aug 19 2014, 12:54 PM
lib/CodeGen/PeepholeOptimizer.cpp
238

I would really like a more-verbose comment here (maybe with an example) to explain what this means. And, perhaps, why you might want to find an alternative source for a physical register or not.

550

dead -> untested

551

change -> changes

603

I don't understand this comment (and/or the associated logic below). When this is called we set CurrentSrcIdx = 1, and then we exit early if CurrentSrcIdx == 1. A comment on how this all works would be nice.

816

What code below enforces this "register bank" constraint?

856

What exactly is not supported?

1317–1318

require -> required

Hi Hal,

Thanks for your detailed feedbacks!

I’ll update the patch accordingly.

Cheers,
-Quentin

lib/CodeGen/PeepholeOptimizer.cpp
238

Sure, I’ll add a comment about that.
Technically, you cannot find a new source for a physical register with that interface because you do not know what instruction defines it.
The next constructor provides an interface to overcome that.

Regarding finding an alternative source for a physical register, this might make sense to do that to be able to get rid of a target specific instruction and thus, be able to take advantage of the register coalescer.

603

Nice catch, the comment is indeed very obscure.
I will update that.

The idea is the following.

Let CopyLike be the instruction being rewritten, where CopyLike has one definition and one source.
In other words, we have:
dst = CopyLike src

If CurrentSrcIdx != 0 (i.e., if CurrentSrcIdx == 1), it means we are looking for the next source of src, which cannot be determined from CopyLike.
However, if we are looking at CurrentSrcIdx == 0, we want the next source of dst, which is src.
Therefore SrcReg == src and TrackReg == dst.

816

PeepholeOptimizer::findNextSource.

856

My bad, I thought "v0 = INSERT_SUBREG v1, v1.sub0, sub0” was incorrectly expanded during TwoAddressInstructionPass, same for REG_SEQUENCE v1.sub0, sub0, v1.sub1, sub1 when we compose sub-registers. I have double checked and in fact, this works.
This should be a TODO, driven by actually seeing this pattern.

qcolombet updated this revision to Diff 12685.Aug 19 2014, 3:41 PM

Update according to Hal's feedbacks:

  • Add more comments.
  • Fix typos.
hfinkel accepted this revision.Aug 19 2014, 3:50 PM
hfinkel edited edge metadata.

LGTM. Thanks!

This revision is now accepted and ready to land.Aug 19 2014, 3:50 PM
qcolombet closed this revision.Aug 20 2014, 10:51 AM

Thanks Hal!

Committed revision 216088.

-Quentin