This is an archive of the discontinued LLVM Phabricator instance.

[RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints
ClosedPublic

Authored by reames on Aug 6 2015, 2:06 PM.

Details

Summary

To be clear: this is an *optimization* not a correctness change.

CodeGenPrep likes to duplicate icmps feeding branch instructions to take advantage of x86's ability to fuze many comparison/branch patterns into a single micro-op and to reduce the need for materializing i1s into general registers. PlaceSafepoints likes to place safepoint polls right at the end of basic blocks (immediately before terminators) when inserting entry and backedge safepoints. These two heuristics interact in a somewhat unfortunate way where the branch terminating the original block will be controlled by a condition driven by unrelocated pointers. This forces the register allocator to keep both the relocated and unrelocated values of the pointers feeding the icmp alive over the safepoint poll.

One simple fix would have been to just adjust PlaceSafepoints to move one back in the basic block, but you can reach similar cases as a result of LICM or other hoisting passes. As a result, doing a post insertion fixup seems to be more robust.

I considered doing this in CodeGenPrep itself, but having to update the live sets of already rewritten safepoints gets complicated fast. In particular, you can't just use def/use information because by moving the icmp, we're extending the live range of it's inputs potentially.

Instead, this patch teaches RewriteStatepointsForGC to make the required adjustments before making the relocations explicit in the IR. This change really highlights the fact that RSForGC is a CodeGenPrep-like pass which is performing target specific lowering. In the long run, we may even want to combine the two though this would require a lot more smarts to be integrated into RSForGC first. We currently rely on being able to run a set of cleanup passes post rewriting because the IR RSForGC generates is pretty damn ugly.

Diff Detail

Event Timeline

reames updated this revision to Diff 31472.Aug 6 2015, 2:06 PM
reames retitled this revision from to [RewriteStatepointsForGC] Avoid using unrelocated pointers after safepoints.
reames updated this object.
reames added a subscriber: llvm-commits.
sanjoy requested changes to this revision.Aug 11 2015, 3:33 PM
sanjoy edited edge metadata.
sanjoy added inline comments.
lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
2472

A switch on an icmp is equivalent to a br; perhaps we should just teach LLVM to transform

%c = icmp slt i32 %x, 0
switch i1 %c, label %def [ i1 0, label %f
                           i1 1, label %t ]

to

%c = icmp slt i32 %x, 0
br i1 %c, label %t, label %f

so that we don't have to deal with switch separately here?

(Right now LLVM does not do the said transform, which surprised me).

If you remove the switch case (pun intended!) here then you should be able to use a simple expression from PatternMatch.h to match a conditional br on an icmp.

2474

I'd prefer typing the lambda with an explicit type, like

[](TerminatorInst *TI) -> Instruction * {
...
This revision now requires changes to proceed.Aug 11 2015, 3:33 PM
sanjoy edited edge metadata.Aug 11 2015, 3:36 PM
chenli added inline comments.Aug 11 2015, 3:41 PM
lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
2472

Hmm, I think -simplifycfg should do the transformation. Do we ever see SwitchInst in practice?

sanjoy added inline comments.Aug 11 2015, 3:51 PM
lib/Transforms/Scalar/RewriteStatepointsForGC.cpp
2472

Here's the full example

target datalayout = "e-m:e-i64:64-f80:128-n8:16:32:64-S128"
target triple = "x86_64-unknown-linux-gnu"

declare void @g()
declare void @h()
declare void @i()

define void @f(i32 %x) {
 entry:
  %c = icmp slt i32 %x, 0
  switch i1 %c, label %def [ i1 0, label %f
                             i1 1, label %t ]

 t:
  call void @g()
  ret void

 f:
  call void @h()
  ret void

 def:
  call void @i()
  ret void
}

opt -O3 does not transform the switch to a br.

Sanjoy, you're example transform isn't directly legal. Your example switch is a three way branch (default, true, false), not a two way branch. There is no way to represent a three way branch in a single br instruction. Your example does point out that we're failing to remove a provably unreachable default case. If we did so, the switch would become a two way switch and then be converted to a branch.

Given the switch logic in the patch is pretty much useless as you point out, I'm just going to remove it for the moment. Longer term, I want to extend this code to reason about other classes of instructions and whether they're legal to move. I want that to be a separate patch though.

I will take your other comment and apply it.

reames updated this revision to Diff 31980.Aug 12 2015, 2:09 PM
reames edited edge metadata.

Revised per review comments.

sanjoy accepted this revision.Aug 12 2015, 2:21 PM
sanjoy edited edge metadata.

lgtm

This revision is now accepted and ready to land.Aug 12 2015, 2:21 PM
This revision was automatically updated to reflect the committed changes.