Page MenuHomePhabricator

[NVPTX] Add NVPTXHoistAddrSpaceCast pass.

Authored by jlebar on Aug 20 2016, 5:27 PM.



This pass transforms

gep (addrspacecast p)


addrspacecast (gep p)

Doing the addrspacecast instruction first (which is valid in NVPTX but
not on all targets) lets LLVM optimize better, because LLVM treats
addrspacecast as a black-box function.

This is particularly important for the idiom clang will use for lowering
the __ldg builtin.

Diff Detail

Event Timeline

jlebar updated this revision to Diff 68793.Aug 20 2016, 5:27 PM
jlebar retitled this revision from to [NVPTX] Add NVPTXHoistAddrSpaceCast pass..
jlebar updated this object.
jlebar added a reviewer: jingyue.
jlebar added subscribers: tra, llvm-commits, majnemer.
majnemer added inline comments.Aug 20 2016, 7:46 PM

Hmm, this looks a little suspicious to me.

What if you have:

%gep = ...
%call = call void @never_returns(%gep)
%asc = addrspacecast %gep ...

Just because the addrspacecast postdominates the GEP does not ensure that it will be executed.

Along slightly similar but slightly different lines, consider:

  %gep = ...
  %asc = addrspacecast %gep ...
  br i1 %cond, %then, %else

  [... never use %asc, only use %gep ...]

  [... use %asc ...]

In this case the addrspacecast postdominates the GEP and all its uses are control dependent on %cond. If %cond is true, then the addrspacecast is never used and all access to memory go through %gep instead of %asc.


auto *







arsenm added a subscriber: arsenm.Aug 20 2016, 8:40 PM
arsenm added inline comments.

Can you put this into a TTI hook?

jingyue edited edge metadata.Aug 22 2016, 12:03 PM

LGTM after you address David's comment -- A post-dominates B doesn't guarantee execution that reaches B will arrive A.


I suggest add a test that checks we don't miscompile when the addrspacecast doesn't post-dominate the gep.

LGTM after you address David's comment -- A post-dominates B doesn't guarantee execution that reaches B will arrive A.

Per my last mail on the list (*), I'm not convinced this is the right approach given the limitations to hoisting that you and David point out. We won't be able to hoist in a lot of situations that seem like they'll be common.

I gave a few alternatives in that mail. I got one of them, teaching SCEV to treat NVPTX addrspacecasts as ptr + offset, to maybe work this afternoon. I'm cleaning it up, will send it out soon.

(*) Do you have any idea why only some emails to the mailing list show up in phab? Having discussions in two places seems extremely broken, but here we are.

I am, at the moment, not convinced this is the right thing to do.

In the limited form where we only hoisted addrspacecasts up insofar as the new location "strongly dominates" the old location (that is, we hoist from location B to location A if A running guarantees that B will run), it would be possible for us to end up transforming one set of address computations into two parallel sets of computations, one in the generic space and one in the non-generic space.

It's not entirely clear to me what *is* the right thing to do, but teaching SCEV what's going on here may be a start.

jlebar abandoned this revision.Sep 30 2016, 9:42 AM