Page MenuHomePhabricator

[NVPTX] kernel pointer arguments point to the global address space

Authored by jingyue on May 31 2015, 10:22 PM.



With this patch, NVPTXLowerKernelArgs converts a kernel pointer argument to a
pointer in the global address space. This change, along with
NVPTXFavorNonGenericAddrSpaces, allows the NVPTX backend to emit*
and* for accessing kernel pointer arguments.

Minor changes:

  1. refactor: extract function convertToPointerInAddrSpace
  2. fix a bug in the test case in bug21465.ll

Diff Detail

Event Timeline

jingyue updated this revision to Diff 26872.May 31 2015, 10:22 PM
jingyue retitled this revision from to [NVPTX] kernel pointer arguments point to the global address space.
jingyue updated this object.
jingyue edited the test plan for this revision. (Show Details)
jingyue added reviewers: eliben, jholewinski, meheff.
jingyue added a subscriber: Unknown Object (MLST).
jholewinski edited edge metadata.Jun 2 2015, 5:45 AM

Comments inlined.

Are you planning on enabling this pass by default?


The address space conversion intrinsics are deprecated in favor of the new addrspacecast instruction (I need to document that in


Isn't it possible that an optimization would remove all of these casts before NVVMFavorNonGenericAddrSpaces runs? I know we control the pass pipeline in the backend, but I worry about these pass ordering constraints.


Strictly speaking, this is only valid for CUDA (DrvInterface::CUDA).

Yes. I'll add that to NVPTXPassConfig::addIRPass.

jingyue added inline comments.Jun 2 2015, 12:38 PM

Thank you for pointing this out. So, do you think we should emit addrspacecast here too instead of and


Not sure. Removing bitcast is fine for NVPTXFavorNonGenericAddrSpace. Why would any pass remove addrspacecast? addrspacecast has a richer semantics -- it changes the address space of a pointer and accessing pointers in different address spaces are not necessarily as fast. If any pass would want to remove an addrspacecast, it should reason about the performance effects of that too.



jingyue added inline comments.Jun 2 2015, 3:59 PM

Your concern makes more sense to me if we replace* with addrspacecast. In that case, this pass would addrspacecasts value back and forth, e.g.,

t1 = cast t0 to global
t2 = cast t1 to generic

Then, it's reasonable for some optimizations to cancel such addrspacecast pairs.

Maybe not in this patch, we can merge this pass and NVPTXFavorNonGeneric into something like NVPTXTypeInference? That should prevent other passes from messing up the intermediate code (besides the name becomes fancier :) How does that sound to you?

jingyue updated this revision to Diff 27017.Jun 2 2015, 4:09 PM
jingyue edited edge metadata.

Enable NVPTXLowerKernelArgs by default

jingyue added inline comments.Jun 2 2015, 4:11 PM


jingyue updated this revision to Diff 27086.Jun 3 2015, 8:13 PM

use addrspacecast instead of* intrinsics

jingyue added a subscriber: wengxt.Jun 3 2015, 8:13 PM
jingyue added inline comments.Jun 3 2015, 8:23 PM

The updated version emits addrspacecast instead of*.

jingyue updated this revision to Diff 27087.Jun 3 2015, 8:31 PM

more comments

jholewinski accepted this revision.Jun 4 2015, 11:36 AM
jholewinski edited edge metadata.

This looks good to me now. I agree that we should merge these two passes to prevent any ordering issues.

This revision is now accepted and ready to land.Jun 4 2015, 11:36 AM
jingyue closed this revision.Jun 4 2015, 1:23 PM

Thank you. FYI, we are working on merging LowerKernelArgs and FavorNonGeneric, and handling the local address space as well.