This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] kernel pointer arguments point to the global address space
ClosedPublic

Authored by jingyue on May 31 2015, 10:22 PM.

Details

Summary

With this patch, NVPTXLowerKernelArgs converts a kernel pointer argument to a
pointer in the global address space. This change, along with
NVPTXFavorNonGenericAddrSpaces, allows the NVPTX backend to emit ld.global.*
and st.global.* for accessing kernel pointer arguments.

Minor changes:

  1. refactor: extract function convertToPointerInAddrSpace
  2. fix a bug in the test case in bug21465.ll

Diff Detail

Event Timeline

jingyue updated this revision to Diff 26872.May 31 2015, 10:22 PM
jingyue retitled this revision from to [NVPTX] kernel pointer arguments point to the global address space.
jingyue updated this object.
jingyue edited the test plan for this revision. (Show Details)
jingyue added reviewers: eliben, jholewinski, meheff.
jingyue added a subscriber: Unknown Object (MLST).
jholewinski edited edge metadata.Jun 2 2015, 5:45 AM

Comments inlined.

Are you planning on enabling this pass by default?

lib/Target/NVPTX/NVPTXLowerKernelArgs.cpp
38

The address space conversion intrinsics are deprecated in favor of the new addrspacecast instruction (I need to document that in IntrinsicsNVVM.td).

41

Isn't it possible that an optimization would remove all of these casts before NVVMFavorNonGenericAddrSpaces runs? I know we control the pass pipeline in the backend, but I worry about these pass ordering constraints.

190

Strictly speaking, this is only valid for CUDA (DrvInterface::CUDA).

Yes. I'll add that to NVPTXPassConfig::addIRPass.

jingyue added inline comments.Jun 2 2015, 12:38 PM
lib/Target/NVPTX/NVPTXLowerKernelArgs.cpp
38

Thank you for pointing this out. So, do you think we should emit addrspacecast here too instead of gen.to.global and gen.to.param?

41

Not sure. Removing bitcast is fine for NVPTXFavorNonGenericAddrSpace. Why would any pass remove addrspacecast? addrspacecast has a richer semantics -- it changes the address space of a pointer and accessing pointers in different address spaces are not necessarily as fast. If any pass would want to remove an addrspacecast, it should reason about the performance effects of that too.

190

ACK

jingyue added inline comments.Jun 2 2015, 3:59 PM
lib/Target/NVPTX/NVPTXLowerKernelArgs.cpp
41

Your concern makes more sense to me if we replace llvm.nvvm.ptr.gen.to.* with addrspacecast. In that case, this pass would addrspacecasts value back and forth, e.g.,

t1 = cast t0 to global
t2 = cast t1 to generic

Then, it's reasonable for some optimizations to cancel such addrspacecast pairs.

Maybe not in this patch, we can merge this pass and NVPTXFavorNonGeneric into something like NVPTXTypeInference? That should prevent other passes from messing up the intermediate code (besides the name becomes fancier :) How does that sound to you?

jingyue updated this revision to Diff 27017.Jun 2 2015, 4:09 PM
jingyue edited edge metadata.

Enable NVPTXLowerKernelArgs by default

jingyue added inline comments.Jun 2 2015, 4:11 PM
lib/Target/NVPTX/NVPTXLowerKernelArgs.cpp
190

Done.

jingyue updated this revision to Diff 27086.Jun 3 2015, 8:13 PM

use addrspacecast instead of llvm.nvvm.ptr.gen.to.* intrinsics

jingyue added a subscriber: wengxt.Jun 3 2015, 8:13 PM
jingyue added inline comments.Jun 3 2015, 8:23 PM
lib/Target/NVPTX/NVPTXLowerKernelArgs.cpp
38

The updated version emits addrspacecast instead of llvm.nvvm.ptr.gen.to.*.

jingyue updated this revision to Diff 27087.Jun 3 2015, 8:31 PM

more comments

jholewinski accepted this revision.Jun 4 2015, 11:36 AM
jholewinski edited edge metadata.

This looks good to me now. I agree that we should merge these two passes to prevent any ordering issues.

This revision is now accepted and ready to land.Jun 4 2015, 11:36 AM
jingyue closed this revision.Jun 4 2015, 1:23 PM

Thank you. FYI, we are working on merging LowerKernelArgs and FavorNonGeneric, and handling the local address space as well.

lib/Target/NVPTX/NVPTXLowerKernelArgs.cpp