Page MenuHomePhabricator

Updated PPCG Code Generation for OpenCL compatibility
ClosedPublic

Authored by PhilippSchaad on Apr 19 2017, 1:28 AM.

Details

Summary

Added a small change to the way pointer arguments are set in the kernel code generation. The way the pointer is retrieved now, specifically requests global address space to be annotated. This is necessary, if the IR should be run through NVPTX to generate OpenCL compatible PTX. The changes do not affect the PTX Strings generated for the CUDA target (nvptx64-nvidia-cuda), but are necessary for OpenCL (nvptx64-nvidia-nvcl).

Additionally, the data layout has been updated to what the NVPTX Backend requests/recommends.

Diff Detail

Repository
rL LLVM

Event Timeline

PhilippSchaad created this revision.Apr 19 2017, 1:28 AM
grosser edited edge metadata.

Add Siddharth, as another review. Siddharth, could you take a look?

bollu added inline comments.Apr 20 2017, 9:41 AM
lib/CodeGen/PPCGCodeGeneration.cpp
1283 ↗(On Diff #95695)

Please add a test case to check that both code paths are taken.

1306 ↗(On Diff #95695)
  1. Change the magic constant 1 to a static const GlobalMemory = 1 or something of the sort.
  2. This change seems unrelated to the initial change. Can this be split into two separate patches?
  3. Please add a test case that makes sure that the correct code is generated.
PhilippSchaad added inline comments.Apr 20 2017, 10:01 AM
lib/CodeGen/PPCGCodeGeneration.cpp
1283 ↗(On Diff #95695)

Not both branches are being taken at the moment. The only invocation of this function, at this time, is being done with constant true. The 32-bit part was already present pre-patch, I just adapted BOTH strings to fit the ones recommended by the NVPTX backend. (See NVPTX Usage)

1306 ↗(On Diff #95695)
  1. What's the issue exactly?
  2. The change should do exactly the same thing, just in one statement instead of two. Before, I used getInt8Ty, and then got a pointer to it, declaring address space 1 = globa, now I'm doing it in one step.
  3. Sure, can do

Updated tests to fit address space changes and correct NVPTX data layout.

Each GPGPU test case now carries the correct NVPTX data-layout requested for 64-bit execution. Additionally, all parameters have been adapted to be global pointers. Consequently, each occurrence of said pointers in the following kernel, now carries the correct address space annotation as well.

jlebar added a subscriber: jlebar.Apr 22 2017, 9:31 AM

Additionally, all parameters have been adapted to be global pointers.

FYI, the LLVM backend has a pass that converts kernel args to address space 1 -- it's not necessary for front-ends to do this. Depending on what you're doing in these tests it may or may not be necessary -- I can't check because this patch was uploaded without context.

PhilippSchaad added a comment.EditedApr 22 2017, 10:17 AM

Additionally, all parameters have been adapted to be global pointers.

FYI, the LLVM backend has a pass that converts kernel args to address space 1 -- it's not necessary for front-ends to do this. Depending on what you're doing in these tests it may or may not be necessary -- I can't check because this patch was uploaded without context.

Oh, thanks for pointing that out! As long as the generated PTX kernel's arguments are annotated with .global it should do the trick.

PS: What I'm trying to do is building said PTX kernel with OpenCL, and OpenCL requires the memory-object pointers in the kernel arguments to be marked as .global. :-)

@jlebar From what I understand, NVPTX only marks a kernel argument as .global if its address space in the IR has been marked with addrspace(1), and said conversion to global address space would then take place inside the kernel, instead of marking the argument itself with .global. Am I wrong on this?

Am I wrong on this?

No, it looks like I was wrong and you are correct.

Sorry about that.

Am I wrong on this?

No, it looks like I was wrong and you are correct.

Sorry about that.

No problem.

In that case a review would be appreciated.

Since this patch was sent without sufficient context, I cannot really tell what it is dong. See http://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface. I'm also not comfortable reviewing Polly code, which I am unfamiliar with.

But in principle what is described in the patch description sounds fine; perhaps that's all the LGTM you need.

bollu added inline comments.Apr 24 2017, 2:20 AM
lib/CodeGen/PPCGCodeGeneration.cpp
1306 ↗(On Diff #95695)

From what I understand, the constant 1 is used to refer to the global address space in the NVPTX backend (please correct me if I am wrong). Could you change the invocation to:

static const int UseGlobalMemory = 1;
Args.push_back(Builder.getInt8PtrTy(UseGlobalMemory));

I personally dislike magic constants in code. Documenting it this way should save the next person a google search.

PhilippSchaad added inline comments.Apr 24 2017, 2:21 AM
lib/CodeGen/PPCGCodeGeneration.cpp
1306 ↗(On Diff #95695)

Oh that's what you mean, yes, in that case I agree with you. This is a better solution. I will update that, thanks for pointing it out.

Removed magic constant in favor of more doc-friendly 'explanatory-constant'.

Hi Philippe,

one comment regarding the data-layout changes:

test/GPGPU/double-parallel-loop.ll
215 ↗(On Diff #96357)

Hi Philippe,

I am slightly surprised you change the data-layout of the input code. This is generally nothing we have control over. Is this change indeed needed to make your patch work?

bollu accepted this revision.EditedApr 24 2017, 2:37 AM

LGTM, thanks for the patch!

This revision is now accepted and ready to land.Apr 24 2017, 2:37 AM
PhilippSchaad added inline comments.Apr 24 2017, 3:07 AM
test/GPGPU/double-parallel-loop.ll
215 ↗(On Diff #96357)

It is not, this was only done for consistency reasons, which does not make sense. My bad, I will revert those.

Reverted unnecessary datalayout changes in tests.

@bollu Could you maybe land this when @grosser agrees with the revert?

grosser accepted this revision.Apr 24 2017, 3:29 AM

LGTM.

@bollu ping: I would love it if you could commit this change please ;-)

This revision was automatically updated to reflect the committed changes.