This is an archive of the discontinued LLVM Phabricator instance.

[Polly][PPCGCodeGen] OpenCL now gets kernel argument size from PPCG CodeGen
ClosedPublic

Authored by PhilippSchaad on May 8 2017, 12:35 AM.

Details

Summary

PPCGCodeGeneration now attaches the size of the kernel launch parameters at the end of the parameter list. For the existing CUDA Runtime, this gets ignored, but the OpenCL Runtime knows to check for kernel-argument size at the end of the parameter list. (The resulting parameters list is twice as long. This has been accounted for in the corresponding test cases).

Event Timeline

PhilippSchaad created this revision.May 8 2017, 12:35 AM
PhilippSchaad edited the summary of this revision. (Show Details)May 8 2017, 12:37 AM
PhilippSchaad set the repository for this revision to rL LLVM.
bollu added inline comments.May 8 2017, 1:34 AM
lib/CodeGen/PPCGCodeGeneration.cpp
1224–1227

I believe int NumArgs = F->getArgumentList().size() will also work? seems a little clearer to me.

1224–1227

NumArgs could be const? It doesn't seem to be mutated anywhere.

1225

why not std::vector<int>(NumArgs)? That removes the explicit memory allocation/deallocation. I wish std::dynarray existed, since that's exactly the use case for this.

1302

The computation of SizeInBytes seems to be a pure computation which is repeated thrice in this function (GPUNodeBuilder::createLaunchParameters). Could this be refactored into a standalone function?

That would also make the assignment look a little cleaner:

ArgSizes[Index] = computeSizeInBytes(Val->getType());
1363

Could the pattern of

to-split.cpp
Value *Slot = Builder.CreateGEP(
        Parameters, {Builder.getInt64(0), Builder.getInt64(Index)});
    Value *ParamTyped =
        Builder.CreatePointerCast(Param, Builder.getInt8PtrTy());
    Builder.CreateStore(ParamTyped, Slot);

be refactored into a separate function if it is not too much trouble? It occurs thrice in this function (GPUNodeBuilder::createLaunchParameters)

PhilippSchaad marked 3 inline comments as done.May 8 2017, 2:01 AM
PhilippSchaad added inline comments.
lib/CodeGen/PPCGCodeGeneration.cpp
1224–1227

The type llvm::Function does not seem to have a member getArgumentList()? Also, I have just adapted that from before, but I agree that something like that would be clearer.

1363

Looking into it.

  • Addressed some comments.
bollu added inline comments.May 8 2017, 3:32 AM
lib/CodeGen/PPCGCodeGeneration.cpp
1224–1227

Hm, it looks like it has been removed. [I was looking at an older reference to Function::getArgumentList](http://llvm.org/docs/doxygen/html/classllvm_1_1Function.html#a0a46edcf9b885556850d8ed9c49d9b52). Apologies!

I believe that [llvm::Function::arg_size()](http://llvm.org/doxygen/classllvm_1_1Function.html#abccf59dbcc12707d079124e6bcfb4a47) should work for this purpose.

Addressed comments

PhilippSchaad marked 5 inline comments as done.May 8 2017, 10:44 AM
bollu edited edge metadata.May 9 2017, 12:45 AM

This can be done in another patch, but I think the Index++ calls inside for loops can be hoisted as for(..; ..; i++, Index++). I personally find this clearer, but I'm not sure what the others think.

lib/CodeGen/PPCGCodeGeneration.cpp
146

getPrimitiveSizeInBits and getScalarSizeInBits accept const Type *T. Could you please change the Type *T to const Type *T?

1225

Have you considered std::vector<int> ArgSizes(NumArgs);? I find this more idiomatic.

I personally find it better like this, because the i is per-loop, so it is self contained in every iteration, the Index however is across loops. To me, this way it is more clear, that the Index is not a loop-specific iteration variable.
Let's see what the others think though. Either way, I would suggest moving that to a different patch indeed, should we decide on changing that.

(Will address your commented points in just a second, agree with both of them.)

Added constant identifier and mor idiomatic ArgSizes vector initialization

PhilippSchaad marked 2 inline comments as done.May 9 2017, 2:11 AM
grosser edited edge metadata.May 9 2017, 2:17 AM

I personally like the Index++ calls in the loop body, just to add another vote to the bikeshedding. I agree that such a change is unrelated to this patch and if anybody feels strong, we should have a dedicated discussion about this topic outside of this thread. In any case, this should not affect this patch.

bollu accepted this revision.May 9 2017, 2:24 AM

LGTM.

This revision is now accepted and ready to land.May 9 2017, 2:24 AM

Nice. @bollu, can you commit this change immediately.

This revision was automatically updated to reflect the committed changes.