Let llvm.invariant.group.barrier accepts pointer to any address space
ClosedPublic

Authored by yaxunl on Mon, Nov 13, 12:53 PM.

Details

Summary

llvm.invariant.group.barrier may accept pointers to arbitrary address space.

This patch let it accept pointers to i8 in any address space and returns
pointer to i8 in the same address space.

Corresponding clang change: https://reviews.llvm.org/D40062

Diff Detail

Repository
rL LLVM
yaxunl created this revision.Mon, Nov 13, 12:53 PM
rampitec added inline comments.Mon, Nov 13, 12:59 PM
lib/Analysis/CaptureTracking.cpp
247 ↗(On Diff #122705)

Can we mark it as "NoCapture<0>" instead?

yaxunl added inline comments.Mon, Nov 13, 1:47 PM
lib/Analysis/CaptureTracking.cpp
247 ↗(On Diff #122705)

line 243 also checks I->getType()->isVoidTy(), but this intrinsic returns a pointer. So even if we make it NoCapture, it still does not work.

rampitec added inline comments.Mon, Nov 13, 1:52 PM
lib/Analysis/CaptureTracking.cpp
247 ↗(On Diff #122705)

Checking for a specific intrinsic seems error-prone. What if the pointer returned by the call is further used in the caller?

rampitec added inline comments.Mon, Nov 13, 1:58 PM
lib/Analysis/CaptureTracking.cpp
247 ↗(On Diff #122705)

Maybe it is possible to add a check for VoidTy with the condition, that value is unused if returned from non-void?

yaxunl added inline comments.Mon, Nov 13, 2:34 PM
lib/Analysis/CaptureTracking.cpp
247 ↗(On Diff #122705)

Good idea. Will do.

yaxunl updated this revision to Diff 122731.Mon, Nov 13, 2:38 PM

Revised by Stas' comments.

This revision is now accepted and ready to land.Mon, Nov 13, 2:44 PM
arsenm added inline comments.Mon, Nov 13, 2:52 PM
lib/Analysis/CaptureTracking.cpp
244 ↗(On Diff #122731)

I don't understand the I->use_empty() check, or how this is related to the address space change

arsenm added inline comments.Mon, Nov 13, 2:54 PM
test/CodeGen/AMDGPU/promote-alloca-invariant-markers.ll
2 ↗(On Diff #122731)

Why is it necessary to add this? The datalayout is implied by the explicit triple

yaxunl added inline comments.Mon, Nov 13, 6:43 PM
lib/Analysis/CaptureTracking.cpp
244 ↗(On Diff #122731)

llvm.invariant.group.barrier only accepts pointer in default addr space. As such, the alloca has to be addrspacecasted then passed to llvm.invariant.group.barrier. In AMDGPUPromoteAlloca pass, addrspacecast is checked through CaptureTracking. The addrspacecast inst is passed to llvm.invariant.group.barrier, CaptureTracking thinks the intrinsic captures the addrspacecast inst because the intrinsic does not have void return type, which is incorrect. If the intrinsic only reads mem, does not throw, even if it returns a pointer, but if the returned pointer is not used, it still does not capture the pointer. So after this is fixed, CaptureTracking will identify the addrspacecast as not captured, then AMDGPUPromoteAlloca will identify the alloca as not captured, and then it can be promoted. Otherwise the alloca will not be promoted.

test/CodeGen/AMDGPU/promote-alloca-invariant-markers.ll
2 ↗(On Diff #122731)

Removing this will cause error:

llc: <stdin>:13:31: error: address space must match datalayout

%tmp = alloca i32, align 4, addrspace(5)

This is a workaround for invariant.group.barrier not having a mangled type. That should probably be fixed instead. The others already do, so it doesn't make sense that group.barrier does not

test/CodeGen/AMDGPU/promote-alloca-invariant-markers.ll
2 ↗(On Diff #122731)

This sounds like a bug. This should be coming from the datalayout

arsenm requested changes to this revision.Mon, Nov 13, 10:18 PM
This revision now requires changes to proceed.Mon, Nov 13, 10:18 PM

Firstly, are you sure that the invariant.group.barrier is the problem here? I didn't hear that it is used anywhere besides devirtualization, that is not turned on by default.
Here is my old path that was never reviewed:
https://reviews.llvm.org/D32673

As you can see, in order to check if the pointer passed to the barrier is captured, we need to check if all uses of the pointer returned by the barrier is not captured.

Firstly, are you sure that the invariant.group.barrier is the problem here? I didn't hear that it is used anywhere besides devirtualization, that is not turned on by default.
Here is my old path that was never reviewed:
https://reviews.llvm.org/D32673

As you can see, in order to check if the pointer passed to the barrier is captured, we need to check if all uses of the pointer returned by the barrier is not captured.

Firstly, are you sure that the invariant.group.barrier is the problem here? I didn't hear that it is used anywhere besides devirtualization, that is not turned on by default.
Here is my old path that was never reviewed:
https://reviews.llvm.org/D32673

As you can see, in order to check if the pointer passed to the barrier is captured, we need to check if all uses of the pointer returned by the barrier is not captured.

I am trying to fix a lit test failure related to invariant.group.barrier. There are two issues:

  1. as Matt pointed out, invariant.group.barrier should accept pointer to any address space as parameter, since alloca address space may not be 0, e.g. for amdgcn---amdgiz triple, the pointer argument of invariant.group.barrier is in addr space 5.
  1. CaptureTracking does not handle invariant.group.barrier correctly, which is what your patch and this patch tries to fix. Since you already have a patch for that, I would drop my fix and focus on the first issue. However as Stas suggested, it may be better to make the fix more generic instead of do special handling for invariant.group.barrier, e.g. is it OK to make it a generic checking for all functions?

if the func does not read mem
if the func does not throw
if the func returns nothing, or the returned pointer is not captured

then pointer argument is not captured

yaxunl added inline comments.Tue, Nov 14, 11:11 AM
test/CodeGen/AMDGPU/promote-alloca-invariant-markers.ll
2 ↗(On Diff #122731)

The error msg comes from LLParser::ParseAlloc.

llc load and parse IR before setting target triple and datalayout for the module.

When llc parses IR, it has to use the datalayout defined by the IR itself. If the IR does not define datalayout, the default layout created from empty string is used.

Currently LLParser does not allow overriding the datalayout of the IR by a given datalayout.

If we want to change this, we need to add an API to LLParser to allow datalayout to be set before parsing IR, and change llc to set datalayout before parsing IR. I think this is better left for another patch.

yaxunl updated this revision to Diff 122958.Tue, Nov 14, 6:19 PM
yaxunl retitled this revision from Fix CaptureTracking for llvm.invariant.group.barrier to Let llvm.invariant.group.barrier accepts pointer to any address space.
yaxunl edited the summary of this revision. (Show Details)

Make llvm.invariant.group.barrier mangled by Matt's comments.

yaxunl edited the summary of this revision. (Show Details)Tue, Nov 14, 6:25 PM
Prazek accepted this revision.Thu, Nov 16, 4:24 AM

LGTM

This revision was automatically updated to reflect the committed changes.