Page MenuHomePhabricator

[CUDA][HIP][OpenMP] Emit deferred diagnostics by a post-parsing AST travese
ClosedPublic

Authored by yaxunl on Nov 13 2019, 4:35 AM.

Details

Summary

This patch removes the explicit call graph for CUDA/HIP/OpenMP deferred diagnostics generated during parsing
since it is error prone due to incomplete information about function declarations during parsing. In stead,
this patch does a post-parsing AST traverse and emits deferred diagnostics based on the use graph implicitly
generated during the traverse.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

revised by John's comments.

rjmccall added inline comments.Feb 3 2020, 3:10 PM
clang/lib/Sema/SemaExpr.cpp
17127

This should inherit from EvaluatedExprVisitor<Derived>, or else calls from EvaluatedExprVisitor and above won't dispatch all the way down to the subclass. This will allow subclasses to do node-specific logic, like your subclass's handling of InOMPDeviceContext or EvaluatedExprMarker's need to do custom things with local variables, DREs, and MEs.

Please also define this in a header; it doesn't need to be file-specific. I guess it needs a Sema & because of the call to LookupDestructor, so lib/Sema is probably the right place for that header.

17152

Let's not have both a visitDeclRefExpr and a VisitDeclRefExpr, distinguished only by capitalization.

17158

Please have all these call sites call asImpl().visitUsedDecl directly, and then don't define it in this class.

17195

This should be in your OMP-specific subclass.

yaxunl updated this revision to Diff 242251.Feb 3 2020, 8:29 PM
yaxunl marked 5 inline comments as done.

revised by John's comments.

rjmccall added inline comments.Feb 3 2020, 8:46 PM
clang/lib/Sema/SemaExpr.cpp
17254

Thanks, this looks a lot better.

Should this be moved to SemaOpenMP.cpp (and renamed to be OpenMP-specific), or do you think it's going to be useful in other modes?

clang/lib/Sema/UsedDeclVisitor.h
1

Please fix this line.

9

"a CRTP class which visits all the declarations that are ODR-used by an expression or statement."

65

It's generally best to asImpl() when restarting on a sub-expression like this, just in case the derived class wants to do something there. Same thing in VisitCXXBindTemporaryExpr.

yaxunl updated this revision to Diff 242296.Feb 4 2020, 4:59 AM
yaxunl marked 4 inline comments as done.

revised by John's comments.

clang/lib/Sema/SemaExpr.cpp
17254

It is not just for OpenMP. Deferred diagnostics are also emitted by CUDA/HIP.

One minor request, but otherwise LGTM; feel free to commit with that change.

clang/lib/Sema/SemaExpr.cpp
17254

Okay. Can it go in Sema.cpp next to the other overload of emitDeferredDiags, then? There isn't really much purpose to it being in this file.

yaxunl marked 2 inline comments as done.Feb 4 2020, 8:08 AM
yaxunl added inline comments.
clang/lib/Sema/SemaExpr.cpp
17254

will do when committing. thanks.

This revision was not accepted when it landed; it landed in state Needs Review.Feb 16 2020, 7:47 PM
This revision was automatically updated to reflect the committed changes.
yaxunl marked an inline comment as done.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 16 2020, 7:47 PM
hliao added a subscriber: hliao.Feb 16 2020, 8:00 PM

one header is missing and breaks the build

clang/lib/Sema/Sema.cpp
14

this file is missing and breaks the build

MaskRay added inline comments.
clang/lib/Sema/Sema.cpp
14
Fznamznon added inline comments.Feb 17 2020, 8:17 AM
clang/lib/Sema/Sema.cpp
1469

This particular change causes duplication of deferred diagnostics.
Consider the following example (please correct me if I'm doing something wrong, I'm not an expert in OpenMP):

int foobar1() { throw 1; } // error is expected here

// let's try to use foobar1 in the code where exceptions aren't allowed
#pragma omp declare target    
int (*B)() = &foobar1;        
#pragma omp end declare target

// and in some other place let's use foobar1 in device code again
#pragma omp declare target    
int a = foobar1();            
#pragma omp end declare target

Then diagnostic for foobar1 will be duplicated for each use of foobar1 under target directive.
I first experienced this behavior not with OpenMP, so I suppose reproducer can be done for each programming model which uses deferred diagnostics.

yaxunl marked an inline comment as done.Feb 17 2020, 10:27 AM
yaxunl added inline comments.
clang/lib/Sema/Sema.cpp
1469

The change is intentional so that each call chain causing the diagnostic can be identified. The drawback is that it is more verbose.

I can change this behavior so that the diagnostic will be emitted only for the first call chain that causes the diagnostic, if less verbose diagnostics is preferred.

This seems to result in triggering clang/lib/CodeGen/CGExpr.cpp:2626 when compiling mlir/lib/Transforms/AffineDataCopyGeneration.cpp with clang build with assertions on (clean build at e8e078c just before this change, broken at this, assert triggering at build fix commit).

https://buildkite.com/mlir/mlir-core/builds/2792#a54fb239-718b-4f0b-a309-f83e46ceb252

This seems to result in triggering clang/lib/CodeGen/CGExpr.cpp:2626 when compiling mlir/lib/Transforms/AffineDataCopyGeneration.cpp with clang build with assertions on (clean build at e8e078c just before this change, broken at this, assert triggering at build fix commit).

https://buildkite.com/mlir/mlir-core/builds/2792#a54fb239-718b-4f0b-a309-f83e46ceb252

Seems reasonable to revert if there's a testcase that they can get from rebuilding llvm with mlir enabled.

erichkeane added inline comments.
clang/lib/Sema/Sema.cpp
1521

Note that when recommitting this (if you choose to), this needs to also handle NamespaceDecl. We're a downstream and discovered that this doesn't properly handle functions or records handled in a namespace.

It can be implemented identically to TranslationUnitDecl.

rjmccall added inline comments.Feb 19 2020, 10:31 AM
clang/lib/Sema/Sema.cpp
1521

Wait, what? We shouldn't be doing this for TranslationUnitDecl either. I don't even know how we're "using" a TranslationUnitDecl, but neither this case not the case for NamespaceDecl should be recursively using every declaration declared inside it. If there's a declaration in a namespace that's being used, it should be getting visited as part of the actual use of it.

The logic for RecordDecl has the same problem.

erichkeane added inline comments.Feb 19 2020, 10:44 AM
clang/lib/Sema/Sema.cpp
1521

Despite the name, this seems to be more of a home-written ast walking class. The entry point is the 'translation unit' which seems to walk through everything in an attempt to find all the functions (including those that are 'marked' as used by an attribute).

You'll see the FunctionDecl section makes this assumption as well (not necessarily that we got to a function via a call). IMO, this approach is strange, and we should register entry points in some manner (functions marked as emitted to the device in some fashion), then just follow its call-graph (via the clang::CallGraph?) to emit all of these functions.

It seemed really odd to see this approach here, but it seemed well reviewed by the time I noticed it (via a downstream bug) so I figured I'd lost my chance to disagree with the approach.

rjmccall added inline comments.Feb 19 2020, 10:56 AM
clang/lib/Sema/Sema.cpp
1521

Sure, but visitUsedDecl isn't the right place to be entering the walk. visitUsedDecl is supposed to be the *callback* from the walk. If they need to walk all the global declarations to find kernels instead of tracking the kernels as they're encountered (which would be a *much* better approach), it should be done as a separate function.

I just missed this in the review.

Seems to me, it causes some other issues. See https://bugs.llvm.org/show_bug.cgi?id=44948 for example

Seems to me, it causes some other issues. See https://bugs.llvm.org/show_bug.cgi?id=44948 for example

I will fix that bug.

yaxunl marked an inline comment as done.Feb 19 2020, 2:05 PM
yaxunl added inline comments.
clang/lib/Sema/Sema.cpp
1521

The deferred diagnostics could be initiated by non-kernel functions or even host functions.

Let's consider a device code library where no kernels are defined. A device function is emitted, which calls a host device function which has a deferred diagnostic. All device functions that are emitted need to be checked.

Same with host functions that are emitted, which may call a host device function which has deferred diagnostic.

Also not just function calls need to be checked. A function address may be taken then called through function pointer. Therefore any reference to a function needs to be followed.

In the case of OpenMP, the initialization of a global function pointer which refers a function may trigger a deferred diangostic. There are tests for that.

rjmccall added inline comments.Feb 19 2020, 3:32 PM
clang/lib/Sema/Sema.cpp
1521

Right, I get that emitting deferred diagnostics for a declaration D needs to trigger any deferred diagnostics in declarations used by D, recursively. You essentially have a graph of lazily-emitted declarations (which may or may not have deferred diagnostics) and a number of eagerly-emitted "root" declarations with use-edges leading into that graph. Any declaration that's reachable from a root will need to be emitted and so needs to have any deferred diagnostics emitted as well. My question is why you're finding these roots with a retroactive walk of the entire translation unit instead of either building a list of roots as you go or (better yet) building a list of lazily-emitted declarations that are used by those roots. You can unambiguously identify at the point of declaration whether an entity will be eagerly or lazily emitted, right? If you just store those initial edges into the lazily-emitted declarations graph and then initiate the recursive walk from them at the end of the translation unit, you'll only end up walking declarations that are actually relevant to your compilation, so you'll have much better locality and (if this matters to you) you'll naturally work a lot better with PCH and modules.

yaxunl marked an inline comment as done.Feb 20 2020, 8:57 AM
yaxunl added inline comments.
clang/lib/Sema/Sema.cpp
1521

I will try the approach you suggested. Basically I will record the emitted functions and variables during parsing and use them as starting point for the final traversal.

This should work for CUDA/HIP. However it may be tricky for OpenMP since the emission of some entities depending on pragmas. Still it may be doable. If I encounter difficulty I will come back for discussion.

I will post the change for review.

Thanks.

bader added inline comments.Feb 20 2020, 9:06 AM
clang/lib/Sema/Sema.cpp
1521

FYI: SYCL is also using deferred diagnostics engine to emit device side diagnostics, although this part hasn't been up-streamed yet, but we are tracking changes in this area.
SYCL support implementation should be quite similar to CUDA/HIP.

I tried recording functions to be emitted during normal parsing and using it as starting point for the final traversal. It is quite promising. I only get one lit test failure for OpenMP:

int foobar2();

#pragma omp declare target
int (*B)() = &foobar2;
#pragma omp end declare target

int foobar2() { throw 1; } // expected-error {{cannot use 'throw' with exceptions disabled}}

In this case, the emission state of foobar2 cannot be determined by itself. It can only be determined to be emitted through variable B. Therefore, I also need to record variables that are potentially emitted.

I tried recording functions to be emitted during normal parsing and using it as starting point for the final traversal. It is quite promising. I only get one lit test failure for OpenMP:

int foobar2();

#pragma omp declare target
int (*B)() = &foobar2;
#pragma omp end declare target

int foobar2() { throw 1; } // expected-error {{cannot use 'throw' with exceptions disabled}}

In this case, the emission state of foobar2 cannot be determined by itself. It can only be determined to be emitted through variable B. Therefore, I also need to record variables that are potentially emitted.

Okay. Sounds like you have some common cause with https://reviews.llvm.org/D71227, then. Pinging @hliao.

Also, we cannot remove traversing of RecordDecl and CapturedDecl encountered in function body since we have OpenMP test like this:

int main() {
#pragma omp target
  {
    t1(0);
  }
  return 0;
}

This results in a kernel function embedded in a captured record decl in AST. We have to drill into the record decl to get the kernel and the function called by it.

I still got assertion when I use the built clang with check-mlir. The reduced testcase is

class A {
public:
  int foo();
};

static A a;

struct B {
  B(int x = a.foo());
};

void test() {
  B x;
}

The assertion I got is:

clang: /home/yaxunl/git/llvm/llvm/tools/clang/lib/CodeGen/CGExpr.cpp:2628: clang::CodeGen::LValue clang::CodeGen::CodeGenFunction::EmitDeclRefLValue(const clang::DeclRefExpr *): Assertion `(ND->isUsed(false) || !isa<VarDecl>(ND) || E->isNonOdrUse() || !E->getLocation().isValid()) && "Should not use decl without marking it used!"' failed.
Stack dump:


 #0 0x000000000258c614 PrintStackTraceSignalHandler(void*) (/home/yaxunl/git/llvm/assert/bin/clang+0x258c614)
 #1 0x000000000258a1ae llvm::sys::RunSignalHandlers() (/home/yaxunl/git/llvm/assert/bin/clang+0x258a1ae)
 #2 0x000000000258b7a2 llvm::sys::CleanupOnSignal(unsigned long) (/home/yaxunl/git/llvm/assert/bin/clang+0x258b7a2)
 #3 0x000000000251d0c3 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) (/home/yaxunl/git/llvm/assert/bin/clang+0x251d0c3)
 #4 0x000000000251d1fc CrashRecoverySignalHandler(int) (/home/yaxunl/git/llvm/assert/bin/clang+0x251d1fc)
 #5 0x00007f0dde3bf390 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x11390)
 #6 0x00007f0ddcf29428 raise /build/glibc-LK5gWL/glibc-2.23/signal/../sysdeps/unix/sysv/linux/raise.c:54:0
 #7 0x00007f0ddcf2b02a abort /build/glibc-LK5gWL/glibc-2.23/stdlib/abort.c:91:0
 #8 0x00007f0ddcf21bd7 __assert_fail_base /build/glibc-LK5gWL/glibc-2.23/assert/assert.c:92:0
 #9 0x00007f0ddcf21c82 (/lib/x86_64-linux-gnu/libc.so.6+0x2dc82)
#10 0x0000000002a1a5df clang::CodeGen::CodeGenFunction::EmitDeclRefLValue(clang::DeclRefExpr const*) (/home/yaxunl/git/llvm/assert/bin/clang+0x2a1a5df)
#11 0x0000000002a0dfb6 clang::CodeGen::CodeGenFunction::EmitLValue(clang::Expr const*) (/home/yaxunl/git/llvm/assert/bin/clang+0x2a0dfb6)
#12 0x0000000002a39973 clang::CodeGen::CodeGenFunction::EmitCXXMemberOrOperatorMemberCallExpr(clang::CallExpr const*, clang::CXXMethodDecl const*, clang::CodeGen::ReturnValueSlot, bool, clang::NestedNameSpecifier*, bool, clang::Expr const*) (/home/yaxunl/git/llvm/assert/bin/clang+0x2a39973)
#13 0x0000000002a389b9 clang::CodeGen::CodeGenFunction::EmitCXXMemberCallExpr(clang::CXXMemberCallExpr const*, clang::CodeGen::ReturnValueSlot) (/home/yaxunl/git/llvm/assert/bin/clang+0x2a389b9)
#14 0x0000000002a28f95 clang::CodeGen::CodeGenFunction::EmitCallExpr(clang::CallExpr const*, clang::CodeGen::ReturnValueSlot) (/home/yaxunl/git/llvm/assert/bin/clang+0x2a28f95)
#15 0x0000000002a5be29 (anonymous namespace)::ScalarExprEmitter::VisitCallExpr(clang::CallExpr const*) (/home/yaxunl/git/llvm/assert/bin/clang+0x2a5be29)
#16 0x0000000002a55b19 clang::StmtVisitorBase<std::add_pointer, (anonymous namespace)::ScalarExprEmitter, llvm::Value*>::Visit(clang::Stmt*) (/home/yaxunl/git/llvm/assert/bin/clang+0x2a55b19)
#17 0x0000000002a4b615 clang::CodeGen::CodeGenFunction::EmitScalarExpr(clang::Expr const*, bool) (/home/yaxunl/git/llvm/assert/bin/clang+0x2a4b615)
#18 0x0000000002a0da30 clang::CodeGen::CodeGenFunction::EmitAnyExpr(clang::Expr const*, clang::CodeGen::AggValueSlot, bool) (/home/yaxunl/git/llvm/assert/bin/clang+0x2a0da30)
#19 0x0000000002a0edde clang::CodeGen::CodeGenFunction::EmitAnyExprToTemp(clang::Expr const*) (/home/yaxunl/git/llvm/assert/bin/clang+0x2a0edde)
#20 0x00000000029cdd6b clang::CodeGen::CodeGenFunction::EmitCallArg(clang::CodeGen::CallArgList&, clang::Expr const*, clang::QualType) (/home/yaxunl/git/llvm/assert/bin/clang+0x29cdd6b)
#21 0x00000000029ccc41 clang::CodeGen::CodeGenFunction::EmitCallArgs(clang::CodeGen::CallArgList&, llvm::ArrayRef<clang::QualType>, llvm::iterator_range<clang::Stmt::CastIterator<clang::Expr, clang::Expr const* const, clang::Stmt const* const> >, clang::CodeGen::CodeGenFunction::AbstractCallee, unsigned int, clang::CodeGen::CodeGenFunction::EvaluationOrder) (/home/yaxunl/git/llvm/assert/bin/clang+0x29ccc41)
#22 0x00000000028d8e7b void clang::CodeGen::CodeGenFunction::EmitCallArgs<clang::FunctionProtoType>(clang::CodeGen::CallArgList&, clang::FunctionProtoType const*, llvm::iterator_range<clang::Stmt::CastIterator<clang::Expr, clang::Expr const* const, clang::Stmt const* const> >, clang::CodeGen::CodeGenFunction::AbstractCallee, unsigned int, clang::CodeGen::CodeGenFunction::EvaluationOrder) (/home/yaxunl/git/llvm/assert/bin/clang+0x28d8e7b)
#23 0x00000000029de431 clang::CodeGen::CodeGenFunction::EmitCXXConstructorCall(clang::CXXConstructorDecl const*, clang::CXXCtorType, bool, bool, clang::CodeGen::AggValueSlot, clang::CXXConstructExpr const*) (/home/yaxunl/git/llvm/assert/bin/clang+0x29de431)
#24 0x0000000002a3b84e clang::CodeGen::CodeGenFunction::EmitCXXConstructExpr(clang::CXXConstructExpr const*, clang::CodeGen::AggValueSlot) (/home/yaxunl/git/llvm/assert/bin/clang+0x2a3b84e)
#25 0x0000000002a32a8b (anonymous namespace)::AggExprEmitter::VisitCXXConstructExpr(clang::CXXConstructExpr const*) (/home/yaxunl/git/llvm/assert/bin/clang+0x2a32a8b)
#26 0x0000000002a2d44f clang::CodeGen::CodeGenFunction::EmitAggExpr(clang::Expr const*, clang::CodeGen::AggValueSlot) (/home/yaxunl/git/llvm/assert/bin/clang+0x2a2d44f)
#27 0x00000000029f96fc clang::CodeGen::CodeGenFunction::EmitExprAsInit(clang::Expr const*, clang::ValueDecl const*, clang::CodeGen::LValue, bool) (/home/yaxunl/git/llvm/assert/bin/clang+0x29f96fc)
#28 0x00000000029f68d9 clang::CodeGen::CodeGenFunction::EmitAutoVarInit(clang::CodeGen::CodeGenFunction::AutoVarEmission const&) (/home/yaxunl/git/llvm/assert/bin/clang+0x29f68d9)
#29 0x00000000029f1ca5 clang::CodeGen::CodeGenFunction::EmitVarDecl(clang::VarDecl const&) (/home/yaxunl/git/llvm/assert/bin/clang+0x29f1ca5)
#30 0x00000000029f1935 clang::CodeGen::CodeGenFunction::EmitDecl(clang::Decl const&) (/home/yaxunl/git/llvm/assert/bin/clang+0x29f1935)
#31 0x00000000027e07fb clang::CodeGen::CodeGenFunction::EmitDeclStmt(clang::DeclStmt const&) (/home/yaxunl/git/llvm/assert/bin/clang+0x27e07fb)
#32 0x00000000027d7a4c clang::CodeGen::CodeGenFunction::EmitSimpleStmt(clang::Stmt const*) (/home/yaxunl/git/llvm/assert/bin/clang+0x27d7a4c)
#33 0x00000000027d66cb clang::CodeGen::CodeGenFunction::EmitStmt(clang::Stmt const*, llvm::ArrayRef<clang::Attr const*>) (/home/yaxunl/git/llvm/assert/bin/clang+0x27d66cb)
#34 0x00000000027e15f0 clang::CodeGen::CodeGenFunction::EmitCompoundStmtWithoutScope(clang::CompoundStmt const&, bool, clang::CodeGen::AggValueSlot) (/home/yaxunl/git/llvm/assert/bin/clang+0x27e15f0)
#35 0x000000000282ffb6 clang::CodeGen::CodeGenFunction::GenerateCode(clang::GlobalDecl, llvm::Function*, clang::CodeGen::CGFunctionInfo const&) (/home/yaxunl/git/llvm/assert/bin/clang+0x282ffb6)
#36 0x000000000284dc52 clang::CodeGen::CodeGenModule::EmitGlobalFunctionDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/home/yaxunl/git/llvm/assert/bin/clang+0x284dc52)
#37 0x0000000002845cc7 clang::CodeGen::CodeGenModule::EmitGlobalDefinition(clang::GlobalDecl, llvm::GlobalValue*) (/home/yaxunl/git/llvm/assert/bin/clang+0x2845cc7)
#38 0x0000000002852271 clang::CodeGen::CodeGenModule::EmitTopLevelDecl(clang::Decl*) (/home/yaxunl/git/llvm/assert/bin/clang+0x2852271)

It is weird since this is not en OpenMP nor CUDA program and there is no deferred diags involved.

It seems somehow my change caused some decl missing used flag.

yaxunl updated this revision to Diff 247275.Feb 28 2020, 8:24 AM

Do not traverse the whole CU. Record potentially emitted functions and variables in the normal parsing and traverse them instead.

Also fixed bug 44948 and regression in check-mlir.

rnk removed a subscriber: rnk.Mar 6 2020, 2:14 PM
rjmccall added inline comments.Mar 7 2020, 11:57 AM
clang/include/clang/Sema/Sema.h
1498

This needs to be saved and restored in modules / PCH.

clang/lib/Sema/Sema.cpp
1479–1555

Hmm. I know this is existing code, but I just realized something. I think it's okay to not emit the notes on every diagnostic, but you might want to emit them on the first diagnostic from a function instead of after the last. If the real bug is that the program is using something it's not supposed to use, and there are enough errors in that function to reach the error limit, then the diagnostics emitter will associate these notes with a diagnostic it's suppressing and so presumably suppress them as well, leaving the user with no way to find this information.

1501

This needs to trigger if you use a variable with delayed diagnostics, too, right?

When you add these methods to UsedDeclVisitor, you'll be able to remove them here.

1519

Should this also go in the base UsedDeclVisitor? I'm less sure about that because the captured statement is really always a part of the enclosing function, right? Should the delay mechanism just be looking through local sub-contexts instead?

clang/lib/Sema/UsedDeclVisitor.h
21

Could you add this in a separate patch?

30

There should definitely be cases in here for every expression that uses a declaration, including both DeclRefExpr and MemberExpr. Those might be overridden in subclasses, but that's their business; the default behavior should be to visit every used decl.

yaxunl marked 23 inline comments as done.Mar 16 2020, 6:01 PM
yaxunl added inline comments.
clang/include/clang/Sema/Sema.h
1498

done

clang/lib/Sema/Sema.cpp
1469

the change is intentional to report all use chains which result in deferred diagnostics, otherwise user may fix one issue then see another issue, instead of see all of the issues in one compilation.

1479–1555

done

1501

fixed

1519

yes this one should also go to UsedDeclVisitor since this statement causes a RecordDecl generated which includes a FunctionDecl for a kernel, therefore this RecordDecl needs to be visited as used decl. I am not sure if other sub-context have the same effect. If so, I think they need to be handled case by case.

clang/lib/Sema/UsedDeclVisitor.h
21
30

done

yaxunl updated this revision to Diff 250666.Mar 16 2020, 6:06 PM
yaxunl marked 6 inline comments as done.

revised by John's comments.

rjmccall added inline comments.Mar 16 2020, 7:30 PM
clang/lib/Sema/Sema.cpp
1521

Okay, thank you. Do you still need all the cases in here for records, templates, and so on? It looks to me like you should always end up here with exactly the variables and functions that are being used, and you should never need to make special efforts to e.g. visit all the specializations of a template or visit all the methods of a class.

1540

Can there also be deferred diagnostics associated with this initializer?

clang/lib/Sema/SemaDecl.cpp
12229

DeclsToCheckForDeferredDiags is basically a set of declarations that you know to have to emit, right? It doesn't seem right to be adding every variable with an initializer to that set — especially because I'm pretty sure this function gets called for literally every variable with an initializer, including local variables. Presumably you only need to do this for global variables that you're definitely going to emit in the current mode.

yaxunl updated this revision to Diff 250960.Mar 17 2020, 6:49 PM
yaxunl marked 6 inline comments as done.

revised by John's comments.

clang/lib/Sema/Sema.cpp
1521

I can remove handling of templates and records. However I have to keep the handling of CapturedDecl. It is generated from code like

void t1(int r) {}

int main() {
#pragma omp target
  {
    t1(0);
  }
  return 0;
}

And it is like a function decl embeded in function main, e.g.

-FunctionDecl 0x86f7c70 <line:8:1, line:15:1> line:8:5 main 'int ()'
  `-CompoundStmt 0x873c3f8 <col:12, line:15:1>
    |-OMPTargetDirective 0x873c3a0 <line:9:1, col:19>
    | `-CapturedStmt 0x873c378 <line:10:3, line:13:3>
    |   `-CapturedDecl 0x873bd18 <<invalid sloc>> <invalid sloc> nothrow
    |     |-CapturedStmt 0x873c350 <line:10:3, line:13:3>
    |     | `-CapturedDecl 0x873c198 <<invalid sloc>> <invalid sloc> nothrow
    |     |   |-CompoundStmt 0x873c338 <line:10:3, line:13:3>
    |     |   | `-CallExpr 0x873c310 <line:12:5, col:9> 'void'
    |     |   |   |-ImplicitCastExpr 0x873c2f8 <col:5> 'void (*)(int)' <FunctionToPointerDecay>
    |     |   |   | `-DeclRefExpr 0x873c290 <col:5> 'void (int)' Function 0x86f7b18 't1' 'void (int)'
    |     |   |   `-IntegerLiteral 0x873c2b0 <col:8> 'int' 0
    |     |   `-ImplicitParamDecl 0x873c228 <line:9:1> col:1 implicit __context 'struct (anonymous at nvptx_va_arg_delayed_diags2.c:9:1) *const restrict'
    |     |-AlwaysInlineAttr 0x873c040 <<invalid sloc>> Implicit __forceinline
    |     |-ImplicitParamDecl 0x873bda0 <col:1> col:1 implicit .global_tid. 'const int'
    |     |-ImplicitParamDecl 0x873be08 <col:1> col:1 implicit .part_id. 'const int *const restrict'
    |     |-ImplicitParamDecl 0x873be70 <col:1> col:1 implicit .privates. 'void *const restrict'
    |     |-ImplicitParamDecl 0x873bed8 <col:1> col:1 implicit .copy_fn. 'void (*const restrict)(void *const restrict, ...)'
    |     |-ImplicitParamDecl 0x873bf40 <col:1> col:1 implicit .task_t. 'void *const'
    |     |-ImplicitParamDecl 0x873bfd8 <col:1> col:1 implicit __context 'struct (anonymous at nvptx_va_arg_delayed_diags2.c:9:1) *const restrict'
    |     |-RecordDecl 0x873c098 <col:1> col:1 implicit struct definition
    |     | `-CapturedRecordAttr 0x873c140 <<invalid sloc>> Implicit
    |     `-CapturedDecl 0x873c198 <<invalid sloc>> <invalid sloc> nothrow
    |       |-CompoundStmt 0x873c338 <line:10:3, line:13:3>
    |       | `-CallExpr 0x873c310 <line:12:5, col:9> 'void'
    |       |   |-ImplicitCastExpr 0x873c2f8 <col:5> 'void (*)(int)' <FunctionToPointerDecay>
    |       |   | `-DeclRefExpr 0x873c290 <col:5> 'void (int)' Function 0x86f7b18 't1' 'void (int)'
    |       |   `-IntegerLiteral 0x873c2b0 <col:8> 'int' 0
    |       `-ImplicitParamDecl 0x873c228 <line:9:1> col:1 implicit __context 'struct (anonymous at nvptx_va_arg_delayed_diags2.c:9:1) *const restrict'
    `-ReturnStmt 0x873c3e8 <line:14:3, col:10>
      `-IntegerLiteral 0x873c3c8 <col:10> 'int' 0

If I do not handle it, I will not be able to reach the call of t1().

1540

Yes. A global variable may be marked by omp declare target directive to be emitted on device. If the global var is initialized with the address of a function, the function will be emitted on device. If the device function calls a host device function which contains a deferred diag, that diag will be emitted. This can only be known after everything is parsed.

clang/lib/Sema/SemaDecl.cpp
12229

Yes we only need to check global variables. Fixed.

rjmccall added inline comments.Mar 18 2020, 10:37 AM
clang/lib/Sema/Sema.cpp
1521

Sure, although I wonder if it might be more reasonable to just make UsedDeclVisitor walk into CapturedDecls (and BlockDecls) when it sees the corresponding statements/expressions. Unlike other declaration references, those are never "cross-references"; they're just local code tied to a declaration for representational reasons.

1540

I meant directly with the initializer. Is there a way today to defer a diagnostic that you would emit while processing an initializer expression? If so, this needs to trigger that.

yaxunl marked 4 inline comments as done.Mar 18 2020, 1:43 PM
yaxunl added inline comments.
clang/lib/Sema/Sema.cpp
1521

done

1540

I don't think the initializer itself (without a target declare directive) will cause a deferred diagnostic since it does not cause change of emission states of functions.

yaxunl updated this revision to Diff 251167.Mar 18 2020, 1:46 PM
yaxunl marked 2 inline comments as done.

revised by John's comments

This looks good, assuming there's either no issue with the lazy emission of variables or that you just intend to tackle that later.

clang/lib/Sema/Sema.cpp
1540

Okay, so if I'm getting this right: only functions are emitted lazily, and variables have to be marked specially in order to get emitted on the device, so there's no need to defer diagnostics within variable initializations because we always know at the time of processing the variable where it will be emitted?

yaxunl marked 2 inline comments as done.Mar 18 2020, 3:18 PM
yaxunl added inline comments.
clang/lib/Sema/Sema.cpp
1540

right.