This is an archive of the discontinued LLVM Phabricator instance.

[CUDA][HIP] Fix non-ODR-use of static device variable
ClosedPublic

Authored by yaxunl on May 11 2021, 7:15 AM.

Details

Summary

Currently non-ODR-use of static device variable by host code
causes static device variable to be emitted and registered,
which should not.

This patch fixes that.

Diff Detail

Event Timeline

yaxunl requested review of this revision.May 11 2021, 7:15 AM
yaxunl created this revision.
tra accepted this revision.May 11 2021, 10:37 AM

LGTM with few nits.

clang/lib/Sema/SemaExpr.cpp
17145

This can be collapsed with the if (SemaRef.LangOpts.CUDA) above.

clang/test/CodeGenCUDA/static-device-var-rdc.cu
118–124

Positive checks should be -DAG and negative checks should be done in a separate RUN as we can't really guarantee the order in which registration calls are emitted.

This revision is now accepted and ready to land.May 11 2021, 10:37 AM
yaxunl marked 2 inline comments as done.May 11 2021, 5:01 PM
yaxunl added inline comments.
clang/lib/Sema/SemaExpr.cpp
17145

This cannot be collapsed since there is else if.

clang/test/CodeGenCUDA/static-device-var-rdc.cu
118–124

will add separate run lines for negative tests

This revision was automatically updated to reflect the committed changes.
yaxunl marked 2 inline comments as done.
Herald added a project: Restricted Project. · View Herald TranscriptMay 12 2021, 8:14 AM
tra added a comment.May 18 2021, 1:56 PM

Sam, this patch has apparently triggered some unwanted side effects. I'm still reducing the failures to something that could be used for debugging, but the rough symptoms are:

We now end up emitting the code for the host-only static member functions of instantiated class templates during device-side compilation.
Clang now complains about not allowed nontrivial static initializers for some variables in class templates.

Sam, this patch has apparently triggered some unwanted side effects. I'm still reducing the failures to something that could be used for debugging, but the rough symptoms are:

We now end up emitting the code for the host-only static member functions of instantiated class templates during device-side compilation.
Clang now complains about not allowed nontrivial static initializers for some variables in class templates.

Sam, this patch has apparently triggered some unwanted side effects. I'm still reducing the failures to something that could be used for debugging, but the rough symptoms are:

We now end up emitting the code for the host-only static member functions of instantiated class templates during device-side compilation.
Clang now complains about not allowed nontrivial static initializers for some variables in class templates.

We limit the variables to device or constant variables, e.g. https://godbolt.org/z/seh1n7KnY

xxx is emitted because it is implicitly constant, but yyy is not emitted.

It would be interesting to know what kind of variables are emitted.

Would you like the change reverted?

tra added a comment.May 18 2021, 2:42 PM

It would be interesting to know what kind of variables are emitted.

I'm still reducing the failure. I'll send you a reproducer once I have it.

Would you like the change reverted?

We have a bit of time -- till the end of the week. Let's wait till we have a reproducer and a better idea what's going on and whether it's something easy to fix.

tra added a comment.May 18 2021, 4:50 PM

Here's one example reproducer: https://godbolt.org/z/77M596W89
It's rather hairy, but should be usable for further debugging.

There are no CUDA attributes anywhere in sight, but we do end up emitting a host-only constructor for o_u which calls strlen.

tra added a comment.EditedMay 18 2021, 5:44 PM

Here's a slightly simpler reproducer: https://godbolt.org/z/rW6P9e37s

Here's a slightly simpler reproducer: https://godbolt.org/z/rW6P9e37s

I have a fix for this: https://reviews.llvm.org/D102801

It seems the issue was due to clang emits the implicit constant variable aw<ar>::c but it is invalid to be emitted on device side since it is initialized with a host function pointer. The fix is not to force emit implicit constant variables in device compilation. This basically limits our previous fix to explicit device variables.

tra added a comment.May 19 2021, 4:25 PM

Here's a slightly simpler reproducer: https://godbolt.org/z/rW6P9e37s

I have a fix for this: https://reviews.llvm.org/D102801

Here's the second regression introduced by this patch. This one triggers an assertion in clang:

class a {
public:
  a(char *);
};
void b() {
  [](char *c) {
    static a d(c);
    d;
  };
}