This is an archive of the discontinued LLVM Phabricator instance.

[CUDA] Do not allow dynamic initialization of global device side variables.
ClosedPublic

Authored by tra on Dec 7 2015, 2:04 PM.

Details

Summary

In general CUDA does not allow dynamic initialization of
global device-side variables except for records with empty constructors as described in section E.2.3.1 of CUDA 7.5 Programming guide:

device, constant and shared variables defined in namespace scope,
that are of class type, cannot have a non-empty constructor or a non-empty destructor.
A constructor for a class type is considered empty at a point in the translation unit,
if it is either a trivial constructor or it satisfies all of the following conditions:

  • The constructor function has been defined.
  • The constructor function has no parameters, the initializer list is empty and the function body is an empty compound statement.
  • Its class has no virtual functions and no virtual base classes.
  • The default constructors of all base classes of its class can be considered empty.
  • For all the nonstatic data members of its class that are of class type (or array thereof), the default constructors can be considered empty.

Clang is already enforcing no-initializers for shared variables, but currently allows dynamic initialization for device and constant variables.

This patch applies initializer checks for all device-side variables.
Empty constructors are accepted, but no code is generated for them.

Diff Detail

Repository
rL LLVM

Event Timeline

tra updated this revision to Diff 42101.Dec 7 2015, 2:04 PM
tra retitled this revision from to [CUDA] Do not allow dynamic initialization of global device side variables..
tra updated this object.
tra added reviewers: rsmith, jingyue, jpienaar.
tra added a subscriber: cfe-commits.
tra added inline comments.Dec 7 2015, 2:11 PM
lib/CodeGen/CGDeclCXX.cpp
329 ↗(On Diff #42101)

@rsmith: is this a good way to find member initializer list items?

struct S {
    int a,b,c;
    S() : a(1),b(2),c(3) {}
};

I'm looking for a(),b(),c() which is what I think CUDA spec wants to check for, but CD->inits() appears to have other initializers on the list as well.

tra added a comment.Dec 16 2015, 3:41 PM

Ping.
@rsmith -- Richard, can you take a look?

rsmith edited edge metadata.Jan 7 2016, 6:17 PM

This should be checked and diagnosed in Sema, not in CodeGen.

lib/CodeGen/CGDeclCXX.cpp
323–324 ↗(On Diff #42101)

What if the constructor is a C-style varargs function:

struct X { X(...) {} };

?

329 ↗(On Diff #42101)

You shouldn't need to check isAnyMemberInitializer: if there's any written inits, the constructor violates the rules.

333 ↗(On Diff #42101)

What doesn't always work?

347–367 ↗(On Diff #42101)

Rather than checking these properties this way, I'd suggest you check the initialization expression in each CXXCtorInitializer only contains CXXConstructExprs for empty constructors (or any other whitelisted constructs). Your current approach will miss a couple of cases which the CUDA spec misses but presumably meant to exclude:

  1. Default member initializers

    int f(); struct X { int n = f(); X() {} };
  1. Cases where a constructor other than a default constructor is implicitly invoked

    struct A { template<typename ...T> A(T...); }; struct B : A { B() {} };
lib/CodeGen/CodeGenModule.cpp
1347–1351 ↗(On Diff #42101)

According to the quoted specification, you're supposed to check whether the constructor can be considered empty at the point in the translation unit where the definition of the variable occurs, so I don't think you need to delay anything.

tra updated this revision to Diff 44687.Jan 12 2016, 3:21 PM
tra edited edge metadata.

Check all variable initializers and only allow 'empty constructors' as Richard has suggested.
Changed test structure so that we test for allowed/disallowed constructors separately from testing how we handle initialization of base classes or member fields.

tra added a comment.Jan 12 2016, 3:46 PM

Richard, I've updated the patch as you've suggested -- it indeed simplifies things quite a bit and handles the corner cases you've mentioned.

lib/CodeGen/CGDeclCXX.cpp
323–324 ↗(On Diff #42101)

CUDA does not support varargs on device side. nvcc fails with an error:

error: a "device" function cannot have ellipsis

That's another thing I'll need to fix (as a separate patch) as clang currently accepts varargs everywhere.

This patch will ignore number of arguments passed to varargs constructor, but the checks for empty body still do apply.

329 ↗(On Diff #42101)

As it turns out, the rules don't apply to all written initializers. For instance, nvcc allows empty constructor on init list:

struct A {  __device__ A(){}; };
struct B {  __device__ B(){}; };

struct C : A {
  B b;
  __device__ C() : A(), b() {}
};

__device__ C c;

I've simplified the patch so that in only checks for constructor's 'emptiness', but disregards how that constructor gets to be executed.

333 ↗(On Diff #42101)

It was leftover from early patch variant that didn't defer emitting global vars.
If I don't defer and need to emit a global var before constructor definition is available, hasTrivialBody() returns false and triggers diagnostics.

347–367 ↗(On Diff #42101)

Nice. This has simplified the checks a lot.

lib/CodeGen/CodeGenModule.cpp
1347–1351 ↗(On Diff #42101)

I guess it's a bug in their guide as nvcc accepts following code with constructor definition appearing *after* the variable:

struct S {  S(); };
__device__ S s;
S::S() {}

I think you missed this from my previous review:

This should be checked and diagnosed in Sema, not in CodeGen.

lib/CodeGen/CGDeclCXX.cpp
333–337 ↗(On Diff #44687)

You can check these conditions with RD->isDynamicClass().

tra updated this revision to Diff 45044.Jan 15 2016, 3:33 PM
tra added a reviewer: jlebar.

Moved initializer checks from CodeGen to Sema.
Added test cases for initializers of non-class variables.

tra marked an inline comment as done.Jan 15 2016, 3:35 PM

I think you missed this from my previous review:

This should be checked and diagnosed in Sema, not in CodeGen.

Done.

rsmith added inline comments.Jan 15 2016, 4:10 PM
lib/CodeGen/CGDeclCXX.cpp
312 ↗(On Diff #45044)

areallowed -> are allowed

lib/CodeGen/CodeGenModule.cpp
2334 ↗(On Diff #45044)

As this is a global variable, it should presumably still be statically zero-initialized.

tra updated this revision to Diff 45051.Jan 15 2016, 4:22 PM
tra marked an inline comment as done.

Typo fix.

lib/CodeGen/CodeGenModule.cpp
2334 ↗(On Diff #45044)

There is no way to initialize shared variables. They are rough equivalent of local variables, only in this case CUDA allocates them per kernel invocation from a shared buffer with no guarantees regarding its contents.

They used to be zero-initialized by compiler, but that was intentionally changed to undef in r245786 / http://reviews.llvm.org/D12241

jlebar edited edge metadata.Jan 17 2016, 11:26 AM

tra asked me to check for coverage. Looks pretty good in that respect.

include/clang/Basic/DiagnosticSemaKinds.td
6419 ↗(On Diff #45051)

Nit, but, since we're all language nerds here, suggest adding an Oxford comma.

lib/Sema/SemaCUDA.cpp
436 ↗(On Diff #45051)

The test passes if I comment out this if statement. I'm not sure if that's expected; this may or may not be entirely covered below.

442 ↗(On Diff #45051)

Tests pass if I comment out the isDefined check.

lib/Sema/SemaDecl.cpp
10183 ↗(On Diff #45051)

We also allow constant initializers for constant and device variables.

Consider rephrasing this -- it sounds like this is a clang extension, but I just checked and it does not appear to be.

10186 ↗(On Diff #45051)

Test passes if I comment out IsGlobal or CUDAIsDevice. (I'm not sure if you care to test the latter, but the former seems important.)

tra updated this revision to Diff 45312.Jan 19 2016, 3:17 PM
tra edited edge metadata.
tra marked 2 inline comments as done.

Addressed Justin's comments.

tra marked 3 inline comments as done.Jan 19 2016, 3:19 PM
tra added inline comments.
lib/Sema/SemaCUDA.cpp
436 ↗(On Diff #45051)

According to CPP reference trivial constructor will pass all other checks below.

442 ↗(On Diff #45051)

hasTrivialBody() would only return true if we have a body which only happens if function is defined. isDefined() is mostly for readability here.

lib/Sema/SemaDecl.cpp
10186 ↗(On Diff #45051)

IsGlobal -- all test cases were using either global or local variables. I've added a static shared variable in the device function. Now IsGlobal check (or, rather !isStaticLocal() part of it) is required in order for the tests to succeed.

CUDAIsDevice is not triggered because all test cases are run with -fcuda-is-device.
It's hard to run host-side test with -verify here because I'd have to put #ifdef around every 'expected-error'

jlebar removed a reviewer: jlebar.Jan 22 2016, 5:44 PM
jlebar added a subscriber: jlebar.

jingyue/jpienaar/rsmith - friendly ping? Without this, -O0 builds don't work, because they emit empty global initializers that don't get optimized out.

jpienaar edited edge metadata.Feb 1 2016, 9:53 AM

@jlebar: We defer it to your and Richard's approval. Thanks

rsmith accepted this revision.Feb 1 2016, 1:17 PM
rsmith edited edge metadata.

Some minor things, but feel free to commit after addressing them.

I agree that we should figure out what to do about the zero/undef initialization separately.

lib/Sema/SemaCUDA.cpp
429–430 ↗(On Diff #45312)

The function might still not be defined after this (if the template is not defined); you should presumably return false here in that case.

442 ↗(On Diff #45312)

Please do remove the isDefined check here. Including it makes a reader wonder what case it's trying to handle.

455–457 ↗(On Diff #45312)

Maybe reorder this before the CXXCtorInitializer check? It's a much cheaper test.

lib/Sema/SemaDecl.cpp
10191–10198 ↗(On Diff #45312)

What should happen if the init is a constant initializer that is a CXXConstructExpr, but it uses a constructor that is not empty from CUDA's perspective? Such as:

struct X { constexpr X() { int n = 0; } };
__device__ X x;

I would assume this should be valid, but I think you'll reject it. Maybe change else if ( to if (!AllowedInit &&?

10196–10198 ↗(On Diff #45312)

Might be clearer as

if (__device__ || __constant__)
  AllowedInit = isConstantInitializer(...)
This revision is now accepted and ready to land.Feb 1 2016, 1:17 PM
tra updated this revision to Diff 46696.Feb 2 2016, 1:45 PM
tra edited edge metadata.
tra marked 8 inline comments as done.

Addressed Richard's comments.
Relaxed restrictions a bit to allow constant initializers even those CUDA would not considered to be empty.
Updated test case accordingly.

This revision was automatically updated to reflect the committed changes.
tra added inline comments.Feb 2 2016, 5:17 PM
lib/Sema/SemaCUDA.cpp
429–430 ↗(On Diff #45312)

I don't think it's needed. If it's still not definied, it will be caught by hasTrivialBody() check below.

lib/Sema/SemaDecl.cpp
10191–10198 ↗(On Diff #45312)

NVCC produces an error (probably because it does not support c++14):
zz.cu(1): error: statement may not appear in a constexpr constructor

clang w/ this patch indeed considers it to be a non-empty initializer and produces an error.

I agree that allowing constant initializer is the right thing to do. Your example requires c++14, so there's no direct comparison with nvcc, but I think allowing it is indeed the right thing to do here.