- User Since
- Jan 8 2015, 1:53 PM (245 w, 3 d)
Fri, Sep 20
Looks like CUDA test-suite is triggering the assertion added by this patch:
Thu, Sep 19
LGTM. Thank you!
Wed, Sep 18
LGTM. You may want to wait a bit for Justin's feedback, in case he has some concerns.
Fri, Sep 13
Thu, Sep 12
Example of the actual error produced by clang: https://godbolt.org/z/Dl1FfC
Tue, Sep 3
Aug 23 2019
Aug 12 2019
Aug 1 2019
It looks to me that the root cause is in
Jul 30 2019
Jul 18 2019
Jul 12 2019
Jul 11 2019
Jul 10 2019
Jun 26 2019
Jun 25 2019
LGTM. Thank you!
Jun 24 2019
Jun 19 2019
Overall looks good. Thank you for making the change.
SGTM in principle. Folding the stubs would be bad as their addresses are implicitly used to identify the kernels to launch.
Jun 17 2019
LGTM, but I'm not familiar with the use of the stuff under Testing/. I've added @zturner who did some work there.
Jun 14 2019
LGTM. This is a cleaner way to provide stub name tweaks.
Is there particular reason you need to switch to this naming scheme?
Jun 13 2019
Yay! Thank you. Lack of this has stopped me when I tried gn last time. I'll try again once this patch lands.
@reames , @tra , @Hahnfeld , @jlebar , @chandlerc, I see that this was discussed in D50391 (in terms of using PTX's volatile to provide atomic lowering), but it's not clear to me that using volatile with atomic semantics in LLVM is something we guarantee will work (even in CUDA mode). I imagine that we'd need to lower these in terms of relaxed atomics in Clang for this to really be guaranteed to work correctly.
Jun 11 2019
So, in short, what you're saying is that lambda type may leak into the mangled name of a __global__ function and ne need to ensure that the mangled name is identical for both host and device, hence the need for consistent naming of lambdas.
So, the only thing this patch appears to do is make everything with this attribute uninitialized on device side and give protected visibility.
If I understand it correctly, you're using the attribute in order to construct something that's sort of opposite of the currently used device vars with host-side shadows. Only now the real variable lives on the host side and it's device side that gets the 'shadow' copy. Do I understand it correctly?
Syntactically the patch looks OK to me, but I think the purpose and meaning of the builtin type should be documented in more details. Based on this patch alone it's not clear at all what it's supposed to be used for and how.
Jun 10 2019
Jun 7 2019
May 29 2019
NVCC also allows that: https://godbolt.org/z/t78RvM
BTW, that code posted looks quite weird to me, how the code could make sense by return a pointer of device variable? or a pointer of shadow host variable?
Note for the future -- it would be great if we could finish discussing the patch before landing it.
I would still like to see the host-side test.
May 28 2019
May 17 2019
I'd add a comment with a brief explanation for the const variant and a TODO() to remove it.
May 16 2019
May 15 2019
May 13 2019
May 10 2019
May 3 2019
May 2 2019
Perhaps we should allow this in all unevaluated contexts?
I.e. int s = sizeof(foo(x)); should also work.
May 1 2019
LGTM, but I've added @rsmith who is way more familiar with this code.