This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Fix pretty printing of CUDA address spaces
ClosedPublic

Authored by richardmembarth on Nov 8 2018, 6:36 AM.

Details

Summary

The current pretty-printer emits OpenCL-style memory spaces specifiers: __device , __constant, and __shared.
The correct CUDA memory space specifiers are: __device__, __constant__, and __shared__:
https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#variable-memory-space-specifiers

Diff Detail

Repository
rL LLVM

Event Timeline

richardmembarth created this revision.Nov 8 2018, 6:36 AM
richardmembarth edited the summary of this revision. (Show Details)Nov 8 2018, 6:45 AM

Can you add tests for this change? We typically have these in Misc by passing -ast-print.

I think it's not so easy to provide such tests for CUDA.
CUDA memory space specifiers are implemented via attributes, e.g. #define __shared__ __attribute__((shared)).
As a result of this, they are pretty-printed via a different code path.
In my example, I call Ctx.getAddrSpaceQualType(QT, LangAS::cuda_shared), which is then pretty-printed via the code above.
Any hints how to provide tests for this one?

Could this be tested using diagnostics that prints the type? Like in test/SemaOpenCL/address-spaces.cl.

Same problem here: The CUDA memory space specifiers are represented via attributes, e.g. CUDASharedAttr and only converted in CodeGen to LangAS::cuda_shared.
We would need a different frontend that annotates LangAS::cuda_shared.

I think it's not so easy to provide such tests for CUDA.
CUDA memory space specifiers are implemented via attributes, e.g. #define __shared__ __attribute__((shared)).
As a result of this, they are pretty-printed via a different code path.
In my example, I call Ctx.getAddrSpaceQualType(QT, LangAS::cuda_shared), which is then pretty-printed via the code above.
Any hints how to provide tests for this one?

If there's no way to trigger a different spelling that a user would see, why is this change needed?

There are external tools (e.g. hipacc) that generate Clang AST. This AST uses LangAS annotations and emits incorrect memory space specifiers for CUDA when pretty-printed.

There are external tools (e.g. hipacc) that generate Clang AST. This AST uses LangAS annotations and emits incorrect memory space specifiers for CUDA when pretty-printed.

That's good to know!

We would need a different frontend that annotates LangAS::cuda_shared.

Do you happen to know why this behaves the way it does? e.g., is the bug that the frontend is annotating incorrectly and forces codegen to work around it, and fixing the frontend to annotate properly lets us remove some workarounds and fixes your issue?

CUDA maps __shared__ internally also to __attribute__((shared)):

#define __annotate__(a) \
        __attribute__((a))
#define __location__(a) \
        __annotate__(a)
...
#define __shared__ \
        __location__(shared)

My guess is that Clang does it just the same way and only converts to LangAS::cuda_shared for code generation in GetGlobalVarAddressSpace:
https://clang.llvm.org/doxygen/CodeGenModule_8cpp_source.html#l03305
In contrast, OpenCL uses keywords that are mapped directly to LangAS::opencl_local etc.

CUDA maps __shared__ internally also to __attribute__((shared)):

#define __annotate__(a) \
        __attribute__((a))
#define __location__(a) \
        __annotate__(a)
...
#define __shared__ \
        __location__(shared)

My guess is that Clang does it just the same way and only converts to LangAS::cuda_shared for code generation in GetGlobalVarAddressSpace:
https://clang.llvm.org/doxygen/CodeGenModule_8cpp_source.html#l03305
In contrast, OpenCL uses keywords that are mapped directly to LangAS::opencl_local etc.

I agree with the change itself... but it's quite annoying that such things can't be tested. :(

In D54258#1297191, @Anastasia wrote:

I agree with the change itself... but it's quite annoying that such things can't be tested. :(

Yes, that's a pity :(

Is there anything missing so that this can be merged?

aaron.ballman accepted this revision.Nov 14 2018, 8:19 AM

In D54258#1297191, @Anastasia wrote:

I agree with the change itself... but it's quite annoying that such things can't be tested. :(

Yes, that's a pity :(

Yeah, it seems like trying to change this in the frontend would also be a fair amount of work.

Is there anything missing so that this can be merged?

No, I think this truly is untestable without major work. Given how trivial this change is, I'm okay with it.

This revision is now accepted and ready to land.Nov 14 2018, 8:19 AM

Do you know when this will be merged?

Herald added a project: Restricted Project. · View Herald TranscriptMay 31 2019, 1:14 AM
Herald added a subscriber: ebevhan. · View Herald Transcript

Do you know when this will be merged?

I apologize, I wasn't aware you needed this merged on your behalf. I normally would be happy to do so, but I'm currently traveling. Maybe @Anastasia or someone else can help, otherwise I will merge it when I get back to the office in two weeks.

Merging in two weeks is fine for me.

My assumption was that accepted patches are merged into upstream in a timely manner.
Maybe this is not how it works?

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJun 5 2019, 10:26 AM

Merging in two weeks is fine for me.

No worries. I have just committed this.

My assumption was that accepted patches are merged into upstream in a timely manner.
Maybe this is not how it works?

No this is a manual process. The author is expected to commit unless he/she doesn't have commit rights. Then it has to be requested explicitly.

Thanks for clarification and merging!