This is an archive of the discontinued LLVM Phabricator instance.

[CUDA][HIP] Fix mangling number for local struct
ClosedPublic

Authored by yaxunl on Mar 30 2022, 8:26 AM.

Details

Summary

MSVC and Itanium mangling use different mangling numbers
for function-scope structs, which causes inconsistent
mangled kernel names in device and host compilations.

This patch uses Itanium mangling number for structs
in for mangling device side names in CUDA/HIP host
compilation on Windows to fix this issue.

A CUDA mangling context is introduced so that ASTContext
knows whether the current name mangling is for device side
names in host compilation. Device and host mangling number
are encoded/decoded as upper and lower half of 32 bit
unsigned integer to fit into the original mangling number
field for AST.

Diff Detail

Event Timeline

yaxunl created this revision.Mar 30 2022, 8:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 30 2022, 8:26 AM
yaxunl requested review of this revision.Mar 30 2022, 8:26 AM
tra added subscribers: rnk, rsmith.Apr 6 2022, 11:36 AM

@rnk, @rsmith -- PTAL. Changing mangling for CXXABI on Windows is way out of my comfort zone.

This patch uses Itanium mangling number for structs in HIP host compilation on Windows to fix this issue.

It does not appear to be HIP-specific and I do not know enough to tell whether the change is safe for all users. Or even for HIP in all cases. Let's see what Reid or Richard think.

yaxunl added a comment.Apr 6 2022, 9:09 PM

This patch takes a similar approach as https://reviews.llvm.org/D69322 has done for lambda. When doing host compilation for CUDA/HIP on Windows with MSVC toolchain, mangling number of lambda always uses Itanium mangling number. In this case, mangling number of local struct types always uses Itanium mangling number. I assume this should be fine as long as it is consistent for all HIP programs.

tra added a comment.Apr 7 2022, 1:59 PM

This patch takes a similar approach as https://reviews.llvm.org/D69322 has done for lambda. When doing host compilation for CUDA/HIP on Windows with MSVC toolchain, mangling number of lambda always uses Itanium mangling number. In this case, mangling number of local struct types always uses Itanium mangling number. I assume this should be fine as long as it is consistent for all HIP programs.

I'm fairly sure that the code does what you need it to do for HIP. What I'm not sure is whether it will impact other users.

I've missed that getManglingNumber is part of MSHIPNumberingContext, so it is HIP-specific and should not change mangling for non-HIP users.

Would it make it possible to end up with different mangling for the shared code compiled as regular C++ code and host-side code in HIP sources? E.g. from a common header included in both.

yaxunl added a comment.Apr 7 2022, 8:28 PM

This patch takes a similar approach as https://reviews.llvm.org/D69322 has done for lambda. When doing host compilation for CUDA/HIP on Windows with MSVC toolchain, mangling number of lambda always uses Itanium mangling number. In this case, mangling number of local struct types always uses Itanium mangling number. I assume this should be fine as long as it is consistent for all HIP programs.

I'm fairly sure that the code does what you need it to do for HIP. What I'm not sure is whether it will impact other users.

I've missed that getManglingNumber is part of MSHIPNumberingContext, so it is HIP-specific and should not change mangling for non-HIP users.

Would it make it possible to end up with different mangling for the shared code compiled as regular C++ code and host-side code in HIP sources? E.g. from a common header included in both.

It may cause a template instantiation of host functions to have different mangled names when compiled as HIP program, versus compiled as C++ program.

To avoid that, I need to restrict using of Itanium mangling number for mangling device side names only, then the mangling of host functions will not be affected in HIP programs.

yaxunl updated this revision to Diff 422001.Apr 11 2022, 12:27 PM
yaxunl retitled this revision from [HIP] Fix mangling number for local struct to [CUDA][HIP] Fix mangling number for local struct.
yaxunl edited the summary of this revision. (Show Details)

Use Itaninium mangling number for mangling device side name
in host compilation only. Keep host name mangling consistent
with C++ programs.

tra added inline comments.Apr 12 2022, 2:41 PM
clang/lib/AST/ASTContext.cpp
11760–11763

I think we should change the MangleNumbers instead to use uin64_t or, perhaps unsigned[2] or even struct{ unsigned host; unsigned device;}.
Reducing the max number of anything to just 64K will be prone to failing sooner or later.

clang/test/CodeGenCUDA/struct-mangling-number.cu
25

Using HOST/DEV prefixes would work better, IMO to tell what's expected to happen where.

rnk added inline comments.Apr 19 2022, 10:23 AM
clang/include/clang/AST/ASTContext.h
682

It doesn't feel right to store mutable state here in ASTContext. You are effectively using this as a global variable to thread a boolean through to ASTContext::getManglingNumber, which is called from the mangler. The boolean indicates of host or device mangling numbers are desired for this particular mangling. Is that more or less correct?

This makes me think that it was perhaps a mistake to store mangling numbers in the AST Context in the first place. If we stored them in the mangling context instead, I think we'd get the right behavior out of the box. Is that correct? It it feasible to make that change?

684

By the way, this looks like SaveAndRestore<bool>. Maybe you can use that in the future.

clang/lib/AST/ASTContext.cpp
11760

IMO it would be simpler to set up a ternary instead of a lambda and two returns:

Res = Cond ? (Res >> 16) : (Res & 0xFFFF);
return Res > 1 ? Res : 1;
11760–11763

I suspect we could live with a 64K mangling number limit, but that is low enough that we need to put in a check for overflow somewhere.

tra added inline comments.Apr 19 2022, 11:21 AM
clang/lib/AST/ASTContext.cpp
11760–11763

If we'd produce a reasonable diagnostic for that, it might work.

yaxunl marked 7 inline comments as done.Apr 22 2022, 8:26 PM
yaxunl added inline comments.
clang/include/clang/AST/ASTContext.h
682

It doesn't feel right to store mutable state here in ASTContext. You are effectively using this as a global variable to thread a boolean through to ASTContext::getManglingNumber, which is called from the mangler. The boolean indicates of host or device mangling numbers are desired for this particular mangling. Is that more or less correct?

This makes me think that it was perhaps a mistake to store mangling numbers in the AST Context in the first place. If we stored them in the mangling context instead, I think we'd get the right behavior out of the box. Is that correct? It it feasible to make that change?

It is possible but I feel it is too invasive. Mangling numbers need to be calculated during construction of AST since it is context dependent. Mangle contexts are created in ABI of codegen. To store mangling numbers in mangling contexts, we need to move mangling contexts from ABI to ASTContext. We also need to add device mangling context in ASTContext. Then for each use of get/set mangling number API, we need to add get/set device mangling number API. That would need lots of changes.

I would see the state added to ASTContext by my approach from the perspective of single source language. The program contains both device code and host code. Ideally the compiler should be able to compile the device code with device target and ABI and compile the host code with host target ABI and switch the target freely during codegen. When we try to get the device side mangled name in host compilation, what we trying to do is switch to device target momoentorily during the mangling. This needs a state in ASTContext indicating that we are doing mangling for device instead of host, therefore getManglingNumber should return device mangling number instead of host mangling number.

684

Yes. SaveAndRestore serves the purpose. will use it.

clang/lib/AST/ASTContext.cpp
11760

will do

11760–11763

mangling number is member of some AST nodes. Increasing its size may increase overall AST memory usage and PCH size. I will add diagnostic for mangling number exceeding limit. If it happens in real applications I will revisit this.

clang/test/CodeGenCUDA/struct-mangling-number.cu
25

will do

yaxunl updated this revision to Diff 424688.Apr 22 2022, 8:29 PM
yaxunl marked 5 inline comments as done.

Revised by Artem's and Reid's comments.

tra accepted this revision.Apr 28 2022, 12:06 PM
This revision is now accepted and ready to land.Apr 28 2022, 12:06 PM
This revision was landed with ongoing or failed builds.Apr 28 2022, 4:55 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 28 2022, 4:55 PM
rnk added inline comments.Apr 29 2022, 11:18 AM
clang/include/clang/AST/ASTContext.h
682

After auditing call sites for setManglingNumber, I see that many of them relate to template instantiation and AST serialization. With that in mind, I think it makes sense to leave the ManglingNumbers DenseMap as you have it, it's not worth updating all of those serialization and cloning codepaths.

However, I really would prefer to remove this boolean on the ASTContext, which effectively stands in as an extra parameter to all getManglingNumber calls. The manglers should be able to calculate this boolean when they are created (they should know if they are the Itanium device mangler or the MS host mangler) and pass the correct value to every call to ASTContext::getManglingNumber. Does that make sense?

yaxunl marked an inline comment as done.
yaxunl added inline comments.
clang/include/clang/AST/ASTContext.h
682

However, I really would prefer to remove this boolean on the ASTContext, which effectively stands in as an extra parameter to all getManglingNumber calls. The manglers should be able to calculate this boolean when they are created (they should know if they are the Itanium device mangler or the MS host mangler) and pass the correct value to every call to ASTContext::getManglingNumber. Does that make sense?

Yes. Opened https://reviews.llvm.org/D124842 for this.

Hi,

I noticed when compiling with gcc 9.3.0 that we get a bunch of new warnings with this patch:

[1/351] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/MicrosoftCXXABI.cpp.o
../../clang/lib/AST/MicrosoftCXXABI.cpp:57:12: warning: 'virtual unsigned int {anonymous}::MicrosoftNumberingContext::getManglingNumber(const clang::VarDecl*, unsigned int)' was hidden [-Woverloaded-virtual]
   57 |   unsigned getManglingNumber(const VarDecl *VD,
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:80:12: warning:   by 'virtual unsigned int {anonymous}::MSHIPNumberingContext::getManglingNumber(const clang::TagDecl*, unsigned int)' [-Woverloaded-virtual]
   80 |   unsigned getManglingNumber(const TagDecl *TD,
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:46:12: warning: 'virtual unsigned int {anonymous}::MicrosoftNumberingContext::getManglingNumber(const clang::BlockDecl*)' was hidden [-Woverloaded-virtual]
   46 |   unsigned getManglingNumber(const BlockDecl *BD) override {
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:80:12: warning:   by 'virtual unsigned int {anonymous}::MSHIPNumberingContext::getManglingNumber(const clang::TagDecl*, unsigned int)' [-Woverloaded-virtual]
   80 |   unsigned getManglingNumber(const TagDecl *TD,
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:42:12: warning: 'virtual unsigned int {anonymous}::MicrosoftNumberingContext::getManglingNumber(const clang::CXXMethodDecl*)' was hidden [-Woverloaded-virtual]
   42 |   unsigned getManglingNumber(const CXXMethodDecl *CallOperator) override {
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:80:12: warning:   by 'virtual unsigned int {anonymous}::MSHIPNumberingContext::getManglingNumber(const clang::TagDecl*, unsigned int)' [-Woverloaded-virtual]
   80 |   unsigned getManglingNumber(const TagDecl *TD,
      |            ^~~~~~~~~~~~~~~~~

No idea if it's important or if gcc is overly picky.

yaxunl marked an inline comment as done.May 12 2022, 8:15 AM

Hi,

I noticed when compiling with gcc 9.3.0 that we get a bunch of new warnings with this patch:

[1/351] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/MicrosoftCXXABI.cpp.o
../../clang/lib/AST/MicrosoftCXXABI.cpp:57:12: warning: 'virtual unsigned int {anonymous}::MicrosoftNumberingContext::getManglingNumber(const clang::VarDecl*, unsigned int)' was hidden [-Woverloaded-virtual]
   57 |   unsigned getManglingNumber(const VarDecl *VD,
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:80:12: warning:   by 'virtual unsigned int {anonymous}::MSHIPNumberingContext::getManglingNumber(const clang::TagDecl*, unsigned int)' [-Woverloaded-virtual]
   80 |   unsigned getManglingNumber(const TagDecl *TD,
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:46:12: warning: 'virtual unsigned int {anonymous}::MicrosoftNumberingContext::getManglingNumber(const clang::BlockDecl*)' was hidden [-Woverloaded-virtual]
   46 |   unsigned getManglingNumber(const BlockDecl *BD) override {
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:80:12: warning:   by 'virtual unsigned int {anonymous}::MSHIPNumberingContext::getManglingNumber(const clang::TagDecl*, unsigned int)' [-Woverloaded-virtual]
   80 |   unsigned getManglingNumber(const TagDecl *TD,
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:42:12: warning: 'virtual unsigned int {anonymous}::MicrosoftNumberingContext::getManglingNumber(const clang::CXXMethodDecl*)' was hidden [-Woverloaded-virtual]
   42 |   unsigned getManglingNumber(const CXXMethodDecl *CallOperator) override {
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:80:12: warning:   by 'virtual unsigned int {anonymous}::MSHIPNumberingContext::getManglingNumber(const clang::TagDecl*, unsigned int)' [-Woverloaded-virtual]
   80 |   unsigned getManglingNumber(const TagDecl *TD,
      |            ^~~~~~~~~~~~~~~~~

No idea if it's important or if gcc is overly picky.

Thanks for letting me know. This is due to the newly added function hiding other overloaded functions in the parent class. I will fix it by adding using MicrosoftNumberingContext::getManglingNumber to MSHIPNumberingContext.

Hi,

I noticed when compiling with gcc 9.3.0 that we get a bunch of new warnings with this patch:

[1/351] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/MicrosoftCXXABI.cpp.o
../../clang/lib/AST/MicrosoftCXXABI.cpp:57:12: warning: 'virtual unsigned int {anonymous}::MicrosoftNumberingContext::getManglingNumber(const clang::VarDecl*, unsigned int)' was hidden [-Woverloaded-virtual]
   57 |   unsigned getManglingNumber(const VarDecl *VD,
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:80:12: warning:   by 'virtual unsigned int {anonymous}::MSHIPNumberingContext::getManglingNumber(const clang::TagDecl*, unsigned int)' [-Woverloaded-virtual]
   80 |   unsigned getManglingNumber(const TagDecl *TD,
      |            ^~~~~~~~~~~~~~~~~

fixed by 0f292141aadb0489231de31de966c239486e019d

Hi,

I noticed when compiling with gcc 9.3.0 that we get a bunch of new warnings with this patch:

[1/351] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/MicrosoftCXXABI.cpp.o
../../clang/lib/AST/MicrosoftCXXABI.cpp:57:12: warning: 'virtual unsigned int {anonymous}::MicrosoftNumberingContext::getManglingNumber(const clang::VarDecl*, unsigned int)' was hidden [-Woverloaded-virtual]
   57 |   unsigned getManglingNumber(const VarDecl *VD,
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:80:12: warning:   by 'virtual unsigned int {anonymous}::MSHIPNumberingContext::getManglingNumber(const clang::TagDecl*, unsigned int)' [-Woverloaded-virtual]
   80 |   unsigned getManglingNumber(const TagDecl *TD,
      |            ^~~~~~~~~~~~~~~~~

fixed by 0f292141aadb0489231de31de966c239486e019d

Thanks!