This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] Assign valid global names
ClosedPublic

Authored by Hahnfeld on Nov 28 2017, 12:21 PM.

Details

Summary

PTX requires that identifiers consist only of [a-zA-Z0-9_$]. The
existing pass already ensured this for globals and this patch adds
the cleanup for functions with local linkage.

However, there was a different problem in the case of collisions
of the adjusted name: The ValueSymbolTable then automatically
appended ".N" with increasing Ns to get a unique name while helping
the ABI demangling. Special case this behavior to omit the dots and
append N directly. This will always give us legal names according
to the PTX requirements.

Diff Detail

Event Timeline

Hahnfeld created this revision.Nov 28 2017, 12:21 PM

This is essentially the fourth attempt after https://reviews.llvm.org/D17738, https://reviews.llvm.org/D29883, and https://reviews.llvm.org/D39005.

Let me come back to some of the previous comments:

In D17738#416060, @rnk wrote:

The problem there is that we would have to make the IR symbol renamer not append '.N' when renaming symbols with name collisions. Until then, we need something like this.

It is also true that we need some way to generate unique symbol names that does not interfere with existing mangling schemes. This means that we need to have some character that is not in [A-Za-z0-9_], and is not @ (because that's used by ELF symbol versioning). ?$@' are used by MSVC mangling. PTX allows only [A-Za-z0-9_$] (and also %, but only as the first character, so that doesn't help). Unfortunately, this does not seem to leave anything we can use (we can use '$' for PTX, but not for MSVC).

One option is that we add a function to LLVM get an available separator character, which can default to '.', but we set to '$' for nvptx, and use that for generating new names at the IR level. Thoughts?

I opted to implement this approach as a special case in ValueSymbolTable::makeUniqueName(). That way we can't get an "invalid" result from setName() which avoids the loop that the previous patches used for cleanup.
I basically followed @hfinkel's analysis of which characters we can use which doesn't leave much choice.

In D17738#661168, @rnk wrote:

This seems practical. Perhaps it could be part of the name mangling scheme already encoded in DataLayout?

DataLayout generally holds information that the target-independent optimizer needs in order to simplify the IR into our canonical form. This is as opposed to TargetTransformInfo, which provides data necessary to optimize the IR in target-aware ways (e.g., do things that are orthogonal to canonicalization such as inlining and vectorization). It is also as opposed to external utility functions that might be used by the frontend (e.g., llvm::sys::getHostCPUName()). If I recall correctly, this is information that would be used by the frontend when generating the IR, and the function results are controlled by the triple. As a result, I think that a general utility function somewhere would be fine.

I explicitly chose not to encode that information in the DataLayout:

  1. I'm not sure the core IR library has access to the currently used DataLayout.
  2. If I understood the code correctly, the DataLayout is set by the frontend, probably Clang in most cases. I don't think we should teach the frontend about a (correctness) decision that the backend can get right based on the selected target.
tra edited edge metadata.Nov 29 2017, 2:03 PM

There must be some truth in the saying "naming is one of the hardest problems in computer science". :-/

lib/IR/ValueSymbolTable.cpp
54–59

This patch addresses "we can't compile generated PTX because LLVM uses illegal characters", but exposes another issue -- having potentially different names on host and device is a problem for CUDA. For some objects host side may need to know what it's called on device side. We need it in order to access it from host (eg cudaMemcpyToSymbol(), or initializing static variables) and we currently assume that the names are the same. If such symbol gets different names on host and device, compilation will succeed, but we'll have problems at runtime.

Does "." have any special meaning? Can we skip the unique delimiter altogether?

If we can't find a suitable way to guarantee identical naming, we'll need a way to have a reliable way to determine the name used on the other side of the compilation.

In D40573#939676, @tra wrote:

There must be some truth in the saying "naming is one of the hardest problems in computer science". :-/

Indeed, and the conflicting requirements listed by Hal makes it even harder to be correct for all cases :-(

lib/IR/ValueSymbolTable.cpp
54–59

So the interesting question is: When will this code ever be hit? Most programming languages (including C and C++) obviously don't allow multiple variables of the same name - how would the compiler say which symbol you meant. The use case I've mostly seen is for compiler generated function, omp_outlined for example. These can be generated multiple times in the same translation unit and have to get unique names. Do you have another example where this could happen?

I'm not really sure '.' has a special meaning. Maybe @rafael can help because one of his old commits (https://reviews.llvm.org/rL253804) says For globals it is important to use "foo.1" to help C++ name demangling.

tra added inline comments.Nov 29 2017, 2:25 PM
lib/IR/ValueSymbolTable.cpp
54–59

I vaguely recall that '.' was an indication for demangler that it should not proceed further. I.e. a sort-of-special character to indicate the end of the C++-mangled part of the symbol name.

If name mangling can't be made identical (and it looks like it may be the case), we can probably work around it. I.e. for symbols that must have identical names on both sides we can generate a unique alias that's identical on both sides and use it instead when CUDA needs it.

Hahnfeld added inline comments.Nov 29 2017, 3:13 PM
lib/IR/ValueSymbolTable.cpp
54–59

But why is LLVM responsible for mangling? Shouldn't this be done by the Clang frontend?
I've found this test in libcxxabi/test/test_demangle.pass.cpp:

{"_ZNK10__cxxabiv111__libcxxabi5__sub20first_demangled_nameEPc.eh", "__cxxabiv1::__libcxxabi::__sub::first_demangled_name(char*) const (.eh)"},

As said, I can't imagine a user defined value where LLVM needs to generate a unique name, so this should be fine for the CUDA functions you mentioned because they operate on variables.

tra added inline comments.Nov 30 2017, 10:12 AM
lib/IR/ValueSymbolTable.cpp
54–59

Sorry. I should've said "the way LLVM creates unique names". C++ mangling is indeed handled by clang.

AFAICT, we do need to consider possibility of LLVM generating unique name whenever Clang calls LLVM's Value::setName(). The API explicitly guarantees that the name will be unique.

Another case would be an identifier with a unicode symbol in it. NVPTX would have to sanitize it with the result potentially clashing with existing name -- similar to @.str test case below. This is hypothetical at the moment as clang does not support unicode (yet?).

Hahnfeld added inline comments.Nov 30 2017, 10:26 AM
lib/IR/ValueSymbolTable.cpp
54–59

All right, then the next natural question is: Do we need to keep the C++ mangling in mind when creating the unique name? Because that is currently the reason that there is a dot separator. If we don't need it we could just drop the dot and every target would be happy. In addition this would guarantee that LLVM will produce a "valid" name after NVPTX sanitized the name and there was a crash.

tra added inline comments.Nov 30 2017, 10:49 AM
lib/IR/ValueSymbolTable.cpp
54–59

Alas dropping the dot does mess with the mangling. In that respect '$' would be lesser evil, at least we'll get the pre-uniquification C++ name demangled correctly. I'm not sure what effect that would have on demangling on windows.

$ c++filt _Z1fv _Z1fv.1 _Z1fv$2 _Z1fv3
f()
f() [clone .1]
f()
_Z1fv3

$ bin/llvm-cxxfilt _Z1fv _Z1fv.1 _Z1fv$2 _Z1fv3
f()
f() (.1)
f()
_Z1fv3

It looks like we'll have to use '$' for uniquification in nvptx as your patch does. That, at least, will deal with the part of the issue we need right now -- generate PTX which ptxas can compile and keep demanglers working. The cases where uniquification mismatch would cause problems should be rare. We can solve that problem separately. It's not perfect, but it looks like we don't have many options here.

I don't have any better ideas.
@hfinkel, @rnk -- your thoughts?

Hahnfeld added inline comments.Nov 30 2017, 11:26 AM
lib/IR/ValueSymbolTable.cpp
54–59

Hehe, I guess you executed this on a shell? Then $2 is an (empty) variable and the right commands are:

$ c++filt _Z1fv _Z1fv.1 _Z1fv\$2 _Z1fv3
f()
f() [clone .1]
_Z1fv$2
_Z1fv3
$ bin/llvm-cxxfilt _Z1fv _Z1fv.1 _Z1fv\$2 _Z1fv3
f()
f() (.1)
_Z1fv$2
_Z1fv3

(at least the two demanglers agree...)

So a dollar sign doesn't help much which means we could also drop it entirely.

tra added inline comments.Nov 30 2017, 11:56 AM
lib/IR/ValueSymbolTable.cpp
54–59

Oops. :-(
Well, bummer. I guess we'll have to live with un-demangle-able unique names on NVPTX side. It's still an improvement over not being able to compile anything with such names.

Hahnfeld added inline comments.Nov 30 2017, 11:59 AM
lib/IR/ValueSymbolTable.cpp
54–59

Ok, I'll update the patch to not even use $ which saves us quotation (see D40572).

Hahnfeld updated this revision to Diff 125007.Nov 30 2017, 1:54 PM
Hahnfeld marked 10 inline comments as done.
Hahnfeld edited the summary of this revision. (Show Details)

Drop '$' and use no separator as it doesn't help with ABI demangling.

tra accepted this revision.Nov 30 2017, 2:11 PM

I am OK with the change, but please wait a bit in case @rnk or @hfinkel have further comments.

This revision is now accepted and ready to land.Nov 30 2017, 2:11 PM
rnk accepted this revision.Nov 30 2017, 2:24 PM

lgtm

This is silly. This bug has been open for so long that nvidia could've just fixed their toolchain by now to accept dots in symbol names. =p

jlebar removed a reviewer: jlebar.Nov 30 2017, 5:52 PM
jlebar added a subscriber: jlebar.

I defer to the others here.

This revision was automatically updated to reflect the committed changes.