This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] Prefix "$L__" for branch label names
ClosedPublic

Authored by slydiman on Feb 13 2022, 9:25 AM.

Details

Summary

A global variable may have the same name as a label, and ptxas does not accept it.
Prefix labels with $L__ to fix this.

Diff Detail

Event Timeline

slydiman created this revision.Feb 13 2022, 9:25 AM
slydiman requested review of this revision.Feb 13 2022, 9:25 AM
MaskRay added a comment.EditedFeb 14 2022, 10:23 AM

I know nearly nothing about NVPTX.
GNU assembler calls this property "local label prefix" (https://sourceware.org/binutils/docs/as/Symbol-Names.html#Symbol-Names), which is typically .L on ELF.
This is usually specified by the assembler manual or psABI for the target. For example. AArch64 psABI has a note (can be seen as nonnormative) that ARMCC uses L.

Does NVPTX has such a specification? Also, if you grep PrivateGlobalPrefix = , having such a long PrivateGlobalPrefix is an outlier.

tra accepted this revision.Feb 14 2022, 11:17 AM

LGTM.

Does NVPTX has such a specification?

Official spec is here: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html
Unfortunately, it does not say much about labels and whether they have a concept of a local labal.

Also, if you grep PrivateGlobalPrefix = , having such a long PrivateGlobalPrefix is an outlier.

PTX is a rather quirky assembly, so being an outlier is business as usual. :-/

For what it's worth, $L__ is what NVCC uses since CUDA-11.3 : https://godbolt.org/z/3q4be9PrG.
I guess $ alone or $L may be sufficient to avoid name conflicts, but $L__ is fine, too, IMO.

llvm/test/CodeGen/NVPTX/label-var-prefix.ll
17 ↗(On Diff #408274)

You should also match the trailing : otherwise the check could match one of the the label uses.

This revision is now accepted and ready to land.Feb 14 2022, 11:17 AM
MaskRay accepted this revision.Feb 14 2022, 11:23 AM

LGTM.

Does NVPTX has such a specification?

Official spec is here: https://docs.nvidia.com/cuda/parallel-thread-execution/index.html
Unfortunately, it does not say much about labels and whether they have a concept of a local labal.

Thanks. If there is any channel communicating with them, it'd be nice to have this documented somewhere.

Also, if you grep PrivateGlobalPrefix = , having such a long PrivateGlobalPrefix is an outlier.

PTX is a rather quirky assembly, so being an outlier is business as usual. :-/

For what it's worth, $L__ is what NVCC uses since CUDA-11.3 : https://godbolt.org/z/3q4be9PrG.
I guess $ alone or $L may be sufficient to avoid name conflicts, but $L__ is fine, too, IMO.

Cool. If NVCC uses $L__, this is definitely safe.
@slydiman please update the message to mention that the choice of $L__ matches NVCC (not LLVM local invention).

MaskRay added inline comments.Feb 14 2022, 11:28 AM
llvm/test/CodeGen/NVPTX/label-var-prefix.ll
6 ↗(On Diff #408274)
6 ↗(On Diff #408274)

In case there is an error, make sure $LBB0_2 does not match the pattern.

This revision was landed with ongoing or failed builds.Feb 14 2022, 12:52 PM
This revision was automatically updated to reflect the committed changes.
slydiman marked 2 inline comments as done.Feb 14 2022, 12:54 PM
tra added a comment.Feb 14 2022, 1:15 PM

Looks like it broke CUDA:
https://lab.llvm.org/buildbot/#/builders/46/builds/23372/steps/3/logs/stdio

ptxas /tmp/complex-78abf1/complex-sm_75.s, line 250; fatal   : Parsing error near '$L__BB6_2': syntax error

The commit message does not mention NVCC uses $L__.

tra added a comment.Feb 14 2022, 1:27 PM

The commit message does not mention NVCC uses $L__.

Thank you for reverting the patch.

@slydiman please verify that clang-generated PTX with the patch does get accepted by ptxas. I'm not quite sure what went wrong here and we can't test it in clang/llvm in-tree tests as they don't have access to CUDA SDK.

@slydiman please verify that clang-generated PTX with the patch does get accepted by ptxas. I'm not quite sure what went wrong here and we can't test it in clang/llvm in-tree tests as they don't have access to CUDA SDK.

It seems this patch requires CUDA 11.3 and PTX ISA v7.3 at least.

slydiman reopened this revision.Apr 28 2022, 1:20 PM
This revision is now accepted and ready to land.Apr 28 2022, 1:20 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 28 2022, 1:20 PM
tra added a comment.Apr 28 2022, 2:32 PM

It seems this patch requires CUDA 11.3 and PTX ISA v7.3 at least.

Can you elaborate on what exactly is failing with older versions? AFAICT, $... is accepted by ptxas going back as far as CUDA-9.0: https://godbolt.org/z/G5n7E9xYz

On a side note, it's apparently possible to have a global or function to start with $: https://godbolt.org/z/zW41xPnPd, https://godbolt.org/z/srjdh3a46
So, prefixing with $ does not solve the problem completely.
We may be better off using % or just __ as the prefix. While it's possible to have a conflict with the latter one, at least the end users are not supposed to use it.

@slydiman please verify that clang-generated PTX with the patch does get accepted by ptxas. I'm not quite sure what went wrong here and we can't test it in clang/llvm in-tree tests as they don't have access to CUDA SDK.

It seems this patch requires CUDA 11.3 and PTX ISA v7.3 at least.

I believe the problem is solved here https://reviews.llvm.org/D123702
I will check everything and resubmit this patch.

slydiman updated this revision to Diff 426213.Apr 30 2022, 1:27 AM
This revision was automatically updated to reflect the committed changes.
tra added a comment.May 2 2022, 2:44 PM

So, prefixing with $ does not solve the problem completely.

Indeed it's still possible to break compilation for both clang and nvcc: https://godbolt.org/z/9h649eMMn

So, prefixing with $ does not solve the problem completely.

Indeed it's still possible to break compilation for both clang and nvcc: https://godbolt.org/z/9h649eMMn

I guess that $__ prefix was chosen because it matches nvcc.

We may be better off using % or just __ as the prefix. While it's possible to have a conflict with the latter one, at least the end users are not supposed to use it.

Using % instead of $ sounds like a good idea - it is not a valid identifier in C, and it seems to be supported by ptxas.

You are right. I’ll look into this more.