This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
docs/
2/2
ReleaseNotes.rst
-
include/clang/
-
clang/
-
Basic/
2/2
Attr.td
2/2
AttrDocs.td
4/4
Features.def
-
TokenKinds.def
-
Parse/
-
Parser.h
-
lib/
-
Basic/
5/6
IdentifierTable.cpp
-
Parse/
7/7
ParseDecl.cpp
-
test/
-
CodeGenCUDA/
3/3
noinline.cu
-
Lexer/
-
has_feature.cu
-
SemaCUDA/
9/10
noinline.cu

Differential D124866

[CUDA][HIP] support noinline as keyword
ClosedPublic

Authored by yaxunl on May 3 2022, 11:27 AM.

Download Raw Diff

Details

Reviewers

tra
aaron.ballman
rsmith

Commits

rGafc9d674fe5a: [CUDA][HIP] support __noinline__ as keyword

Summary

CUDA/HIP programs use __noinline__ like a keyword e.g.
__noinline__ void foo() {} since __noinline__ is defined
as a macro __attribute__((noinline)) in CUDA/HIP runtime
header files.

However, gcc and clang supports __attribute__((__noinline__))
the same as __attribute__((noinline)). Some C++ libraries
use __attribute__((__noinline__)) in their header files.
When CUDA/HIP programs include such header files,
clang will emit error about invalid attributes.

This patch fixes this issue by supporting __noinline__ as
a keyword, so that CUDA/HIP runtime could remove
the macro definition.

Diff Detail

Unit TestsFailed

	Time	Test
	40 ms	x64 debian > LLVM.CodeGen/Thumb2::bti-indirect-branches.ll

Event Timeline

yaxunl created this revision.May 3 2022, 11:27 AM

Herald added a reviewer: aaron.ballman. · View Herald TranscriptMay 3 2022, 11:27 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: mattd, carlosgalvezp, dexonsmith. · View Herald Transcript

yaxunl requested review of this revision.May 3 2022, 11:27 AM

add feature cuda_noinline_keyword to facilitate CUDA/HIP headers removing noinline macro

Harbormaster completed remote builds in B162526: Diff 426794.May 3 2022, 1:19 PM

I don't know how language extensions come about in CUDA or HIP -- is there an appropriate standards body (or something similar) that's aware of this extension and supports it?

The changes should likely come with a release note entry about the new functionality, and some documentation changes as well.

clang/include/clang/Basic/Attr.td
1777–1780
clang/include/clang/Basic/Features.def
274	Do the CUDA or HIP specs define `__noinline__` as a keyword specifically? If not, this isn't a `FEATURE`, it's an `EXTENSION` because it's specific to Clang, not the language standard.
clang/lib/Parse/ParseDecl.cpp
902	I think we should we be issuing a pedantic "this is a clang extension" warning here, WDYT?
clang/test/SemaCUDA/noinline.cu
9	I think there should also be a test like: [[gnu::__noinline__]] void fun4() {} to verify that the double square bracket syntax also correctly handles this being a keyword now (I expect the test to pass).

I don't know how language extensions come about in CUDA or HIP -- is there an appropriate standards body (or something similar) that's aware of this extension and supports it?

Summoning @rsmith for his language lawyer expertise.

yaxunl marked 4 inline comments as done.May 6 2022, 11:12 AM

yaxunl added inline comments.

clang/include/clang/Basic/Attr.td
1777–1780	will do
clang/include/clang/Basic/Features.def
274	CUDA/HIP do not have language spec. In their programming guide, they do not define `__noinline__` as a keyword. Will make it an extension.
clang/lib/Parse/ParseDecl.cpp
902	will do
clang/test/SemaCUDA/noinline.cu
9	will do

revised by Aaron's comments

aaron.ballman added inline comments.May 6 2022, 11:23 AM

clang/include/clang/Basic/Features.def
274	CUDA/HIP do not have language spec. Then what body of people governs changes to the language? Basically, I'm trying to understand whether this patch meets the community requirements for adding an extension: https://clang.llvm.org/get_involved.html#criteria, specifically #4 (though the rest of the points are worth keeping in mind). I don't want to Clang ending up stepping on toes by defining this extension only to accidentally frustrate the CUDA community.

yaxunl marked an inline comment as done.May 6 2022, 11:46 AM

yaxunl added inline comments.

clang/include/clang/Basic/Features.def
274	specific to `__noinline__`, it is largely determined by the existing behaviour of CUDA SDK. The CUDA SDK defines `__noinline__` as a macro `__attribute__((noinline))`. However, it is not compatible with some C++ headers which use `__attribute__((__noinline__))`. This patch will not change the usage pattern of `__noinline__`. It is equivalent to the original behaviour with the benefit of being compatible with C++ headers.

added release note and documentation

CUDA/HIP do not have language spec.

Well. It's not completely true. CUDA programming guide does serve as the de-facto spec for CUDA. It's far from perfect, but it does mention __noinline__ and __forceinline__ as function qualifiers: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#noinline-and-forceinline

In D124866#3497439, @tra wrote:

CUDA/HIP do not have language spec.

Well. It's not completely true. CUDA programming guide does serve as the de-facto spec for CUDA. It's far from perfect, but it does mention __noinline__ and __forceinline__ as function qualifiers: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#noinline-and-forceinline

Thanks for the pointer. I missed that part.

CUDA SDK implements __noinline__ as attribute __attribute__((noinline)) though. Some requirements may not have diagnostics.

In D124866#3497439, @tra wrote:

CUDA/HIP do not have language spec.

Well. It's not completely true. CUDA programming guide does serve as the de-facto spec for CUDA. It's far from perfect, but it does mention __noinline__ and __forceinline__ as function qualifiers: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#noinline-and-forceinline

Thank you, that's the magic words I was hoping for -- because they're described as function qualifiers, I think it's justifiable to add them as a keyword implementation in Clang and not worry about stepping on the toes of the CUDA spec (it's adhering to what the current spec requires).

Should we do __forceinline__ at the same time so that there's consistency?

clang/lib/Parse/ParseDecl.cpp
902	I'm questioning whether my advice here was good or not -- now that I see the CUDA spec already calls these function qualifiers... it's debatable whether this is a Clang extension or just the way in which Clang implements the CUDA function qualifiers. @tra -- do you have opinions? I'm sort of leaning towards dropping the extension warning, but the only reason I can think of for keeping it is if Clang is the only CUDA compiler that doesn't require you to include a header before using the function qualifiers. If that's the case, there is a portability concern.

In D124866#3500761, @aaron.ballman wrote:

Should we do __forceinline__ at the same time so that there's consistency?

__forceinline__ does not have the issue as __noinline__ has since it is not a GCC attribute. The current CUDA/HIP implementation of __forceinline__ in header files is sufficient. I do not see the benefit of implementing __forceinline__ as a keyword.

In D124866#3501181, @yaxunl wrote:

In D124866#3500761, @aaron.ballman wrote:

Should we do __forceinline__ at the same time so that there's consistency?

__forceinline__ does not have the issue as __noinline__ has since it is not a GCC attribute. The current CUDA/HIP implementation of __forceinline__ in header files is sufficient. I do not see the benefit of implementing __forceinline__ as a keyword.

Primarily to reduce user confusion. It's kind of weird for __noinline__ to be a keyword and __forceinline__ to not be a keyword when they're both defined the same way by the CUDA spec. This means you can #undef one of them but not the other, that sort of thing.

clang/test/CodeGenCUDA/noinline.cu
2	I've asked @erichkeane to weigh in on whether there's a better approach here than specifying an optimization level.
clang/test/SemaCUDA/noinline.cu
9	Ah, I just noticed we also have no tests for the behavior of the keyword in the presence of the macro being defined. e.g., #define __noinline__ __attribute__((__noinline__)) __noinline__ void fun5() {}

erichkeane added inline comments.May 9 2022, 10:15 AM

clang/test/CodeGenCUDA/noinline.cu
2	You don't need to do this, it looks like all you're trying to do is keep 'clang' out of `O0` mode. However, what you do NOT want is the optimizations to run. The common way to do that is to combine `O1`/`O2`/etc like: `-O2 -disable-llvm-passes` This will keep clang in `O2` mode, but will keep the optimizer from running anything, which might mess with the test later on.

In D124866#3501203, @aaron.ballman wrote:

Should we do __forceinline__ at the same time so that there's consistency?

Primarily to reduce user confusion. It's kind of weird for __noinline__ to be a keyword and __forceinline__ to not be a keyword when they're both defined the same way by the CUDA spec. This means you can #undef one of them but not the other, that sort of thing.

I'm slightly biased towards making them both a keyword. That said, I may be convinced otherwise if we discover that it may break some assumptions in existing C++ code. I just don't know enough.

clang/lib/Parse/ParseDecl.cpp
902	I'm not sure if such a warning would be useful. the only reason I can think of for keeping it is if Clang is the only CUDA compiler that doesn't require you to include a header before using the function qualifiers. If that's the case, there is a portability concern. I don't think it's an issue. We already have similar divergence between nvcc/clang. E.g. built-in variables like `threadIdx`. Clang implements them in a header, but NVCC provides them by compiler itself. With both compilers the variables are available by the time we get to compile user code. Virtually all CUDA compilations are done with tons of CUDA headers pre-included by compiler. Those that do not do that are already on their own and have to provide many other 'standard' CUDA things like target attributes. I don't think we need to worry about that.

In D124866#3501203, @aaron.ballman wrote:

__forceinline__ does not have the issue as __noinline__ has since it is not a GCC attribute. The current CUDA/HIP implementation of __forceinline__ in header files is sufficient. I do not see the benefit of implementing __forceinline__ as a keyword.

Primarily to reduce user confusion. It's kind of weird for __noinline__ to be a keyword and __forceinline__ to not be a keyword when they're both defined the same way by the CUDA spec. This means you can #undef one of them but not the other, that sort of thing.

If we are to add __forceinline__ as a keyword, I feel it better be a separate patch to be cleaner.

clang/lib/Parse/ParseDecl.cpp
902	I can remove the diagnostics since it seems unnecessary. I tend to treat it as an extension since nvcc is the de facto standard implementation, which does not implement it as a keyword. Compared to that, this is like an extension.
clang/test/CodeGenCUDA/noinline.cu
2	will use -O2 -disable-llvm-passes
clang/test/SemaCUDA/noinline.cu
9	will do

removed diagnostics and added more tests

In D124866#3501641, @yaxunl wrote:

If we are to add __forceinline__ as a keyword, I feel it better be a separate patch to be cleaner.

Fine with me.

clang/lib/Parse/ParseDecl.cpp
902	I'd argue that NVCC does implement it (as in "documents and makes it available"). Providing the documented functionality using a different implementation does not reach the point of being an extension, IMO. While there are observable differences between implementations, depending on them would be a portability error for the user.

This revision is now accepted and ready to land.May 9 2022, 1:17 PM

Harbormaster completed remote builds in B163550: Diff 428167.May 9 2022, 3:00 PM

In D124866#3501641, @yaxunl wrote:

If we are to add __forceinline__ as a keyword, I feel it better be a separate patch to be cleaner.

I'm fine with that.

A few nits and a question about the test recently added.

clang/docs/ReleaseNotes.rst
343–348
clang/include/clang/Basic/AttrDocs.td
543
clang/test/SemaCUDA/noinline.cu
9	I missed an important detail -- I think this is now going to generate a warning in `-pedantic` mode (through `-Wkeyword-macro`) when compiling for CUDA; is that going to be a problem for CUDA headers, or are those always included as a system header (and so the diagnostics will be suppressed)?

yaxunl marked 4 inline comments as done.May 10 2022, 9:30 AM

yaxunl added inline comments.

clang/docs/ReleaseNotes.rst
343–348	will fix
clang/include/clang/Basic/AttrDocs.td
543	will fix.
clang/lib/Parse/ParseDecl.cpp
902	that makes sense. will change the extension to feature
clang/test/SemaCUDA/noinline.cu
9	I could not find how clang driver adds CUDA include path https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Cuda.cpp#L284 @tra do you know how CUDA include path is added? is it done by CMake? For HIP the HIP include path is added as a system include path by clang driver.

dexonsmith removed a subscriber: dexonsmith.May 10 2022, 9:31 AM

aaron.ballman added inline comments.May 10 2022, 9:33 AM

clang/test/SemaCUDA/noinline.cu
9	Whatever we find out, we can emulate its behavior here in the test file to see what the diagnostic behavior will be (you can use GNU linemarkers to convince the compiler parts of the source are in a system header).

yaxunl marked 4 inline comments as done.May 10 2022, 10:10 AM

yaxunl added inline comments.

clang/test/SemaCUDA/noinline.cu
9	will add tests for that. It seems no matter it is system header or normal header, no warnings are emitted even with -pedantic.

aaron.ballman added inline comments.May 10 2022, 10:11 AM

clang/test/SemaCUDA/noinline.cu
9	Excellent, thank you!

make it a feature, add tests for pedantic, fix release notes and doecumentation

tra added inline comments.May 10 2022, 10:51 AM

clang/test/SemaCUDA/noinline.cu
9	CUDA includes are added via `-internal-isystem` here: https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Cuda.cpp#L892

Harbormaster completed remote builds in B163727: Diff 428423.May 10 2022, 11:29 AM

This revision was landed with ongoing or failed builds.May 10 2022, 11:34 AM

Closed by commit rGafc9d674fe5a: [CUDA][HIP] support __noinline__ as keyword (authored by yaxunl). · Explain Why

This revision was automatically updated to reflect the committed changes.

yaxunl marked an inline comment as done.

yaxunl added a commit: rGafc9d674fe5a: [CUDA][HIP] support __noinline__ as keyword.

Herald added a project: Restricted Project. · View Herald TranscriptMay 10 2022, 11:34 AM

delcypher added a subscriber: delcypher.May 10 2022, 5:47 PM

delcypher added inline comments.

clang/lib/Basic/IdentifierTable.cpp
111	@yaxunl Is it intentional that you didn't update `KEYALL` here? That means `KEYALL` doesn't include the bit for `KEYCUDA`. If that was your intention then this will break if someone adds a new key. E.g. KEYCUDA = 0x2000000, KEYSOMENEWTHING = 0x4000000, // ... // KEYALL now includes `KEYCUDA`, whereas it didn't before. // KEYALL includes KEYSOMENEWTHING KEYALL = (0x7ffffff & ~KEYNOMS18 & ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to exclude. ... Updating the `0x1ffffff` constant to `0x3ffffff` so that `KEYALL` includes `KEYCUDA` If your intention is to not have `KEYCUDA` set in `KEYALL` then amend `KEYALL` to be. KEYALL = (0x7ffffff & ~KEYNOMS18 & ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are used to exclude. // KEYCUDA is not included in KEYALL

yaxunl added inline comments.May 10 2022, 7:52 PM

clang/lib/Basic/IdentifierTable.cpp

111

My intention is not to include KEYCUDA in KEYALL.

Should I change KEYALL to

KEYALL = (0x3ffffff & ~KEYNOMS18 &
              ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are used to exclude.
// KEYCUDA is not included in KEYALL

instead of

KEYALL = (0x7ffffff & ~KEYNOMS18 &
              ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are used to exclude.
// KEYCUDA is not included in KEYALL

since the current maximum mask is 0x3ffffff instead of 0x7ffffff

delcypher added inline comments.May 10 2022, 10:28 PM

clang/lib/Basic/IdentifierTable.cpp

111

Oops, you're right it would be 0x3ffffff. I wonder though if we should clean this up so we don't need to manually update the bit mask every time... what if it was written like this?

 enum {
    KEYC99        = 0x1,
    KEYCXX        = 0x2,
    KEYCXX11      = 0x4,
    ....
    KEYSYCL       = 0x1000000,
    KEYCUDA       = 0x2000000,
    KEYMAX = KEYCUDA, // Must be set to the largest KEY enum value
    KEYALLCXX = KEYCXX | KEYCXX11 | KEYCXX20,

    // KEYNOMS18 and KEYNOOPENCL are used to exclude.
    // KEYCUDA is not included in KEYALL because <FIXME add reason here>
    KEYALL = (((KEYMAX & (KEYMAX-1)) & ~KEYNOMS18 & ~KEYNOOPENCL & ~KEYCUDA)
};

yaxunl added inline comments.May 11 2022, 8:18 AM

clang/lib/Basic/IdentifierTable.cpp
111	On second thought, KEYALL does not need to exclude KEYCUDA. However, it would be good to set KEYALL in a generic approach. I will open a separate review.

yaxunl marked 2 inline comments as done.May 11 2022, 8:48 AM

yaxunl added inline comments.

clang/lib/Basic/IdentifierTable.cpp
111	opened https://reviews.llvm.org/D125396 to fix KEYALL

delcypher added inline comments.May 11 2022, 8:50 AM

clang/lib/Basic/IdentifierTable.cpp
111	Oops that should say KEYALL = (((KEYMAX \| (KEYMAX-1)) & ~KEYNOMS18 & ~KEYNOOPENCL & ~KEYCUDA)

Pierre-vh mentioned this in D137251: [clang][cuda/hip] Allow `__noinline__` lambdas.Nov 2 2022, 6:42 AM

Pierre-vh mentioned this in rGc05f1639f7f4: [clang][cuda/hip] Allow `__noinline__` lambdas.Nov 4 2022, 12:33 AM

yaxunl mentioned this in D149364: [CUDA] Temporarily undefine __noinline__ when including bits/shared_ptr_base.h.Apr 28 2023, 7:38 AM

Revision Contents

Path

Size

clang/

docs/

ReleaseNotes.rst

7 lines

include/

clang/

Basic/

5 lines

4 lines

3 lines

3 lines

Parse/

Parser.h

1 line

lib/

Basic/

IdentifierTable.cpp

3 lines

Parse/

ParseDecl.cpp

14 lines

test/

CodeGenCUDA/

noinline.cu

34 lines

Lexer/

has_feature.cu

8 lines

SemaCUDA/

noinline.cu

19 lines

Diff 428423

clang/docs/ReleaseNotes.rst

Show First 20 Lines • Show All 334 Lines • ▼ Show 20 Lines

- The mangling scheme for C++20 modules has incompatibly changed. The

symbols with named module attachment.

C++2b Feature Support

^^^^^^^^^^^^^^^^^^^^^

- Implemented `P2128R6: Multidimensional subscript operator <https://wg21.link/P2128R6>`_.

- Implemented `P0849R8: auto(x): decay-copy in the language <https://wg21.link/P0849R8>`_.

- Implemented `P2242R3: Non-literal variables (and labels and gotos) in constexpr functions <https://wg21.link/P2242R3>`_.

CUDA Language Changes in Clang

CUDA/HIP Language Changes in Clang

------------------------------

----------------------------------

- Added `__noinline__` as a keyword to avoid diagnostics due to usage of

`__attribute__((__noinline__))` in CUDA/HIP programs.

aaron.ballmanUnsubmitted

Done

- Implemented `P2242R3: Non-literal variables (and labels and gotos) in constexpr functions <https://wg21.link/P2242R3>`_.

CUDA/HIP Language Changes in Clang

- ------------------------------

+ ----------------------------------

- Added `__noinline__` as a keyword to avoid diagnostics due to usage of

aaron.ballman:

yaxunlAuthorUnsubmitted

Done

will fix

yaxunl: will fix

Objective-C Language Changes in Clang

-------------------------------------

OpenCL C Language Changes in Clang

----------------------------------

...

▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

clang/include/clang/Basic/Attr.td

Show First 20 Lines • Show All 1,768 Lines • ▼ Show 20 Lines

def Convergent : InheritableAttr {

let Spellings = [Clang<"convergent">];

let Subjects = SubjectList<[Function]>;

let Documentation = [ConvergentDocs];

let SimpleHandler = 1;

}

def NoInline : DeclOrStmtAttr {

let Spellings = [GCC<"noinline">, CXX11<"clang", "noinline">,

let Spellings = [Keyword<"__noinline__">, GCC<"noinline">,

C2x<"clang", "noinline">, Declspec<"noinline">];

CXX11<"clang", "noinline">, C2x<"clang", "noinline">,

Declspec<"noinline">];

aaron.ballmanUnsubmitted

Done

def NoInline : DeclOrStmtAttr {

- let Spellings = [Keyword<"__noinline__">, GCC<"noinline">, CXX11<"clang", "noinline">,

- C2x<"clang", "noinline">, Declspec<"noinline">];

+ let Spellings = [Keyword<"__noinline__">, GCC<"noinline">,

+ CXX11<"clang", "noinline">, C2x<"clang", "noinline">,

+ Declspec<"noinline">];

let Accessors = [Accessor<"isClangNoInline", [CXX11<"clang", "noinline">,

aaron.ballman:

yaxunlAuthorUnsubmitted

Done

will do

yaxunl: will do

let Accessors = [Accessor<"isClangNoInline", [CXX11<"clang", "noinline">,

C2x<"clang", "noinline">]>];

let Documentation = [NoInlineDocs];

let Subjects = SubjectList<[Function, Stmt], WarnDiag,

"functions and statements">;

let SimpleHandler = 1;

}

▲ Show 20 Lines • Show All 2,214 Lines • Show Last 20 Lines

clang/include/clang/Basic/AttrDocs.td

Show First 20 Lines • Show All 532 Lines • ▼ Show 20 Lines

This function attribute suppresses the inlining of a function at the call sites

of the function.

``[[clang::noinline]]`` spelling can be used as a statement attribute; other

spellings of the attribute are not supported on statements. If a statement is

marked ``[[clang::noinline]]`` and contains calls, those calls inside the

statement will not be inlined by the compiler.

``__noinline__`` can be used as a keyword in CUDA/HIP languages. This is to

avoid diagnostics due to usage of ``__attribute__((__noinline__))``

with ``__noinline__`` defined as a macro as ``__attribute__((noinline))``.

aaron.ballmanUnsubmitted

Done

avoid diagnostics due to usage of ``__attribute__((__noinline__))``

- with ``__noinline__`` defined as a macro as ``__attribute__((noinline))`.

+ with ``__noinline__`` defined as a macro as ``__attribute__((noinline))``.

.. code-block:: c

aaron.ballman:

yaxunlAuthorUnsubmitted

Done

will fix.

yaxunl: will fix.

.. code-block:: c

int example(void) {

int r;

[[clang::noinline]] foo();

[[clang::noinline]] r = bar();

return r;

}

▲ Show 20 Lines • Show All 5,900 Lines • Show Last 20 Lines

clang/include/clang/Basic/Features.def

	Show First 20 Lines • Show All 264 Lines • ▼ Show 20 Lines
	EXTENSION(gnu_asm, LangOpts.GNUAsm)			EXTENSION(gnu_asm, LangOpts.GNUAsm)
	EXTENSION(gnu_asm_goto_with_outputs, LangOpts.GNUAsm)			EXTENSION(gnu_asm_goto_with_outputs, LangOpts.GNUAsm)
	EXTENSION(matrix_types, LangOpts.MatrixTypes)			EXTENSION(matrix_types, LangOpts.MatrixTypes)
	EXTENSION(matrix_types_scalar_division, true)			EXTENSION(matrix_types_scalar_division, true)
	EXTENSION(cxx_attributes_on_using_declarations, LangOpts.CPlusPlus11)			EXTENSION(cxx_attributes_on_using_declarations, LangOpts.CPlusPlus11)

	FEATURE(cxx_abi_relative_vtable, LangOpts.CPlusPlus && LangOpts.RelativeCXXABIVTables)			FEATURE(cxx_abi_relative_vtable, LangOpts.CPlusPlus && LangOpts.RelativeCXXABIVTables)

				// CUDA/HIP Features
				FEATURE(cuda_noinline_keyword, LangOpts.CUDA)
				aaron.ballmanUnsubmitted Done Reply Inline Actions Do the CUDA or HIP specs define `__noinline__` as a keyword specifically? If not, this isn't a `FEATURE`, it's an `EXTENSION` because it's specific to Clang, not the language standard. aaron.ballman: Do the CUDA or HIP specs define `__noinline__` as a keyword specifically? If not, this isn't a…
				yaxunlAuthorUnsubmitted Done Reply Inline Actions CUDA/HIP do not have language spec. In their programming guide, they do not define `__noinline__` as a keyword. Will make it an extension. yaxunl: CUDA/HIP do not have language spec. In their programming guide, they do not define…
				aaron.ballmanUnsubmitted Done Reply Inline Actions CUDA/HIP do not have language spec. Then what body of people governs changes to the language? Basically, I'm trying to understand whether this patch meets the community requirements for adding an extension: https://clang.llvm.org/get_involved.html#criteria, specifically #4 (though the rest of the points are worth keeping in mind). I don't want to Clang ending up stepping on toes by defining this extension only to accidentally frustrate the CUDA community. aaron.ballman: > CUDA/HIP do not have language spec. Then what body of people governs changes to the…
				yaxunlAuthorUnsubmitted Done Reply Inline Actions specific to `__noinline__`, it is largely determined by the existing behaviour of CUDA SDK. The CUDA SDK defines `__noinline__` as a macro `__attribute__((noinline))`. However, it is not compatible with some C++ headers which use `__attribute__((__noinline__))`. This patch will not change the usage pattern of `__noinline__`. It is equivalent to the original behaviour with the benefit of being compatible with C++ headers. yaxunl: specific to `__noinline__`, it is largely determined by the existing behaviour of CUDA SDK.

	#undef EXTENSION			#undef EXTENSION
	#undef FEATURE			#undef FEATURE

clang/include/clang/Basic/TokenKinds.def

	Show First 20 Lines • Show All 593 Lines • ▼ Show 20 Lines
	KEYWORD(__builtin_astype , KEYOPENCLC \| KEYOPENCLCXX)			KEYWORD(__builtin_astype , KEYOPENCLC \| KEYOPENCLCXX)
	UNARY_EXPR_OR_TYPE_TRAIT(vec_step, VecStep, KEYOPENCLC \| KEYOPENCLCXX \| KEYALTIVEC \| KEYZVECTOR)			UNARY_EXPR_OR_TYPE_TRAIT(vec_step, VecStep, KEYOPENCLC \| KEYOPENCLCXX \| KEYALTIVEC \| KEYZVECTOR)
	#define GENERIC_IMAGE_TYPE(ImgType, Id) KEYWORD(ImgType##_t, KEYOPENCLC \| KEYOPENCLCXX)			#define GENERIC_IMAGE_TYPE(ImgType, Id) KEYWORD(ImgType##_t, KEYOPENCLC \| KEYOPENCLCXX)
	#include "clang/Basic/OpenCLImageTypes.def"			#include "clang/Basic/OpenCLImageTypes.def"
	KEYWORD(pipe , KEYOPENCLC \| KEYOPENCLCXX)			KEYWORD(pipe , KEYOPENCLC \| KEYOPENCLCXX)
	// C++ for OpenCL s2.3.1: addrspace_cast operator			// C++ for OpenCL s2.3.1: addrspace_cast operator
	KEYWORD(addrspace_cast , KEYOPENCLCXX)			KEYWORD(addrspace_cast , KEYOPENCLCXX)

				// CUDA/HIP function attributes
				KEYWORD(__noinline__ , KEYCUDA)

	// OpenMP Type Traits			// OpenMP Type Traits
	UNARY_EXPR_OR_TYPE_TRAIT(__builtin_omp_required_simd_align, OpenMPRequiredSimdAlign, KEYALL)			UNARY_EXPR_OR_TYPE_TRAIT(__builtin_omp_required_simd_align, OpenMPRequiredSimdAlign, KEYALL)

	// Borland Extensions.			// Borland Extensions.
	KEYWORD(__pascal , KEYALL)			KEYWORD(__pascal , KEYALL)

	// Altivec Extension.			// Altivec Extension.
	KEYWORD(__vector , KEYALTIVEC\|KEYZVECTOR)			KEYWORD(__vector , KEYALTIVEC\|KEYZVECTOR)
	▲ Show 20 Lines • Show All 328 Lines • Show Last 20 Lines

clang/include/clang/Parse/Parser.h

Show First 20 Lines • Show All 2,821 Lines • ▼ Show 20 Lines	private:
void ParseMicrosoftTypeAttributes(ParsedAttributes &attrs);		void ParseMicrosoftTypeAttributes(ParsedAttributes &attrs);
void DiagnoseAndSkipExtendedMicrosoftTypeAttributes();		void DiagnoseAndSkipExtendedMicrosoftTypeAttributes();
SourceLocation SkipExtendedMicrosoftTypeAttributes();		SourceLocation SkipExtendedMicrosoftTypeAttributes();
void ParseMicrosoftInheritanceClassAttributes(ParsedAttributes &attrs);		void ParseMicrosoftInheritanceClassAttributes(ParsedAttributes &attrs);
void ParseBorlandTypeAttributes(ParsedAttributes &attrs);		void ParseBorlandTypeAttributes(ParsedAttributes &attrs);
void ParseOpenCLKernelAttributes(ParsedAttributes &attrs);		void ParseOpenCLKernelAttributes(ParsedAttributes &attrs);
void ParseOpenCLQualifiers(ParsedAttributes &Attrs);		void ParseOpenCLQualifiers(ParsedAttributes &Attrs);
void ParseNullabilityTypeSpecifiers(ParsedAttributes &attrs);		void ParseNullabilityTypeSpecifiers(ParsedAttributes &attrs);
		void ParseCUDAFunctionAttributes(ParsedAttributes &attrs);

VersionTuple ParseVersionTuple(SourceRange &Range);		VersionTuple ParseVersionTuple(SourceRange &Range);
void ParseAvailabilityAttribute(IdentifierInfo &Availability,		void ParseAvailabilityAttribute(IdentifierInfo &Availability,
SourceLocation AvailabilityLoc,		SourceLocation AvailabilityLoc,
ParsedAttributes &attrs,		ParsedAttributes &attrs,
SourceLocation *endLoc,		SourceLocation *endLoc,
IdentifierInfo *ScopeName,		IdentifierInfo *ScopeName,
SourceLocation ScopeLoc,		SourceLocation ScopeLoc,
▲ Show 20 Lines • Show All 663 Lines • Show Last 20 Lines

clang/lib/Basic/IdentifierTable.cpp

Show First 20 Lines • Show All 102 Lines • ▼ Show 20 Lines	enum {
KEYOBJC = 0x20000,		KEYOBJC = 0x20000,
KEYZVECTOR = 0x40000,		KEYZVECTOR = 0x40000,
KEYCOROUTINES = 0x80000,		KEYCOROUTINES = 0x80000,
KEYMODULES = 0x100000,		KEYMODULES = 0x100000,
KEYCXX20 = 0x200000,		KEYCXX20 = 0x200000,
KEYOPENCLCXX = 0x400000,		KEYOPENCLCXX = 0x400000,
KEYMSCOMPAT = 0x800000,		KEYMSCOMPAT = 0x800000,
KEYSYCL = 0x1000000,		KEYSYCL = 0x1000000,
		KEYCUDA = 0x2000000,
		delcypherUnsubmitted Done Reply Inline Actions @yaxunl Is it intentional that you didn't update `KEYALL` here? That means `KEYALL` doesn't include the bit for `KEYCUDA`. If that was your intention then this will break if someone adds a new key. E.g. KEYCUDA = 0x2000000, KEYSOMENEWTHING = 0x4000000, // ... // KEYALL now includes `KEYCUDA`, whereas it didn't before. // KEYALL includes KEYSOMENEWTHING KEYALL = (0x7ffffff & ~KEYNOMS18 & ~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to exclude. ... Updating the `0x1ffffff` constant to `0x3ffffff` so that `KEYALL` includes `KEYCUDA` If your intention is to not have `KEYCUDA` set in `KEYALL` then amend `KEYALL` to be. KEYALL = (0x7ffffff & ~KEYNOMS18 & ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are used to exclude. // KEYCUDA is not included in KEYALL delcypher: @yaxunl Is it intentional that you didn't update `KEYALL` here? That means `KEYALL` doesn't…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions My intention is not to include KEYCUDA in KEYALL. Should I change KEYALL to KEYALL = (0x3ffffff & ~KEYNOMS18 & ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are used to exclude. // KEYCUDA is not included in KEYALL instead of KEYALL = (0x7ffffff & ~KEYNOMS18 & ~KEYNOOPENCL & ~KEYCUDA ) // KEYNOMS18 and KEYNOOPENCL are used to exclude. // KEYCUDA is not included in KEYALL since the current maximum mask is 0x3ffffff instead of 0x7ffffff yaxunl: My intention is not to include KEYCUDA in KEYALL. Should I change KEYALL to ``` KEYALL =…
		delcypherUnsubmitted Done Reply Inline Actions Oops, you're right it would be `0x3ffffff`. I wonder though if we should clean this up so we don't need to manually update the bit mask every time... what if it was written like this? enum { KEYC99 = 0x1, KEYCXX = 0x2, KEYCXX11 = 0x4, .... KEYSYCL = 0x1000000, KEYCUDA = 0x2000000, KEYMAX = KEYCUDA, // Must be set to the largest KEY enum value KEYALLCXX = KEYCXX \| KEYCXX11 \| KEYCXX20, // KEYNOMS18 and KEYNOOPENCL are used to exclude. // KEYCUDA is not included in KEYALL because <FIXME add reason here> KEYALL = (((KEYMAX & (KEYMAX-1)) & ~KEYNOMS18 & ~KEYNOOPENCL & ~KEYCUDA) }; delcypher: Oops, you're right it would be `0x3ffffff`. I wonder though if we should clean this up so we…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions On second thought, KEYALL does not need to exclude KEYCUDA. However, it would be good to set KEYALL in a generic approach. I will open a separate review. yaxunl: On second thought, KEYALL does not need to exclude KEYCUDA. However, it would be good to set…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions opened https://reviews.llvm.org/D125396 to fix KEYALL yaxunl: opened https://reviews.llvm.org/D125396 to fix KEYALL
		delcypherUnsubmitted Not Done Reply Inline Actions Oops that should say KEYALL = (((KEYMAX \| (KEYMAX-1)) & ~KEYNOMS18 & ~KEYNOOPENCL & ~KEYCUDA) delcypher: Oops that should say ``` KEYALL = (((KEYMAX \| (KEYMAX-1)) & ~KEYNOMS18 & ~KEYNOOPENCL &…
KEYALLCXX = KEYCXX \| KEYCXX11 \| KEYCXX20,		KEYALLCXX = KEYCXX \| KEYCXX11 \| KEYCXX20,
KEYALL = (0x1ffffff & ~KEYNOMS18 &		KEYALL = (0x1ffffff & ~KEYNOMS18 &
~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to exclude.		~KEYNOOPENCL) // KEYNOMS18 and KEYNOOPENCL are used to exclude.
};		};

/// How a keyword is treated in the selected standard.		/// How a keyword is treated in the selected standard.
enum KeywordStatus {		enum KeywordStatus {
KS_Disabled, // Disabled		KS_Disabled, // Disabled
Show All 34 Lines	static KeywordStatus getKeywordStatus(const LangOptions &LangOpts,
if (LangOpts.CPlusPlus20 && (Flags & KEYCONCEPTS)) return KS_Enabled;		if (LangOpts.CPlusPlus20 && (Flags & KEYCONCEPTS)) return KS_Enabled;
if (LangOpts.Coroutines && (Flags & KEYCOROUTINES)) return KS_Enabled;		if (LangOpts.Coroutines && (Flags & KEYCOROUTINES)) return KS_Enabled;
if (LangOpts.ModulesTS && (Flags & KEYMODULES)) return KS_Enabled;		if (LangOpts.ModulesTS && (Flags & KEYMODULES)) return KS_Enabled;
if (LangOpts.CPlusPlus && (Flags & KEYALLCXX)) return KS_Future;		if (LangOpts.CPlusPlus && (Flags & KEYALLCXX)) return KS_Future;
if (LangOpts.CPlusPlus && !LangOpts.CPlusPlus20 && (Flags & CHAR8SUPPORT))		if (LangOpts.CPlusPlus && !LangOpts.CPlusPlus20 && (Flags & CHAR8SUPPORT))
return KS_Future;		return KS_Future;
if (LangOpts.isSYCL() && (Flags & KEYSYCL))		if (LangOpts.isSYCL() && (Flags & KEYSYCL))
return KS_Enabled;		return KS_Enabled;
		if (LangOpts.CUDA && (Flags & KEYCUDA))
		return KS_Enabled;
return KS_Disabled;		return KS_Disabled;
}		}

/// AddKeyword - This method is used to associate a token ID with specific		/// AddKeyword - This method is used to associate a token ID with specific
/// identifiers because they are language keywords. This causes the lexer to		/// identifiers because they are language keywords. This causes the lexer to
/// automatically map matching identifiers to specialized token codes.		/// automatically map matching identifiers to specialized token codes.
static void AddKeyword(StringRef Keyword,		static void AddKeyword(StringRef Keyword,
tok::TokenKind TokenCode, unsigned Flags,		tok::TokenKind TokenCode, unsigned Flags,
▲ Show 20 Lines • Show All 607 Lines • Show Last 20 Lines

clang/lib/Parse/ParseDecl.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 891 Lines • ▼ Show 20 Lines	void Parser::ParseOpenCLKernelAttributes(ParsedAttributes &attrs) {
while (Tok.is(tok::kw___kernel)) {		while (Tok.is(tok::kw___kernel)) {
IdentifierInfo *AttrName = Tok.getIdentifierInfo();		IdentifierInfo *AttrName = Tok.getIdentifierInfo();
SourceLocation AttrNameLoc = ConsumeToken();		SourceLocation AttrNameLoc = ConsumeToken();
attrs.addNew(AttrName, AttrNameLoc, nullptr, AttrNameLoc, nullptr, 0,		attrs.addNew(AttrName, AttrNameLoc, nullptr, AttrNameLoc, nullptr, 0,
ParsedAttr::AS_Keyword);		ParsedAttr::AS_Keyword);
}		}
}		}

		void Parser::ParseCUDAFunctionAttributes(ParsedAttributes &attrs) {
		while (Tok.is(tok::kw___noinline__)) {
		IdentifierInfo *AttrName = Tok.getIdentifierInfo();
		aaron.ballmanUnsubmitted Done Reply Inline Actions I think we should we be issuing a pedantic "this is a clang extension" warning here, WDYT? aaron.ballman: I think we should we be issuing a pedantic "this is a clang extension" warning here, WDYT?
		yaxunlAuthorUnsubmitted Done Reply Inline Actions will do yaxunl: will do
		aaron.ballmanUnsubmitted Done Reply Inline Actions I'm questioning whether my advice here was good or not -- now that I see the CUDA spec already calls these function qualifiers... it's debatable whether this is a Clang extension or just the way in which Clang implements the CUDA function qualifiers. @tra -- do you have opinions? I'm sort of leaning towards dropping the extension warning, but the only reason I can think of for keeping it is if Clang is the only CUDA compiler that doesn't require you to include a header before using the function qualifiers. If that's the case, there is a portability concern. aaron.ballman: I'm questioning whether my advice here was good or not -- now that I see the CUDA spec already…
		traUnsubmitted Done Reply Inline Actions I'm not sure if such a warning would be useful. the only reason I can think of for keeping it is if Clang is the only CUDA compiler that doesn't require you to include a header before using the function qualifiers. If that's the case, there is a portability concern. I don't think it's an issue. We already have similar divergence between nvcc/clang. E.g. built-in variables like `threadIdx`. Clang implements them in a header, but NVCC provides them by compiler itself. With both compilers the variables are available by the time we get to compile user code. Virtually all CUDA compilations are done with tons of CUDA headers pre-included by compiler. Those that do not do that are already on their own and have to provide many other 'standard' CUDA things like target attributes. I don't think we need to worry about that. tra: I'm not sure if such a warning would be useful. > the only reason I can think of for keeping…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions I can remove the diagnostics since it seems unnecessary. I tend to treat it as an extension since nvcc is the de facto standard implementation, which does not implement it as a keyword. Compared to that, this is like an extension. yaxunl: I can remove the diagnostics since it seems unnecessary. I tend to treat it as an extension…
		traUnsubmitted Done Reply Inline Actions I'd argue that NVCC does implement it (as in "documents and makes it available"). Providing the documented functionality using a different implementation does not reach the point of being an extension, IMO. While there are observable differences between implementations, depending on them would be a portability error for the user. tra: I'd argue that NVCC does implement it (as in "documents and makes it available"). Providing the…
		yaxunlAuthorUnsubmitted Done Reply Inline Actions that makes sense. will change the extension to feature yaxunl: that makes sense. will change the extension to feature
		SourceLocation AttrNameLoc = ConsumeToken();
		attrs.addNew(AttrName, AttrNameLoc, nullptr, AttrNameLoc, nullptr, 0,
		ParsedAttr::AS_Keyword);
		}
		}

void Parser::ParseOpenCLQualifiers(ParsedAttributes &Attrs) {		void Parser::ParseOpenCLQualifiers(ParsedAttributes &Attrs) {
IdentifierInfo *AttrName = Tok.getIdentifierInfo();		IdentifierInfo *AttrName = Tok.getIdentifierInfo();
SourceLocation AttrNameLoc = Tok.getLocation();		SourceLocation AttrNameLoc = Tok.getLocation();
Attrs.addNew(AttrName, AttrNameLoc, nullptr, AttrNameLoc, nullptr, 0,		Attrs.addNew(AttrName, AttrNameLoc, nullptr, AttrNameLoc, nullptr, 0,
ParsedAttr::AS_Keyword);		ParsedAttr::AS_Keyword);
}		}

void Parser::ParseNullabilityTypeSpecifiers(ParsedAttributes &attrs) {		void Parser::ParseNullabilityTypeSpecifiers(ParsedAttributes &attrs) {
▲ Show 20 Lines • Show All 2,777 Lines • ▼ Show 20 Lines	case tok::kw___pascal:
ParseBorlandTypeAttributes(DS.getAttributes());		ParseBorlandTypeAttributes(DS.getAttributes());
continue;		continue;

// OpenCL single token adornments.		// OpenCL single token adornments.
case tok::kw___kernel:		case tok::kw___kernel:
ParseOpenCLKernelAttributes(DS.getAttributes());		ParseOpenCLKernelAttributes(DS.getAttributes());
continue;		continue;

		// CUDA/HIP single token adornments.
		case tok::kw___noinline__:
		ParseCUDAFunctionAttributes(DS.getAttributes());
		continue;

// Nullability type specifiers.		// Nullability type specifiers.
case tok::kw__Nonnull:		case tok::kw__Nonnull:
case tok::kw__Nullable:		case tok::kw__Nullable:
case tok::kw__Nullable_result:		case tok::kw__Nullable_result:
case tok::kw__Null_unspecified:		case tok::kw__Null_unspecified:
ParseNullabilityTypeSpecifiers(DS.getAttributes());		ParseNullabilityTypeSpecifiers(DS.getAttributes());
continue;		continue;

▲ Show 20 Lines • Show All 3,856 Lines • Show Last 20 Lines

clang/test/CodeGenCUDA/noinline.cu

This file was added.

// Uses -O2 since the defalt -O0 option adds noinline to all functions.

aaron.ballmanUnsubmitted

Done

- // optimization is needed, otherwise by default all functions have noinline.

+ // Optimization is needed, otherwise by default all functions have noinline.

// RUN: %clang_cc1 -triple nvptx-nvidia-cuda -fcuda-is-device \

I've asked @erichkeane to weigh in on whether there's a better approach here than specifying an optimization level.

aaron.ballman: I've asked @erichkeane to weigh in on whether there's a better approach here than specifying an…

erichkeaneUnsubmitted

Done

You don't need to do this, it looks like all you're trying to do is keep 'clang' out of O0 mode. However, what you do NOT want is the optimizations to run. The common way to do that is to combine O1/O2/etc like: -O2 -disable-llvm-passes

This will keep clang in O2 mode, but will keep the optimizer from running anything, which might mess with the test later on.

erichkeane: You don't need to do this, it looks like all you're trying to do is keep 'clang' out of `O0`…

yaxunlAuthorUnsubmitted

Done

will use -O2 -disable-llvm-passes

yaxunl: will use -O2 -disable-llvm-passes

// RUN: %clang_cc1 -triple nvptx-nvidia-cuda -fcuda-is-device \

// RUN: -O2 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s

// RUN: %clang_cc1 -triple amdgcn-amd-amdhsa -fcuda-is-device \

// RUN: -O2 -disable-llvm-passes -emit-llvm -o - -x hip %s | FileCheck %s

// RUN: %clang_cc1 -triple x86_64-unknown-gnu-linux \

// RUN: -O2 -disable-llvm-passes -emit-llvm -o - %s | FileCheck %s

#include "Inputs/cuda.h"

__noinline__ __device__ __host__ void fun1() {}

__attribute__((noinline)) __device__ __host__ void fun2() {}

__attribute__((__noinline__)) __device__ __host__ void fun3() {}

[[gnu::__noinline__]] __device__ __host__ void fun4() {}

#define __noinline__ __attribute__((__noinline__))

__noinline__ __device__ __host__ void fun5() {}

__device__ __host__ void fun6() {}

// CHECK: define{{.*}}@_Z4fun1v{{.*}}#[[ATTR1:[0-9]*]]

// CHECK: define{{.*}}@_Z4fun2v{{.*}}#[[ATTR1:[0-9]*]]

// CHECK: define{{.*}}@_Z4fun3v{{.*}}#[[ATTR1:[0-9]*]]

// CHECK: define{{.*}}@_Z4fun4v{{.*}}#[[ATTR1:[0-9]*]]

// CHECK: define{{.*}}@_Z4fun5v{{.*}}#[[ATTR1:[0-9]*]]

// CHECK: define{{.*}}@_Z4fun6v{{.*}}#[[ATTR2:[0-9]*]]

// CHECK: attributes #[[ATTR1]] = {{.*}}noinline

// CHECK-NOT: attributes #[[ATTR2]] = {{.*}}noinline

clang/test/Lexer/has_feature.cu

This file was added.

				// RUN: %clang_cc1 -E -triple x86_64-linux-gnu %s -o - \| FileCheck %s

				// CHECK: has_noinline_keyword
				#if __has_feature(cuda_noinline_keyword)
				int has_noinline_keyword();
				#else
				int no_noinine_keyword();
				#endif

clang/test/SemaCUDA/noinline.cu

This file was added.

				// RUN: %clang_cc1 -fsyntax-only -verify=cuda %s
				// RUN: %clang_cc1 -fsyntax-only -verify=cuda -pedantic %s
				// RUN: %clang_cc1 -fsyntax-only -verify=cpp -x c++ %s

				// cuda-no-diagnostics

				__noinline__ void fun1() { } // cpp-error {{unknown type name '__noinline__'}}

				__attribute__((noinline)) void fun2() { }
				aaron.ballmanUnsubmitted Done Reply Inline Actions I think there should also be a test like: [[gnu::__noinline__]] void fun4() {} to verify that the double square bracket syntax also correctly handles this being a keyword now (I expect the test to pass). aaron.ballman: I think there should also be a test like: ``` [[gnu::__noinline__]] void fun4() {} ``` to…
				yaxunlAuthorUnsubmitted Done Reply Inline Actions will do yaxunl: will do
				aaron.ballmanUnsubmitted Done Reply Inline Actions Ah, I just noticed we also have no tests for the behavior of the keyword in the presence of the macro being defined. e.g., #define __noinline__ __attribute__((__noinline__)) __noinline__ void fun5() {} aaron.ballman: Ah, I just noticed we also have no tests for the behavior of the keyword in the presence of the…
				yaxunlAuthorUnsubmitted Done Reply Inline Actions will do yaxunl: will do
				aaron.ballmanUnsubmitted Done Reply Inline Actions I missed an important detail -- I think this is now going to generate a warning in `-pedantic` mode (through `-Wkeyword-macro`) when compiling for CUDA; is that going to be a problem for CUDA headers, or are those always included as a system header (and so the diagnostics will be suppressed)? aaron.ballman: I missed an important detail -- I think this is now going to generate a warning in `-pedantic`…
				yaxunlAuthorUnsubmitted Done Reply Inline Actions I could not find how clang driver adds CUDA include path https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Cuda.cpp#L284 @tra do you know how CUDA include path is added? is it done by CMake? For HIP the HIP include path is added as a system include path by clang driver. yaxunl: I could not find how clang driver adds CUDA include path https://github.com/llvm/llvm…
				aaron.ballmanUnsubmitted Done Reply Inline Actions Whatever we find out, we can emulate its behavior here in the test file to see what the diagnostic behavior will be (you can use GNU linemarkers to convince the compiler parts of the source are in a system header). aaron.ballman: Whatever we find out, we can emulate its behavior here in the test file to see what the…
				yaxunlAuthorUnsubmitted Done Reply Inline Actions will add tests for that. It seems no matter it is system header or normal header, no warnings are emitted even with -pedantic. yaxunl: will add tests for that. It seems no matter it is system header or normal header, no warnings…
				aaron.ballmanUnsubmitted Done Reply Inline Actions Excellent, thank you! aaron.ballman: Excellent, thank you!
				traUnsubmitted Not Done Reply Inline Actions CUDA includes are added via `-internal-isystem` here: https://github.com/llvm/llvm-project/blob/main/clang/lib/Driver/ToolChains/Cuda.cpp#L892 tra: CUDA includes are added via `-internal-isystem` here: https://github.com/llvm/llvm…
				__attribute__((__noinline__)) void fun3() { }
				[[gnu::__noinline__]] void fun4() { }

				#define __noinline__ __attribute__((__noinline__))
				__noinline__ void fun5() {}

				#undef __noinline__
				#10 "cuda.h" 3
				#define __noinline__ __attribute__((__noinline__))
				__noinline__ void fun6() {}

This is an archive of the discontinued LLVM Phabricator instance.

[CUDA][HIP] support __noinline__ as keywordClosedPublic

Details

Diff Detail

Unit TestsFailed

Event Timeline

Revision Contents

Diff 428423

clang/docs/ReleaseNotes.rst

clang/include/clang/Basic/Attr.td

clang/include/clang/Basic/AttrDocs.td

clang/include/clang/Basic/Features.def

clang/include/clang/Basic/TokenKinds.def

clang/include/clang/Parse/Parser.h

clang/lib/Basic/IdentifierTable.cpp

clang/lib/Parse/ParseDecl.cpp

clang/test/CodeGenCUDA/noinline.cu

clang/test/Lexer/has_feature.cu

clang/test/SemaCUDA/noinline.cu

[CUDA][HIP] support noinline as keyword
ClosedPublic