Page MenuHomePhabricator

yaxunl (Yaxun Liu)
User

Projects

User does not belong to any projects.

User Details

User Since
May 13 2015, 10:16 AM (373 w, 21 h)

Recent Activity

Tue, Jul 5

yaxunl accepted D128923: [LinkerWrapper] Add AMDGPU specific options to the LLD invocation.

LGTM. Thanks.

Tue, Jul 5, 10:34 AM · Restricted Project, Restricted Project
yaxunl accepted D128850: [HIP] Generate offloading entries for HIP with the new driver..

LGTM. Thanks.

Tue, Jul 5, 10:30 AM · Restricted Project, Restricted Project

Fri, Jun 24

yaxunl added a comment to D128346: [AMDGPU] Disable the private segment size fallback to 16k.

Will this patch cause HIP apps using the dynamic stack to fail if HIP runtime change is not in place?

Fri, Jun 24, 8:30 AM · Restricted Project, Restricted Project
yaxunl committed rG8ad4c6e4b129: [HIP] add -fhip-kernel-arg-name (authored by yaxunl).
[HIP] add -fhip-kernel-arg-name
Fri, Jun 24, 8:16 AM · Restricted Project, Restricted Project
yaxunl closed D128022: [HIP] add -fhip-kernel-arg-name.
Fri, Jun 24, 8:16 AM · Restricted Project, Restricted Project

Thu, Jun 23

yaxunl added inline comments to D128022: [HIP] add -fhip-kernel-arg-name.
Thu, Jun 23, 8:27 PM · Restricted Project, Restricted Project
yaxunl added inline comments to D128022: [HIP] add -fhip-kernel-arg-name.
Thu, Jun 23, 9:16 AM · Restricted Project, Restricted Project

Wed, Jun 22

yaxunl added inline comments to D128022: [HIP] add -fhip-kernel-arg-name.
Wed, Jun 22, 12:06 PM · Restricted Project, Restricted Project
yaxunl added a comment to D127142: [HIP] Link with clang_rt.builtins.

Magically deciding a default value for --unwindlib or --rtlib is not nice. You may emit a warning if the selected default happens to be incompatible with HIP.

Wed, Jun 22, 8:13 AM · Restricted Project

Tue, Jun 21

yaxunl updated the diff for D127142: [HIP] Link with clang_rt.builtins.

add -unwindlib=libgcc by default for --hip-link since -rtlib=compiler-rt needs it

Tue, Jun 21, 8:09 AM · Restricted Project

Fri, Jun 17

yaxunl added a comment to D127142: [HIP] Link with clang_rt.builtins.

If I use --rtlib=compiler-rt, does that also requires --unwindlib=unwindlib ?

No. --unwindlib=libunwind requires --rtlib=compiler-rt. --rtlib=compiler-rt is compatible with both --unwindlib=libgcc and --unwindlib=libunwind.

If only use -rtlib=compiler-rt without changing unwind lib, I will get missing symbol:

[2022-06-16T20:05:28.644Z] ld.lld: error: undefined symbol: _Unwind_Resume

[2022-06-16T20:05:28.644Z] >>> referenced by main.cpp

[2022-06-16T20:05:28.644Z] >>> CMakeFiles/MIOpenDriver.dir/main.cpp.o:(generate_skipahead_file())

[2022-06-16T20:05:28.644Z] >>> referenced by main.cpp

[2022-06-16T20:05:28.644Z] >>> CMakeFiles/MIOpenDriver.dir/main.cpp.o:(main)

[2022-06-16T20:05:28.644Z] >>> referenced by main.cpp

[2022-06-16T20:05:28.644Z] >>> CMakeFiles/MIOpenDriver.dir/main.cpp.o:(std::function<void (float&, float, bool&)> reduce::ReduceOpFn2<float>(miopenReduceTensorOp_t))

[2022-06-16T20:05:28.644Z] >>> referenced 1246 more times

[2022-06-16T20:05:28.644Z]

[2022-06-16T20:05:28.644Z] ld.lld: error: ../lib/libMIOpen.so.1.0.50300: undefined reference to _Unwind_Resume [--no-allow-shlib-undefined]

[2022-06-16T20:05:28.644Z] ld.lld: error: ../lib/libMIOpen.so.1.0.50300: undefined reference to _Unwind_Backtrace [--no-allow-shlib-undefined]

[2022-06-16T20:05:28.644Z] ld.lld: error: ../lib/libMIOpen.so.1.0.50300: undefined reference to _Unwind_GetIP [--no-allow-shlib-undefined]

[2022-06-16T20:05:28.644Z] clang-15: error: linker command failed with exit code 1 (use -v to see invocation)

Fri, Jun 17, 8:25 AM · Restricted Project

Thu, Jun 16

yaxunl requested review of D128022: [HIP] add -fhip-kernel-arg-name.
Thu, Jun 16, 7:26 PM · Restricted Project, Restricted Project
yaxunl added a comment to D127142: [HIP] Link with clang_rt.builtins.

If I use --rtlib=compiler-rt, does that also requires --unwindlib=unwindlib ?

No. --unwindlib=libunwind requires --rtlib=compiler-rt. --rtlib=compiler-rt is compatible with both --unwindlib=libgcc and --unwindlib=libunwind.

Thu, Jun 16, 6:37 PM · Restricted Project
yaxunl updated the diff for D127142: [HIP] Link with clang_rt.builtins.

use compiler-rt as runtime lib by default for --hip-link

Thu, Jun 16, 10:12 AM · Restricted Project

Tue, Jun 14

yaxunl committed rGaf9ee3357cec: [HIP] fix long double size (authored by yaxunl).
[HIP] fix long double size
Tue, Jun 14, 6:58 PM · Restricted Project, Restricted Project
yaxunl closed D127771: [HIP] fix long double size.
Tue, Jun 14, 6:58 PM · Restricted Project, Restricted Project
yaxunl added a comment to D127771: [HIP] fix long double size.

AFAICT, the test case you've added works fine with the compiler at HEAD: https://cuda.godbolt.org/z/q3xYMfdeb
I guess it only shows up in assertion-enabled builds. Can you check what happens if you run the test case compiled as CUDA? I suspect it will have the same issue -- we don't override setAuxTarget at all.

Tue, Jun 14, 11:49 AM · Restricted Project, Restricted Project
yaxunl requested review of D127771: [HIP] fix long double size.
Tue, Jun 14, 11:01 AM · Restricted Project, Restricted Project

Thu, Jun 9

yaxunl added a comment to D127142: [HIP] Link with clang_rt.builtins.

These functions are not available in libgcc but in libclang_rt.builtins. Therefore --hip-link needs to link with libclang_rt.builtins by default.

I think this is problematic.

The current link sequence is ... "-lamdhip64" "/tmp/Debug/lib/clang/15.0.0/lib/linux/libclang_rt.builtins-x86_64.a" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "-lc" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "/usr/lib/gcc/x86_64-linux-gnu/12/crtendS.o" "/lib/x86_64-linux-gnu/crtn.o"

Linking in libclang_rt.builtins is incompatible with libgcc. There may be subtle symbol resolving issues. I don't think silently changing the behavior is desired.
Perhaps --rocm-path should require --rtlib=compiler-rt. See the --rtlib=libgcc requires --unwindlib=libgcc diagnostic.

Your user will need to do a bit more work if CLANG_DEFAULT_RTLIB is not compiler-rt.

Thu, Jun 9, 8:17 AM · Restricted Project
yaxunl added a comment to D127267: [NVPTX] Add setAuxTarget override rather than make a new TargetInfo.

This patch is to fix an issue, right? At least we need a test to prevent that issue from happening again.

Thu, Jun 9, 8:06 AM · Restricted Project, Restricted Project

Wed, Jun 8

yaxunl added a comment to D127267: [NVPTX] Add setAuxTarget override rather than make a new TargetInfo.

need a test

Wed, Jun 8, 7:58 AM · Restricted Project, Restricted Project

Jun 7 2022

yaxunl updated the diff for D127142: [HIP] Link with clang_rt.builtins.

use getCompilerRT to get compiler-rt lib path

Jun 7 2022, 7:17 AM · Restricted Project
yaxunl added inline comments to D127142: [HIP] Link with clang_rt.builtins.
Jun 7 2022, 7:16 AM · Restricted Project

Jun 6 2022

yaxunl requested review of D127142: [HIP] Link with clang_rt.builtins.
Jun 6 2022, 12:00 PM · Restricted Project

May 31 2022

yaxunl committed rG92a606f6de77: [HIP] Pass -Xoffload-linker option to device linker (authored by yaxunl).
[HIP] Pass -Xoffload-linker option to device linker
May 31 2022, 7:39 PM · Restricted Project, Restricted Project
yaxunl closed D126704: [HIP] Pass -Xoffload-linker option to device linker.
May 31 2022, 7:38 PM · Restricted Project, Restricted Project
yaxunl committed rG377806a65ea9: [HIP] Fix static lib name on windows (authored by yaxunl).
[HIP] Fix static lib name on windows
May 31 2022, 7:14 PM · Restricted Project, Restricted Project
yaxunl closed D126681: [HIP] Fix static lib name on windows.
May 31 2022, 7:14 PM · Restricted Project, Restricted Project
yaxunl added inline comments to D126681: [HIP] Fix static lib name on windows.
May 31 2022, 2:51 PM · Restricted Project, Restricted Project
yaxunl requested review of D126704: [HIP] Pass -Xoffload-linker option to device linker.
May 31 2022, 8:21 AM · Restricted Project, Restricted Project

May 30 2022

yaxunl requested review of D126681: [HIP] Fix static lib name on windows.
May 30 2022, 5:07 PM · Restricted Project, Restricted Project

May 25 2022

yaxunl added a comment to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

How much work would it take to add cuid generation in the new driver, similar to what the old driver does, using the same logic, however imperfect it is? I'd be OK with that as a possibly permanent solution.

I'm somewhat wary of temporary solutions as they tend to become permanent and age poorly.
That said, I'm OK with someone else tie-breaking us here.
@yaxunl -- Sam, do you have an opinion?

May 25 2022, 12:22 PM · Restricted Project, Restricted Project

May 24 2022

yaxunl added a comment to D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang.

In HIP, kernels are represented by attribute global and not by calling convention in clang. This may be an alternative.

May 24 2022, 9:34 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.
May 24 2022, 9:20 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.
May 24 2022, 9:00 AM · Restricted Project, Restricted Project

May 19 2022

yaxunl accepted D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang.

LGTM. Thanks.

May 19 2022, 2:38 PM · Restricted Project, Restricted Project
yaxunl added a comment to D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang.

need a codegen test to make sure amdgpu_kernel ABI is used in C/C++ for functions with this attribute

We've already got tests that check the amdgpu_kernel calling conv is lowered correctly. This change only adds it to clang, so checking the IR seems sufficient. I can copy/paste/mod that test case if you like, but it doesn't give any extra coverage that I can see

May 19 2022, 1:20 PM · Restricted Project, Restricted Project
yaxunl added a comment to D125970: [amdgpu] Add amdgpu_kernel calling conv attribute to clang.

need a codegen test to make sure amdgpu_kernel ABI is used in C/C++ for functions with this attribute. https://github.com/llvm/llvm-project/blob/main/clang/test/CodeGenCUDA/amdgpu-kernel-arg-pointer-type.cu#L64 may be used as an example.

May 19 2022, 12:39 PM · Restricted Project, Restricted Project
yaxunl committed rG559b8fc17ef6: [AMDGPU] emit macro __GFX9__ etc (authored by yaxunl).
[AMDGPU] emit macro __GFX9__ etc
May 19 2022, 9:18 AM · Restricted Project, Restricted Project
yaxunl closed D125909: [AMDGPU] emit macro __GFX9__ etc.
May 19 2022, 9:18 AM · Restricted Project, Restricted Project
yaxunl committed rGcefe472c51fb: [clang] Fix __has_builtin (authored by yaxunl).
[clang] Fix __has_builtin
May 19 2022, 8:35 AM · Restricted Project, Restricted Project
yaxunl closed D125829: [clang] Fix __has_builtin.
May 19 2022, 8:35 AM · Restricted Project, Restricted Project

May 18 2022

yaxunl added inline comments to D125909: [AMDGPU] emit macro __GFX9__ etc.
May 18 2022, 10:52 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D125829: [clang] Fix __has_builtin.
May 18 2022, 10:43 AM · Restricted Project, Restricted Project
yaxunl requested review of D125909: [AMDGPU] emit macro __GFX9__ etc.
May 18 2022, 10:16 AM · Restricted Project, Restricted Project
yaxunl updated the diff for D125829: [clang] Fix __has_builtin.

revised by Artem's comments

May 18 2022, 9:49 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D125829: [clang] Fix __has_builtin.
May 18 2022, 9:48 AM · Restricted Project, Restricted Project

May 17 2022

yaxunl added inline comments to D125555: [clang] Add __has_target_feature.
May 17 2022, 12:19 PM · Restricted Project
yaxunl requested review of D125829: [clang] Fix __has_builtin.
May 17 2022, 12:18 PM · Restricted Project, Restricted Project
yaxunl added a comment to D125705: [OpenMP] Don't build the offloading driver without a source input.

HIP toolchain allows clang driver to compile bundled bitcode or assembly for mixed host/device compilation or device-only multi-GPU compilation.

e.g.

clang --offload-arch=gfx906 --offload-arch=gfx908 a.bc b.s

Can you add a test to make sure this does not break HIP toolchain? Thanks.

This only changes the new driver, which doesn't support HIP right now anyway. This should be captured by some existing tests but I could try to dig them up

May 17 2022, 9:09 AM · Restricted Project, Restricted Project
yaxunl added a comment to D125705: [OpenMP] Don't build the offloading driver without a source input.

HIP toolchain allows clang driver to compile bundled bitcode or assembly for mixed host/device compilation or device-only multi-GPU compilation.

May 17 2022, 8:58 AM · Restricted Project, Restricted Project

May 16 2022

yaxunl added inline comments to D125555: [clang] Add __has_target_feature.
May 16 2022, 7:47 AM · Restricted Project
yaxunl added inline comments to D125555: [clang] Add __has_target_feature.
May 16 2022, 7:43 AM · Restricted Project

May 13 2022

yaxunl added inline comments to D125555: [clang] Add __has_target_feature.
May 13 2022, 10:21 AM · Restricted Project
yaxunl updated the diff for D125555: [clang] Add __has_target_feature.

fix typo

May 13 2022, 9:12 AM · Restricted Project
yaxunl requested review of D125555: [clang] Add __has_target_feature.
May 13 2022, 9:07 AM · Restricted Project

May 12 2022

yaxunl added a comment to D122734: [CUDA][HIP] Fix mangling number for local struct.

Hi,

I noticed when compiling with gcc 9.3.0 that we get a bunch of new warnings with this patch:

[1/351] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/MicrosoftCXXABI.cpp.o
../../clang/lib/AST/MicrosoftCXXABI.cpp:57:12: warning: 'virtual unsigned int {anonymous}::MicrosoftNumberingContext::getManglingNumber(const clang::VarDecl*, unsigned int)' was hidden [-Woverloaded-virtual]
   57 |   unsigned getManglingNumber(const VarDecl *VD,
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:80:12: warning:   by 'virtual unsigned int {anonymous}::MSHIPNumberingContext::getManglingNumber(const clang::TagDecl*, unsigned int)' [-Woverloaded-virtual]
   80 |   unsigned getManglingNumber(const TagDecl *TD,
      |            ^~~~~~~~~~~~~~~~~
May 12 2022, 9:06 AM · Restricted Project, Restricted Project
yaxunl committed rG0f292141aadb: [clang]Silence warning in MicrosoftCXXABI.cpp (authored by yaxunl).
[clang]Silence warning in MicrosoftCXXABI.cpp
May 12 2022, 9:05 AM · Restricted Project, Restricted Project
yaxunl added a comment to D122734: [CUDA][HIP] Fix mangling number for local struct.

Hi,

I noticed when compiling with gcc 9.3.0 that we get a bunch of new warnings with this patch:

[1/351] Building CXX object tools/clang/lib/AST/CMakeFiles/obj.clangAST.dir/MicrosoftCXXABI.cpp.o
../../clang/lib/AST/MicrosoftCXXABI.cpp:57:12: warning: 'virtual unsigned int {anonymous}::MicrosoftNumberingContext::getManglingNumber(const clang::VarDecl*, unsigned int)' was hidden [-Woverloaded-virtual]
   57 |   unsigned getManglingNumber(const VarDecl *VD,
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:80:12: warning:   by 'virtual unsigned int {anonymous}::MSHIPNumberingContext::getManglingNumber(const clang::TagDecl*, unsigned int)' [-Woverloaded-virtual]
   80 |   unsigned getManglingNumber(const TagDecl *TD,
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:46:12: warning: 'virtual unsigned int {anonymous}::MicrosoftNumberingContext::getManglingNumber(const clang::BlockDecl*)' was hidden [-Woverloaded-virtual]
   46 |   unsigned getManglingNumber(const BlockDecl *BD) override {
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:80:12: warning:   by 'virtual unsigned int {anonymous}::MSHIPNumberingContext::getManglingNumber(const clang::TagDecl*, unsigned int)' [-Woverloaded-virtual]
   80 |   unsigned getManglingNumber(const TagDecl *TD,
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:42:12: warning: 'virtual unsigned int {anonymous}::MicrosoftNumberingContext::getManglingNumber(const clang::CXXMethodDecl*)' was hidden [-Woverloaded-virtual]
   42 |   unsigned getManglingNumber(const CXXMethodDecl *CallOperator) override {
      |            ^~~~~~~~~~~~~~~~~
../../clang/lib/AST/MicrosoftCXXABI.cpp:80:12: warning:   by 'virtual unsigned int {anonymous}::MSHIPNumberingContext::getManglingNumber(const clang::TagDecl*, unsigned int)' [-Woverloaded-virtual]
   80 |   unsigned getManglingNumber(const TagDecl *TD,
      |            ^~~~~~~~~~~~~~~~~

No idea if it's important or if gcc is overly picky.

May 12 2022, 8:15 AM · Restricted Project, Restricted Project

May 11 2022

yaxunl committed rG84db35594953: [clang] Fix KEYALL (authored by yaxunl).
[clang] Fix KEYALL
May 11 2022, 11:30 AM · Restricted Project, Restricted Project
yaxunl closed D125396: [clang] Fix KEYALL.
May 11 2022, 11:29 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D125165: [Clang] Introduce clang-offload-packager tool to bundle device files.
May 11 2022, 11:26 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D125165: [Clang] Introduce clang-offload-packager tool to bundle device files.
May 11 2022, 10:23 AM · Restricted Project, Restricted Project
yaxunl added a comment to D125165: [Clang] Introduce clang-offload-packager tool to bundle device files.

We could add a "clang-offload-bundler and clang-offload-wrapper are deprecated, replace them with $whatever" in the release notes and then remove them a release later, assuming the replacement is straightforward.

I think it is still too early to say clang-offload-bundler is deprecated. It is used by HIP toolchain and has functionality currently not available in clang-offload-packager.

If I read the above right, jhuber says it's been merged into clang itself, not that it's being replaced by clang-offload-packager (?)

I'll clarify, the functionality of the clang-offload-bundler is to embed device files into the host. I now do this directly in clang by creating a global string in the LLVM-IR of the host rather than calling a tool. The HIP toolchain still uses the clang-offload-bundler, but I'm planning on putting patches up to move away from that. The current clang-offload-bundler and this new tool have different purposes, this one simply create a binary that can then be embedded into the host. There is still functionality that the clang-offload-bundler provides that I don't intend to replace, namely the bundling and un-bundling of text files. I don't think we want to stick with the clang-offload-bundler approach, because the files that the --clang-offload-bundler spat out weren't valid input to the rest of LLVM, e.g. clang -S -emit-llvm --offload-arch=gfx908 foo.hip -o - | opt would break.

May 11 2022, 9:43 AM · Restricted Project, Restricted Project
yaxunl added a comment to D125165: [Clang] Introduce clang-offload-packager tool to bundle device files.

We could add a "clang-offload-bundler and clang-offload-wrapper are deprecated, replace them with $whatever" in the release notes and then remove them a release later, assuming the replacement is straightforward.

I think it is still too early to say clang-offload-bundler is deprecated. It is used by HIP toolchain and has functionality currently not available in clang-offload-packager.

If I read the above right, jhuber says it's been merged into clang itself, not that it's being replaced by clang-offload-packager (?)

May 11 2022, 9:03 AM · Restricted Project, Restricted Project
yaxunl added a comment to D125396: [clang] Fix KEYALL.

@yaxunl Thanks for addressing my feedback so quickly. I think the commit message should also mention that KEYCUDA is now included in KEYALL. Other than that LGTM.

May 11 2022, 8:55 AM · Restricted Project, Restricted Project
yaxunl added a comment to D125165: [Clang] Introduce clang-offload-packager tool to bundle device files.

We could add a "clang-offload-bundler and clang-offload-wrapper are deprecated, replace them with $whatever" in the release notes and then remove them a release later, assuming the replacement is straightforward.

May 11 2022, 8:52 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D124866: [CUDA][HIP] support __noinline__ as keyword.
May 11 2022, 8:48 AM · Restricted Project, Restricted Project
yaxunl requested review of D125396: [clang] Fix KEYALL.
May 11 2022, 8:46 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D124866: [CUDA][HIP] support __noinline__ as keyword.
May 11 2022, 8:19 AM · Restricted Project, Restricted Project

May 10 2022

yaxunl added inline comments to D124866: [CUDA][HIP] support __noinline__ as keyword.
May 10 2022, 7:52 PM · Restricted Project, Restricted Project
yaxunl committed rG180a8536cec8: Fix indentation in ReleaseNotes.rst (authored by yaxunl).
Fix indentation in ReleaseNotes.rst
May 10 2022, 11:57 AM · Restricted Project, Restricted Project
yaxunl committed rGafc9d674fe5a: [CUDA][HIP] support __noinline__ as keyword (authored by yaxunl).
[CUDA][HIP] support __noinline__ as keyword
May 10 2022, 11:35 AM · Restricted Project, Restricted Project
yaxunl closed D124866: [CUDA][HIP] support __noinline__ as keyword.
May 10 2022, 11:34 AM · Restricted Project, Restricted Project
yaxunl updated the diff for D124866: [CUDA][HIP] support __noinline__ as keyword.

make it a feature, add tests for pedantic, fix release notes and doecumentation

May 10 2022, 10:13 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D124866: [CUDA][HIP] support __noinline__ as keyword.
May 10 2022, 10:10 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D124866: [CUDA][HIP] support __noinline__ as keyword.
May 10 2022, 9:30 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D125165: [Clang] Introduce clang-offload-packager tool to bundle device files.
May 10 2022, 8:20 AM · Restricted Project, Restricted Project

May 9 2022

yaxunl updated the diff for D124866: [CUDA][HIP] support __noinline__ as keyword.

removed diagnostics and added more tests

May 9 2022, 12:36 PM · Restricted Project, Restricted Project
yaxunl added a comment to D124866: [CUDA][HIP] support __noinline__ as keyword.

__forceinline__ does not have the issue as __noinline__ has since it is not a GCC attribute. The current CUDA/HIP implementation of __forceinline__ in header files is sufficient. I do not see the benefit of implementing __forceinline__ as a keyword.

Primarily to reduce user confusion. It's kind of weird for __noinline__ to be a keyword and __forceinline__ to not be a keyword when they're both defined the same way by the CUDA spec. This means you can #undef one of them but not the other, that sort of thing.

May 9 2022, 12:34 PM · Restricted Project, Restricted Project
yaxunl added a comment to D124866: [CUDA][HIP] support __noinline__ as keyword.

Should we do __forceinline__ at the same time so that there's consistency?

May 9 2022, 10:03 AM · Restricted Project, Restricted Project

May 6 2022

yaxunl added a comment to D124866: [CUDA][HIP] support __noinline__ as keyword.

CUDA/HIP do not have language spec.

Well. It's not completely true. CUDA programming guide does serve as the de-facto spec for CUDA. It's far from perfect, but it does mention __noinline__ and __forceinline__ as function qualifiers: https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#noinline-and-forceinline

May 6 2022, 12:30 PM · Restricted Project, Restricted Project
yaxunl updated the diff for D124866: [CUDA][HIP] support __noinline__ as keyword.

added release note and documentation

May 6 2022, 11:48 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D124866: [CUDA][HIP] support __noinline__ as keyword.
May 6 2022, 11:46 AM · Restricted Project, Restricted Project
yaxunl updated the diff for D124866: [CUDA][HIP] support __noinline__ as keyword.

revised by Aaron's comments

May 6 2022, 11:16 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D124866: [CUDA][HIP] support __noinline__ as keyword.
May 6 2022, 11:12 AM · Restricted Project, Restricted Project
yaxunl added a comment to D125050: [OpenMP] Try to Infer target triples using the offloading architecture.

could you please add a test to https://github.com/llvm/llvm-project/blob/main/clang/test/Driver/hip-offload-arch.hip

May 6 2022, 10:48 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D125050: [OpenMP] Try to Infer target triples using the offloading architecture.
May 6 2022, 10:41 AM · Restricted Project, Restricted Project
yaxunl added a comment to D123812: [CUDA] Add wrapper code generation for registering CUDA images.

LGTM.

May 6 2022, 8:43 AM · Restricted Project, Restricted Project

May 4 2022

yaxunl committed rG62501bc45a2f: [NFC][CUDA][HIP] rework mangling number for aux target (authored by yaxunl).
[NFC][CUDA][HIP] rework mangling number for aux target
May 4 2022, 10:06 AM · Restricted Project, Restricted Project
yaxunl closed D124842: [NFC][CUDA][HIP] rework mangling number for aux target.
May 4 2022, 10:06 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D124842: [NFC][CUDA][HIP] rework mangling number for aux target.
May 4 2022, 8:27 AM · Restricted Project, Restricted Project

May 3 2022

yaxunl updated the diff for D124866: [CUDA][HIP] support __noinline__ as keyword.

add feature cuda_noinline_keyword to facilitate CUDA/HIP headers removing noinline macro

May 3 2022, 11:48 AM · Restricted Project, Restricted Project
yaxunl requested review of D124866: [CUDA][HIP] support __noinline__ as keyword.
May 3 2022, 11:27 AM · Restricted Project, Restricted Project
yaxunl added a reviewer for D124842: [NFC][CUDA][HIP] rework mangling number for aux target: rnk.
May 3 2022, 5:06 AM · Restricted Project, Restricted Project
yaxunl added inline comments to D122734: [CUDA][HIP] Fix mangling number for local struct.
May 3 2022, 5:05 AM · Restricted Project, Restricted Project
yaxunl requested review of D124842: [NFC][CUDA][HIP] rework mangling number for aux target.
May 3 2022, 5:03 AM · Restricted Project, Restricted Project

Apr 28 2022

yaxunl committed rG11d3e31c60bd: [CUDA][HIP] Fix mangling number for local struct (authored by yaxunl).
[CUDA][HIP] Fix mangling number for local struct
Apr 28 2022, 4:55 PM · Restricted Project, Restricted Project
yaxunl closed D122734: [CUDA][HIP] Fix mangling number for local struct.
Apr 28 2022, 4:55 PM · Restricted Project, Restricted Project
yaxunl added a comment to D122734: [CUDA][HIP] Fix mangling number for local struct.

ping

Apr 28 2022, 11:10 AM · Restricted Project, Restricted Project