Page MenuHomePhabricator

yaxunl (Yaxun Liu)
User

Projects

User does not belong to any projects.

User Details

User Since
May 13 2015, 10:16 AM (296 w, 6 d)

Recent Activity

Yesterday

yaxunl updated the diff for D85223: [CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc.

separate CUID patch.

Tue, Jan 19, 3:24 PM
yaxunl requested review of D95007: [CUDA][HIP] Add -fuse-cuid.
Tue, Jan 19, 3:20 PM
yaxunl added a comment to D85223: [CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc.
In D85223#2507518, @tra wrote:

I'd propose splitting the patch into two. One with the addition of CUID and the other that changes the way we havdle static vars.
CUID is useful on its own and is relatively uncontroversial.

Externalizing static vars is a more interesting issue and I'm not sure what's the best way to handle it yet. On one hand it is necessary for visibility across host/device, on the other, externalizing all static vars will almost always have negative effect as very few of the static vars actually need this. As already pointed out in the #if 0 section of the patch, ideally we should externalize only the vars that need it. Generally speaking, I do not think we will be able to do that, because with -fgpu-rdc it may be used from the host code in some other TU.

We may need to explicitly annotate such the static variables that need to be visible on both sides and only apply externalization to the variables annotated this way. E.g. require them to be __host__ __device__.

WDYT?

Tue, Jan 19, 1:07 PM

Mon, Jan 18

yaxunl added a reviewer for D85223: [CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc: ronlieb.
Mon, Jan 18, 8:05 AM

Fri, Jan 15

yaxunl added a comment to D85223: [CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc.

I concede that making the variables external, and trying to give them unique names, does work around static variables not working. I believe static variables are subjected to more aggressive optimisation than external ones but the effect might not be significant.

This "works" in cuda today because the loader ignores the local annotation when accessing the variable. There is some probably unintended behaviour when multiple static variables have the same name in that the first one wins.

The corresponding change to the hsa loader is trivial. Why is making the symbols external, with the associated complexity in picking non-conflicting names, considered better than changing the loader?

Fri, Jan 15, 1:23 PM
yaxunl abandoned D80750: llvm-link: Add module flag behavior MergeTargetID.
Fri, Jan 15, 1:04 PM · Restricted Project
yaxunl abandoned D84822: Add documentation for target ID and ClangOffloadBundlerFormat.
Fri, Jan 15, 1:03 PM · Restricted Project
yaxunl abandoned D84824: [HIP] Emit target-id module flag.
Fri, Jan 15, 1:03 PM
yaxunl abandoned D92363: [HIP] Warn no --offload-arch option.
Fri, Jan 15, 1:02 PM
yaxunl requested review of D94814: [HIP] Support `__managed__` attribute.
Fri, Jan 15, 11:45 AM · Restricted Project

Thu, Jan 14

yaxunl added a comment to D93062: [HIP] Add signbit(long double) decl.

Let's land this fix now. It will take some time to set up the buildbot. I think we already have this issue covered in HIP directed tests, right?

Thu, Jan 14, 9:05 AM · Restricted Project
yaxunl added a comment to D92277: [OpenCL] Refactor of targets OpenCL option settings.

LGTM about r600 changes. Thanks.

Thu, Jan 14, 7:14 AM

Tue, Jan 12

yaxunl added a comment to D93525: [OpenMP] Add unbundling of archives containing bundled object files into device specific archives.

can you document this in ClangOffloadBundler.rst ? I think we need a clear description about how clang-offload-bundler knows which file in the .a file belongs to which target.

Tue, Jan 12, 8:28 AM · Restricted Project

Thu, Jan 7

yaxunl accepted D93638: [hip] Enable HIP compilation with `<complex`> on MSVC..

LGTM. Thanks.

Thu, Jan 7, 10:51 AM · Restricted Project

Wed, Jan 6

yaxunl committed rG90bf3ecef4bb: [clang-offload-bundler] Add option -list (authored by yaxunl).
[clang-offload-bundler] Add option -list
Wed, Jan 6, 1:23 PM
yaxunl closed D92954: [clang-offload-bundler] Add option -list.
Wed, Jan 6, 1:23 PM · Restricted Project
yaxunl updated the diff for D92954: [clang-offload-bundler] Add option -list.

revised by Artem's comments

Wed, Jan 6, 12:09 PM · Restricted Project
yaxunl added inline comments to D92954: [clang-offload-bundler] Add option -list.
Wed, Jan 6, 12:08 PM · Restricted Project
yaxunl added a comment to D93587: [hip] Fix HIP version parsing..

Pls address Artem's comments. LGTM otherwise.

Wed, Jan 6, 7:35 AM · Restricted Project

Tue, Jan 5

yaxunl added inline comments to D93638: [hip] Enable HIP compilation with `<complex`> on MSVC..
Tue, Jan 5, 12:36 PM · Restricted Project

Mon, Jan 4

yaxunl added a comment to D71726: Let clang atomic builtins fetch add/sub support floating point types.

ping

Mon, Jan 4, 8:15 AM
yaxunl added a comment to D92954: [clang-offload-bundler] Add option -list.

ping

Mon, Jan 4, 8:13 AM · Restricted Project

Dec 18 2020

yaxunl added a comment to D92954: [clang-offload-bundler] Add option -list.

ping

Dec 18 2020, 1:03 PM · Restricted Project

Dec 16 2020

yaxunl committed rGb9fb063e63c7: [clang-offload-bundler] Add option -allow-missing-bundles (authored by yaxunl).
[clang-offload-bundler] Add option -allow-missing-bundles
Dec 16 2020, 11:53 AM
yaxunl closed D93068: [clang-offload-bundler] Add option -allow-missing-bundles.
Dec 16 2020, 11:53 AM · Restricted Project
yaxunl accepted D93258: [amdgpu] Default to code object v3.

LGTM. Thanks.

Dec 16 2020, 11:10 AM · Restricted Project, Restricted Project
yaxunl added a comment to D93068: [clang-offload-bundler] Add option -allow-missing-bundles.

@ABataev Is this patch OK for OpenMP? It is NFC for OpenMP toolchain but affects using clang-offload-bundler as a standalone tool. Thanks.

Dec 16 2020, 7:42 AM · Restricted Project
yaxunl added a reviewer for D93068: [clang-offload-bundler] Add option -allow-missing-bundles: ABataev.
Dec 16 2020, 7:39 AM · Restricted Project
yaxunl added a comment to D93258: [amdgpu] Default to code object v3.

Thanks for fixing the lit tests. Using regex is the right choice.

Dec 16 2020, 7:22 AM · Restricted Project, Restricted Project

Dec 15 2020

yaxunl committed rG4f14b80803a4: [HIP] unbundle bundled preprocessor output (authored by yaxunl).
[HIP] unbundle bundled preprocessor output
Dec 15 2020, 7:15 PM
yaxunl closed D92720: [HIP] unbundle bundled preprocessor output.
Dec 15 2020, 7:14 PM · Restricted Project

Dec 14 2020

yaxunl added a comment to D93258: [amdgpu] Default to code object v3.

Thanks! Just saw the CI emails come through. I didn't realise code object version was hardcoded across various tests. That's a weird thing to do when it has no effect on the generated IR.

@yaxunl This would never pass the gerrit testing. AMD internal has moved to code object v4 so changing the default to v3 will break internal tests that assume that. The hsa runtime that can load v4 objects hasn't shipped, either in rocm or as source in github, and the backend that generates v4 object files also hasn't shipped, so defaulting to v4 in the trunk front end is a poor choice.

Dec 14 2020, 8:44 PM · Restricted Project, Restricted Project
yaxunl added a comment to D92720: [HIP] unbundle bundled preprocessor output.
In D92720#2453277, @tra wrote:

Output of -E for HIP combined host/device compilation is a plain text. It has C++ comments inserted between preprocessor outputs for host and different GPU arch's. The C++ comments follow the format of clang-offload-bundler bundled text files therefore clang-offload-bundler is able to unbundle it.

OK.

This actually exposed a minor issue. Using something that may legitemately occur in the user source as a separator is rather easy to break.
E.g. there's nothing wrong with the following code, but the bundler will not handle it well if it's used with hip-cpp-output:

const char* s1 = R"foo(
// __CLANG_OFFLOAD_BUNDLE____END__ hip-amdgcn-amd-amdhsa-gfx803
)foo";

It's not a showstopper, but it would be great if the separator would be something that can't be encountered in the preprocessed output.

Dec 14 2020, 8:26 PM · Restricted Project
yaxunl requested changes to D93258: [amdgpu] Default to code object v3.
Dec 14 2020, 8:20 PM · Restricted Project, Restricted Project
yaxunl reopened D93258: [amdgpu] Default to code object v3.

@JonChesterfield Did this patch pass ePSDB in gerrit? Better do that before committing it to trunk since we don't know if math libs are compatible with this patch. Also you need to fix lit tests.

Dec 14 2020, 8:15 PM · Restricted Project, Restricted Project

Dec 13 2020

yaxunl added inline comments to D92277: [OpenCL] Refactor of targets OpenCL option settings.
Dec 13 2020, 6:23 AM

Dec 11 2020

yaxunl added a comment to D92720: [HIP] unbundle bundled preprocessor output.
In D92720#2437621, @tra wrote:

-E by default prints preprocessed output to stdout. CUDA will print preprocessed output from all subcompilations. What does HIP do in this case? Printing out the bundle is probably not what the user will expect.
IMO preprocessed output is frequently used as a debugging tool, so it's important for users to be able to read it. Bundled output is rather cumbersome to deal with. It's possible to manually unbundle it, but the tool is not documented well and it's not particularly suitable for human use.

I think we should preserve the long-established meaning of -E and keep its output as plain text.

Dec 11 2020, 7:24 AM · Restricted Project
yaxunl updated the diff for D92720: [HIP] unbundle bundled preprocessor output.

revised by Artem's comments

Dec 11 2020, 7:19 AM · Restricted Project
yaxunl added a comment to D71726: Let clang atomic builtins fetch add/sub support floating point types.

ping

Dec 11 2020, 6:50 AM
yaxunl accepted D92782: [CodeGen][AMDGPU] Fix ICE for static initializer IR generation.

LGTM. Thanks!

Dec 11 2020, 6:13 AM · Restricted Project
yaxunl updated the diff for D92954: [clang-offload-bundler] Add option -list.

Revised by Artem's comments.

Dec 11 2020, 5:33 AM · Restricted Project
yaxunl added inline comments to D92954: [clang-offload-bundler] Add option -list.
Dec 11 2020, 5:30 AM · Restricted Project

Dec 10 2020

yaxunl added inline comments to D92893: [CUDA] Do not diagnose host/device variable access in dependent types..
Dec 10 2020, 8:11 PM · Restricted Project
yaxunl updated the diff for D93068: [clang-offload-bundler] Add option -allow-missing-bundles.

Revised by Artem's comments.

Dec 10 2020, 7:56 PM · Restricted Project
yaxunl added inline comments to D93068: [clang-offload-bundler] Add option -allow-missing-bundles.
Dec 10 2020, 7:45 PM · Restricted Project
yaxunl requested review of D93068: [clang-offload-bundler] Add option -allow-missing-bundles.
Dec 10 2020, 1:33 PM · Restricted Project
yaxunl requested changes to D92277: [OpenCL] Refactor of targets OpenCL option settings.
Dec 10 2020, 1:04 PM

Dec 9 2020

yaxunl updated the diff for D92954: [clang-offload-bundler] Add option -list.

Remove unnecessary formatting changes.

Dec 9 2020, 7:56 PM · Restricted Project
yaxunl added inline comments to D92954: [clang-offload-bundler] Add option -list.
Dec 9 2020, 7:40 PM · Restricted Project
yaxunl updated the diff for D92954: [clang-offload-bundler] Add option -list.

Revised by Artem's comments: removing unnecessary output to temporary file, extract forEachBundle.

Dec 9 2020, 7:39 PM · Restricted Project
yaxunl requested review of D92954: [clang-offload-bundler] Add option -list.
Dec 9 2020, 11:15 AM · Restricted Project

Dec 8 2020

yaxunl added a comment to D92893: [CUDA] Do not diagnose host/device variable access in dependent types..

LGTM. Can we have a test?

Dec 8 2020, 5:14 PM · Restricted Project

Dec 7 2020

yaxunl committed rGefc063b621ea: Fix lit test failure due to 0b81d9 (authored by yaxunl).
Fix lit test failure due to 0b81d9
Dec 7 2020, 4:51 PM
yaxunl committed rG0b81d9a99257: [AMDGPU] add -mcode-object-version=n (authored by yaxunl).
[AMDGPU] add -mcode-object-version=n
Dec 7 2020, 3:13 PM
yaxunl committed rG5cae70800266: [clang][AMDGPU] remove mxnack and msramecc options (authored by yaxunl).
[clang][AMDGPU] remove mxnack and msramecc options
Dec 7 2020, 3:13 PM
yaxunl committed rG4bed1d9b32b1: [HIP] fix bundle entry ID for -- (authored by yaxunl).
[HIP] fix bundle entry ID for --
Dec 7 2020, 3:13 PM
yaxunl closed D91310: [AMDGPU] Add -mcode-object-version=n.
Dec 7 2020, 3:13 PM · Restricted Project
yaxunl committed rG40ad476a3244: [clang][AMDGPU] rename sram-ecc as sramecc (authored by yaxunl).
[clang][AMDGPU] rename sram-ecc as sramecc
Dec 7 2020, 3:06 PM
yaxunl closed D86217: rename sram-ecc as sramecc in clang.
Dec 7 2020, 3:06 PM · Restricted Project, Restricted Project
yaxunl added a comment to rGcd95338ee302: [CUDA][HIP] Fix capturing reference to host variable.

What is the expected behavior in the event of no capture clause? The no capture clause situation was connected to why lambdas behaved this way in the first place.

Dec 7 2020, 1:54 PM

Dec 5 2020

yaxunl requested review of D92720: [HIP] unbundle bundled preprocessor output.
Dec 5 2020, 6:35 AM · Restricted Project

Dec 4 2020

yaxunl committed rG0519e1ddb388: [HIP] Fix bug in driver about wavefront size (authored by yaxunl).
[HIP] Fix bug in driver about wavefront size
Dec 4 2020, 5:37 AM
yaxunl closed D92628: [HIP] Fix bug in driver about wavefront size.
Dec 4 2020, 5:37 AM · Restricted Project

Dec 3 2020

yaxunl requested review of D92628: [HIP] Fix bug in driver about wavefront size.
Dec 3 2020, 7:01 PM · Restricted Project
yaxunl accepted D92130: [HIP] cmath demote long double args to double.

LGTM. Thanks.

Dec 3 2020, 2:26 PM · Restricted Project
yaxunl added a comment to D92363: [HIP] Warn no --offload-arch option.
In D92363#2426401, @tra wrote:

While I agree that the default GPU choice is not likely to be correct, or usable, for everyone, but the warning seems to be a half-measure.
If the default is not usable, then it should not be the default. If it's usable, then we don't need a warning.

Having a warning would make sense if it were a part of the plan to transition towards making GPU arch a mandatory option. Is that the case here?
Just a warning is not very useful, IMO. The users would need to specify the GPU in order to silence it, so why not just require it.

Dec 3 2020, 11:29 AM
yaxunl added a comment to D71726: Let clang atomic builtins fetch add/sub support floating point types.

ping

Dec 3 2020, 5:13 AM

Dec 2 2020

yaxunl committed rG3a781b912fc7: Fix assertion in tryEmitAsConstant (authored by yaxunl).
Fix assertion in tryEmitAsConstant
Dec 2 2020, 4:11 PM
yaxunl committed rGacb6f80d96b7: [CUDA][HIP] Fix overloading resolution (authored by yaxunl).
[CUDA][HIP] Fix overloading resolution
Dec 2 2020, 1:34 PM
yaxunl closed D80450: [CUDA][HIP] Fix HD function resolution.
Dec 2 2020, 1:34 PM · Restricted Project
yaxunl added a comment to D80450: [CUDA][HIP] Fix HD function resolution.
In D80450#2426507, @tra wrote:

LGTM.

I'd suggest adding more details on the background of this change to the commit log (point to the comment in the isBetterOverloadCandidate ?) and outline the intention to enable the new way to do overloading after some soak time.

Dec 2 2020, 8:27 AM · Restricted Project
yaxunl committed rG5c8911d0ba38: [CUDA][HIP] Diagnose reference of host variable (authored by yaxunl).
[CUDA][HIP] Diagnose reference of host variable
Dec 2 2020, 7:29 AM
yaxunl closed D91281: [CUDA][HIP] Diagnose reference of host variable.
Dec 2 2020, 7:28 AM · Restricted Project
yaxunl accepted D92418: [hip] Fix host object creation from fatbin.
Dec 2 2020, 7:21 AM · Restricted Project
yaxunl committed rGcd95338ee302: [CUDA][HIP] Fix capturing reference to host variable (authored by yaxunl).
[CUDA][HIP] Fix capturing reference to host variable
Dec 2 2020, 7:15 AM
yaxunl closed D91088: [CUDA][HIP] Fix capturing reference to host variable.
Dec 2 2020, 7:15 AM · Restricted Project
yaxunl added inline comments to D91088: [CUDA][HIP] Fix capturing reference to host variable.
Dec 2 2020, 6:43 AM · Restricted Project

Nov 30 2020

yaxunl requested review of D92363: [HIP] Warn no --offload-arch option.
Nov 30 2020, 9:14 PM
yaxunl added inline comments to D91088: [CUDA][HIP] Fix capturing reference to host variable.
Nov 30 2020, 8:17 PM · Restricted Project
yaxunl updated the diff for D91088: [CUDA][HIP] Fix capturing reference to host variable.

extract lambda as a function

Nov 30 2020, 8:10 PM · Restricted Project
yaxunl committed rG011bf4f55630: Add help text for -nogpuinc (authored by yaxunl).
Add help text for -nogpuinc
Nov 30 2020, 7:33 PM
yaxunl closed D92339: Add help text for -nogpuinc.
Nov 30 2020, 7:33 PM · Restricted Project
yaxunl updated the diff for D80450: [CUDA][HIP] Fix HD function resolution.

Add -ffix-overload-resolution and fix a regression.

Nov 30 2020, 7:11 PM · Restricted Project
yaxunl added inline comments to D80450: [CUDA][HIP] Fix HD function resolution.
Nov 30 2020, 1:18 PM · Restricted Project
yaxunl added a comment to D92342: [HIP] Fix HIP test on windows due to lld suffix.

Aaron, Could you please update the bugzilla after the fix is landed? Thanks.

Nov 30 2020, 1:02 PM · Restricted Project
yaxunl added inline comments to D80450: [CUDA][HIP] Fix HD function resolution.
Nov 30 2020, 12:35 PM · Restricted Project
yaxunl added a comment to D80450: [CUDA][HIP] Fix HD function resolution.
In D80450#2423706, @tra wrote:

SGTM. I'll check how the patch fares on our CUDA code.

Nov 30 2020, 12:07 PM · Restricted Project
yaxunl requested review of D92339: Add help text for -nogpuinc.
Nov 30 2020, 11:50 AM · Restricted Project

Nov 27 2020

yaxunl updated the diff for D80450: [CUDA][HIP] Fix HD function resolution.

If -fgpu-defer-diags is off, keep original behavior.

Nov 27 2020, 6:10 AM · Restricted Project

Nov 26 2020

yaxunl reopened D80450: [CUDA][HIP] Fix HD function resolution.

reopen for fixing the regression

Nov 26 2020, 8:43 AM · Restricted Project
yaxunl added a comment to D80450: [CUDA][HIP] Fix HD function resolution.
In D80450#2088129, @tra wrote:
In D80450#2087938, @tra wrote:

Reproducer for the regression. https://gist.github.com/Artem-B/183e9cfc28c6b04c1c862c853b5d9575
It's not particularly small, but that's as far as I could get it reduced.

With the patch, an attempt to instantiate ag on line 36 (in the reproducer sources I linked to above) results in ambiguity between two templates on lines 33 and 24 that are in different namespaces.
Previously it picked the template on line 28.

Managed to simplify the reproducer down to this which now reports that a host candidate has been ignored. This may explain why we ended up with the ambiguity when other overloads were present.

template <typename> struct a {};
namespace b {
struct c : a<int> {};
template <typename d> void ag(d);
} // namespace b
template <typename ae>
__attribute__((host)) __attribute__((device)) int ag(a<ae>) {
  ae e;
  ag(e);
}
void f() { ag<b::c>; }
Nov 26 2020, 8:42 AM · Restricted Project

Nov 24 2020

yaxunl added a comment to D91310: [AMDGPU] Add -mcode-object-version=n.

ping

Nov 24 2020, 7:19 AM · Restricted Project
yaxunl added a comment to D91088: [CUDA][HIP] Fix capturing reference to host variable.

ping

Nov 24 2020, 5:39 AM · Restricted Project
yaxunl committed rGcb08558caa3b: [HIP] Fix regressions due to fp contract change (authored by yaxunl).
[HIP] Fix regressions due to fp contract change
Nov 24 2020, 5:10 AM
yaxunl closed D90174: [HIP] Fix regressions due to fp contract change.
Nov 24 2020, 5:10 AM · Restricted Project

Nov 23 2020

yaxunl added a comment to D71726: Let clang atomic builtins fetch add/sub support floating point types.

Yes, there are no generically available libcalls for atomic float math -- but that's okay -- let LLVM handle transform into a cmpxchg loop when required.

I suspect Yaxun's target cannot provide libcalls at all, which is why he wants to diagnose up-front. But I agree that we should be thinking about this uniformly, and that his target should be diagnosing *all* unsupported atomics.

Nov 23 2020, 3:19 PM
yaxunl updated the diff for D71726: Let clang atomic builtins fetch add/sub support floating point types.

revised by John's comments. Added target hook and diagnostics for generic atomic operations.

Nov 23 2020, 3:12 PM

Nov 19 2020

yaxunl added a comment to D90174: [HIP] Fix regressions due to fp contract change.

ping

Nov 19 2020, 8:38 AM · Restricted Project

Nov 16 2020

yaxunl committed rG3f4b5893efed: [AMDGPU] Add option -munsafe-fp-atomics (authored by yaxunl).
[AMDGPU] Add option -munsafe-fp-atomics
Nov 16 2020, 6:53 PM
yaxunl closed D91546: [AMDGPU] Add option -munsafe-fp-atomics.
Nov 16 2020, 6:53 PM · Restricted Project