Page MenuHomePhabricator

yaxunl (Yaxun Liu)
User

Projects

User does not belong to any projects.

User Details

User Since
May 13 2015, 10:16 AM (218 w, 4 d)

Recent Activity

Thu, Jul 18

yaxunl added inline comments to D64364: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012.
Thu, Jul 18, 8:39 AM · Restricted Project

Thu, Jul 11

yaxunl committed rG6add24adaf6a: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012 (authored by yaxunl).
[HIP] Add GPU arch gfx1010, gfx1011, and gfx1012
Thu, Jul 11, 10:51 AM
yaxunl committed rL365799: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012.
[HIP] Add GPU arch gfx1010, gfx1011, and gfx1012
Thu, Jul 11, 10:50 AM
yaxunl closed D64364: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012.
Thu, Jul 11, 10:50 AM · Restricted Project

Mon, Jul 8

yaxunl created D64364: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012.
Mon, Jul 8, 1:04 PM · Restricted Project
yaxunl accepted D63756: [AMDGPU] Increased the number of implicit argument bytes for both OpenCL and HIP (CLANG)..

LGTM. Thanks.

Mon, Jul 8, 8:22 AM · Restricted Project, Restricted Project

Fri, Jul 5

yaxunl committed rGa62413526d80: [AMDGPU] Added a new metadata for multi grid sync implicit argument (authored by yaxunl).
[AMDGPU] Added a new metadata for multi grid sync implicit argument
Fri, Jul 5, 9:09 AM
yaxunl committed rL365217: [AMDGPU] Added a new metadata for multi grid sync implicit argument.
[AMDGPU] Added a new metadata for multi grid sync implicit argument
Fri, Jul 5, 9:09 AM
yaxunl closed D63886: [AMDGPU] Added a new metadata for multi grid sync implicit argument..
Fri, Jul 5, 9:09 AM · Restricted Project

Thu, Jul 4

yaxunl added a comment to D53295: Mark store and load of block invoke function as invariant.group.

Great, thank you. Yaxun, are you planning to pick this back up? I know it's been a long time.

Thu, Jul 4, 12:08 PM

Wed, Jul 3

yaxunl accepted D63886: [AMDGPU] Added a new metadata for multi grid sync implicit argument..

LGTM

Wed, Jul 3, 1:40 PM · Restricted Project
yaxunl accepted D63850: [AMDGPU] Kernel arg metadata: added support for "__hip_texture" type..

LGTM

Wed, Jul 3, 8:17 AM · Restricted Project

Tue, Jun 25

yaxunl committed rGc3dfe9082bce: [HIP] Support attribute hip_pinned_shadow (authored by yaxunl).
[HIP] Support attribute hip_pinned_shadow
Tue, Jun 25, 8:48 PM
yaxunl committed rL364381: [HIP] Support attribute hip_pinned_shadow.
[HIP] Support attribute hip_pinned_shadow
Tue, Jun 25, 8:48 PM
yaxunl closed D62738: [HIP] Support attribute hip_pinned_shadow.
Tue, Jun 25, 8:47 PM · Restricted Project
yaxunl committed rGd325eb3b56d5: Fix build failure due to missing break (authored by yaxunl).
Fix build failure due to missing break
Tue, Jun 25, 8:34 PM
yaxunl committed rL364380: Fix build failure due to missing break.
Fix build failure due to missing break
Tue, Jun 25, 8:34 PM
yaxunl updated the diff for D62738: [HIP] Support attribute hip_pinned_shadow.

rename the attribute and make it HIP only.

Tue, Jun 25, 12:35 PM · Restricted Project
yaxunl added a comment to D63756: [AMDGPU] Increased the number of implicit argument bytes for both OpenCL and HIP (CLANG)..

can you try compile an empty HIP kernel and see what metadata is generated by backend?

Tue, Jun 25, 7:57 AM · Restricted Project, Restricted Project

Mon, Jun 24

yaxunl added inline comments to D62738: [HIP] Support attribute hip_pinned_shadow.
Mon, Jun 24, 11:57 AM · Restricted Project

Sun, Jun 23

yaxunl added inline comments to D62738: [HIP] Support attribute hip_pinned_shadow.
Sun, Jun 23, 2:27 PM · Restricted Project

Jun 21 2019

yaxunl added inline comments to D63256: [OpenCL] Split type and macro definitions into opencl-c-base.h.
Jun 21 2019, 1:50 PM · Restricted Project, Restricted Project

Jun 20 2019

yaxunl added a comment to D62738: [HIP] Support attribute hip_pinned_shadow.

The problem is that we do not see generic usage of
Although there is no texture specific handling on the compiler side, there is texture specific handling of symbols

Jun 20 2019, 9:40 AM · Restricted Project
yaxunl added a comment to D62738: [HIP] Support attribute hip_pinned_shadow.
In D62738#1538900, @tra wrote:

So, the only thing this patch appears to do is make everything with this attribute uninitialized on device side and give protected visibility.
If I understand it correctly, you're using the attribute in order to construct something that's sort of opposite of the currently used device vars with host-side shadows. Only now the real variable lives on the host side and it's device side that gets the 'shadow' copy. Do I understand it correctly?

If so, then this functionality seems to be fairly general-purpose to me. I.e. it has literally nothing to do with textures other than the name.

Perhaps it would make more sense to rename this attribute to something along the lines of 'device_referenceable' and bring its implementation to somewhat more complete shape.

By 'complete' I mean that it would be great to flesh out what can and can't use the attribute. Does it have to be a type attribute, or can it be applied to variables?
The example in the patch suggests that it's the *variable* that's affected by the attribute.

Once it works, HIP's texture support can use it for its purposes.

E.g. your example could look like this:

#define  __attribute__((device_builtin_texture_type)) __texture__

template <class T, int texType, enum hipTextureReadMode>
    struct  texture
      : public textureReference { ... }

__texture__ texture<float, 2, hipReadModeElementType> tex;

This way compiler does not need to deal with the details of texture implementation on the HIP side.
Host/device visibility of the variables is easy to see in the source (similar to device, shared, etc) and there will be no need to dig into template defined somewhere else to become aware of this.
It will be potentially useful beyond HIP-only texture implementation.

What do you think?

Jun 20 2019, 9:38 AM · Restricted Project

Jun 18 2019

yaxunl updated the diff for D62738: [HIP] Support attribute hip_pinned_shadow.

Fix visibility and dso_local. Allow undefined symbol in code object. This is to allow merging the host and device symbols at run time.

Jun 18 2019, 9:34 PM · Restricted Project
yaxunl updated the diff for D62738: [HIP] Support attribute hip_pinned_shadow.

Revised by Artem's comments.

Jun 18 2019, 6:44 PM · Restricted Project

Jun 17 2019

yaxunl accepted D62697: AMDGPU: Disable errno by default.

LGTM. Sorry for the delay.

Jun 17 2019, 7:14 AM
yaxunl accepted D63335: [HIP] Add the interface deriving the stub name of device kernels..

LGTM. Thanks.

Jun 17 2019, 4:24 AM · Restricted Project, Restricted Project

Jun 14 2019

yaxunl committed rGcabce71845f4: [AMDGPU] Enable the implicit arguments for HIP (CLANG) (authored by yaxunl).
[AMDGPU] Enable the implicit arguments for HIP (CLANG)
Jun 14 2019, 8:52 AM
yaxunl committed rL363414: [AMDGPU] Enable the implicit arguments for HIP (CLANG).
[AMDGPU] Enable the implicit arguments for HIP (CLANG)
Jun 14 2019, 8:52 AM
yaxunl closed D62244: [AMDGPU] Enable the implicit arguments for HIP (CLANG).
Jun 14 2019, 8:52 AM · Restricted Project, Restricted Project

Jun 13 2019

yaxunl abandoned D59863: [HIP] Support gpu arch gfx906+sram-ecc.
Jun 13 2019, 12:16 PM
yaxunl closed D61112: AMDGPU: Enable _Float16.
Jun 13 2019, 12:16 PM
yaxunl accepted D62244: [AMDGPU] Enable the implicit arguments for HIP (CLANG).
Jun 13 2019, 12:15 PM · Restricted Project, Restricted Project

Jun 11 2019

yaxunl updated the diff for D62738: [HIP] Support attribute hip_pinned_shadow.

Revised by Artem's comments.

Jun 11 2019, 1:42 PM · Restricted Project
yaxunl added a comment to D62738: [HIP] Support attribute hip_pinned_shadow.

ping

Jun 11 2019, 10:53 AM · Restricted Project
yaxunl committed rG1362ffbc2106: Revert r344630 Disable code object version 3 for HIP toolchain. (authored by yaxunl).
Revert r344630 Disable code object version 3 for HIP toolchain.
Jun 11 2019, 8:04 AM
yaxunl committed rL363076: Revert r344630 Disable code object version 3 for HIP toolchain..
Revert r344630 Disable code object version 3 for HIP toolchain.
Jun 11 2019, 8:02 AM

Jun 10 2019

yaxunl added a comment to D62739: AMDGPU: Always emit amdgpu-flat-work-group-size.

My concern is that this essentially forcing user to add amdgpu_flat_work_group_size attribute to all kernels that are executed outside of (128,256). Potentially this can cause lots of regressions for existing OpenCL apps. I am not sure if it is feasible to force all OpenCL apps to make this change. Should we do some tests before making this change?

Jun 10 2019, 9:19 AM

Jun 7 2019

yaxunl accepted D62696: AMDGPU: Use AMDGPU toolchain for other OSes.

LGTM

Jun 7 2019, 7:31 AM

Jun 6 2019

yaxunl added a comment to D62971: [HIP] Remove the assertion on match between host/device names..

LGTM. It seems no reason to assume the mangled name to be same on host and device side once anonymous types are mangled differently in host and device code. On windows, kernel has totally different names on host and device side without issues.

Jun 6 2019, 12:50 PM · Restricted Project
yaxunl added a reviewer for D62971: [HIP] Remove the assertion on match between host/device names.: tra.
Jun 6 2019, 12:47 PM · Restricted Project

May 31 2019

yaxunl added inline comments to D62739: AMDGPU: Always emit amdgpu-flat-work-group-size.
May 31 2019, 2:01 PM
yaxunl created D62738: [HIP] Support attribute hip_pinned_shadow.
May 31 2019, 8:36 AM · Restricted Project

May 29 2019

yaxunl added a comment to D62603: [CUDA][HIP] Skip setting `externally_initialized` for static device variables..
In D62603#1521832, @tra wrote:

that should assume that variable is not declared with static. that's also the motivation of this patch.

cppreference defines internal linkage as 'The name can be referred to from all scopes in the current translation unit.'
The current translation unit in CUDA context gets a bit murky. On one hand host and device are compiled separately, and may conceivably be considered separate TUs. On the other hand, the fact that we mix host and device code in the same source file implies tight coupling and the users do expect them to be treated as if all host and device code in the source file is in the same TU. E.g. you may have a kernel in an anonymous namespace yet you do want to be able to launch it from the host side.

I think static __device__ globals would fall into the same category -- nominally they should not be visible outside of device-side object file, but in practice we do need to make them visible from the host side of the same TU.

May 29 2019, 1:32 PM · Restricted Project
yaxunl added a comment to D62603: [CUDA][HIP] Skip setting `externally_initialized` for static device variables..

LGTM. The externally initializable attribute causes some optimizations disabled. For static device variables it seems reasonable to remove the externaly initializable attribute.

May 29 2019, 9:01 AM · Restricted Project
yaxunl added a reviewer for D62603: [CUDA][HIP] Skip setting `externally_initialized` for static device variables.: tra.
May 29 2019, 8:56 AM · Restricted Project

May 28 2019

yaxunl committed rGdc805a49064b: Fix failure of lit test dependent-libs.cu (authored by yaxunl).
Fix failure of lit test dependent-libs.cu
May 28 2019, 6:32 PM
yaxunl committed rC361905: Fix failure of lit test dependent-libs.cu.
Fix failure of lit test dependent-libs.cu
May 28 2019, 6:32 PM
yaxunl committed rL361905: Fix failure of lit test dependent-libs.cu.
Fix failure of lit test dependent-libs.cu
May 28 2019, 6:32 PM
yaxunl committed rG02afe4e077c4: [CUDA][HIP] Emit dependent libs for host only (authored by yaxunl).
[CUDA][HIP] Emit dependent libs for host only
May 28 2019, 2:18 PM
yaxunl committed rC361880: [CUDA][HIP] Emit dependent libs for host only.
[CUDA][HIP] Emit dependent libs for host only
May 28 2019, 2:18 PM
yaxunl committed rL361880: [CUDA][HIP] Emit dependent libs for host only.
[CUDA][HIP] Emit dependent libs for host only
May 28 2019, 2:18 PM
yaxunl closed D62483: [CUDA][HIP] Emit dependent libs for host only.
May 28 2019, 2:18 PM · Restricted Project

May 27 2019

yaxunl created D62483: [CUDA][HIP] Emit dependent libs for host only.
May 27 2019, 5:18 AM · Restricted Project
yaxunl committed rGa53d48b7f45d: [OpenCL] Fix file-scope const sampler variable for 2.0 (authored by yaxunl).
[OpenCL] Fix file-scope const sampler variable for 2.0
May 27 2019, 4:18 AM
yaxunl committed rL361757: [OpenCL] Fix file-scope const sampler variable for 2.0.
[OpenCL] Fix file-scope const sampler variable for 2.0
May 27 2019, 4:18 AM
yaxunl committed rC361757: [OpenCL] Fix file-scope const sampler variable for 2.0.
[OpenCL] Fix file-scope const sampler variable for 2.0
May 27 2019, 4:18 AM
yaxunl closed D62197: [OpenCL] Fix file-scope const sampler variable for 2.0.
May 27 2019, 4:18 AM · Restricted Project

May 22 2019

yaxunl added a comment to D62244: [AMDGPU] Enable the implicit arguments for HIP (CLANG).

Currently HIP and CUDA share the same test directories, so better put the test in CodeGenCUDA.

May 22 2019, 8:16 AM · Restricted Project, Restricted Project

May 21 2019

yaxunl updated the diff for D62197: [OpenCL] Fix file-scope const sampler variable for 2.0.

Add full diff.

May 21 2019, 7:37 AM · Restricted Project
yaxunl created D62197: [OpenCL] Fix file-scope const sampler variable for 2.0.
May 21 2019, 7:34 AM · Restricted Project

May 1 2019

yaxunl added a comment to D61396: [hip] Fix ambiguity from `>>>` of CUDA..

LGTM too. Thanks Michael for fixing this.

May 1 2019, 2:14 PM · Restricted Project, Restricted Project

Apr 30 2019

yaxunl added inline comments to D61274: [Sema][AST] Explicit visibility for OpenCL/CUDA kernels/variables.
Apr 30 2019, 3:27 PM · Restricted Project
yaxunl committed rG9e67d129f079: Add requires amdgpu-registered-target for amdgpu-float16.cpp (authored by yaxunl).
Add requires amdgpu-registered-target for amdgpu-float16.cpp
Apr 30 2019, 12:06 PM
yaxunl committed rC359598: Add requires amdgpu-registered-target for amdgpu-float16.cpp.
Add requires amdgpu-registered-target for amdgpu-float16.cpp
Apr 30 2019, 12:05 PM
yaxunl committed rL359598: Add requires amdgpu-registered-target for amdgpu-float16.cpp.
Add requires amdgpu-registered-target for amdgpu-float16.cpp
Apr 30 2019, 12:05 PM
yaxunl committed rG44697012070c: AMDGPU: Enable _Float16 (authored by yaxunl).
AMDGPU: Enable _Float16
Apr 30 2019, 11:35 AM
yaxunl committed rL359594: AMDGPU: Enable _Float16.
AMDGPU: Enable _Float16
Apr 30 2019, 11:34 AM
yaxunl committed rC359594: AMDGPU: Enable _Float16.
AMDGPU: Enable _Float16
Apr 30 2019, 11:34 AM

Apr 26 2019

yaxunl accepted D61194: [HIP] Fix visibility of `__constant__` variables..

LGTM. Thanks!

Apr 26 2019, 12:26 PM · Restricted Project
yaxunl added inline comments to D61194: [HIP] Fix visibility of `__constant__` variables..
Apr 26 2019, 11:37 AM · Restricted Project

Apr 24 2019

yaxunl created D61112: AMDGPU: Enable _Float16.
Apr 24 2019, 8:08 PM

Apr 12 2019

yaxunl committed rG7bd8c37b1773: [HIP] Use -mlink-builtin-bitcode to link device library (authored by yaxunl).
[HIP] Use -mlink-builtin-bitcode to link device library
Apr 12 2019, 9:23 AM
yaxunl committed rC358290: [HIP] Use -mlink-builtin-bitcode to link device library.
[HIP] Use -mlink-builtin-bitcode to link device library
Apr 12 2019, 9:22 AM
yaxunl committed rL358290: [HIP] Use -mlink-builtin-bitcode to link device library.
[HIP] Use -mlink-builtin-bitcode to link device library
Apr 12 2019, 9:22 AM
yaxunl closed D60513: [HIP] Use -mlink-builtin-bitcode to link device library.
Apr 12 2019, 9:22 AM · Restricted Project
yaxunl updated the summary of D60620: [HIP] Support -offloading-target-id.
Apr 12 2019, 8:44 AM
yaxunl created D60620: [HIP] Support -offloading-target-id.
Apr 12 2019, 8:38 AM

Apr 10 2019

yaxunl created D60513: [HIP] Use -mlink-builtin-bitcode to link device library.
Apr 10 2019, 8:04 AM · Restricted Project

Apr 3 2019

yaxunl added inline comments to D59321: WIP: AMDGPU: Teach toolchain to link rocm device libs.
Apr 3 2019, 11:34 AM
yaxunl added reviewers for D59321: WIP: AMDGPU: Teach toolchain to link rocm device libs: b-sumner, t-tye.
Apr 3 2019, 11:23 AM

Apr 2 2019

yaxunl added inline comments to D60141: [HIP-Clang] Fat binary should not be produced for non GPU code.
Apr 2 2019, 1:32 PM · Restricted Project, Restricted Project
yaxunl accepted D60141: [HIP-Clang] Fat binary should not be produced for non GPU code.

LGTM. Thanks!

Apr 2 2019, 12:57 PM · Restricted Project, Restricted Project
yaxunl added a reviewer for D60141: [HIP-Clang] Fat binary should not be produced for non GPU code: tra.
Apr 2 2019, 11:04 AM · Restricted Project, Restricted Project

Mar 27 2019

yaxunl added inline comments to D59863: [HIP] Support gpu arch gfx906+sram-ecc.
Mar 27 2019, 11:05 AM

Mar 26 2019

yaxunl created D59863: [HIP] Support gpu arch gfx906+sram-ecc.
Mar 26 2019, 7:51 PM

Mar 21 2019

yaxunl created D59647: [CUDA][HIP] Warn shared var initialization.
Mar 21 2019, 8:39 AM · Restricted Project

Mar 15 2019

yaxunl added a comment to D59316: [HIP-Clang] propagate -mllvm options to opt and llc.
Mar 15 2019, 2:43 PM · Restricted Project
yaxunl added a comment to D59316: [HIP-Clang] propagate -mllvm options to opt and llc.

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

This isn't really true. We do run lld to link the final executable. It also doesn't change that opt and llc should never be involved in the process

Can lld do ISA level linking? That is, one device function in one object file calls another device function in a different object file, and we let lld link them together?

We can't link multiple objects, but we do need to link the single object with lld. The relocations even for functions in the same module are 0 until lld fixes them up. Do we have execution tests for function calls using HIP? Since it looks like lld isn't getting used here, I suspect they aren't workingh

Mar 15 2019, 12:35 PM · Restricted Project
yaxunl added a comment to D59316: [HIP-Clang] propagate -mllvm options to opt and llc.

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

This isn't really true. We do run lld to link the final executable. It also doesn't change that opt and llc should never be involved in the process

Mar 15 2019, 12:14 PM · Restricted Project

Mar 14 2019

yaxunl accepted D59316: [HIP-Clang] propagate -mllvm options to opt and llc.

LGTM. Thanks!

Mar 14 2019, 11:02 AM · Restricted Project
yaxunl added a comment to D59316: [HIP-Clang] propagate -mllvm options to opt and llc.

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

Mar 14 2019, 11:02 AM · Restricted Project

Mar 5 2019

yaxunl committed rGc5be267003ef: [CUDA][HIP][Sema] Fix template kernel with function as template parameter (authored by yaxunl).
[CUDA][HIP][Sema] Fix template kernel with function as template parameter
Mar 5 2019, 10:20 AM
yaxunl committed rL355421: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.
[CUDA][HIP][Sema] Fix template kernel with function as template parameter
Mar 5 2019, 10:20 AM
yaxunl committed rC355421: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.
[CUDA][HIP][Sema] Fix template kernel with function as template parameter
Mar 5 2019, 10:20 AM
yaxunl closed D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.
Mar 5 2019, 10:19 AM · Restricted Project
yaxunl committed rG071949c3afae: Allow bundle size to be 0 in clang-offload-bundler (authored by yaxunl).
Allow bundle size to be 0 in clang-offload-bundler
Mar 5 2019, 9:52 AM
yaxunl committed rC355419: Allow bundle size to be 0 in clang-offload-bundler.
Allow bundle size to be 0 in clang-offload-bundler
Mar 5 2019, 9:52 AM
yaxunl committed rL355419: Allow bundle size to be 0 in clang-offload-bundler.
Allow bundle size to be 0 in clang-offload-bundler
Mar 5 2019, 9:52 AM