Page MenuHomePhabricator

yaxunl (Yaxun Liu)
User

Projects

User does not belong to any projects.

User Details

User Since
May 13 2015, 10:16 AM (227 w, 4 d)

Recent Activity

Yesterday

yaxunl added a comment to D67509: [CUDA][HIP] Fix hostness of defaulted constructor.
In D67509#1677528, @tra wrote:

I am taking a look.

I can reproduce similar asserts locally. It seems the assertion I added assert(!(HasD || HasH) || (NeedsD == HasD && NeedsH == HasH)); is not always true. Since we do not have this assert before, I removed it. I will study what causes it to assert and post it later.

Sun, Sep 22, 6:14 PM · Restricted Project

Fri, Sep 20

yaxunl added a comment to D67509: [CUDA][HIP] Fix hostness of defaulted constructor.
In D67509#1677528, @tra wrote:

I am taking a look.

Fri, Sep 20, 8:04 PM · Restricted Project
yaxunl committed rG27a803917181: Revert assertion added by r372394 (authored by yaxunl).
Revert assertion added by r372394
Fri, Sep 20, 7:53 PM
yaxunl committed rL372452: Revert assertion added by r372394.
Revert assertion added by r372394
Fri, Sep 20, 7:53 PM
yaxunl added a comment to D67509: [CUDA][HIP] Fix hostness of defaulted constructor.
In D67509#1677528, @tra wrote:
Fri, Sep 20, 4:14 PM · Restricted Project
yaxunl added inline comments to D67837: [CUDA][HIP] Fix assertion in Sema::markKnownEmitted with -fopenmp.
Fri, Sep 20, 1:01 PM · Restricted Project
yaxunl updated the diff for D67837: [CUDA][HIP] Fix assertion in Sema::markKnownEmitted with -fopenmp.

reuse the call tree.

Fri, Sep 20, 12:59 PM · Restricted Project
yaxunl updated the summary of D67837: [CUDA][HIP] Fix assertion in Sema::markKnownEmitted with -fopenmp.
Fri, Sep 20, 8:47 AM · Restricted Project
yaxunl created D67837: [CUDA][HIP] Fix assertion in Sema::markKnownEmitted with -fopenmp.
Fri, Sep 20, 8:36 AM · Restricted Project
yaxunl committed rGe5d17c511fa6: [CUDA][HIP] Fix hostness of defaulted constructor Clang does not respect the… (authored by yaxunl).
[CUDA][HIP] Fix hostness of defaulted constructor Clang does not respect the…
Fri, Sep 20, 7:30 AM
yaxunl committed rL372394: [CUDA][HIP] Fix hostness of defaulted constructor.
[CUDA][HIP] Fix hostness of defaulted constructor
Fri, Sep 20, 7:30 AM
yaxunl closed D67509: [CUDA][HIP] Fix hostness of defaulted constructor.
Fri, Sep 20, 7:29 AM · Restricted Project

Thu, Sep 19

yaxunl updated the diff for D67509: [CUDA][HIP] Fix hostness of defaulted constructor.

revise by Artem's comments.

Thu, Sep 19, 2:15 PM · Restricted Project
yaxunl updated the diff for D67509: [CUDA][HIP] Fix hostness of defaulted constructor.

simplify logic by Artem's comments.

Thu, Sep 19, 1:56 PM · Restricted Project
yaxunl updated the diff for D67509: [CUDA][HIP] Fix hostness of defaulted constructor.

Skip inferring for explicit host/device attrs only. Adds checks for implicit device and host attrs and avoid duplicates.

Thu, Sep 19, 11:36 AM · Restricted Project
yaxunl added inline comments to D67509: [CUDA][HIP] Fix hostness of defaulted constructor.
Thu, Sep 19, 9:29 AM · Restricted Project
yaxunl updated the diff for D67509: [CUDA][HIP] Fix hostness of defaulted constructor.

Posts a new fix for this issue, where the defaulted constructor definition follows the hostness of the original declaration in the class. Also fix the issue when defaulted ctor has explicit host device attribs.

Thu, Sep 19, 4:54 AM · Restricted Project

Fri, Sep 13

yaxunl added a comment to D67509: [CUDA][HIP] Fix hostness of defaulted constructor.

Sorry I found some issue with the fix.

Fri, Sep 13, 1:16 PM · Restricted Project

Thu, Sep 12

yaxunl created D67509: [CUDA][HIP] Fix hostness of defaulted constructor.
Thu, Sep 12, 11:25 AM · Restricted Project

Tue, Sep 3

yaxunl committed rG1bea97c971d6: [AMDGPU] Set default flat work group size to (1,256) for HIP (authored by yaxunl).
[AMDGPU] Set default flat work group size to (1,256) for HIP
Tue, Sep 3, 12:10 PM
yaxunl committed rL370808: [AMDGPU] Set default flat work group size to (1,256) for HIP.
[AMDGPU] Set default flat work group size to (1,256) for HIP
Tue, Sep 3, 11:49 AM
yaxunl closed D67048: [AMDGPU] Set default flat work group size to (1,256) for HIP.
Tue, Sep 3, 11:49 AM · Restricted Project

Sun, Sep 1

yaxunl updated the diff for D67048: [AMDGPU] Set default flat work group size to (1,256) for HIP.

add test

Sun, Sep 1, 6:49 AM · Restricted Project
yaxunl created D67048: [AMDGPU] Set default flat work group size to (1,256) for HIP.
Sun, Sep 1, 6:16 AM · Restricted Project

Aug 22 2019

yaxunl committed rGaf478e240baa: [OpenCL] Fix declaration of enqueue_marker (authored by yaxunl).
[OpenCL] Fix declaration of enqueue_marker
Aug 22 2019, 4:22 AM
yaxunl committed rL369641: [OpenCL] Fix declaration of enqueue_marker.
[OpenCL] Fix declaration of enqueue_marker
Aug 22 2019, 4:22 AM
yaxunl closed D66512: [OpenCL] Fix declaration of enqueue_marker.
Aug 22 2019, 4:22 AM · Restricted Project

Aug 20 2019

yaxunl created D66512: [OpenCL] Fix declaration of enqueue_marker.
Aug 20 2019, 7:21 PM · Restricted Project

Jul 18 2019

yaxunl added inline comments to D64364: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012.
Jul 18 2019, 8:39 AM · Restricted Project

Jul 11 2019

yaxunl committed rG6add24adaf6a: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012 (authored by yaxunl).
[HIP] Add GPU arch gfx1010, gfx1011, and gfx1012
Jul 11 2019, 10:51 AM
yaxunl committed rL365799: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012.
[HIP] Add GPU arch gfx1010, gfx1011, and gfx1012
Jul 11 2019, 10:50 AM
yaxunl closed D64364: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012.
Jul 11 2019, 10:50 AM · Restricted Project

Jul 8 2019

yaxunl created D64364: [HIP] Add GPU arch gfx1010, gfx1011, and gfx1012.
Jul 8 2019, 1:04 PM · Restricted Project
yaxunl accepted D63756: [AMDGPU] Increased the number of implicit argument bytes for both OpenCL and HIP (CLANG)..

LGTM. Thanks.

Jul 8 2019, 8:22 AM · Restricted Project, Restricted Project

Jul 5 2019

yaxunl committed rGa62413526d80: [AMDGPU] Added a new metadata for multi grid sync implicit argument (authored by yaxunl).
[AMDGPU] Added a new metadata for multi grid sync implicit argument
Jul 5 2019, 9:09 AM
yaxunl committed rL365217: [AMDGPU] Added a new metadata for multi grid sync implicit argument.
[AMDGPU] Added a new metadata for multi grid sync implicit argument
Jul 5 2019, 9:09 AM
yaxunl closed D63886: [AMDGPU] Added a new metadata for multi grid sync implicit argument..
Jul 5 2019, 9:09 AM · Restricted Project

Jul 4 2019

yaxunl added a comment to D53295: Mark store and load of block invoke function as invariant.group.

Great, thank you. Yaxun, are you planning to pick this back up? I know it's been a long time.

Jul 4 2019, 12:08 PM

Jul 3 2019

yaxunl accepted D63886: [AMDGPU] Added a new metadata for multi grid sync implicit argument..

LGTM

Jul 3 2019, 1:40 PM · Restricted Project
yaxunl accepted D63850: [AMDGPU] Kernel arg metadata: added support for "__hip_texture" type..

LGTM

Jul 3 2019, 8:17 AM · Restricted Project

Jun 25 2019

yaxunl committed rGc3dfe9082bce: [HIP] Support attribute hip_pinned_shadow (authored by yaxunl).
[HIP] Support attribute hip_pinned_shadow
Jun 25 2019, 8:48 PM
yaxunl committed rL364381: [HIP] Support attribute hip_pinned_shadow.
[HIP] Support attribute hip_pinned_shadow
Jun 25 2019, 8:48 PM
yaxunl closed D62738: [HIP] Support attribute hip_pinned_shadow.
Jun 25 2019, 8:47 PM · Restricted Project
yaxunl committed rGd325eb3b56d5: Fix build failure due to missing break (authored by yaxunl).
Fix build failure due to missing break
Jun 25 2019, 8:34 PM
yaxunl committed rL364380: Fix build failure due to missing break.
Fix build failure due to missing break
Jun 25 2019, 8:34 PM
yaxunl updated the diff for D62738: [HIP] Support attribute hip_pinned_shadow.

rename the attribute and make it HIP only.

Jun 25 2019, 12:35 PM · Restricted Project
yaxunl added a comment to D63756: [AMDGPU] Increased the number of implicit argument bytes for both OpenCL and HIP (CLANG)..

can you try compile an empty HIP kernel and see what metadata is generated by backend?

Jun 25 2019, 7:57 AM · Restricted Project, Restricted Project

Jun 24 2019

yaxunl added inline comments to D62738: [HIP] Support attribute hip_pinned_shadow.
Jun 24 2019, 11:57 AM · Restricted Project

Jun 23 2019

yaxunl added inline comments to D62738: [HIP] Support attribute hip_pinned_shadow.
Jun 23 2019, 2:27 PM · Restricted Project

Jun 21 2019

yaxunl added inline comments to D63256: [OpenCL] Split type and macro definitions into opencl-c-base.h.
Jun 21 2019, 1:50 PM · Restricted Project, Restricted Project

Jun 20 2019

yaxunl added a comment to D62738: [HIP] Support attribute hip_pinned_shadow.

The problem is that we do not see generic usage of
Although there is no texture specific handling on the compiler side, there is texture specific handling of symbols

Jun 20 2019, 9:40 AM · Restricted Project
yaxunl added a comment to D62738: [HIP] Support attribute hip_pinned_shadow.
In D62738#1538900, @tra wrote:

So, the only thing this patch appears to do is make everything with this attribute uninitialized on device side and give protected visibility.
If I understand it correctly, you're using the attribute in order to construct something that's sort of opposite of the currently used device vars with host-side shadows. Only now the real variable lives on the host side and it's device side that gets the 'shadow' copy. Do I understand it correctly?

If so, then this functionality seems to be fairly general-purpose to me. I.e. it has literally nothing to do with textures other than the name.

Perhaps it would make more sense to rename this attribute to something along the lines of 'device_referenceable' and bring its implementation to somewhat more complete shape.

By 'complete' I mean that it would be great to flesh out what can and can't use the attribute. Does it have to be a type attribute, or can it be applied to variables?
The example in the patch suggests that it's the *variable* that's affected by the attribute.

Once it works, HIP's texture support can use it for its purposes.

E.g. your example could look like this:

#define  __attribute__((device_builtin_texture_type)) __texture__

template <class T, int texType, enum hipTextureReadMode>
    struct  texture
      : public textureReference { ... }

__texture__ texture<float, 2, hipReadModeElementType> tex;

This way compiler does not need to deal with the details of texture implementation on the HIP side.
Host/device visibility of the variables is easy to see in the source (similar to device, shared, etc) and there will be no need to dig into template defined somewhere else to become aware of this.
It will be potentially useful beyond HIP-only texture implementation.

What do you think?

Jun 20 2019, 9:38 AM · Restricted Project

Jun 18 2019

yaxunl updated the diff for D62738: [HIP] Support attribute hip_pinned_shadow.

Fix visibility and dso_local. Allow undefined symbol in code object. This is to allow merging the host and device symbols at run time.

Jun 18 2019, 9:34 PM · Restricted Project
yaxunl updated the diff for D62738: [HIP] Support attribute hip_pinned_shadow.

Revised by Artem's comments.

Jun 18 2019, 6:44 PM · Restricted Project

Jun 17 2019

yaxunl accepted D62697: AMDGPU: Disable errno by default.

LGTM. Sorry for the delay.

Jun 17 2019, 7:14 AM
yaxunl accepted D63335: [HIP] Add the interface deriving the stub name of device kernels..

LGTM. Thanks.

Jun 17 2019, 4:24 AM · Restricted Project, Restricted Project

Jun 14 2019

yaxunl committed rGcabce71845f4: [AMDGPU] Enable the implicit arguments for HIP (CLANG) (authored by yaxunl).
[AMDGPU] Enable the implicit arguments for HIP (CLANG)
Jun 14 2019, 8:52 AM
yaxunl committed rL363414: [AMDGPU] Enable the implicit arguments for HIP (CLANG).
[AMDGPU] Enable the implicit arguments for HIP (CLANG)
Jun 14 2019, 8:52 AM
yaxunl closed D62244: [AMDGPU] Enable the implicit arguments for HIP (CLANG).
Jun 14 2019, 8:52 AM · Restricted Project, Restricted Project

Jun 13 2019

yaxunl abandoned D59863: [HIP] Support gpu arch gfx906+sram-ecc.
Jun 13 2019, 12:16 PM
yaxunl closed D61112: AMDGPU: Enable _Float16.
Jun 13 2019, 12:16 PM
yaxunl accepted D62244: [AMDGPU] Enable the implicit arguments for HIP (CLANG).
Jun 13 2019, 12:15 PM · Restricted Project, Restricted Project

Jun 11 2019

yaxunl updated the diff for D62738: [HIP] Support attribute hip_pinned_shadow.

Revised by Artem's comments.

Jun 11 2019, 1:42 PM · Restricted Project
yaxunl added a comment to D62738: [HIP] Support attribute hip_pinned_shadow.

ping

Jun 11 2019, 10:53 AM · Restricted Project
yaxunl committed rG1362ffbc2106: Revert r344630 Disable code object version 3 for HIP toolchain. (authored by yaxunl).
Revert r344630 Disable code object version 3 for HIP toolchain.
Jun 11 2019, 8:04 AM
yaxunl committed rL363076: Revert r344630 Disable code object version 3 for HIP toolchain..
Revert r344630 Disable code object version 3 for HIP toolchain.
Jun 11 2019, 8:02 AM

Jun 10 2019

yaxunl added a comment to D62739: AMDGPU: Always emit amdgpu-flat-work-group-size.

My concern is that this essentially forcing user to add amdgpu_flat_work_group_size attribute to all kernels that are executed outside of (128,256). Potentially this can cause lots of regressions for existing OpenCL apps. I am not sure if it is feasible to force all OpenCL apps to make this change. Should we do some tests before making this change?

Jun 10 2019, 9:19 AM

Jun 7 2019

yaxunl accepted D62696: AMDGPU: Use AMDGPU toolchain for other OSes.

LGTM

Jun 7 2019, 7:31 AM

Jun 6 2019

yaxunl added a comment to D62971: [HIP] Remove the assertion on match between host/device names..

LGTM. It seems no reason to assume the mangled name to be same on host and device side once anonymous types are mangled differently in host and device code. On windows, kernel has totally different names on host and device side without issues.

Jun 6 2019, 12:50 PM · Restricted Project
yaxunl added a reviewer for D62971: [HIP] Remove the assertion on match between host/device names.: tra.
Jun 6 2019, 12:47 PM · Restricted Project

May 31 2019

yaxunl added inline comments to D62739: AMDGPU: Always emit amdgpu-flat-work-group-size.
May 31 2019, 2:01 PM
yaxunl created D62738: [HIP] Support attribute hip_pinned_shadow.
May 31 2019, 8:36 AM · Restricted Project

May 29 2019

yaxunl added a comment to D62603: [CUDA][HIP] Skip setting `externally_initialized` for static device variables..
In D62603#1521832, @tra wrote:

that should assume that variable is not declared with static. that's also the motivation of this patch.

cppreference defines internal linkage as 'The name can be referred to from all scopes in the current translation unit.'
The current translation unit in CUDA context gets a bit murky. On one hand host and device are compiled separately, and may conceivably be considered separate TUs. On the other hand, the fact that we mix host and device code in the same source file implies tight coupling and the users do expect them to be treated as if all host and device code in the source file is in the same TU. E.g. you may have a kernel in an anonymous namespace yet you do want to be able to launch it from the host side.

I think static __device__ globals would fall into the same category -- nominally they should not be visible outside of device-side object file, but in practice we do need to make them visible from the host side of the same TU.

May 29 2019, 1:32 PM · Restricted Project
yaxunl added a comment to D62603: [CUDA][HIP] Skip setting `externally_initialized` for static device variables..

LGTM. The externally initializable attribute causes some optimizations disabled. For static device variables it seems reasonable to remove the externaly initializable attribute.

May 29 2019, 9:01 AM · Restricted Project
yaxunl added a reviewer for D62603: [CUDA][HIP] Skip setting `externally_initialized` for static device variables.: tra.
May 29 2019, 8:56 AM · Restricted Project

May 28 2019

yaxunl committed rGdc805a49064b: Fix failure of lit test dependent-libs.cu (authored by yaxunl).
Fix failure of lit test dependent-libs.cu
May 28 2019, 6:32 PM
yaxunl committed rC361905: Fix failure of lit test dependent-libs.cu.
Fix failure of lit test dependent-libs.cu
May 28 2019, 6:32 PM
yaxunl committed rL361905: Fix failure of lit test dependent-libs.cu.
Fix failure of lit test dependent-libs.cu
May 28 2019, 6:32 PM
yaxunl committed rG02afe4e077c4: [CUDA][HIP] Emit dependent libs for host only (authored by yaxunl).
[CUDA][HIP] Emit dependent libs for host only
May 28 2019, 2:18 PM
yaxunl committed rC361880: [CUDA][HIP] Emit dependent libs for host only.
[CUDA][HIP] Emit dependent libs for host only
May 28 2019, 2:18 PM
yaxunl committed rL361880: [CUDA][HIP] Emit dependent libs for host only.
[CUDA][HIP] Emit dependent libs for host only
May 28 2019, 2:18 PM
yaxunl closed D62483: [CUDA][HIP] Emit dependent libs for host only.
May 28 2019, 2:18 PM · Restricted Project

May 27 2019

yaxunl created D62483: [CUDA][HIP] Emit dependent libs for host only.
May 27 2019, 5:18 AM · Restricted Project
yaxunl committed rGa53d48b7f45d: [OpenCL] Fix file-scope const sampler variable for 2.0 (authored by yaxunl).
[OpenCL] Fix file-scope const sampler variable for 2.0
May 27 2019, 4:18 AM
yaxunl committed rL361757: [OpenCL] Fix file-scope const sampler variable for 2.0.
[OpenCL] Fix file-scope const sampler variable for 2.0
May 27 2019, 4:18 AM
yaxunl committed rC361757: [OpenCL] Fix file-scope const sampler variable for 2.0.
[OpenCL] Fix file-scope const sampler variable for 2.0
May 27 2019, 4:18 AM
yaxunl closed D62197: [OpenCL] Fix file-scope const sampler variable for 2.0.
May 27 2019, 4:18 AM · Restricted Project

May 22 2019

yaxunl added a comment to D62244: [AMDGPU] Enable the implicit arguments for HIP (CLANG).

Currently HIP and CUDA share the same test directories, so better put the test in CodeGenCUDA.

May 22 2019, 8:16 AM · Restricted Project, Restricted Project

May 21 2019

yaxunl updated the diff for D62197: [OpenCL] Fix file-scope const sampler variable for 2.0.

Add full diff.

May 21 2019, 7:37 AM · Restricted Project
yaxunl created D62197: [OpenCL] Fix file-scope const sampler variable for 2.0.
May 21 2019, 7:34 AM · Restricted Project

May 1 2019

yaxunl added a comment to D61396: [hip] Fix ambiguity from `>>>` of CUDA..

LGTM too. Thanks Michael for fixing this.

May 1 2019, 2:14 PM · Restricted Project, Restricted Project

Apr 30 2019

yaxunl added inline comments to D61274: [Sema][AST] Explicit visibility for OpenCL/CUDA kernels/variables.
Apr 30 2019, 3:27 PM · Restricted Project
yaxunl committed rG9e67d129f079: Add requires amdgpu-registered-target for amdgpu-float16.cpp (authored by yaxunl).
Add requires amdgpu-registered-target for amdgpu-float16.cpp
Apr 30 2019, 12:06 PM
yaxunl committed rC359598: Add requires amdgpu-registered-target for amdgpu-float16.cpp.
Add requires amdgpu-registered-target for amdgpu-float16.cpp
Apr 30 2019, 12:05 PM
yaxunl committed rL359598: Add requires amdgpu-registered-target for amdgpu-float16.cpp.
Add requires amdgpu-registered-target for amdgpu-float16.cpp
Apr 30 2019, 12:05 PM
yaxunl committed rG44697012070c: AMDGPU: Enable _Float16 (authored by yaxunl).
AMDGPU: Enable _Float16
Apr 30 2019, 11:35 AM
yaxunl committed rL359594: AMDGPU: Enable _Float16.
AMDGPU: Enable _Float16
Apr 30 2019, 11:34 AM
yaxunl committed rC359594: AMDGPU: Enable _Float16.
AMDGPU: Enable _Float16
Apr 30 2019, 11:34 AM

Apr 26 2019

yaxunl accepted D61194: [HIP] Fix visibility of `__constant__` variables..

LGTM. Thanks!

Apr 26 2019, 12:26 PM · Restricted Project
yaxunl added inline comments to D61194: [HIP] Fix visibility of `__constant__` variables..
Apr 26 2019, 11:37 AM · Restricted Project