Page MenuHomePhabricator
Feed Advanced Search

Today

yaxunl updated the diff for D62738: [HIP] Support attribute hip_pinned_shadow.

rename the attribute and make it HIP only.

Tue, Jun 25, 12:35 PM
yaxunl added a comment to D63756: [AMDGPU] Increased the number of implicit argument bytes for both OpenCL and HIP (CLANG)..

can you try compile an empty HIP kernel and see what metadata is generated by backend?

Tue, Jun 25, 7:57 AM · Restricted Project

Yesterday

yaxunl added inline comments to D62738: [HIP] Support attribute hip_pinned_shadow.
Mon, Jun 24, 11:57 AM

Sun, Jun 23

yaxunl added inline comments to D62738: [HIP] Support attribute hip_pinned_shadow.
Sun, Jun 23, 2:27 PM

Fri, Jun 21

yaxunl added inline comments to D63256: [OpenCL] Split type and macro definitions into opencl-c-base.h.
Fri, Jun 21, 1:50 PM · Restricted Project, Restricted Project

Thu, Jun 20

yaxunl added a comment to D62738: [HIP] Support attribute hip_pinned_shadow.

The problem is that we do not see generic usage of
Although there is no texture specific handling on the compiler side, there is texture specific handling of symbols

Thu, Jun 20, 9:40 AM
yaxunl added a comment to D62738: [HIP] Support attribute hip_pinned_shadow.
In D62738#1538900, @tra wrote:

So, the only thing this patch appears to do is make everything with this attribute uninitialized on device side and give protected visibility.
If I understand it correctly, you're using the attribute in order to construct something that's sort of opposite of the currently used device vars with host-side shadows. Only now the real variable lives on the host side and it's device side that gets the 'shadow' copy. Do I understand it correctly?

If so, then this functionality seems to be fairly general-purpose to me. I.e. it has literally nothing to do with textures other than the name.

Perhaps it would make more sense to rename this attribute to something along the lines of 'device_referenceable' and bring its implementation to somewhat more complete shape.

By 'complete' I mean that it would be great to flesh out what can and can't use the attribute. Does it have to be a type attribute, or can it be applied to variables?
The example in the patch suggests that it's the *variable* that's affected by the attribute.

Once it works, HIP's texture support can use it for its purposes.

E.g. your example could look like this:

#define  __attribute__((device_builtin_texture_type)) __texture__

template <class T, int texType, enum hipTextureReadMode>
    struct  texture
      : public textureReference { ... }

__texture__ texture<float, 2, hipReadModeElementType> tex;

This way compiler does not need to deal with the details of texture implementation on the HIP side.
Host/device visibility of the variables is easy to see in the source (similar to device, shared, etc) and there will be no need to dig into template defined somewhere else to become aware of this.
It will be potentially useful beyond HIP-only texture implementation.

What do you think?

Thu, Jun 20, 9:38 AM

Tue, Jun 18

yaxunl updated the diff for D62738: [HIP] Support attribute hip_pinned_shadow.

Fix visibility and dso_local. Allow undefined symbol in code object. This is to allow merging the host and device symbols at run time.

Tue, Jun 18, 9:34 PM
yaxunl updated the diff for D62738: [HIP] Support attribute hip_pinned_shadow.

Revised by Artem's comments.

Tue, Jun 18, 6:44 PM

Mon, Jun 17

yaxunl accepted D62697: AMDGPU: Disable errno by default.

LGTM. Sorry for the delay.

Mon, Jun 17, 7:14 AM
yaxunl accepted D63335: [HIP] Add the interface deriving the stub name of device kernels..

LGTM. Thanks.

Mon, Jun 17, 4:24 AM · Restricted Project, Restricted Project

Fri, Jun 14

yaxunl committed rGcabce71845f4: [AMDGPU] Enable the implicit arguments for HIP (CLANG) (authored by yaxunl).
[AMDGPU] Enable the implicit arguments for HIP (CLANG)
Fri, Jun 14, 8:52 AM
yaxunl committed rL363414: [AMDGPU] Enable the implicit arguments for HIP (CLANG).
[AMDGPU] Enable the implicit arguments for HIP (CLANG)
Fri, Jun 14, 8:52 AM
yaxunl closed D62244: [AMDGPU] Enable the implicit arguments for HIP (CLANG).
Fri, Jun 14, 8:52 AM · Restricted Project, Restricted Project

Thu, Jun 13

yaxunl abandoned D59863: [HIP] Support gpu arch gfx906+sram-ecc.
Thu, Jun 13, 12:16 PM
yaxunl closed D61112: AMDGPU: Enable _Float16.
Thu, Jun 13, 12:16 PM
yaxunl accepted D62244: [AMDGPU] Enable the implicit arguments for HIP (CLANG).
Thu, Jun 13, 12:15 PM · Restricted Project, Restricted Project

Tue, Jun 11

yaxunl updated the diff for D62738: [HIP] Support attribute hip_pinned_shadow.

Revised by Artem's comments.

Tue, Jun 11, 1:42 PM
yaxunl added a comment to D62738: [HIP] Support attribute hip_pinned_shadow.

ping

Tue, Jun 11, 10:53 AM
yaxunl committed rG1362ffbc2106: Revert r344630 Disable code object version 3 for HIP toolchain. (authored by yaxunl).
Revert r344630 Disable code object version 3 for HIP toolchain.
Tue, Jun 11, 8:04 AM
yaxunl committed rL363076: Revert r344630 Disable code object version 3 for HIP toolchain..
Revert r344630 Disable code object version 3 for HIP toolchain.
Tue, Jun 11, 8:02 AM

Mon, Jun 10

yaxunl added a comment to D62739: AMDGPU: Always emit amdgpu-flat-work-group-size.

My concern is that this essentially forcing user to add amdgpu_flat_work_group_size attribute to all kernels that are executed outside of (128,256). Potentially this can cause lots of regressions for existing OpenCL apps. I am not sure if it is feasible to force all OpenCL apps to make this change. Should we do some tests before making this change?

Mon, Jun 10, 9:19 AM

Fri, Jun 7

yaxunl accepted D62696: AMDGPU: Use AMDGPU toolchain for other OSes.

LGTM

Fri, Jun 7, 7:31 AM

Thu, Jun 6

yaxunl added a comment to D62971: [HIP] Remove the assertion on match between host/device names..

LGTM. It seems no reason to assume the mangled name to be same on host and device side once anonymous types are mangled differently in host and device code. On windows, kernel has totally different names on host and device side without issues.

Thu, Jun 6, 12:50 PM · Restricted Project
yaxunl added a reviewer for D62971: [HIP] Remove the assertion on match between host/device names.: tra.
Thu, Jun 6, 12:47 PM · Restricted Project

Fri, May 31

yaxunl added inline comments to D62739: AMDGPU: Always emit amdgpu-flat-work-group-size.
Fri, May 31, 2:01 PM
yaxunl created D62738: [HIP] Support attribute hip_pinned_shadow.
Fri, May 31, 8:36 AM

Wed, May 29

yaxunl added a comment to D62603: [CUDA][HIP] Skip setting `externally_initialized` for static device variables..
In D62603#1521832, @tra wrote:

that should assume that variable is not declared with static. that's also the motivation of this patch.

cppreference defines internal linkage as 'The name can be referred to from all scopes in the current translation unit.'
The current translation unit in CUDA context gets a bit murky. On one hand host and device are compiled separately, and may conceivably be considered separate TUs. On the other hand, the fact that we mix host and device code in the same source file implies tight coupling and the users do expect them to be treated as if all host and device code in the source file is in the same TU. E.g. you may have a kernel in an anonymous namespace yet you do want to be able to launch it from the host side.

I think static __device__ globals would fall into the same category -- nominally they should not be visible outside of device-side object file, but in practice we do need to make them visible from the host side of the same TU.

Wed, May 29, 1:32 PM · Restricted Project
yaxunl added a comment to D62603: [CUDA][HIP] Skip setting `externally_initialized` for static device variables..

LGTM. The externally initializable attribute causes some optimizations disabled. For static device variables it seems reasonable to remove the externaly initializable attribute.

Wed, May 29, 9:01 AM · Restricted Project
yaxunl added a reviewer for D62603: [CUDA][HIP] Skip setting `externally_initialized` for static device variables.: tra.
Wed, May 29, 8:56 AM · Restricted Project

Tue, May 28

yaxunl committed rGdc805a49064b: Fix failure of lit test dependent-libs.cu (authored by yaxunl).
Fix failure of lit test dependent-libs.cu
Tue, May 28, 6:32 PM
yaxunl committed rC361905: Fix failure of lit test dependent-libs.cu.
Fix failure of lit test dependent-libs.cu
Tue, May 28, 6:32 PM
yaxunl committed rL361905: Fix failure of lit test dependent-libs.cu.
Fix failure of lit test dependent-libs.cu
Tue, May 28, 6:32 PM
yaxunl committed rG02afe4e077c4: [CUDA][HIP] Emit dependent libs for host only (authored by yaxunl).
[CUDA][HIP] Emit dependent libs for host only
Tue, May 28, 2:18 PM
yaxunl committed rC361880: [CUDA][HIP] Emit dependent libs for host only.
[CUDA][HIP] Emit dependent libs for host only
Tue, May 28, 2:18 PM
yaxunl committed rL361880: [CUDA][HIP] Emit dependent libs for host only.
[CUDA][HIP] Emit dependent libs for host only
Tue, May 28, 2:18 PM
yaxunl closed D62483: [CUDA][HIP] Emit dependent libs for host only.
Tue, May 28, 2:18 PM · Restricted Project

Mon, May 27

yaxunl created D62483: [CUDA][HIP] Emit dependent libs for host only.
Mon, May 27, 5:18 AM · Restricted Project
yaxunl committed rGa53d48b7f45d: [OpenCL] Fix file-scope const sampler variable for 2.0 (authored by yaxunl).
[OpenCL] Fix file-scope const sampler variable for 2.0
Mon, May 27, 4:18 AM
yaxunl committed rL361757: [OpenCL] Fix file-scope const sampler variable for 2.0.
[OpenCL] Fix file-scope const sampler variable for 2.0
Mon, May 27, 4:18 AM
yaxunl committed rC361757: [OpenCL] Fix file-scope const sampler variable for 2.0.
[OpenCL] Fix file-scope const sampler variable for 2.0
Mon, May 27, 4:18 AM
yaxunl closed D62197: [OpenCL] Fix file-scope const sampler variable for 2.0.
Mon, May 27, 4:18 AM · Restricted Project

May 22 2019

yaxunl added a comment to D62244: [AMDGPU] Enable the implicit arguments for HIP (CLANG).

Currently HIP and CUDA share the same test directories, so better put the test in CodeGenCUDA.

May 22 2019, 8:16 AM · Restricted Project, Restricted Project

May 21 2019

yaxunl updated the diff for D62197: [OpenCL] Fix file-scope const sampler variable for 2.0.

Add full diff.

May 21 2019, 7:37 AM · Restricted Project
yaxunl created D62197: [OpenCL] Fix file-scope const sampler variable for 2.0.
May 21 2019, 7:34 AM · Restricted Project

May 1 2019

yaxunl added a comment to D61396: [hip] Fix ambiguity from `>>>` of CUDA..

LGTM too. Thanks Michael for fixing this.

May 1 2019, 2:14 PM · Restricted Project, Restricted Project

Apr 30 2019

yaxunl added inline comments to D61274: [Sema][AST] Explicit visibility for OpenCL/CUDA kernels/variables.
Apr 30 2019, 3:27 PM · Restricted Project
yaxunl committed rG9e67d129f079: Add requires amdgpu-registered-target for amdgpu-float16.cpp (authored by yaxunl).
Add requires amdgpu-registered-target for amdgpu-float16.cpp
Apr 30 2019, 12:06 PM
yaxunl committed rC359598: Add requires amdgpu-registered-target for amdgpu-float16.cpp.
Add requires amdgpu-registered-target for amdgpu-float16.cpp
Apr 30 2019, 12:05 PM
yaxunl committed rL359598: Add requires amdgpu-registered-target for amdgpu-float16.cpp.
Add requires amdgpu-registered-target for amdgpu-float16.cpp
Apr 30 2019, 12:05 PM
yaxunl committed rG44697012070c: AMDGPU: Enable _Float16 (authored by yaxunl).
AMDGPU: Enable _Float16
Apr 30 2019, 11:35 AM
yaxunl committed rL359594: AMDGPU: Enable _Float16.
AMDGPU: Enable _Float16
Apr 30 2019, 11:34 AM
yaxunl committed rC359594: AMDGPU: Enable _Float16.
AMDGPU: Enable _Float16
Apr 30 2019, 11:34 AM

Apr 26 2019

yaxunl accepted D61194: [HIP] Fix visibility of `__constant__` variables..

LGTM. Thanks!

Apr 26 2019, 12:26 PM · Restricted Project
yaxunl added inline comments to D61194: [HIP] Fix visibility of `__constant__` variables..
Apr 26 2019, 11:37 AM · Restricted Project

Apr 24 2019

yaxunl created D61112: AMDGPU: Enable _Float16.
Apr 24 2019, 8:08 PM

Apr 12 2019

yaxunl committed rG7bd8c37b1773: [HIP] Use -mlink-builtin-bitcode to link device library (authored by yaxunl).
[HIP] Use -mlink-builtin-bitcode to link device library
Apr 12 2019, 9:23 AM
yaxunl committed rC358290: [HIP] Use -mlink-builtin-bitcode to link device library.
[HIP] Use -mlink-builtin-bitcode to link device library
Apr 12 2019, 9:22 AM
yaxunl committed rL358290: [HIP] Use -mlink-builtin-bitcode to link device library.
[HIP] Use -mlink-builtin-bitcode to link device library
Apr 12 2019, 9:22 AM
yaxunl closed D60513: [HIP] Use -mlink-builtin-bitcode to link device library.
Apr 12 2019, 9:22 AM · Restricted Project
yaxunl updated the summary of D60620: [HIP] Support -offloading-target-id.
Apr 12 2019, 8:44 AM
yaxunl created D60620: [HIP] Support -offloading-target-id.
Apr 12 2019, 8:38 AM

Apr 10 2019

yaxunl created D60513: [HIP] Use -mlink-builtin-bitcode to link device library.
Apr 10 2019, 8:04 AM · Restricted Project

Apr 3 2019

yaxunl added inline comments to D59321: WIP: AMDGPU: Teach toolchain to link rocm device libs.
Apr 3 2019, 11:34 AM
yaxunl added reviewers for D59321: WIP: AMDGPU: Teach toolchain to link rocm device libs: b-sumner, t-tye.
Apr 3 2019, 11:23 AM

Apr 2 2019

yaxunl added inline comments to D60141: [HIP-Clang] Fat binary should not be produced for non GPU code.
Apr 2 2019, 1:32 PM · Restricted Project, Restricted Project
yaxunl accepted D60141: [HIP-Clang] Fat binary should not be produced for non GPU code.

LGTM. Thanks!

Apr 2 2019, 12:57 PM · Restricted Project, Restricted Project
yaxunl added a reviewer for D60141: [HIP-Clang] Fat binary should not be produced for non GPU code: tra.
Apr 2 2019, 11:04 AM · Restricted Project, Restricted Project

Mar 27 2019

yaxunl added inline comments to D59863: [HIP] Support gpu arch gfx906+sram-ecc.
Mar 27 2019, 11:05 AM

Mar 26 2019

yaxunl created D59863: [HIP] Support gpu arch gfx906+sram-ecc.
Mar 26 2019, 7:51 PM

Mar 21 2019

yaxunl created D59647: [CUDA][HIP] Warn shared var initialization.
Mar 21 2019, 8:39 AM · Restricted Project

Mar 15 2019

yaxunl added a comment to D59316: [HIP-Clang] propagate -mllvm options to opt and llc.
Mar 15 2019, 2:43 PM · Restricted Project
yaxunl added a comment to D59316: [HIP-Clang] propagate -mllvm options to opt and llc.

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

This isn't really true. We do run lld to link the final executable. It also doesn't change that opt and llc should never be involved in the process

Can lld do ISA level linking? That is, one device function in one object file calls another device function in a different object file, and we let lld link them together?

We can't link multiple objects, but we do need to link the single object with lld. The relocations even for functions in the same module are 0 until lld fixes them up. Do we have execution tests for function calls using HIP? Since it looks like lld isn't getting used here, I suspect they aren't workingh

Mar 15 2019, 12:35 PM · Restricted Project
yaxunl added a comment to D59316: [HIP-Clang] propagate -mllvm options to opt and llc.

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

This isn't really true. We do run lld to link the final executable. It also doesn't change that opt and llc should never be involved in the process

Mar 15 2019, 12:14 PM · Restricted Project

Mar 14 2019

yaxunl accepted D59316: [HIP-Clang] propagate -mllvm options to opt and llc.

LGTM. Thanks!

Mar 14 2019, 11:02 AM · Restricted Project
yaxunl added a comment to D59316: [HIP-Clang] propagate -mllvm options to opt and llc.

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

Mar 14 2019, 11:02 AM · Restricted Project

Mar 5 2019

yaxunl committed rGc5be267003ef: [CUDA][HIP][Sema] Fix template kernel with function as template parameter (authored by yaxunl).
[CUDA][HIP][Sema] Fix template kernel with function as template parameter
Mar 5 2019, 10:20 AM
yaxunl committed rL355421: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.
[CUDA][HIP][Sema] Fix template kernel with function as template parameter
Mar 5 2019, 10:20 AM
yaxunl committed rC355421: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.
[CUDA][HIP][Sema] Fix template kernel with function as template parameter
Mar 5 2019, 10:20 AM
yaxunl closed D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.
Mar 5 2019, 10:19 AM · Restricted Project
yaxunl committed rG071949c3afae: Allow bundle size to be 0 in clang-offload-bundler (authored by yaxunl).
Allow bundle size to be 0 in clang-offload-bundler
Mar 5 2019, 9:52 AM
yaxunl committed rC355419: Allow bundle size to be 0 in clang-offload-bundler.
Allow bundle size to be 0 in clang-offload-bundler
Mar 5 2019, 9:52 AM
yaxunl committed rL355419: Allow bundle size to be 0 in clang-offload-bundler.
Allow bundle size to be 0 in clang-offload-bundler
Mar 5 2019, 9:52 AM
yaxunl closed D58057: Allow bundle size to be 0 in clang-offload-bundler.
Mar 5 2019, 9:52 AM · Restricted Project
yaxunl committed rGab851939fc6b: [HIP] Do not unbundle object files for -fno-gpu-rdc (authored by yaxunl).
[HIP] Do not unbundle object files for -fno-gpu-rdc
Mar 5 2019, 8:09 AM
yaxunl committed rC355410: [HIP] Do not unbundle object files for -fno-gpu-rdc.
[HIP] Do not unbundle object files for -fno-gpu-rdc
Mar 5 2019, 8:09 AM
yaxunl committed rL355410: [HIP] Do not unbundle object files for -fno-gpu-rdc.
[HIP] Do not unbundle object files for -fno-gpu-rdc
Mar 5 2019, 8:09 AM
yaxunl closed D58917: [HIP] Do not unbundle object files for -fno-gpu-rdc.
Mar 5 2019, 8:09 AM · Restricted Project, Restricted Project
yaxunl added a reviewer for D58057: Allow bundle size to be 0 in clang-offload-bundler: ABataev.
Mar 5 2019, 7:02 AM · Restricted Project
yaxunl added a comment to D58057: Allow bundle size to be 0 in clang-offload-bundler.

Alexey, could you please also review this patch? Thanks.

Mar 5 2019, 7:02 AM · Restricted Project
yaxunl added inline comments to D58917: [HIP] Do not unbundle object files for -fno-gpu-rdc.
Mar 5 2019, 7:02 AM · Restricted Project, Restricted Project

Mar 4 2019

yaxunl created D58917: [HIP] Do not unbundle object files for -fno-gpu-rdc.
Mar 4 2019, 12:01 PM · Restricted Project, Restricted Project

Feb 28 2019

yaxunl committed rGfd2c5c05fc8f: Partial revert of r353952: [HIP] Handle compile -m options and propagate into… (authored by yaxunl).
Partial revert of r353952: [HIP] Handle compile -m options and propagate into…
Feb 28 2019, 9:08 AM
yaxunl committed rC355106: Partial revert of r353952: [HIP] Handle compile -m options and propagate into….
Partial revert of r353952: [HIP] Handle compile -m options and propagate into…
Feb 28 2019, 9:08 AM
yaxunl committed rL355106: Partial revert of r353952: [HIP] Handle compile -m options and propagate into….
Partial revert of r353952: [HIP] Handle compile -m options and propagate into…
Feb 28 2019, 9:08 AM

Feb 27 2019

yaxunl committed rG785cbd850b76: [NFC] minor revision of r354929 [CUDA][HIP] Check calling convention based on… (authored by yaxunl).
[NFC] minor revision of r354929 [CUDA][HIP] Check calling convention based on…
Feb 27 2019, 7:48 AM
yaxunl committed rL354990: [NFC] minor revision of r354929 [CUDA][HIP] Check calling convention based on….
[NFC] minor revision of r354929 [CUDA][HIP] Check calling convention based on…
Feb 27 2019, 7:45 AM
yaxunl committed rC354990: [NFC] minor revision of r354929 [CUDA][HIP] Check calling convention based on….
[NFC] minor revision of r354929 [CUDA][HIP] Check calling convention based on…
Feb 27 2019, 7:45 AM
yaxunl added inline comments to D57716: [CUDA][HIP] Check calling convention based on function target.
Feb 27 2019, 7:11 AM · Restricted Project
yaxunl added inline comments to D57716: [CUDA][HIP] Check calling convention based on function target.
Feb 27 2019, 6:57 AM · Restricted Project