Page MenuHomePhabricator

tra (Artem Belevich)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 8 2015, 1:53 PM (315 w, 4 d)

Recent Activity

Today

tra accepted D95329: [llvm-link] Fix crash when materializing appending global.
Mon, Jan 25, 5:47 PM · Restricted Project
tra accepted D95299: Fix truncated __OPENMP_NVPTX__ preprocessor condition.

LGTM

Mon, Jan 25, 10:40 AM · Restricted Project
tra added inline comments to D95329: [llvm-link] Fix crash when materializing appending global.
Mon, Jan 25, 10:36 AM · Restricted Project
tra added inline comments to D95329: [llvm-link] Fix crash when materializing appending global.
Mon, Jan 25, 9:22 AM · Restricted Project

Thu, Jan 21

tra committed rG127091bfd5ed: [CUDA] Normalize handling of defauled dtor. (authored by tra).
[CUDA] Normalize handling of defauled dtor.
Thu, Jan 21, 10:49 AM
tra closed D94732: [CUDA] Normalize handling of defauled dtor..
Thu, Jan 21, 10:49 AM · Restricted Project
tra accepted D94814: [HIP] Support `__managed__` attribute.

LGTM.

Thu, Jan 21, 9:55 AM · Restricted Project, Restricted Project

Wed, Jan 20

tra added inline comments to D95104: [OpenMP] Replace `cuda.h` with (forward) declarations in the offload plugin.
Wed, Jan 20, 4:30 PM · Restricted Project
tra updated the diff for D94732: [CUDA] Normalize handling of defauled dtor..

Added a test for the corner case Richard has pointed out in the comments.

Wed, Jan 20, 4:13 PM · Restricted Project
tra updated the diff for D94732: [CUDA] Normalize handling of defauled dtor..

Removed unneeded changes.

Wed, Jan 20, 2:34 PM · Restricted Project
tra added inline comments to D94732: [CUDA] Normalize handling of defauled dtor..
Wed, Jan 20, 2:32 PM · Restricted Project
tra added inline comments to D85223: [CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc.
Wed, Jan 20, 11:43 AM
tra added inline comments to D94732: [CUDA] Normalize handling of defauled dtor..
Wed, Jan 20, 11:05 AM · Restricted Project
tra updated the diff for D94732: [CUDA] Normalize handling of defauled dtor..

Addressed Richard's comments.

Wed, Jan 20, 11:04 AM · Restricted Project
Herald added a reviewer for D95007: [CUDA][HIP] Add -fuse-cuid: jansvoboda11.
Wed, Jan 20, 10:47 AM

Tue, Jan 19

tra added a comment to D94884: [Clang][OpenMP] Include header for CUDA builtin vars into OpenMP wrapper header.

I won't object too strongly, as ultimately I don't care about cuda, but I view intertwining the two implementations as technical debt.

Tue, Jan 19, 2:15 PM · Restricted Project
tra accepted D94884: [Clang][OpenMP] Include header for CUDA builtin vars into OpenMP wrapper header.

LGTM for __clang_cuda_builtin_vars.h.

Tue, Jan 19, 1:24 PM · Restricted Project
tra added a comment to D94814: [HIP] Support `__managed__` attribute.

Presumably, __managed__ variables would have to be memory-mapped into the host address space.

Tue, Jan 19, 1:20 PM · Restricted Project, Restricted Project
tra added inline comments to D94884: [Clang][OpenMP] Include header for CUDA builtin vars into OpenMP wrapper header.
Tue, Jan 19, 12:43 PM · Restricted Project
tra added a comment to D85223: [CUDA][HIP] Support accessing static device variable in host code for -fgpu-rdc.

I'd propose splitting the patch into two. One with the addition of CUID and the other that changes the way we havdle static vars.
CUID is useful on its own and is relatively uncontroversial.

Tue, Jan 19, 11:23 AM
tra added inline comments to D94884: [Clang][OpenMP] Include header for CUDA builtin vars into OpenMP wrapper header.
Tue, Jan 19, 10:54 AM · Restricted Project
tra accepted D94643: [llvm-link] Improve link time for bitcode archives [NFC].
Tue, Jan 19, 10:28 AM · Restricted Project

Thu, Jan 14

tra requested review of D94732: [CUDA] Normalize handling of defauled dtor..
Thu, Jan 14, 4:37 PM · Restricted Project
tra added a comment to D94643: [llvm-link] Improve link time for bitcode archives [NFC].

Do you have any measurements showing actual impact on performance or memory consumption?

Thu, Jan 14, 10:51 AM · Restricted Project
tra added a comment to D93062: [HIP] Add signbit(long double) decl.

Go ahead.

Thu, Jan 14, 10:00 AM · Restricted Project

Wed, Jan 13

tra added a comment to D94337: Add cuda header type for cuh files.

... The goal of __clang_cuda_standalone_defs.h is to make it possible to parse CUDA sources at all w/o having to rely on CUDA SDK. ...

Should __clang_cuda_standalone_defs.h depend on the arch? For example, __match_all_sync doesn't exist in sm_35 but does exist in sm_75.
For tooling this might not matter because the host cuda includes don't depend on the arch.

Wed, Jan 13, 1:25 PM · Restricted Project
tra added a comment to D93062: [HIP] Add signbit(long double) decl.

For CUDA we have tests in the test-suite (https://github.com/llvm/llvm-test-suite/tree/main/External/CUDA) and a handful of buildbots running them (e.g. http://lab.llvm.org:8011/#/builders/55).
AMD should probably set up some public build/test bots for HIP, too.
In this case, manual testing and rollback if something breaks is about all we can do.

Wed, Jan 13, 12:05 PM · Restricted Project
tra added a comment to D94337: Add cuda header type for cuh files.
In D94337#2491825, @tra wrote:

'Works' is not exactly the same as 'works correctly'. This example makes a() look like a regular host function, instead of the kernel, and that affects how the rest of the TU get parsed.
I.e. you'll have further errors if somewhere down below the file has a<<<1,1>>>(). Similar story with ignoring __host__ and __device__ attributes -- that may lead to various overload resolution errors, or reporting conflicting redeclarations/redefinitions for the perfectly valid host/device function overloads. The list goes on.

No, this part is definitely working, the full set of SDK headers is included. I have confirmed this by compiling a .cu and a .cuh file with -E and checking that the output is identical. Further, I have confirmed that global<<<_, _>>>() builds, that __host__ only functions can't be used on the device, and that __device__ functions can't be used on the host.

Wed, Jan 13, 10:57 AM · Restricted Project

Mon, Jan 11

tra added a comment to D94337: Add cuda header type for cuh files.

For example consider the following header:

#pragma once

__global__ void a() {
  unsigned block_idx = blockIdx.x;
  unsigned thread_idx = threadIdx.x;

  __shfl_down_sync(1, 2, 1);
}

When saved as a .cuh and compiled as clang++ file.cuh -fsyntax-only --cuda-gpu-arch=sm_75 (using clang++ built from this commit), this works fine other than an invalid diagnostic for the #pragma once.

Mon, Jan 11, 5:37 PM · Restricted Project
tra added a comment to D94337: Add cuda header type for cuh files.

My primary goal for this change was to allow for language servers and other tooling to properly handle cuda header files. From my understanding the way that language servers handle c++ header files is by compiling them with -xc++-header and -fsyntax-only. This is certainly true for ccls and it seems to be true for clangd.
So this can be accomplished without actually able to produce preprocessed output for cuda headers - it only requires handling the "-fsyntax-only" use case.

Mon, Jan 11, 2:52 PM · Restricted Project
tra added a comment to D94337: Add cuda header type for cuh files.

This adds a cuda header type with file extension "cuh". The output type file
extension is "cuhi" - not sure if this is a good choice. This allows
language servers to properly handle cuh files without additional arguments.

Mon, Jan 11, 11:02 AM · Restricted Project

Thu, Jan 7

tra added a comment to D93930: [NewPM][NVPTX] Port NVPTX opt passes.

LGTM

Thu, Jan 7, 3:15 PM · Restricted Project
tra accepted D93638: [hip] Enable HIP compilation with `<complex`> on MSVC..

Forget that C function could be overloaded on Clang with overloadable
extension. With that, we don't need to mark functions from <ymath.h> as HD.
Instead, we could provide their device-side implementation directly.

Thu, Jan 7, 10:20 AM · Restricted Project

Wed, Jan 6

tra added inline comments to D93638: [hip] Enable HIP compilation with `<complex`> on MSVC..
Wed, Jan 6, 4:03 PM · Restricted Project
tra added inline comments to D93638: [hip] Enable HIP compilation with `<complex`> on MSVC..
Wed, Jan 6, 2:34 PM · Restricted Project
tra accepted D92954: [clang-offload-bundler] Add option -list.
Wed, Jan 6, 12:37 PM · Restricted Project
tra accepted D93587: [hip] Fix HIP version parsing..

LGTM overall.

Wed, Jan 6, 11:23 AM · Restricted Project
tra added inline comments to D93638: [hip] Enable HIP compilation with `<complex`> on MSVC..
Wed, Jan 6, 11:09 AM · Restricted Project
tra accepted D94123: [NVPTX] Fix debugging information being added to NVPTX target if remarks are enabled.

LGTM.

Wed, Jan 6, 10:37 AM · Restricted Project, Restricted Project

Tue, Jan 5

tra added a comment to D94123: [NVPTX] Fix debugging information being added to NVPTX target if remarks are enabled.

Okay, so without that flag Clang will not create debug symbols in the PTX assembly output.

Tue, Jan 5, 4:49 PM · Restricted Project, Restricted Project
tra added a comment to D94123: [NVPTX] Fix debugging information being added to NVPTX target if remarks are enabled.

There's --cuda-noopt-device-debug option specifically to allow compiling GPU code with full debug info. Clang will generate optimized PTX, but ptxas optimizations will be disabled.

Tue, Jan 5, 2:50 PM · Restricted Project, Restricted Project
tra added inline comments to D92434: [NFC][AMDGPU] AMDGPU code object V4 ABI documentation.
Tue, Jan 5, 2:20 PM · Restricted Project, Restricted Project
tra added inline comments to D93638: [hip] Enable HIP compilation with `<complex`> on MSVC..
Tue, Jan 5, 1:01 PM · Restricted Project
tra added inline comments to D93587: [hip] Fix HIP version parsing..
Tue, Jan 5, 12:37 PM · Restricted Project
tra added inline comments to D92954: [clang-offload-bundler] Add option -list.
Tue, Jan 5, 12:22 PM · Restricted Project
tra accepted D92535: [llvm-link] fix linker behavior when linking archives with --only-needed option.

LGTM.

Tue, Jan 5, 9:35 AM · Restricted Project

Dec 15 2020

tra added a comment to D92535: [llvm-link] fix linker behavior when linking archives with --only-needed option.

@jdoerfert -- do you happen to know how archives with bitcode are used by OpenMP? Does OpenMP ever link just the archive alone w/o any other bitcode files?

Dec 15 2020, 1:11 PM · Restricted Project
tra added a comment to D92535: [llvm-link] fix linker behavior when linking archives with --only-needed option.

The change look OK. But we should still wait for @jsjodin to confirm that the first-file-is-different for archive files is unintentional.

Dec 15 2020, 12:43 PM · Restricted Project

Dec 14 2020

tra accepted D92720: [HIP] unbundle bundled preprocessor output.

Output of -E for HIP combined host/device compilation is a plain text. It has C++ comments inserted between preprocessor outputs for host and different GPU arch's. The C++ comments follow the format of clang-offload-bundler bundled text files therefore clang-offload-bundler is able to unbundle it.

Dec 14 2020, 3:03 PM · Restricted Project
tra accepted D93068: [clang-offload-bundler] Add option -allow-missing-bundles.

The patch could use an OK with OMP folks, considering that we've changed the way offload bunder is invoked for OMP.

Dec 14 2020, 2:37 PM · Restricted Project
tra committed rG0936655bac78: [CUDA] Do not diagnose host/device variable access in dependent types. (authored by tra).
[CUDA] Do not diagnose host/device variable access in dependent types.
Dec 14 2020, 11:56 AM
tra closed D92893: [CUDA] Do not diagnose host/device variable access in dependent types..
Dec 14 2020, 11:56 AM · Restricted Project
tra updated the diff for D92893: [CUDA] Do not diagnose host/device variable access in dependent types..

Use device in the test case.

Dec 14 2020, 11:09 AM · Restricted Project
tra added inline comments to D93181: [NFC][AMDGPU] Reformat AMD GPU targets in cuda.cpp.
Dec 14 2020, 10:38 AM · Restricted Project

Dec 10 2020

tra updated subscribers of D92893: [CUDA] Do not diagnose host/device variable access in dependent types..

My first variant of the patch only helped with some cases when the surface/texture attribute type was used.
Trying to reduce real-world failure resulted in an example that I've added as the test case which was still failing with this patch applied.

Dec 10 2020, 4:49 PM · Restricted Project
tra updated the diff for D92893: [CUDA] Do not diagnose host/device variable access in dependent types..

Found another corner case (reference within a template with the surface/texture
attibute.) and figured out a better fix.

Dec 10 2020, 4:41 PM · Restricted Project
tra accepted D93062: [HIP] Add signbit(long double) decl.
Dec 10 2020, 2:58 PM · Restricted Project
tra added a comment to D93068: [clang-offload-bundler] Add option -allow-missing-bundles.
Dec 10 2020, 2:56 PM · Restricted Project
tra added inline comments to D92431: [SROA] Remove Dead Instructions while creating speculative instructions.
Dec 10 2020, 11:08 AM · Restricted Project
tra added inline comments to D92954: [clang-offload-bundler] Add option -list.
Dec 10 2020, 10:50 AM · Restricted Project

Dec 9 2020

tra accepted D92918: [llvm-link][NFC] Minor cleanup.

And, by the way, this function is called as a static member (i.e. Linker::linkModules()) everywhere in LLVM sources except llvm-link.cpp:

Dec 9 2020, 5:57 PM · Restricted Project
tra committed rG016e4ebfde28: [DWARF] Allow toolchain to adjust specified DWARF version. (authored by tra).
[DWARF] Allow toolchain to adjust specified DWARF version.
Dec 9 2020, 4:35 PM
tra closed D92617: [DWARF] Allow toolchain to adjust specified DWARF version..
Dec 9 2020, 4:35 PM · Restricted Project
tra added inline comments to D92617: [DWARF] Allow toolchain to adjust specified DWARF version..
Dec 9 2020, 12:57 PM · Restricted Project
tra updated the diff for D92617: [DWARF] Allow toolchain to adjust specified DWARF version..

Reorganized tests for unsupported debug options & dwarf version clamping.

Dec 9 2020, 12:53 PM · Restricted Project
tra added inline comments to D92954: [clang-offload-bundler] Add option -list.
Dec 9 2020, 11:45 AM · Restricted Project
tra added a comment to D92535: [llvm-link] fix linker behavior when linking archives with --only-needed option.

Sure. I believe llvm-link works incorrectly when linking --only-needed symbols from archives with bitcode files. As it is implemented now, llvm-link, when dealing with archives, first links archive modules together into an intermediate module and then tries to link required symbols from that intermediate module into the result. The problem is that archive modules are linked together with --only-needed flag as well, so we always end up with getting empty intermediate module because archive linking starts from scratch (i.e. nothing gets imported into archive module because there are no dependencies).

Dec 9 2020, 9:58 AM · Restricted Project
tra added a comment to D92918: [llvm-link][NFC] Minor cleanup.

llvm::Linker::linkModules() is a static member, so there is no need
to pass reference to llvm::Linker instance to loadArFile() function.

Dec 9 2020, 9:29 AM · Restricted Project

Dec 8 2020

tra added inline comments to D92617: [DWARF] Allow toolchain to adjust specified DWARF version..
Dec 8 2020, 4:38 PM · Restricted Project
tra added a comment to D91281: [CUDA][HIP] Diagnose reference of host variable.
In D91281#2441147, @tra wrote:

I think isCUDADeviceBuiltinTextureType has problem handling texture refs within templates.

Dec 8 2020, 4:11 PM · Restricted Project
tra requested review of D92893: [CUDA] Do not diagnose host/device variable access in dependent types..
Dec 8 2020, 3:17 PM · Restricted Project
tra added a comment to D91281: [CUDA][HIP] Diagnose reference of host variable.

It appears that we need to add special handling for texture/surface references. Nominally they are host-side objects, but they are accessed/used from device functions as far as Sema is concerned.

Dec 8 2020, 2:26 PM · Restricted Project
tra added a comment to D92535: [llvm-link] fix linker behavior when linking archives with --only-needed option.

The patch description describes what the patch does, but does not tell us much about the problem is is supposed to fix.
Could you give us more details on why the patch is needed?

Dec 8 2020, 9:27 AM · Restricted Project

Dec 7 2020

tra updated the diff for D92617: [DWARF] Allow toolchain to adjust specified DWARF version..

Adjusted openmp test for the changed -gembed-source warning.

Dec 7 2020, 5:12 PM · Restricted Project
tra updated the diff for D92617: [DWARF] Allow toolchain to adjust specified DWARF version..

Addressed comments.

Dec 7 2020, 5:01 PM · Restricted Project
tra updated the diff for D92617: [DWARF] Allow toolchain to adjust specified DWARF version..

Updated to address the comments. PTAL.

Dec 7 2020, 2:09 PM · Restricted Project
tra added a comment to D92720: [HIP] unbundle bundled preprocessor output.

-E by default prints preprocessed output to stdout. CUDA will print preprocessed output from all subcompilations. What does HIP do in this case? Printing out the bundle is probably not what the user will expect.
IMO preprocessed output is frequently used as a debugging tool, so it's important for users to be able to read it. Bundled output is rather cumbersome to deal with. It's possible to manually unbundle it, but the tool is not documented well and it's not particularly suitable for human use.

Dec 7 2020, 10:55 AM · Restricted Project

Dec 4 2020

tra added inline comments to D92617: [DWARF] Allow toolchain to adjust specified DWARF version..
Dec 4 2020, 1:50 PM · Restricted Project
tra requested review of D92684: [CUDA, tet-suite] enable testing with C++17 and C++20.
Dec 4 2020, 1:12 PM
tra committed rG43267929423b: [CUDA] Another attempt to fix early inclusion of <new> from libstdc++ (authored by tra).
[CUDA] Another attempt to fix early inclusion of <new> from libstdc++
Dec 4 2020, 12:04 PM
tra closed D91807: [CUDA] Unbreak CUDA compilation with -std=c++20.
Dec 4 2020, 12:04 PM · Restricted Project
tra updated the summary of D91807: [CUDA] Unbreak CUDA compilation with -std=c++20.
Dec 4 2020, 12:03 PM · Restricted Project
tra updated the diff for D91807: [CUDA] Unbreak CUDA compilation with -std=c++20.

Removed forgotten pop_macro(DEVICE)

Dec 4 2020, 11:46 AM · Restricted Project
tra requested review of D91807: [CUDA] Unbreak CUDA compilation with -std=c++20.

@jlebar PTAL. The previous fix did not quite fix the problem. __device__ operator new/delete was still not defined, but the issue was not reported due to deferred diags if the functions were not used.

Dec 4 2020, 11:37 AM · Restricted Project
tra updated the diff for D91807: [CUDA] Unbreak CUDA compilation with -std=c++20.

Better fix for the early inclusion of <new>

Dec 4 2020, 11:35 AM · Restricted Project
tra reopened D91807: [CUDA] Unbreak CUDA compilation with -std=c++20.

Reopening the tracker for a better fix.

Dec 4 2020, 11:33 AM · Restricted Project
tra added inline comments to D92617: [DWARF] Allow toolchain to adjust specified DWARF version..
Dec 4 2020, 11:08 AM · Restricted Project
tra updated the diff for D92617: [DWARF] Allow toolchain to adjust specified DWARF version..

Simplified dwarf version clamping.

Dec 4 2020, 11:07 AM · Restricted Project

Dec 3 2020

tra added inline comments to D92617: [DWARF] Allow toolchain to adjust specified DWARF version..
Dec 3 2020, 8:12 PM · Restricted Project
tra added inline comments to D92617: [DWARF] Allow toolchain to adjust specified DWARF version..
Dec 3 2020, 8:01 PM · Restricted Project
tra added inline comments to D92617: [DWARF] Allow toolchain to adjust specified DWARF version..
Dec 3 2020, 4:48 PM · Restricted Project
tra updated the diff for D92617: [DWARF] Allow toolchain to adjust specified DWARF version..

Updated according to Devid's feedback.

Dec 3 2020, 4:46 PM · Restricted Project
tra requested review of D92617: [DWARF] Allow toolchain to adjust specified DWARF version..
Dec 3 2020, 3:57 PM · Restricted Project
tra added a comment to D92363: [HIP] Warn no --offload-arch option.
Dec 3 2020, 12:17 PM

Dec 2 2020

tra added a comment to D80450: [CUDA][HIP] Fix HD function resolution.

Also, naming. -ffix-overload-resolution is rather non-specific. I didn't mean to use it literally. The problem is that I can't think of a good descriptive name for what we do here. -fgpu-fix-wrong-side-overloads ? Something else?

How about -fgpu-exclude-wrong-side-overloads? Since what this patch does is always excluding wrong side overloads whereas previously only excluding wrong side overloads if there are same side overloads.

Dec 2 2020, 10:12 AM · Restricted Project

Dec 1 2020

tra accepted D80450: [CUDA][HIP] Fix HD function resolution.

I'd suggest adding more details on the background of this change to the commit log (point to the comment in the isBetterOverloadCandidate ?) and outline the intention to enable the new way to do overloading after some soak time.

Dec 1 2020, 1:02 PM · Restricted Project
tra accepted D91088: [CUDA][HIP] Fix capturing reference to host variable.
Dec 1 2020, 12:40 PM · Restricted Project
tra added a comment to D92363: [HIP] Warn no --offload-arch option.

While I agree that the default GPU choice is not likely to be correct, or usable, for everyone, but the warning seems to be a half-measure.
If the default is not usable, then it should not be the default. If it's usable, then we don't need a warning.

Dec 1 2020, 12:23 PM

Nov 30 2020

tra added a comment to D91088: [CUDA][HIP] Fix capturing reference to host variable.

LGTM in general.

Nov 30 2020, 1:23 PM · Restricted Project
tra added inline comments to D80450: [CUDA][HIP] Fix HD function resolution.
Nov 30 2020, 1:04 PM · Restricted Project