tra (Artem Belevich)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 8 2015, 1:53 PM (157 w, 5 d)

Recent Activity

Today

tra added a comment to D41788: [DeclPrinter] Fix two cases that crash clang -ast-print..

@bkramer Can you take a look at the patch?

Tue, Jan 16, 5:09 PM
tra added a reviewer for D41788: [DeclPrinter] Fix two cases that crash clang -ast-print.: bkramer.
Tue, Jan 16, 5:08 PM
tra added a comment to D41827: [DEBUG] Initial adaptation of NVPTX target for debug info emission..

Looks OK to me. That said, I have little clue about DWARF, so I'll defer to echristo@ as it's his domain.

Tue, Jan 16, 12:01 PM

Fri, Jan 12

tra added a comment to D41788: [DeclPrinter] Fix two cases that crash clang -ast-print..

@arphaman: ping.

Fri, Jan 12, 10:08 AM
tra added inline comments to D14254: [OpenMP] Initial implementation of OpenMP offloading library - libomptarget device RTLs..
Fri, Jan 12, 9:52 AM · Restricted Project

Mon, Jan 8

tra added a reviewer for D41788: [DeclPrinter] Fix two cases that crash clang -ast-print.: jlebar.
Mon, Jan 8, 3:34 PM
tra committed rT322013: [test-suite, CUDA] Make sure we use the thrust library from test external dir..
[test-suite, CUDA] Make sure we use the thrust library from test external dir.
Mon, Jan 8, 10:36 AM
tra committed rT322012: [test-suite, CUDA] Improve handling of GPUs not supported by particular CUDA….
[test-suite, CUDA] Improve handling of GPUs not supported by particular CUDA…
Mon, Jan 8, 10:36 AM
tra committed rL322013: [test-suite, CUDA] Make sure we use the thrust library from test external dir..
[test-suite, CUDA] Make sure we use the thrust library from test external dir.
Mon, Jan 8, 10:11 AM
tra committed rL322012: [test-suite, CUDA] Improve handling of GPUs not supported by particular CUDA….
[test-suite, CUDA] Improve handling of GPUs not supported by particular CUDA…
Mon, Jan 8, 10:11 AM
tra closed D41685: [test-suite, CUDA] Make sure we use the thrust library from test external dir..
Mon, Jan 8, 10:11 AM
tra closed D41683: [test-suite, CUDA] Improve handling of GPUs not supported by particular CUDA version..
Mon, Jan 8, 10:11 AM

Fri, Jan 5

tra created D41788: [DeclPrinter] Fix two cases that crash clang -ast-print..
Fri, Jan 5, 3:08 PM
tra abandoned D41781: [DeclPrinter] Handle built-in C++ types in -ast-print..

Never mind. There must be something else going on in the case where I've discovered the crash. the test case in this patch does not really reproduce the issue by itself. :-(

Fri, Jan 5, 1:11 PM
tra created D41781: [DeclPrinter] Handle built-in C++ types in -ast-print..
Fri, Jan 5, 11:23 AM
tra added a comment to D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.

I'm still curious to hear what do you plan to do when your depot use grows beyond certain limit. At the very least there's the physical limit on shared memory size. Shared memory use also affects how many threads can be launched which has large impact on performance. IMO having some sort of user-controllable threshold would be very desirable.

When shared memory isn't enough to hold the shared depot, global memory will be used instead. That is a scheme which will be covered by a future patch.

Fri, Jan 5, 10:09 AM

Thu, Jan 4

tra added a comment to D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.

Dotting the 'i's on the questions that were not replied to directly.

Thu, Jan 4, 10:30 AM

Tue, Jan 2

tra created D41685: [test-suite, CUDA] Make sure we use the thrust library from test external dir..
Tue, Jan 2, 3:01 PM
tra created D41683: [test-suite, CUDA] Improve handling of GPUs not supported by particular CUDA version..
Tue, Jan 2, 1:55 PM

Thu, Dec 21

tra committed rL321326: [CUDA] More fixes for __shfl_* intrinsics..
[CUDA] More fixes for __shfl_* intrinsics.
Thu, Dec 21, 3:53 PM
tra committed rC321326: [CUDA] More fixes for __shfl_* intrinsics..
[CUDA] More fixes for __shfl_* intrinsics.
Thu, Dec 21, 3:53 PM
tra closed D41521: [CUDA] fixes for __shfl_* intrinsics..
Thu, Dec 21, 3:53 PM
tra added a comment to D41521: [CUDA] fixes for __shfl_* intrinsics..

Added to my todo list. There are few more gaps that I want to test in order to make sure we don't regress on compatibility with older CUDA versions while changing these wrappers.

Thu, Dec 21, 3:29 PM
tra created D41521: [CUDA] fixes for __shfl_* intrinsics..
Thu, Dec 21, 2:37 PM

Dec 11 2017

tra accepted D40996: Add --no-cuda-version-check in unknown-std.cpp.
Dec 11 2017, 11:46 AM

Dec 8 2017

tra added a comment to D40996: Add --no-cuda-version-check in unknown-std.cpp.

Ideally the tests should be hermetic and should use mock CUDA installation that comes with the tests. E.g. --cuda-path=%S/Inputs/CUDA/usr/local/cuda

Dec 8 2017, 10:03 AM

Dec 6 2017

tra committed rC319909: [NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in….
[NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in…
Dec 6 2017, 9:50 AM
tra committed rL319909: [NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in….
[NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in…
Dec 6 2017, 9:50 AM
tra closed D40872: [NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in clang. by committing rL319909: [NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in….
Dec 6 2017, 9:50 AM
tra committed rL319908: [CUDA] Added overloads for '[unsigned] long' variants of shfl builtins..
[CUDA] Added overloads for '[unsigned] long' variants of shfl builtins.
Dec 6 2017, 9:41 AM
tra committed rC319908: [CUDA] Added overloads for '[unsigned] long' variants of shfl builtins..
[CUDA] Added overloads for '[unsigned] long' variants of shfl builtins.
Dec 6 2017, 9:41 AM
tra closed D40871: [CUDA] Added overloads for '[unsigned] long' variants of shfl builtins. by committing rC319908: [CUDA] Added overloads for '[unsigned] long' variants of shfl builtins..
Dec 6 2017, 9:41 AM
tra closed D40871: [CUDA] Added overloads for '[unsigned] long' variants of shfl builtins. by committing rL319908: [CUDA] Added overloads for '[unsigned] long' variants of shfl builtins..
Dec 6 2017, 9:41 AM

Dec 5 2017

tra created D40872: [NVPTX,CUDA] Added llvm.nvvm.fns intrinsic and matching __nvvm_fns builtin in clang..
Dec 5 2017, 4:22 PM
tra created D40871: [CUDA] Added overloads for '[unsigned] long' variants of shfl builtins..
Dec 5 2017, 4:19 PM
tra added a comment to D40033: [NVPTX] Initial adaptation of MCAsmStreamer/MCTargetStreamer for debug info in Cuda..

I'll defer to @echristo for the final approval.

Dec 5 2017, 12:03 PM
tra accepted D40033: [NVPTX] Initial adaptation of MCAsmStreamer/MCTargetStreamer for debug info in Cuda..

LGTM in general. Can you think of a way to verify new functionality? It looks like we may have to wait until some target starts using it.

Dec 5 2017, 12:01 PM

Nov 30 2017

tra committed rC319485: [CUDA] Tweak CUDA wrappers to make cuda-9 work with libc++.
[CUDA] Tweak CUDA wrappers to make cuda-9 work with libc++
Nov 30 2017, 2:23 PM
tra committed rL319485: [CUDA] Tweak CUDA wrappers to make cuda-9 work with libc++.
[CUDA] Tweak CUDA wrappers to make cuda-9 work with libc++
Nov 30 2017, 2:23 PM
tra closed D40198: [CUDA] Tweak CUDA wrappers to make cuda-9 work with libc++ by committing rL319485: [CUDA] Tweak CUDA wrappers to make cuda-9 work with libc++.
Nov 30 2017, 2:22 PM
tra accepted D40573: [NVPTX] Assign valid global names.

I am OK with the change, but please wait a bit in case @rnk or @hfinkel have further comments.

Nov 30 2017, 2:11 PM
tra added inline comments to D40573: [NVPTX] Assign valid global names.
Nov 30 2017, 11:56 AM
tra added inline comments to D40573: [NVPTX] Assign valid global names.
Nov 30 2017, 10:49 AM
tra added inline comments to D40573: [NVPTX] Assign valid global names.
Nov 30 2017, 10:12 AM

Nov 29 2017

tra added a comment to D40198: [CUDA] Tweak CUDA wrappers to make cuda-9 work with libc++.

ping.

Nov 29 2017, 3:17 PM
tra added inline comments to D40573: [NVPTX] Assign valid global names.
Nov 29 2017, 2:25 PM
tra added a comment to D40573: [NVPTX] Assign valid global names.

There must be some truth in the saying "naming is one of the hardest problems in computer science". :-/

Nov 29 2017, 2:03 PM

Nov 28 2017

tra accepted D40453: Add the nvidia-cuda-toolkit Debian package path to search path.
Nov 28 2017, 3:24 PM
tra added inline comments to D40453: Add the nvidia-cuda-toolkit Debian package path to search path.
Nov 28 2017, 1:37 PM
tra committed rL319201: [CUDA] Report "unsupported VLA" errors only on device side..
[CUDA] Report "unsupported VLA" errors only on device side.
Nov 28 2017, 10:52 AM
tra closed D40275: [CUDA] Report "unsupported VLA" errors only on device side. by committing rL319201: [CUDA] Report "unsupported VLA" errors only on device side..
Nov 28 2017, 10:52 AM
tra added inline comments to D40275: [CUDA] Report "unsupported VLA" errors only on device side..
Nov 28 2017, 10:27 AM

Nov 27 2017

tra added a comment to D40275: [CUDA] Report "unsupported VLA" errors only on device side..
In D40275#933253, @tra wrote:

@rjmccall : are you OK with this approach? If VLA is not supported by the target, CUDA is handled as a special case so it can emit deferred diag, OpenMP reports an error only if shouldDiagnoseTargetSupportFromOpenMP() allows it, and everything else does so unconditionally.

Nov 27 2017, 3:54 PM
tra requested changes to D40453: Add the nvidia-cuda-toolkit Debian package path to search path.

I'm reluctant to add a distribution-specific search path unconditionally.

Nov 27 2017, 3:21 PM

Nov 22 2017

tra added a comment to D40275: [CUDA] Report "unsupported VLA" errors only on device side..

@rjmccall : are you OK with this approach? If VLA is not supported by the target, CUDA is handled as a special case so it can emit deferred diag, OpenMP reports an error only if shouldDiagnoseTargetSupportFromOpenMP() allows it, and everything else does so unconditionally.

Nov 22 2017, 1:19 PM

Nov 21 2017

tra updated the diff for D40275: [CUDA] Report "unsupported VLA" errors only on device side..

Updated CUDA tests

Nov 21 2017, 11:33 AM
tra added a comment to D40275: [CUDA] Report "unsupported VLA" errors only on device side..

When Sema sees this code during compilation, it can not tell whether there is an error. Calling foo from the host code is perfectly valid. Calling it from device code is not. CUDADiagIfDeviceCode creates 'postponed' diagnostics which only gets emitted if we ever need to generate code for the function on device.

Interesting. I suspect that we'll end up dealing with this problem for OpenMP as well (in the future - for OpenMP v5). In this next version (for which the draft is available here: http://www.openmp.org/wp-content/uploads/openmp-TR6.pdf), we'll have "implicit declare target" functions (whereby we generate target code based on the locally-defined subset of the transitive closure of the call graph starting from target regions).

Nov 21 2017, 10:37 AM
tra added inline comments to D40275: [CUDA] Report "unsupported VLA" errors only on device side..
Nov 21 2017, 10:27 AM
tra updated the diff for D40275: [CUDA] Report "unsupported VLA" errors only on device side..

Updated to partially address rjmccall@ comments.

Nov 21 2017, 10:27 AM
tra added inline comments to D40275: [CUDA] Report "unsupported VLA" errors only on device side..
Nov 21 2017, 9:46 AM

Nov 20 2017

tra updated the diff for D40275: [CUDA] Report "unsupported VLA" errors only on device side..

Updates CUDA's VLA test to use nvptx triple.

Nov 20 2017, 5:15 PM
tra added a comment to D40275: [CUDA] Report "unsupported VLA" errors only on device side..

And please add a regression test which is apparently missing for the case that a VLA is NOT diagnosed in CUDA mode

Nov 20 2017, 4:55 PM
tra added a comment to D40275: [CUDA] Report "unsupported VLA" errors only on device side..

In D39505 @rjmccall requested that the check should be made independent of the language. To preserve this, I think the CUDA specific checks should be added to the generic case instead of restricting its evaluation.

Nov 20 2017, 4:52 PM
tra updated the diff for D40275: [CUDA] Report "unsupported VLA" errors only on device side..

Folded OpenCL check under if (T->isVariableArrayType())

Nov 20 2017, 4:46 PM
tra created D40275: [CUDA] Report "unsupported VLA" errors only on device side..
Nov 20 2017, 4:34 PM
tra added a comment to D40250: [OpenMP] Consistently use cubin extension for nvlink.

Looks OK to me. I'll defer to gtbercea@ for the final stamp.

Nov 20 2017, 1:55 PM
tra added inline comments to D40250: [OpenMP] Consistently use cubin extension for nvlink.
Nov 20 2017, 10:39 AM

Nov 17 2017

tra created D40198: [CUDA] Tweak CUDA wrappers to make cuda-9 work with libc++.
Nov 17 2017, 2:31 PM

Nov 16 2017

tra accepted D40151: [CUDA] [test-suite] Remove references to nexttoward in CUDA tests..
Nov 16 2017, 4:52 PM
tra accepted D40152: [CUDA] Remove implementations of nexttoward..
Nov 16 2017, 3:57 PM

Nov 14 2017

tra closed D39822: [NVPTX] Model (some) side effects of warp-synchronous data exchange intrinsics..

Landed in r318173

Nov 14 2017, 11:15 AM
tra committed rL318173: Mark intrinsics operating on the whole warp as IntrInaccessibleMemOnly.
Mark intrinsics operating on the whole warp as IntrInaccessibleMemOnly
Nov 14 2017, 11:14 AM

Nov 9 2017

tra accepted D39502: [Driver] Make clang/cc conforms to UNIX standard.

LGTM for CUDA-related functionality & tests. Thank you for the patch!

Nov 9 2017, 5:15 PM
tra added inline comments to D39502: [Driver] Make clang/cc conforms to UNIX standard.
Nov 9 2017, 4:58 PM
tra added inline comments to D39502: [Driver] Make clang/cc conforms to UNIX standard.
Nov 9 2017, 4:21 PM
tra added a comment to D39822: [NVPTX] Model (some) side effects of warp-synchronous data exchange intrinsics..

I was not sure if the *_sync intrinsics required preventing CSE since these intrinsics capture all state as arguments (lanes in a warp to sync as an argument). However, on Volta, I think different lanes in a warp can execute the intrinsic from different syntactic locations (i.e., different program counters). If true, then we do indeed have to model the data exchanged.

Nov 9 2017, 10:04 AM
tra updated the summary of D39822: [NVPTX] Model (some) side effects of warp-synchronous data exchange intrinsics..
Nov 9 2017, 9:43 AM

Nov 8 2017

tra added a comment to D39822: [NVPTX] Model (some) side effects of warp-synchronous data exchange intrinsics..

In the commit message, did you mean CSE (Common Subexpression Elimination) instead of CSI?

Nov 8 2017, 5:01 PM
tra updated the summary of D39822: [NVPTX] Model (some) side effects of warp-synchronous data exchange intrinsics..
Nov 8 2017, 5:00 PM
tra created D39822: [NVPTX] Model (some) side effects of warp-synchronous data exchange intrinsics..
Nov 8 2017, 4:18 PM
tra accepted D39818: [CUDA] [test-suite] Test std::min and std::max with C++11..

LGTM.

Nov 8 2017, 2:10 PM
tra accepted D39817: [CUDA] Fix std::min on device side to return the min, not the max..

Ouch. LGTM.

Nov 8 2017, 2:08 PM

Nov 7 2017

tra accepted D39638: [NVPTX] Implement __nvvm_atom_add_gen_d builtin..
Nov 7 2017, 1:50 PM

Nov 6 2017

tra added inline comments to D39502: [Driver] Make clang/cc conforms to UNIX standard.
Nov 6 2017, 5:12 PM
tra added a comment to D39502: [Driver] Make clang/cc conforms to UNIX standard.

Improve testcase according to review feedback.

In order to let compilation to be successful for one input, I need to use
-fsyntax-only because I don't have nvptx assembler. Enable -fsyntax-only
changes the order of the compilation somehow so I need to reorder the
errors and warnings a little bit.

Nov 6 2017, 4:31 PM
tra added a comment to D39703: [CUDA] Remove implementations of nexttoward and nextafter..

Libdevice does provide implementation for __nv_nextafterf() and __nv_nextafter() and it has corresponding wrappers in math_functions.h[pp].

Nov 6 2017, 3:18 PM
tra added inline comments to D39502: [Driver] Make clang/cc conforms to UNIX standard.
Nov 6 2017, 3:02 PM
tra added a comment to D39502: [Driver] Make clang/cc conforms to UNIX standard.

Also, the reason I don't know how to craft a testcase is not because I have
trouble with CUDA driver, but how to write a test to check when did the driver
bailed out. Let me know if you have any suggestions.

Nov 6 2017, 2:08 PM

Nov 2 2017

tra accepted D39586: [CUDA] Mark CUDA as a no-errno platform..
Nov 2 2017, 7:30 PM

Oct 25 2017

tra accepted D39109: [CUDA] Print an error if you try to compile with < sm_30 on CUDA 9..

LGTM.

Oct 25 2017, 1:33 PM

Oct 24 2017

tra committed rL316495: [NVPTX] allow address space inference for volatile loads/stores..
[NVPTX] allow address space inference for volatile loads/stores.
Oct 24 2017, 1:32 PM
tra closed D39026: [NVPTX] allow address space inference for volatile loads/stores. by committing rL316495: [NVPTX] allow address space inference for volatile loads/stores..
Oct 24 2017, 1:32 PM

Oct 23 2017

tra accepted D39109: [CUDA] Print an error if you try to compile with < sm_30 on CUDA 9..

The point was that we have two error messages for one problem -- this CUDA version does not support this GPU. The new message you've added (CUDA9, sm20) has to be rather verbose in order to be correct as it must deal with the possibility of either of the relevant arguments being the source of the error. The other end of the problem (CUDA<9, sm_70) should ideally be phrased similarly. But why do we need both? IMO both cases could be reported more consistently with a single message similar to the one you've added -- "CUDA version X does not support compiling for GPU arch Y. Use --cuda-gpu-arch to specify a different GPU arch, use --cuda-path to specify a different CUDA install, or pass --no-cuda-version-check."

Oct 23 2017, 7:13 PM
tra updated the diff for D39026: [NVPTX] allow address space inference for volatile loads/stores..

Grammar fix.

Oct 23 2017, 6:10 PM
tra added inline comments to D39026: [NVPTX] allow address space inference for volatile loads/stores..
Oct 23 2017, 3:21 PM
tra updated the diff for D39026: [NVPTX] allow address space inference for volatile loads/stores..

Addressed Justin's comments.

Oct 23 2017, 3:17 PM
tra added inline comments to D39109: [CUDA] Print an error if you try to compile with < sm_30 on CUDA 9..
Oct 23 2017, 1:38 PM

Oct 17 2017

tra created D39026: [NVPTX] allow address space inference for volatile loads/stores..
Oct 17 2017, 5:25 PM
tra requested changes to D39005: [OpenMP] Clean up variable and function names for NVPTX backend.

Justin is right. I completely forgot about this. :-/
Hal offered possible solution: https://reviews.llvm.org/D17738#661115

Oct 17 2017, 11:03 AM
tra accepted D39005: [OpenMP] Clean up variable and function names for NVPTX backend.
Oct 17 2017, 9:18 AM