Page MenuHomePhabricator

tra (Artem Belevich)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 8 2015, 1:53 PM (227 w, 2 d)

Recent Activity

Fri, May 17

tra accepted D62046: [OpenMP][bugfix] Add missing math functions variants for log and abs..

I'd add a comment with a brief explanation for the const variant and a TODO() to remove it.

Fri, May 17, 9:32 AM · Restricted Project

Thu, May 16

tra added inline comments to D62046: [OpenMP][bugfix] Add missing math functions variants for log and abs..
Thu, May 16, 5:03 PM · Restricted Project

Wed, May 15

tra added a comment to D61949: [OpenMP][bugfix] Fix issues with C++ 17 compilation when handling math functions.

LGTM.

Wed, May 15, 12:12 PM · Restricted Project
tra added inline comments to D61949: [OpenMP][bugfix] Fix issues with C++ 17 compilation when handling math functions.
Wed, May 15, 9:43 AM · Restricted Project

Mon, May 13

tra accepted D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion..

This won't affect CUDA in any way, all we have added is OpenMP specific.

Mon, May 13, 2:32 PM · Restricted Project
tra added a comment to D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion..

As soon as libc++ the limits header included in

__clang_cuda_cmath.h:15
``` is not found:

__clang_cuda_cmath.h:15:10: fatal error: 'limits' file not found
#include <limits>

Not even CUDA works actually so I'm not sure what the best answer to this problem is.
Mon, May 13, 1:06 PM · Restricted Project
tra added a comment to D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion..

Two small changes and then it is fine with me. @tra ?

Mon, May 13, 10:32 AM · Restricted Project

Fri, May 10

tra added inline comments to D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion..
Fri, May 10, 2:37 PM · Restricted Project
tra added inline comments to D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion..
Fri, May 10, 2:01 PM · Restricted Project
tra added inline comments to D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion..
Fri, May 10, 1:53 PM · Restricted Project
tra added inline comments to D61765: [OpenMP][Clang][BugFix] Split declares and math functions inclusion..
Fri, May 10, 1:24 PM · Restricted Project

Fri, May 3

tra committed rL359928: [CUDA buildbot] tell libunwind where to find libcxx..
[CUDA buildbot] tell libunwind where to find libcxx.
Fri, May 3, 1:38 PM
tra accepted D61474: [CUDA][Clang][Bugfix] Add missing CUDA 9.2 case.
Fri, May 3, 9:38 AM · Restricted Project
tra added inline comments to D61458: [hip] Relax CUDA call restriction within `decltype` context..
Fri, May 3, 9:24 AM · Restricted Project

Thu, May 2

tra committed rG4cbb23502612: [CUDA] Do not pass deprecated option fo fatbinary (authored by tra).
[CUDA] Do not pass deprecated option fo fatbinary
Thu, May 2, 3:36 PM
tra committed rL359838: [CUDA] Do not pass deprecated option fo fatbinary.
[CUDA] Do not pass deprecated option fo fatbinary
Thu, May 2, 3:35 PM
tra committed rC359838: [CUDA] Do not pass deprecated option fo fatbinary.
[CUDA] Do not pass deprecated option fo fatbinary
Thu, May 2, 3:35 PM
tra closed D61470: [CUDA] Do not pass deprecated option fo fatbinary.
Thu, May 2, 3:35 PM · Restricted Project
tra created D61470: [CUDA] Do not pass deprecated option fo fatbinary.
Thu, May 2, 3:03 PM · Restricted Project
tra added inline comments to D61458: [hip] Relax CUDA call restriction within `decltype` context..
Thu, May 2, 2:27 PM · Restricted Project
tra added a comment to D61458: [hip] Relax CUDA call restriction within `decltype` context..
In D61458#1488523, @tra wrote:

Perhaps we should allow this in all unevaluated contexts?
I.e. int s = sizeof(foo(x)); should also work.

good point, do we have a dedicated context for sizeof? that make the checking easier.

Thu, May 2, 2:02 PM · Restricted Project
tra added a comment to D61458: [hip] Relax CUDA call restriction within `decltype` context..

Perhaps we should allow this in all unevaluated contexts?
I.e. int s = sizeof(foo(x)); should also work.

Thu, May 2, 1:37 PM · Restricted Project
tra added a reviewer for D61458: [hip] Relax CUDA call restriction within `decltype` context.: jlebar.
Thu, May 2, 1:35 PM · Restricted Project
tra accepted D61399: [OpenMP][Clang] Support for target math functions.
Thu, May 2, 10:45 AM · Restricted Project

Wed, May 1

tra added a comment to D61396: [hip] Fix ambiguity from `>>>` of CUDA..

LGTM, but I've added @rsmith who is way more familiar with this code.

Wed, May 1, 1:22 PM · Restricted Project, Restricted Project
tra added a reviewer for D61396: [hip] Fix ambiguity from `>>>` of CUDA.: rsmith.
Wed, May 1, 1:16 PM · Restricted Project, Restricted Project

Tue, Apr 30

tra added a comment to D60907: [OpenMP] Add math functions support in OpenMP offloading.

I actually don't want to preinclude anything and my arguments are (mostly) for the OpenMP offloading code path not necessarily Cuda.
Maybe to clarify, what I want is:

  1. Make sure the clang/Headers/math.h is found first if math.h is included.
  2. Use a scheme similar to the one described https://reviews.llvm.org/D47849#1483653 in clang/Headers/math.h
  3. Only add math.h function overloads in our math.h. <- This is debatable
Tue, Apr 30, 11:42 AM · Restricted Project
tra added a comment to D60907: [OpenMP] Add math functions support in OpenMP offloading.

+1 to Hal's comments.

Tue, Apr 30, 10:33 AM · Restricted Project

Mon, Apr 29

tra added a comment to D61274: [Sema][AST] Explicit visibility for OpenCL/CUDA kernels/variables.

A kernel functions in CUDA is actually two different functions. One is the real kernel we compile for the GPU, another is a host-side stub that launches the device-side kernel.

Mon, Apr 29, 2:21 PM · Restricted Project

Thu, Apr 25

tra committed rG5fe85a003f6b: [CUDA] Implemented _[bi]mma* builtins. (authored by tra).
[CUDA] Implemented _[bi]mma* builtins.
Thu, Apr 25, 3:28 PM
tra committed rG16737538f4fc: PTX 6.3 extends `wmma` instruction to support s8/u8/s4/u4/b1 -> s32. (authored by tra).
PTX 6.3 extends `wmma` instruction to support s8/u8/s4/u4/b1 -> s32.
Thu, Apr 25, 3:28 PM
tra committed rG8d825b38ed2c: [NVPTX] generate correct MMA instruction mnemonics with PTX63+. (authored by tra).
[NVPTX] generate correct MMA instruction mnemonics with PTX63+.
Thu, Apr 25, 3:27 PM
tra committed rG7ecd82ce19ae: [NVPTX] Refactor generation of MMA intrinsics and instructions. NFC. (authored by tra).
[NVPTX] Refactor generation of MMA intrinsics and instructions. NFC.
Thu, Apr 25, 3:27 PM
tra committed rL359248: [CUDA] Implemented _[bi]mma* builtins..
[CUDA] Implemented _[bi]mma* builtins.
Thu, Apr 25, 3:27 PM
tra committed rC359248: [CUDA] Implemented _[bi]mma* builtins..
[CUDA] Implemented _[bi]mma* builtins.
Thu, Apr 25, 3:27 PM
tra closed D60279: [CUDA] Implemented _[bi]mma* builtins..
Thu, Apr 25, 3:27 PM · Restricted Project, Restricted Project
tra committed rL359247: PTX 6.3 extends `wmma` instruction to support s8/u8/s4/u4/b1 -> s32..
PTX 6.3 extends `wmma` instruction to support s8/u8/s4/u4/b1 -> s32.
Thu, Apr 25, 3:27 PM
tra closed D60015: [NVPTX] Added intrinsics/instructions for MMA ops on (sub-)integers.
Thu, Apr 25, 3:26 PM · Restricted Project
tra committed rL359246: [NVPTX] generate correct MMA instruction mnemonics with PTX63+..
[NVPTX] generate correct MMA instruction mnemonics with PTX63+.
Thu, Apr 25, 3:26 PM
tra closed D59393: [NVPTX] generate correct MMA instruction mnemonics with PTX63+..
Thu, Apr 25, 3:26 PM · Restricted Project
tra committed rL359245: [NVPTX] Refactor generation of MMA intrinsics and instructions. NFC..
[NVPTX] Refactor generation of MMA intrinsics and instructions. NFC.
Thu, Apr 25, 3:26 PM
tra closed D59389: [NVPTX] Refactor generation of MMA intrinsics and instructions. NFC..
Thu, Apr 25, 3:26 PM · Restricted Project

Mon, Apr 22

tra added a comment to D60985: Fix compatability for cuda sm_75.

FYI, I have almost-ready set of patches to implement missing bits of sm_75 support, including this change:
https://reviews.llvm.org/D60279

Mon, Apr 22, 4:32 PM · Restricted Project

Apr 17 2019

tra accepted D60818: [CUDA][Windows] restrict long double device functions declarations to Windows.

LGTM. Thank you for cleaning this up.

Apr 17 2019, 9:24 AM · Restricted Project

Apr 15 2019

tra added a comment to D60727: [NVPTXAsmPrinter] clean up dead code. NFC.

Nick: Can you do some archaeology on the original patch and find out if there was supposed to be something supported here?

Art: Any thoughts?

Apr 15 2019, 12:06 PM · Restricted Project

Apr 12 2019

tra updated subscribers of D60620: [HIP] Support -offloading-target-id.

@arsenm Matt, FYI, this patch seems to be a continuation of D59863 you've commented on.

Apr 12 2019, 10:28 AM
tra updated subscribers of D60620: [HIP] Support -offloading-target-id.

It looks like you are solving two problems here.
a) you want to create multiple device passes for the same GPU, but with different options.
b) you may want to pass different compiler options to different device compilations.
The patch effectively hard-codes {gpu, options} tuple into --offloading-target-id variant.
Is that correct?

Apr 12 2019, 10:22 AM

Apr 10 2019

tra accepted D60513: [HIP] Use -mlink-builtin-bitcode to link device library.
Apr 10 2019, 12:01 PM · Restricted Project

Apr 8 2019

tra updated the diff for D60279: [CUDA] Implemented _[bi]mma* builtins..
  • Converted class to struct+function as Tim suggested.
Apr 8 2019, 5:09 PM · Restricted Project, Restricted Project

Apr 5 2019

tra added a comment to D60220: [CUDA][Windows] Final fix for bug 38811 (Step 3 of 3).

Oooh, sorry, but I've just pushed the fix. But with the following words: "Add missing long double device functions' declarations. Provide only declarations to prevent any use of long double on the device side, because CUDA does not support long double on the device side."

Apr 5 2019, 10:08 AM · Restricted Project
tra added a comment to D60220: [CUDA][Windows] Final fix for bug 38811 (Step 3 of 3).

One more thing -- perhaps the long double declarations should be put under #ifndef _MSC_VER in all the files to make the change unobservable on non-windows platforms.

Apr 5 2019, 9:43 AM · Restricted Project
tra accepted D60220: [CUDA][Windows] Final fix for bug 38811 (Step 3 of 3).

Thank you for fixing this!

Apr 5 2019, 9:32 AM · Restricted Project

Apr 4 2019

tra updated the diff for D60279: [CUDA] Implemented _[bi]mma* builtins..
  • Added PTX64 to the list of builtins' constraints.
Apr 4 2019, 4:44 PM · Restricted Project, Restricted Project
tra updated the diff for D60279: [CUDA] Implemented _[bi]mma* builtins..
  • Fixed minor issues with parameters of the new builtins:
    • __imma*_st_c_i32 builtins have 'const int * src'
    • __bmma_m8n8k128_mma_xor_popc_b1 does not have 'satf' argument.
Apr 4 2019, 3:49 PM · Restricted Project, Restricted Project
tra updated the diff for D60279: [CUDA] Implemented _[bi]mma* builtins..

Cleaned up mma test generation.

Apr 4 2019, 1:51 PM · Restricted Project, Restricted Project
tra updated the diff for D60015: [NVPTX] Added intrinsics/instructions for MMA ops on (sub-)integers.
  • Enabled .satf for s4/u4.
Apr 4 2019, 11:35 AM · Restricted Project
tra added a child revision for D59393: [NVPTX] generate correct MMA instruction mnemonics with PTX63+.: D60015: [NVPTX] Added intrinsics/instructions for MMA ops on (sub-)integers.
Apr 4 2019, 11:32 AM · Restricted Project
tra added a parent revision for D60015: [NVPTX] Added intrinsics/instructions for MMA ops on (sub-)integers: D59393: [NVPTX] generate correct MMA instruction mnemonics with PTX63+..
Apr 4 2019, 11:32 AM · Restricted Project
tra added a child revision for D60015: [NVPTX] Added intrinsics/instructions for MMA ops on (sub-)integers: D60279: [CUDA] Implemented _[bi]mma* builtins..
Apr 4 2019, 11:32 AM · Restricted Project
tra added a parent revision for D60279: [CUDA] Implemented _[bi]mma* builtins.: D60015: [NVPTX] Added intrinsics/instructions for MMA ops on (sub-)integers.
Apr 4 2019, 11:32 AM · Restricted Project, Restricted Project
tra created D60279: [CUDA] Implemented _[bi]mma* builtins..
Apr 4 2019, 11:29 AM · Restricted Project, Restricted Project

Apr 3 2019

tra added inline comments to D60220: [CUDA][Windows] Final fix for bug 38811 (Step 3 of 3).
Apr 3 2019, 11:27 AM · Restricted Project
tra accepted D60168: [test-suite,CUDA] Add #include <stdio.h> to test_round.cu to fix a build error..
Apr 3 2019, 10:35 AM · Restricted Project

Apr 2 2019

tra added inline comments to D59152: [libc++] Build <filesystem> support as part of the dylib.
Apr 2 2019, 5:34 PM · Restricted Project
tra added inline comments to D59152: [libc++] Build <filesystem> support as part of the dylib.
Apr 2 2019, 2:42 PM · Restricted Project
tra added a comment to D60141: [HIP-Clang] Fat binary should not be produced for non GPU code.

Hi Artem, I had just committed the change. IS this change OK or should I revert it?

Apr 2 2019, 2:37 PM · Restricted Project, Restricted Project
tra added inline comments to D59152: [libc++] Build <filesystem> support as part of the dylib.
Apr 2 2019, 2:30 PM · Restricted Project
tra accepted D60141: [HIP-Clang] Fat binary should not be produced for non GPU code.
Apr 2 2019, 1:46 PM · Restricted Project, Restricted Project
tra added inline comments to D59152: [libc++] Build <filesystem> support as part of the dylib.
Apr 2 2019, 1:41 PM · Restricted Project
tra added a comment to D60141: [HIP-Clang] Fat binary should not be produced for non GPU code.

General nit: please use diffs with very large context when you submit patches with Phabricator.
https://llvm.org/docs/Phabricator.html#requesting-a-review-via-the-web-interface

Apr 2 2019, 1:20 PM · Restricted Project, Restricted Project

Apr 1 2019

tra added inline comments to D60015: [NVPTX] Added intrinsics/instructions for MMA ops on (sub-)integers.
Apr 1 2019, 4:37 PM · Restricted Project

Mar 29 2019

tra accepted D59947: [NVPTX] Fix the codegen for llvm.round..
Mar 29 2019, 4:17 PM · Restricted Project
tra created D60015: [NVPTX] Added intrinsics/instructions for MMA ops on (sub-)integers.
Mar 29 2019, 2:56 PM · Restricted Project
tra added inline comments to D59947: [NVPTX] Fix the codegen for llvm.round..
Mar 29 2019, 1:42 PM · Restricted Project

Mar 28 2019

tra accepted D59950: [test-suite,CUDA] Add a test case to test the edge cases for the implementation of llvm.round intrinsic in the PTX backend..
Mar 28 2019, 7:38 PM · Restricted Project
tra added a parent revision for D59950: [test-suite,CUDA] Add a test case to test the edge cases for the implementation of llvm.round intrinsic in the PTX backend.: D59947: [NVPTX] Fix the codegen for llvm.round..
Mar 28 2019, 7:34 PM · Restricted Project
tra added a child revision for D59947: [NVPTX] Fix the codegen for llvm.round.: D59950: [test-suite,CUDA] Add a test case to test the edge cases for the implementation of llvm.round intrinsic in the PTX backend..
Mar 28 2019, 7:34 PM · Restricted Project
tra added inline comments to D59947: [NVPTX] Fix the codegen for llvm.round..
Mar 28 2019, 3:36 PM · Restricted Project
tra added inline comments to D59950: [test-suite,CUDA] Add a test case to test the edge cases for the implementation of llvm.round intrinsic in the PTX backend..
Mar 28 2019, 3:22 PM · Restricted Project
tra added inline comments to D59900: [Sema] Fix a crash when nonnull checking.
Mar 28 2019, 3:05 PM · Restricted Project, Restricted Project

Mar 27 2019

tra updated subscribers of D59900: [Sema] Fix a crash when nonnull checking.

@rsmith, @jlebar I'm out of my depth here and could use some language lawyering help.

Mar 27 2019, 3:19 PM · Restricted Project, Restricted Project
tra added inline comments to D59863: [HIP] Support gpu arch gfx906+sram-ecc.
Mar 27 2019, 10:39 AM

Mar 23 2019

tra added a comment to D59451: Fix gettid warnings and one test on FreeBSD.

I am unsure whether the problem with the build system as referred to by the remark "switch to -pthread once the rest of the build system can deal with it" is now solved, though. @tra, any idea?

Mar 23 2019, 8:05 PM · Restricted Project, Restricted Project

Mar 21 2019

tra added a comment to D59647: [CUDA][HIP] Warn shared var initialization.

This looks like one of the things we should *not* do as it affects correctness -- non-trivial constructor may be arbitrarily complex and the per-TU flag to enable this behavior is way too coarse, IMO.
On the other hand, I can believe that someone somewhere did write the code and relies to NVCC accepting it.

Mar 21 2019, 11:14 AM · Restricted Project
tra added reviewers for D59647: [CUDA][HIP] Warn shared var initialization: jlebar, rsmith.
Mar 21 2019, 11:08 AM · Restricted Project

Mar 20 2019

tra added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

This is, or is very similar to, the problem that the host/device overloading addresses in CUDA.

Mar 20 2019, 10:10 AM · Restricted Project

Mar 19 2019

tra added inline comments to D59393: [NVPTX] generate correct MMA instruction mnemonics with PTX63+..
Mar 19 2019, 11:27 AM · Restricted Project

Mar 18 2019

tra added inline comments to D59393: [NVPTX] generate correct MMA instruction mnemonics with PTX63+..
Mar 18 2019, 5:16 PM · Restricted Project
tra updated the diff for D59393: [NVPTX] generate correct MMA instruction mnemonics with PTX63+..

Rebased on updated D59389

Mar 18 2019, 5:09 PM · Restricted Project
tra updated the diff for D59389: [NVPTX] Refactor generation of MMA intrinsics and instructions. NFC..
  • Addressed Tim's comments.
Mar 18 2019, 4:14 PM · Restricted Project
tra updated the diff for D59389: [NVPTX] Refactor generation of MMA intrinsics and instructions. NFC..
  • Addressed Tim's comments.
Mar 18 2019, 4:10 PM · Restricted Project
tra added inline comments to D59389: [NVPTX] Refactor generation of MMA intrinsics and instructions. NFC..
Mar 18 2019, 4:10 PM · Restricted Project

Mar 15 2019

tra added a comment to D59423: [CUDA][Windows] Partial fix for bug 38811 (Step 2 of 3).

The intent is to avoid unintentional clashes with the preprocessor macros the user may have defined.
https://reviews.llvm.org/rL260647

Mar 15 2019, 11:59 AM · Restricted Project
tra accepted D59423: [CUDA][Windows] Partial fix for bug 38811 (Step 2 of 3).
Mar 15 2019, 11:41 AM · Restricted Project
tra added a comment to D59423: [CUDA][Windows] Partial fix for bug 38811 (Step 2 of 3).

Perhaps for consistency sake it would be better to replace __sptr -> __s and __cptr -> __c.

Well, it came from NVIDIA code, you know, I mean all those double underscores.

Mar 15 2019, 11:40 AM · Restricted Project
tra added a comment to D59423: [CUDA][Windows] Partial fix for bug 38811 (Step 2 of 3).

___ stands out as a sore thumb and raises unnecessary questions -- "why does it have three underscores, while __cptr is fine with two?".
Perhaps for consistency sake it would be better to replace __sptr -> __s and __cptr -> __c.
Given that we're just passing the args through, we don't really need to have ptr here.

Mar 15 2019, 11:06 AM · Restricted Project

Mar 14 2019

tra created D59393: [NVPTX] generate correct MMA instruction mnemonics with PTX63+..
Mar 14 2019, 3:19 PM · Restricted Project
tra added a parent revision for D59393: [NVPTX] generate correct MMA instruction mnemonics with PTX63+.: D59389: [NVPTX] Refactor generation of MMA intrinsics and instructions. NFC..
Mar 14 2019, 3:19 PM · Restricted Project
tra added a child revision for D59389: [NVPTX] Refactor generation of MMA intrinsics and instructions. NFC.: D59393: [NVPTX] generate correct MMA instruction mnemonics with PTX63+..
Mar 14 2019, 3:19 PM · Restricted Project
tra created D59389: [NVPTX] Refactor generation of MMA intrinsics and instructions. NFC..
Mar 14 2019, 2:51 PM · Restricted Project