tra (Artem Belevich)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 8 2015, 1:53 PM (201 w, 6 d)

Recent Activity

Thu, Nov 15

tra committed rL347013: Added missing whitespace in the link..
Added missing whitespace in the link.
Thu, Nov 15, 5:26 PM
tra committed rL347007: [CUDA] updated CompileCudaWithLLVM.rst.
[CUDA] updated CompileCudaWithLLVM.rst
Thu, Nov 15, 5:05 PM
tra closed D54608: [CUDA] updated CompileCudaWithLLVM.rst.
Thu, Nov 15, 5:05 PM
tra added inline comments to D54608: [CUDA] updated CompileCudaWithLLVM.rst.
Thu, Nov 15, 5:04 PM
tra created D54608: [CUDA] updated CompileCudaWithLLVM.rst.
Thu, Nov 15, 4:49 PM

Tue, Nov 13

tra accepted D54496: [HIP] Fix device only compilation.

Do I understand it correctly that the bug appears to affect HIP compilation only?

Tue, Nov 13, 2:54 PM

Fri, Nov 9

tra accepted D54275: [HIP] Remove useless sections in linked files.
Fri, Nov 9, 10:09 AM

Tue, Nov 6

tra added a comment to D54183: [HIP] Change default optimization level to -O3.

I'm not convinced that nvcc's behavior is a good guide for clang's defaults.

Tue, Nov 6, 2:16 PM

Wed, Oct 24

tra accepted D53657: [DEBUG_INFO][NVPTX]Fix processing of DBG_VALUES..
Wed, Oct 24, 2:04 PM
tra added inline comments to D53657: [DEBUG_INFO][NVPTX]Fix processing of DBG_VALUES..
Wed, Oct 24, 10:58 AM
tra added inline comments to D53657: [DEBUG_INFO][NVPTX]Fix processing of DBG_VALUES..
Wed, Oct 24, 10:21 AM

Oct 22 2018

tra accepted D53472: Add gfx904 and gfx906 to GPU Arch.
Oct 22 2018, 1:59 PM

Oct 5 2018

tra committed rC343875: [CUDA] Use all 64 bits of GUID in __nv_module_id.
[CUDA] Use all 64 bits of GUID in __nv_module_id
Oct 5 2018, 11:41 AM
tra committed rL343875: [CUDA] Use all 64 bits of GUID in __nv_module_id.
[CUDA] Use all 64 bits of GUID in __nv_module_id
Oct 5 2018, 11:41 AM
tra closed D52938: [CUDA] Use all 64 bits of GUID in __nv_module_id.
Oct 5 2018, 11:41 AM
tra added a comment to D52938: [CUDA] Use all 64 bits of GUID in __nv_module_id.

This particular change is largely cosmetic. I've just spotted this nit while I was debugging a different problem.

Oct 5 2018, 11:40 AM
tra created D52938: [CUDA] Use all 64 bits of GUID in __nv_module_id.
Oct 5 2018, 10:44 AM

Oct 2 2018

tra accepted D52377: [HIP] Support early finalization of device code for -fno-gpu-rdc.

LGTM.

Oct 2 2018, 10:20 AM

Oct 1 2018

tra committed rL343551: [test-suite, CUDA] Run GPU tests in parallel..
[test-suite, CUDA] Run GPU tests in parallel.
Oct 1 2018, 5:13 PM
tra added inline comments to D52377: [HIP] Support early finalization of device code for -fno-gpu-rdc.
Oct 1 2018, 4:05 PM
tra closed D52438: [CUDA] Add basic support for CUDA-10.0.

Landed in rC342924

Oct 1 2018, 11:35 AM

Sep 25 2018

tra accepted D52259: [CUDA] Fix two failed test cases using --cuda-path-ignore-env.

It's unfortunate that lit does not scrub the environment in order to make tests hermetic.

Sep 25 2018, 9:37 AM · Restricted Project

Sep 24 2018

tra committed rT342931: [test-suite, CUDA] Enable tests w/ CUDA-10.0.
[test-suite, CUDA] Enable tests w/ CUDA-10.0
Sep 24 2018, 5:25 PM
tra committed rL342931: [test-suite, CUDA] Enable tests w/ CUDA-10.0.
[test-suite, CUDA] Enable tests w/ CUDA-10.0
Sep 24 2018, 4:59 PM
tra committed rC342924: [CUDA] Added basic support for compiling with CUDA-10.0.
[CUDA] Added basic support for compiling with CUDA-10.0
Sep 24 2018, 4:14 PM
tra committed rL342924: [CUDA] Added basic support for compiling with CUDA-10.0.
[CUDA] Added basic support for compiling with CUDA-10.0
Sep 24 2018, 4:14 PM
tra added a comment to D52437: [CUDA] Add preliminary support for CUDA 10.0.

Great to see someone beating me to add support for a new CUDA version. :-)
I've posted my patch in D52438. It's very similar to yours with a couple of other necessary changes.

Sep 24 2018, 2:48 PM
tra created D52438: [CUDA] Add basic support for CUDA-10.0.
Sep 24 2018, 2:42 PM

Sep 21 2018

tra added a comment to D52377: [HIP] Support early finalization of device code for -fno-gpu-rdc.

Overall the patch look OK. I'll take a closer look on Monday.

Sep 21 2018, 2:02 PM
tra committed rL342752: [CUDA] Fixed parsing of optional template-argument-list..
[CUDA] Fixed parsing of optional template-argument-list.
Sep 21 2018, 10:50 AM
tra committed rC342752: [CUDA] Fixed parsing of optional template-argument-list..
[CUDA] Fixed parsing of optional template-argument-list.
Sep 21 2018, 10:50 AM
tra closed D52321: [CUDA] Fixed parsing of optional template-argument-list..
Sep 21 2018, 10:50 AM
tra updated the diff for D52321: [CUDA] Fixed parsing of optional template-argument-list..

Added '>=' and '>>=' to the list of tokens that may indicate the end of the
empty template argument list.

Sep 21 2018, 10:40 AM
tra committed rC342749: [CUDA] Ignore uncallable functions when we check for usual deallocators..
[CUDA] Ignore uncallable functions when we check for usual deallocators.
Sep 21 2018, 10:31 AM
tra committed rL342749: [CUDA] Ignore uncallable functions when we check for usual deallocators..
[CUDA] Ignore uncallable functions when we check for usual deallocators.
Sep 21 2018, 10:31 AM
tra closed D51808: [CUDA] Ignore uncallable functions when we check for usual deallocators..
Sep 21 2018, 10:31 AM

Sep 20 2018

tra created D52321: [CUDA] Fixed parsing of optional template-argument-list..
Sep 20 2018, 1:03 PM

Sep 19 2018

tra added a comment to D52259: [CUDA] Fix two failed test cases using --cuda-path-ignore-env.

The patch does not seem to match the description and appears to have nothing to do with rearranging include paths. Could you check if these are the changes you intended to send for review.

Sep 19 2018, 1:10 PM · Restricted Project

Sep 18 2018

tra updated the diff for D51808: [CUDA] Ignore uncallable functions when we check for usual deallocators..

Renamed last instance of 'Matches' -> 'PreventedBy'.

Sep 18 2018, 5:19 PM
tra updated the diff for D51808: [CUDA] Ignore uncallable functions when we check for usual deallocators..

Updated assertion message.

Sep 18 2018, 5:05 PM
tra committed rL342514: [clang-tidy] Replace redundant checks with an assert()..
[clang-tidy] Replace redundant checks with an assert().
Sep 18 2018, 2:52 PM
tra committed rCTE342514: [clang-tidy] Replace redundant checks with an assert()..
[clang-tidy] Replace redundant checks with an assert().
Sep 18 2018, 2:52 PM
tra closed D52179: [clang-tidy] Replace redundant checks with an assert()..
Sep 18 2018, 2:52 PM · Restricted Project

Sep 17 2018

tra added inline comments to D52179: [clang-tidy] Replace redundant checks with an assert()..
Sep 17 2018, 4:32 PM · Restricted Project
tra added inline comments to D52179: [clang-tidy] Replace redundant checks with an assert()..
Sep 17 2018, 4:00 PM · Restricted Project
tra updated the diff for D52179: [clang-tidy] Replace redundant checks with an assert()..
  • Check that D is non-null
Sep 17 2018, 3:31 PM · Restricted Project
tra added a comment to D52179: [clang-tidy] Replace redundant checks with an assert()..

Is the condition for this assertion checked beforehand or could this create runtime failures?

Sep 17 2018, 12:57 PM · Restricted Project
tra added a dependency for D51808: [CUDA] Ignore uncallable functions when we check for usual deallocators.: D52179: [clang-tidy] Replace redundant checks with an assert()..
Sep 17 2018, 11:25 AM
tra added a dependent revision for D52179: [clang-tidy] Replace redundant checks with an assert().: D51808: [CUDA] Ignore uncallable functions when we check for usual deallocators..
Sep 17 2018, 11:24 AM · Restricted Project
tra updated the diff for D51808: [CUDA] Ignore uncallable functions when we check for usual deallocators..

Addressed Richard's comments.
Moved clang-tidy changes into separate review D52179.

Sep 17 2018, 11:24 AM
tra created D52179: [clang-tidy] Replace redundant checks with an assert()..
Sep 17 2018, 11:20 AM · Restricted Project

Sep 13 2018

tra added a comment to D51808: [CUDA] Ignore uncallable functions when we check for usual deallocators..

@rsmith ping.

Sep 13 2018, 10:07 AM

Sep 7 2018

tra added a comment to D51809: [CUDA][HIP] Fix ShouldDeleteSpecialMember for inherited constructors.

@jlebar Justin, can you take a look?

Sep 7 2018, 1:33 PM
tra added a reviewer for D51809: [CUDA][HIP] Fix ShouldDeleteSpecialMember for inherited constructors: jlebar.
Sep 7 2018, 1:32 PM
tra created D51808: [CUDA] Ignore uncallable functions when we check for usual deallocators..
Sep 7 2018, 11:46 AM

Sep 4 2018

tra committed rT341430: [test-suite, CUDA] Update CUDA cmake files..
[test-suite, CUDA] Update CUDA cmake files.
Sep 4 2018, 4:50 PM
tra committed rL341430: [test-suite, CUDA] Update CUDA cmake files..
[test-suite, CUDA] Update CUDA cmake files.
Sep 4 2018, 4:44 PM
tra closed D51663: [test-suite, CUDA] Update CUDA test suite cmake files..
Sep 4 2018, 4:44 PM
tra added a comment to D51663: [test-suite, CUDA] Update CUDA test suite cmake files..

LGTM, sorry for the breakage.

Sep 4 2018, 4:44 PM
tra updated the diff for D51663: [test-suite, CUDA] Update CUDA test suite cmake files..

Updated the changes according to Matthias' comments.

Sep 4 2018, 4:41 PM
tra added a comment to D51048: cmake: Specify reference outputs in llvm_test_data().

How about D51663?

Sep 4 2018, 4:22 PM
tra created D51663: [test-suite, CUDA] Update CUDA test suite cmake files..
Sep 4 2018, 4:21 PM
tra added a comment to D51048: cmake: Specify reference outputs in llvm_test_data().

Your patch above *almost* works, except that each test variant wants to create the same symlink $BUILD/ExternalCUDA/$TEST.teferece_output -> $SRC/External/CUDA/$TEST.reference_output. If symlinks are created at different points in time ninja manages to avoid conflicts, but typically I get one or two attempts to create the symlink launches simultaneously and one of them fails with "symlink already exists". Perhaps the symlink target name should be uniquified in some way.

Sep 4 2018, 12:04 PM
tra added a comment to D51048: cmake: Specify reference outputs in llvm_test_data().

Though thinking about it, I don't really understand your problem:

How can there be multiple outputs with the same name? They have different names in the repository (algorithm.reference_output, assert.reference_output, ...)

Sep 4 2018, 11:41 AM
tra abandoned D51501: [CUDA] Fix CUDA compilation broken by D50845.

Not needed anymore after the reverts in rC341115 and rC341118, right?

Sep 4 2018, 9:52 AM

Aug 31 2018

tra added a comment to D51048: cmake: Specify reference outputs in llvm_test_data().

I've ran into an unexpected problem after this patches have landed.

Aug 31 2018, 5:33 PM
tra accepted D51554: [CUDA][OPENMP][NVPTX]Improve logic of the debug info support..

Nice. So, in effect, for optimized builds we'll generate pre-DWARF line info only, unless --cuda-noopt-device-debug is specified.
Will this deal with the warnings about back-end being unable to handle particular debug options?

Aug 31 2018, 2:08 PM

Aug 30 2018

tra added inline comments to D51507: Allow all supportable attributes to be used with #pragma clang attribute..
Aug 30 2018, 3:06 PM
tra added a comment to D51441: Add predefined macro __gnu_linux__ for proper aux-triple.

Tests reverted in rL341118.

Aug 30 2018, 1:55 PM
tra added a comment to D51312: [OpenMP][NVPTX] Use appropriate _CALL_ELF macro when offloading.

Reverted in rL341115 & rL341118.

Aug 30 2018, 1:55 PM
tra committed rC341118: Revert the tests that should've been reverted in rL341115.
Revert the tests that should've been reverted in rL341115
Aug 30 2018, 1:55 PM
tra committed rL341118: Revert the tests that should've been reverted in rL341115.
Revert the tests that should've been reverted in rL341115
Aug 30 2018, 1:55 PM
tra added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.

Tests reverted in rL341118

Aug 30 2018, 1:54 PM
tra added a comment to D51441: Add predefined macro __gnu_linux__ for proper aux-triple.

Reverted in rL341115.

Aug 30 2018, 1:47 PM
tra committed rL341115: Reverted the "[CUDA/OpenMP] Define only some host macros during device….
Reverted the "[CUDA/OpenMP] Define only some host macros during device…
Aug 30 2018, 1:44 PM
tra added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.

Reverted in rL341115

Aug 30 2018, 1:44 PM
tra committed rC341115: Reverted the "[CUDA/OpenMP] Define only some host macros during device….
Reverted the "[CUDA/OpenMP] Define only some host macros during device…
Aug 30 2018, 1:44 PM
tra added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.

That, and r340967 D51441. I'm running check-clang now and will land reverted changes shortly.

Aug 30 2018, 1:26 PM
tra added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.

Ok, the top preprocessor condition for that function is #ifndef __SSE2_MATH__ - the exact same macro that was part of the motivation. Can you please test compiling a simple C file (including math.h) with -mno-sse? My guess would be that this is broken as well.
If yes I'm fine with reverting because I need to teach Clang to allow anonymous unions in type specifiers to make that weird system header work with this patch.

Aug 30 2018, 1:00 PM
tra added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.

I've sent out D51501. It unbreaks CUDA compilation and keeps OpenMP unchanged.

Aug 30 2018, 12:07 PM
tra created D51501: [CUDA] Fix CUDA compilation broken by D50845.
Aug 30 2018, 12:04 PM
tra added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.

In general, it looks like this patch leads to some host macros having to be defined again for the auxiliary triple case. It is not clear to me how to exhaustively identify the missing macros, so far it's been just trial and error.

Well, that's the point of this patch, isn't it? Again, the current approach is to just define all macros which is definitely broken.

Aug 30 2018, 11:45 AM
tra added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.

In our case the headers from a relatively old glibc and compiler errors out on this:

/* This function is used in the `isfinite' macro.  */
__MATH_INLINE int
__NTH (__finite (double __x))
{
  return (__extension__
	  (((((union { double __d; int __i[2]; }) {__d: __x}).__i[1]
	     | 0x800fffffu) + 1) >> 31));
}
Aug 30 2018, 11:39 AM
tra added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.

FYI. This breaks our CUDA compilation. I haven't figured out what exactly is wrong yet. I may need to unroll the patch if the fix is not obvious.

Aug 30 2018, 11:20 AM
tra accepted D51465: Revamp test-suite documentation.

Looks great. Thank you for updating the docs.

Aug 30 2018, 9:47 AM

Aug 29 2018

tra added inline comments to D51434: [HIP] Add -fvisibility hidden option to clang.
Aug 29 2018, 2:30 PM
tra accepted D51441: Add predefined macro __gnu_linux__ for proper aux-triple.

While we're here, perhaps Builder.defineMacro("__linux__") should be changed to DefineStd("linux") which defines linux/__linux/__linux__?

Aug 29 2018, 11:40 AM
tra added inline comments to D51441: Add predefined macro __gnu_linux__ for proper aux-triple.
Aug 29 2018, 10:52 AM
tra added a comment to D51434: [HIP] Add -fvisibility hidden option to clang.

I could not find anything about PLTs in AMDGPU-ABI, nor could I find anything relevant on google.
I still have no idea why PLTs are required in this case. Without that info, the problem may as well be due to unintended requirement for PLT that this patch would hide.

Aug 29 2018, 10:26 AM
tra added a comment to D51434: [HIP] Add -fvisibility hidden option to clang.

Could you elaborate on what exactly is the problem this patch fixes?
I don't see how internalizing the symbols connects to PLTs. My understanding is that PLTs are used to provide stubs for symbols to be resolved by dynamic linker at runtime. AFAICT AMD does not use shared libs on device side. What do I miss?

Aug 29 2018, 9:37 AM

Aug 28 2018

tra accepted D51336: [HIP] Fix output file extension.
Aug 28 2018, 10:41 AM

Aug 27 2018

tra accepted D51256: [test-suite, CUDA] Fix some CMake problems.
Aug 27 2018, 11:35 AM
tra accepted D51306: [NVPTX] Implement isLegalToVectorizeLoadChain.

Nice. LGTM.

Aug 27 2018, 9:57 AM

Aug 24 2018

tra accepted D50845: [CUDA/OpenMP] Define only some host macros during device compilation.

Please keep an eye on CUDA buildbot http://lab.llvm.org:8011/builders/clang-cuda-build.
It runs fair amount of tests with libc++ and handful of libstdc++ versions and may a canary if these changes break something.

Aug 24 2018, 10:32 AM

Aug 23 2018

tra added a comment to D47757: [Sema] Produce diagnostics when unavailable aligned allocation/deallocation functions are called.

I've confirmed that the patch does not break anything in our CUDA code, so it's good to go as far as CUDA is concerned.

Aug 23 2018, 11:06 AM

Aug 22 2018

tra added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

__clang_cuda_device_functions.h is not intended to be a device-side math.h, despite having a lot of overlap/similarities. It may change at any time we get new CUDA version.
I would suggest writing an OpenMP-specific replacement for math.h which would map to whatever device-specific function OpenMP needs. For NVPTX that may be libdevice, for which you have declarations in __clang_cuda_libdevice_declares.h. Using part of __clang_cuda_device_functions.h may be a decent starting point for NVPTX, but OpenMP will likely need to provide an equivalent for other back-ends, too.

Aug 22 2018, 3:16 PM

Aug 21 2018

tra accepted D51042: [NVPTX] Remove ftz variants of cvt with rounding mode.
In D51042#1207769, @tra wrote:

This is a surprise. PTX ISA does not mention that .ftz is not applicable to cvt.*.f16.* instructions.
Is it only cvt that does not support .ftz or does it impact other instructions? PTX spec has add/sub/mul/fma/set/setp instructions that support f16 and have .ftz variant.

It's only cvt with an explicit rounding mode. I actually ran the output of f16-instructions.ll with FTZ through ptxas and removed instructions until it compiled it. This might even be a bug in ptxas.

Aug 21 2018, 11:00 AM
tra added a comment to D51042: [NVPTX] Remove ftz variants of cvt with rounding mode.

This is a surprise. PTX ISA does not mention that .ftz is not applicable to cvt.*.f16.* instructions.
Is it only cvt that does not support .ftz or does it impact other instructions? PTX spec has add/sub/mul/fma/set/setp instructions that support f16 and have .ftz variant.

Aug 21 2018, 9:44 AM

Aug 20 2018

tra accepted D50957: Rename -mlink-cuda-bitcode to -mlink-builtin-bitcode.
Aug 20 2018, 10:31 AM

Aug 17 2018

tra added a comment to D47757: [Sema] Produce diagnostics when unavailable aligned allocation/deallocation functions are called.

Talked to @ahatanak over IRC. It appears that this patch may have exposed a preexisting bug.
Apparently delete t; in test/SemaCUDA/call-host-fn-from-device.cu does actually end up calling __host__ operator delete. It should've picked __device__ operator delete, but it does not, so reporting an error here appears to be correct.

Aug 17 2018, 3:27 PM