tra (Artem Belevich)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 8 2015, 1:53 PM (179 w, 4 d)

Recent Activity

Today

tra added inline comments to D48013: TableGen/SearchableTables: Support more generic enums and tables.
Mon, Jun 18, 11:49 AM

Wed, Jun 13

tra accepted D48037: [CUDA] Add tests to ensure that std::min/max can be called from __host__ __device__ functions..
Wed, Jun 13, 3:19 PM
tra accepted D48152: [CUDA] Add tests that, in C++14 mode, min/max are constexpr..
Wed, Jun 13, 3:18 PM
tra accepted D48151: [CUDA] Make __host__/__device__ min/max overloads constexpr in C++14..
Wed, Jun 13, 3:17 PM
tra added a comment to D48036: [CUDA] Make min/max shims host+device..

Ack.

Wed, Jun 13, 9:59 AM

Tue, Jun 12

tra added a comment to D47691: [NVPTX] Ignore target-cpu and -features for inling.

I've discussed this with @echristo . Bottom line is that this patch is OK to proceed, and that I'm on the hook to add a NVPTX-specific pass to enforce that all target-cpu and and target-feature attributes in the TU are compatible with the current compilation parameters. I.e. if we're compiling for sm_50, then IR with functions that have target-cpu=sm_60 will result in an error.

Tue, Jun 12, 5:33 PM
tra added a comment to D48013: TableGen/SearchableTables: Support more generic enums and tables.

Few nits. LGTM syntax-wise. I'm not very familiar with backend, so will defer to someone who has better idea.

Tue, Jun 12, 5:23 PM
tra added a comment to D48036: [CUDA] Make min/max shims host+device..

Last comment in the bug pointed out that those overloads should be constexpr in c++14. Maybe in a separate patch, though.

Tue, Jun 12, 11:22 AM
tra accepted D47958: [CUDA][HIP] Allow CUDA __global__ functions to have amdgpu kernel attributes.

Thank you.

Tue, Jun 12, 10:34 AM
tra added a comment to D47691: [NVPTX] Ignore target-cpu and -features for inling.

Bottom line -- the situation is far from perfect, but IMO the patch does sensible thing if we're compiling IR with mixed target-cpu and target-features attributes using NVPTX.

In summary, I think that's okay so long as there aren't intrinsics that depend on target features.

Tue, Jun 12, 9:57 AM

Fri, Jun 8

tra added a comment to D47958: [CUDA][HIP] Allow CUDA __global__ functions to have amdgpu kernel attributes.

Drive-by review:

Fri, Jun 8, 1:59 PM

Thu, Jun 7

tra accepted D47902: [CUDA] Fix emission of constant strings in sections.
Thu, Jun 7, 1:28 PM
tra added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

IMO this goes into the right direction, we should use the fast implementation in libdevice. If LLVM doesn't lower these calls in the NVPTX backend, I think it's ok to use header wrappers as CUDA already does.

Thu, Jun 7, 10:15 AM

Wed, Jun 6

tra committed rC334143: [CUDA] Check initializers of instantiated template variables..
[CUDA] Check initializers of instantiated template variables.
Wed, Jun 6, 3:41 PM
tra committed rL334143: [CUDA] Check initializers of instantiated template variables..
[CUDA] Check initializers of instantiated template variables.
Wed, Jun 6, 3:41 PM
tra created D47845: [CUDA] Removed unused __nvvm_* builtins with non-generic pointers..
Wed, Jun 6, 1:31 PM
tra accepted D47431: TableGen: Allow foreach in multiclass to depend on template args.

Few nits. LGTM overall.

Wed, Jun 6, 1:10 PM
tra committed rL334108: [CUDA] Replace 'nv_weak' attributes in CUDA headers with 'weak'..
[CUDA] Replace 'nv_weak' attributes in CUDA headers with 'weak'.
Wed, Jun 6, 10:57 AM
tra committed rC334108: [CUDA] Replace 'nv_weak' attributes in CUDA headers with 'weak'..
[CUDA] Replace 'nv_weak' attributes in CUDA headers with 'weak'.
Wed, Jun 6, 10:57 AM
tra closed D47804: [CUDA] Replace 'nv_weak' attributes in CUDA headers with 'weak'..
Wed, Jun 6, 10:57 AM
tra accepted D47555: [HIP] Fix unbundling.

Few minor nits/suggestions. LGTM otherwise.

Wed, Jun 6, 10:49 AM
tra added a comment to D47733: [CUDA][HIP] Set kernel calling convention before arrange function.

@rsmith - Richard, can you take a look?

Wed, Jun 6, 10:06 AM
tra added a reviewer for D47733: [CUDA][HIP] Set kernel calling convention before arrange function: rsmith.
Wed, Jun 6, 10:05 AM

Tue, Jun 5

tra added a comment to D47376: [CUDA][HIP] Do not offload for -M.

Just to make it clear, I'm not against making a sensible default choice, but rather want to make sure that it is possible to override it if the user needs to.

Tue, Jun 5, 5:35 PM
tra requested changes to D47376: [CUDA][HIP] Do not offload for -M.

I'm not sure this is the right thing to do. What if user explicitly wants device-side dependencies and runs clang --cuda-device-only -M ? This patch makes it impossible.

Tue, Jun 5, 5:31 PM
tra added a comment to D47804: [CUDA] Replace 'nv_weak' attributes in CUDA headers with 'weak'..

I'll wait to see if that fixes @Hahnfeld's problem.

Tue, Jun 5, 5:13 PM
tra created D47804: [CUDA] Replace 'nv_weak' attributes in CUDA headers with 'weak'..
Tue, Jun 5, 4:51 PM
tra added a comment to D47201: [CUDA] Implement nv_weak attribute for functions.

I've experimented a bit and I think that we may not need this patch at all.
As far as I can tell, nv_weak is only applicable to device functions. It's ignored for global kernels and is apparently forbidden for data.
For device functions nvcc produces .weak attribute in PTX.

Tue, Jun 5, 4:15 PM
tra removed a reviewer for D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain: tra.
Tue, Jun 5, 3:49 PM
tra added a comment to D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain.

With the updated patch description + the discussion I'm OK with the approach from the general "how do we compile/use CUDA" point of view. I'll leave the question of whether the approach works for OpenMP to someone more familiar with it.

Tue, Jun 5, 3:48 PM
tra added a comment to D47733: [CUDA][HIP] Set kernel calling convention before arrange function.

Looks OK overall, but I'm not very familiar with CGCall, so you may want to dig through the history and find someone with more expertise.

Tue, Jun 5, 2:28 PM
tra added inline comments to D47691: [NVPTX] Ignore target-cpu and -features for inling.
Tue, Jun 5, 1:07 PM
tra accepted D47691: [NVPTX] Ignore target-cpu and -features for inling.
Tue, Jun 5, 11:51 AM

Fri, Jun 1

tra added a comment to D47070: [CUDA] Upgrade linked bitcode to enable inlining.

IMO overriding TargetTransformInfo::areInlineCompatible to always return true on NVPTX is what we want to do instead of upgrading everything else.
AFAICT, on NVPTX there's no reason to prevent inlining due to those attributes -- we'll never generate code, nor will we ever execute it on any other GPU than we're currently compiling for.

Fri, Jun 1, 10:42 AM
tra added a comment to D47201: [CUDA] Implement nv_weak attribute for functions.

IIUIC, nv_weak is a synonym for weak (<rant>why, oh why did they need it?</rant>)
You may need to hunt down and change few other places that deal with the weak attribute.
E.g.: https://github.com/llvm-project/llvm-project-20170507/blob/master/clang/lib/AST/Decl.cpp#L4267
https://github.com/llvm-project/llvm-project-20170507/blob/master/clang/lib/CodeGen/ItaniumCXXABI.cpp#L3045

Fri, Jun 1, 10:25 AM

Thu, May 31

tra added a comment to D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain.

I tried this example (https://devblogs.nvidia.com/separate-compilation-linking-cuda-device-code/). It worked with NVCC but not with clang++. I can produce the main.o particle.o and v.o objects as relocatable (-fcuda-rdc) but the final step fails with a missing reference error.

Thu, May 31, 3:23 PM

Wed, May 30

tra accepted D47432: TableGen/DAGPatterns: Allow bit constants in addition to int constants.
Wed, May 30, 11:51 AM
tra added a comment to D47432: TableGen/DAGPatterns: Allow bit constants in addition to int constants.

I agree on the test case, unfortunately it requires a lot of setup to make the backend do anything useful at all.

Wed, May 30, 11:50 AM
tra accepted D47530: TableGen: some LangRef doc fixes.
Wed, May 30, 11:39 AM
tra accepted D47430: TableGen: Streamline the semantics of NAME.
Wed, May 30, 11:38 AM

Tue, May 29

tra added a comment to D47430: TableGen: Streamline the semantics of NAME.

Perhaps unrelated documentation cleanup can be extracted into a separate patch.

Tue, May 29, 2:01 PM
tra added a comment to D47432: TableGen/DAGPatterns: Allow bit constants in addition to int constants.

LGTM in general.
It would be good to add a test that we can use bits w/o explicit casting now.

Tue, May 29, 11:48 AM
tra accepted D46476: [HIP] Add action builder for HIP.

One nit. LGTM otherwise.

Tue, May 29, 11:45 AM
tra added a comment to D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain.

"Interoperability with other compilers" is probably a statement that's a bit too strong. At best it's kind of compatible with CUDA tools and I don't think it's feasible for other compilers. I.e. it will be useless for AMD GPUs and whatever compiler they use.

Tue, May 29, 11:37 AM

Fri, May 25

tra added a comment to D47396: [LLD] Place .nv_fatbin section at the beginning of the executable..

Does this just not work well with other linkers, do other linkers also have special handling, or does it work better elsewhere for some other reason?

Fri, May 25, 4:49 PM
tra created D47396: [LLD] Place .nv_fatbin section at the beginning of the executable..
Fri, May 25, 3:47 PM

Wed, May 23

tra accepted D45212: Add HIP toolchain.

One small nit. LGTM otherwise.

Wed, May 23, 1:42 PM
tra added inline comments to D45212: Add HIP toolchain.
Wed, May 23, 12:13 PM
tra committed rL333098: [CUDA] Fixed the list of GPUs supported by CUDA-9..
[CUDA] Fixed the list of GPUs supported by CUDA-9.
Wed, May 23, 9:50 AM
tra committed rC333098: [CUDA] Fixed the list of GPUs supported by CUDA-9..
[CUDA] Fixed the list of GPUs supported by CUDA-9.
Wed, May 23, 9:50 AM
tra closed D47268: [CUDA] Fixed the list of GPUs supported by CUDA-9.
Wed, May 23, 9:49 AM
tra created D47268: [CUDA] Fixed the list of GPUs supported by CUDA-9.
Wed, May 23, 9:31 AM

Tue, May 22

tra added a comment to D47154: Try to make builtin address space declarations not useless.

CUDA does not expose explicit AS on clang size. All pointers are treated as generic and we infer specific address space only in LLVM.
__nvvm_atom_*_[sg]_* builtins should probably be removed as they are indeed useless without pointers with explicit AS and NVCC itself does not have such builtins either. Instead, we should convert the generic AS builtin to address-space specific instruction somewhere in LLVM.

Tue, May 22, 4:20 PM
tra added a comment to D47070: [CUDA] Upgrade linked bitcode to enable inlining.

As a short-term fix we can disable feature-to-function attribute propagation for NVPTX until we fix it.

@echristo -- any other suggestions?

This is some of what I was talking about when I was mentioning how function attributes and the targets work. Ideally you'll have a compatible set of features and it won't really cause an issue. The idea is that if you're compiling for a minimum ptx feature of X, then any "compatible" set of ptx should be able to inline into your code. I think you do want the features to propagate in general, just specific use cases may not care one way or another - that said, for those use cases you're probably just compiling everything with the same feature anyhow.

Tue, May 22, 4:09 PM
tra added a comment to D47139: [NaryReassociate] Detect deleted instr with WeakTrackingVH.

@sanjoy is more familiar with this, so I've asked him to take a look.

Tue, May 22, 11:38 AM
tra edited reviewers for D47139: [NaryReassociate] Detect deleted instr with WeakTrackingVH, added: sanjoy; removed: jingyue.
Tue, May 22, 11:37 AM

May 18 2018

tra accepted D45783: [DEBUGINFO, NVPTX] Render `-no-cuda-debug` LLVM option when required..
May 18 2018, 4:43 PM
tra added inline comments to D46472: [HIP] Support offloading by linker script.
May 18 2018, 4:26 PM
tra added a comment to D45212: Add HIP toolchain.

One more thing -- it would be really good to add some tests to make sure your commands are constructed the way you want.

May 18 2018, 3:52 PM
tra added inline comments to D46476: [HIP] Add action builder for HIP.
May 18 2018, 3:52 PM
tra added a comment to D45212: Add HIP toolchain.

Sorry about the long silence. I'm back to continue the reviews. I'll handle what I can today and will continue with the rest on Tuesday.

May 18 2018, 3:17 PM
tra updated subscribers of D47070: [CUDA] Upgrade linked bitcode to enable inlining.

This was not intended. :-( I was unaware that GetCPUAndFeaturesAttributes() would add any feature that looks like a valid CPU name to the target-cpu attribute.
All I needed is to make builtins available or not. Setting them as function attributes is not what we need here.

May 18 2018, 11:03 AM

May 17 2018

tra accepted D46995: [test-suite] Enable CUDA complex tests with libc++ now that D25403 is resolved..
May 17 2018, 10:13 AM
tra accepted D46994: [test-suite] Test CUDA in C++14 mode with C++11 stdlibs..
May 17 2018, 10:02 AM
tra added a dependency for D46995: [test-suite] Enable CUDA complex tests with libc++ now that D25403 is resolved.: D46994: [test-suite] Test CUDA in C++14 mode with C++11 stdlibs..
May 17 2018, 9:56 AM
tra added a dependent revision for D46994: [test-suite] Test CUDA in C++14 mode with C++11 stdlibs.: D46995: [test-suite] Enable CUDA complex tests with libc++ now that D25403 is resolved..
May 17 2018, 9:56 AM

May 11 2018

tra added a comment to D46782: [CUDA] Allow "extern __shared__ Foo foo[]" within anon. namespaces..

Rather than suppressing the warning, should we instead give such variables external linkage?

May 11 2018, 4:03 PM
tra accepted D46782: [CUDA] Allow "extern __shared__ Foo foo[]" within anon. namespaces..
May 11 2018, 3:59 PM
tra committed rL332142: [Split GEP] handle trunc() in separate-const-offset-from-gep pass..
[Split GEP] handle trunc() in separate-const-offset-from-gep pass.
May 11 2018, 2:19 PM
tra closed D46732: [Split GEP] handle trunc() in separate-const-offset-from-gep pass..
May 11 2018, 2:19 PM

May 10 2018

tra updated the diff for D46732: [Split GEP] handle trunc() in separate-const-offset-from-gep pass..

Fixed mangled comment.

May 10 2018, 3:50 PM
tra created D46732: [Split GEP] handle trunc() in separate-const-offset-from-gep pass..
May 10 2018, 3:47 PM

May 9 2018

tra committed rL331941: [NVPTX] Added a feature to use short pointers for const/local/shared AS..
[NVPTX] Added a feature to use short pointers for const/local/shared AS.
May 9 2018, 4:50 PM
tra closed D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS..
May 9 2018, 4:50 PM
tra committed rC331938: [CUDA] Added -f[no-]cuda-short-ptr option.
[CUDA] Added -f[no-]cuda-short-ptr option
May 9 2018, 4:14 PM
tra committed rL331938: [CUDA] Added -f[no-]cuda-short-ptr option.
[CUDA] Added -f[no-]cuda-short-ptr option
May 9 2018, 4:13 PM
tra closed D46148: [CUDA] Added -f[no-]cuda-short-ptr option.
May 9 2018, 4:13 PM

May 8 2018

tra accepted D46471: [HIP] Add hip offload kind.

Small nit. LGTM otherwise.

May 8 2018, 11:59 AM
tra added a comment to D44435: CUDA ctor/dtor Module-Unique Symbol Name.

Great! Let's close this review then.
And good luck with cling.

May 8 2018, 10:29 AM

May 4 2018

tra added a comment to D44435: CUDA ctor/dtor Module-Unique Symbol Name.

Perhaps we should take a step back and consider whether this is the right approach to solve your problem.

May 4 2018, 10:24 AM

May 3 2018

tra added a comment to D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS..

Would it be possible to get rid of the flag entirely and make this controlled only by the data layout? That's defined on the module level.

Yes, ultimately the switch is just changing the datalayout. If this is just a short term thing for migration I think a global flag is fine. If you want to be able to swap these properly long term, this is a different arch name / TargetMachine.

May 3 2018, 1:47 PM

May 1 2018

tra removed a reviewer for D46119: Add a header file for enum IIT_Info, NFC: tra.
May 1 2018, 1:59 PM

Apr 27 2018

tra updated the diff for D46148: [CUDA] Added -f[no-]cuda-short-ptr option.

Do not use subtarget feature to control codegen of short pointers.
Instead we need to add a field to TargetOptions which is the only way to pass this info to NVPTXTargetInfo constructor where we calculate DataLayout.

Apr 27 2018, 4:31 PM
tra added a comment to D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS..

@arsenm Will this do?

Apr 27 2018, 3:42 PM
tra updated the diff for D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS..

Moved control of the feature from subtarget feature to --nvptx-short-ptr command-line option.

Apr 27 2018, 3:39 PM
tra added a comment to D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS..

The subtarget is a function level property. The pointer size / datalayout is a module property and can't be different between functions. If you have multiple functions with different pointer sizes in a module (or no functions at all), then what pointer size ends up used for a global? The global itself has a size, and any ConstantExpr pointers in its initializer will also need to have a size.

Apr 27 2018, 3:08 PM
tra added a comment to D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS..

NVPTX wasn't already doing this? As this is a module property, a subtarget feature can't be used for this. For example what happens for a global value?

Apr 27 2018, 2:31 PM
tra added inline comments to D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS..
Apr 27 2018, 1:45 PM
tra updated the diff for D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS..

Use DataLayout to obtain pointer size.

Apr 27 2018, 1:40 PM

Apr 26 2018

tra added inline comments to D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS..
Apr 26 2018, 3:42 PM
tra updated the diff for D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS..

Updated feature description.

Apr 26 2018, 3:41 PM
tra updated the diff for D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS..

UseShortPointer -> UseShortPointers

Apr 26 2018, 3:37 PM
tra added a dependent revision for D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS.: D46148: [CUDA] Added -f[no-]cuda-short-ptr option.
Apr 26 2018, 3:25 PM
tra added a dependency for D46148: [CUDA] Added -f[no-]cuda-short-ptr option: D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS..
Apr 26 2018, 3:25 PM
tra updated the diff for D46148: [CUDA] Added -f[no-]cuda-short-ptr option.

Removed debug printout.

Apr 26 2018, 3:24 PM
tra created D46148: [CUDA] Added -f[no-]cuda-short-ptr option.
Apr 26 2018, 3:20 PM
tra created D46147: [NVPTX] Added a feature to use short pointers for const/local/shared AS..
Apr 26 2018, 3:18 PM
tra accepted D46130: [NVPTX] Turn on Loop/SLP vectorization.
Apr 26 2018, 9:35 AM
tra updated subscribers of D46116: [NVPTX] Make the legalizer expand shufflevector of <2 x half>.

Lgtm.

Apr 26 2018, 8:22 AM

Apr 24 2018

tra accepted D44984: [HIP] Add hip input kind and codegen for kernel launching.
Apr 24 2018, 1:37 PM