gtbercea (Gheorghe-Teodor Bercea)
User

Projects

User does not belong to any projects.

User Details

User Since
Dec 29 2016, 12:44 AM (59 w, 4 d)

Recent Activity

Wed, Feb 14

gtbercea updated the diff for D43197: [OpenMP] Add flag for linking runtime bitcode library.

Fix test.

Wed, Feb 14, 12:28 PM
gtbercea updated the diff for D43197: [OpenMP] Add flag for linking runtime bitcode library.

Revert.

Wed, Feb 14, 12:11 PM
gtbercea added a comment to D43197: [OpenMP] Add flag for linking runtime bitcode library.

I'm still not sure we can't run this test on Windows. I think lots of other tests use touch, even some specific to Windows...

Wed, Feb 14, 11:58 AM
gtbercea updated the diff for D43197: [OpenMP] Add flag for linking runtime bitcode library.

Use %T.

Wed, Feb 14, 11:32 AM
gtbercea added inline comments to D43197: [OpenMP] Add flag for linking runtime bitcode library.
Wed, Feb 14, 8:38 AM
gtbercea updated the diff for D43197: [OpenMP] Add flag for linking runtime bitcode library.

Fix tmp folder name.

Wed, Feb 14, 8:23 AM
gtbercea added inline comments to D43197: [OpenMP] Add flag for linking runtime bitcode library.
Wed, Feb 14, 8:21 AM
gtbercea updated the diff for D43197: [OpenMP] Add flag for linking runtime bitcode library.

Move unix specific test to new file.

Wed, Feb 14, 8:19 AM

Mon, Feb 12

gtbercea updated the diff for D43197: [OpenMP] Add flag for linking runtime bitcode library.

Add regression tests.

Mon, Feb 12, 12:54 PM
gtbercea updated the diff for D43197: [OpenMP] Add flag for linking runtime bitcode library.

Fix warning message.

Mon, Feb 12, 9:26 AM
gtbercea created D43197: [OpenMP] Add flag for linking runtime bitcode library.
Mon, Feb 12, 8:59 AM
gtbercea updated the diff for D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.

Change name to include full compute capability name not just the number.

Mon, Feb 12, 8:43 AM

Fri, Feb 9

gtbercea added a comment to D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.

ping @Hahnfeld

Fri, Feb 9, 2:37 PM

Thu, Feb 8

gtbercea abandoned D39005: [OpenMP] Clean up variable and function names for NVPTX backend.
Thu, Feb 8, 8:07 AM
gtbercea accepted D42841: [docs] Improve help for OpenMP options.

LG

Thu, Feb 8, 8:07 AM
gtbercea updated the diff for D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.

Remove LIBOMPTARGET_NVPTX_COMPUTE_CAPABILITY

Thu, Feb 8, 8:00 AM
gtbercea updated the diff for D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.

Add deprecated message to documentation.

Thu, Feb 8, 7:53 AM
gtbercea added a comment to D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.

Forgot to add @grokos as a reviewer. Fixed now.

Thu, Feb 8, 7:28 AM
gtbercea added a reviewer for D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining: grokos.
Thu, Feb 8, 7:25 AM
gtbercea added a comment to D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.

Good to go @Hahnfeld ?

Thu, Feb 8, 7:18 AM
gtbercea updated the diff for D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.

Fix documentation.

Thu, Feb 8, 7:17 AM

Wed, Feb 7

gtbercea updated the diff for D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.

Fix documentation.

Wed, Feb 7, 1:23 PM
gtbercea updated the diff for D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.

Reinstate cache variable.

Wed, Feb 7, 11:19 AM
gtbercea added inline comments to D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.
Wed, Feb 7, 10:53 AM
gtbercea added inline comments to D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.
Wed, Feb 7, 10:51 AM
gtbercea added a comment to D41485: [OpenMP][libomptarget] Add data sharing support in libomptarget.

@Hahnfeld any further comments?

Wed, Feb 7, 9:04 AM
gtbercea updated the diff for D41485: [OpenMP][libomptarget] Add data sharing support in libomptarget.

Move definition of data sharing variables.

Wed, Feb 7, 8:12 AM
gtbercea added inline comments to D41485: [OpenMP][libomptarget] Add data sharing support in libomptarget.
Wed, Feb 7, 7:32 AM

Tue, Feb 6

gtbercea updated the diff for D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.

Rebase on new master.

Tue, Feb 6, 2:07 PM
gtbercea added a comment to D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.

ping

Tue, Feb 6, 1:15 PM
gtbercea updated the diff for D41485: [OpenMP][libomptarget] Add data sharing support in libomptarget.

Update against latest master branch.

Tue, Feb 6, 12:22 PM

Jan 12 2018

gtbercea added inline comments to D14254: [OpenMP] Initial implementation of OpenMP offloading library - libomptarget device RTLs..
Jan 12 2018, 9:49 AM · Restricted Project

Jan 9 2018

gtbercea updated the diff for D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.

Remove LABEL from tests and add TODO comment for shared memory limit.

Jan 9 2018, 8:29 AM
gtbercea added a comment to D14254: [OpenMP] Initial implementation of OpenMP offloading library - libomptarget device RTLs..

Two global remarks:

  1. I think we agreed on having <thread id> % <warp size> instead of bit operations.

What's wrong with bit-wise operations as long as they are documented? I think we should keep them and then comment what it is that they do.

Please see my discussion with Samuel: The code means to do a modulo and the bit operation is an optimization that the compiler can do.

Jan 9 2018, 8:12 AM · Restricted Project
gtbercea added a comment to D14254: [OpenMP] Initial implementation of OpenMP offloading library - libomptarget device RTLs..

Two global remarks:

  1. I think we agreed on having <thread id> % <warp size> instead of bit operations.
Jan 9 2018, 5:35 AM · Restricted Project

Jan 5 2018

gtbercea added a comment to D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.
In D38978#968565, @tra wrote:

I'm still curious to hear what do you plan to do when your depot use grows beyond certain limit. At the very least there's the physical limit on shared memory size. Shared memory use also affects how many threads can be launched which has large impact on performance. IMO having some sort of user-controllable threshold would be very desirable.

When shared memory isn't enough to hold the shared depot, global memory will be used instead. That is a scheme which will be covered by a future patch.

Good luck with that. IMO if your kernel requires all shared memory available per multiprocessor, you are almost guaranteed suboptimal performance because you will not have enough threads running -- neither for peak compute, nor to hide global memory access latency. My bet that you will eventually end up limiting shared memory use to a fairly small fraction of it.

Jan 5 2018, 11:21 AM
gtbercea added a comment to D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.
In D38978#967485, @tra wrote:

Dotting the 'i's on the questions that were not replied to directly.

In D38978#899205, @tra wrote:

Considering that device-side code tends to be heavily inlined, it may be prudent to add an option to control the total size of shared memory we allow to be used for this purpose.

I'm still curious to hear what do you plan to do when your depot use grows beyond certain limit. At the very least there's the physical limit on shared memory size. Shared memory use also affects how many threads can be launched which has large impact on performance. IMO having some sort of user-controllable threshold would be very desirable.

Jan 5 2018, 2:57 AM
gtbercea updated the diff for D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.

Address comments.

Jan 5 2018, 2:54 AM

Jan 4 2018

gtbercea added a reviewer for D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining: ABataev.
Jan 4 2018, 5:02 AM
gtbercea added a dependent revision for D14254: [OpenMP] Initial implementation of OpenMP offloading library - libomptarget device RTLs.: D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.
Jan 4 2018, 4:51 AM · Restricted Project
gtbercea updated the summary of D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.
Jan 4 2018, 4:51 AM
gtbercea created D41724: [OpenMP][libomptarget] Enable the compilation of multiple bc libraries for runtime inlining.
Jan 4 2018, 4:50 AM
gtbercea added a comment to D41485: [OpenMP][libomptarget] Add data sharing support in libomptarget.

ping

Jan 4 2018, 3:14 AM
gtbercea added a comment to D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.

ping

Jan 4 2018, 3:13 AM

Jan 3 2018

gtbercea abandoned D41486: [OpenMP][Clang] Add missing argument to runtime functions..

Functionality already landed. See previous comment.

Jan 3 2018, 4:00 AM

Dec 22 2017

gtbercea updated the diff for D41485: [OpenMP][libomptarget] Add data sharing support in libomptarget.
Dec 22 2017, 3:29 AM
gtbercea added inline comments to D41485: [OpenMP][libomptarget] Add data sharing support in libomptarget.
Dec 22 2017, 3:19 AM

Dec 21 2017

gtbercea added inline comments to D41485: [OpenMP][libomptarget] Add data sharing support in libomptarget.
Dec 21 2017, 5:11 AM
gtbercea closed D40451: [OpenMP] Add function attribute for triggering shared memory lowering in the LLVM backend.

Committed here D41123

Dec 21 2017, 5:09 AM
gtbercea updated the diff for D41486: [OpenMP][Clang] Add missing argument to runtime functions..

Address comments.

Dec 21 2017, 4:58 AM
gtbercea added a comment to D41486: [OpenMP][Clang] Add missing argument to runtime functions..

D41012? This patch doesn't update the documentation with function signatures.

Dec 21 2017, 4:43 AM
gtbercea created D41486: [OpenMP][Clang] Add missing argument to runtime functions..
Dec 21 2017, 4:35 AM
gtbercea created D41485: [OpenMP][libomptarget] Add data sharing support in libomptarget.
Dec 21 2017, 4:29 AM

Dec 20 2017

gtbercea added a comment to D14254: [OpenMP] Initial implementation of OpenMP offloading library - libomptarget device RTLs..

Thanks @Hahnfeld just making sure this goes ahead! :)

Dec 20 2017, 2:29 AM · Restricted Project
gtbercea accepted D14254: [OpenMP] Initial implementation of OpenMP offloading library - libomptarget device RTLs..
Dec 20 2017, 2:01 AM · Restricted Project
gtbercea added a reviewer for D14254: [OpenMP] Initial implementation of OpenMP offloading library - libomptarget device RTLs.: gtbercea.

Good to go @Hahnfeld ?

Dec 20 2017, 2:00 AM · Restricted Project

Dec 19 2017

gtbercea updated the diff for D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.

Use LLVM function for checking if pointer is stored.

Dec 19 2017, 4:22 AM

Dec 12 2017

gtbercea updated the diff for D41123: [OpenMP] Add function attribute for triggering data sharing..

Fix test.

Dec 12 2017, 11:45 AM
gtbercea created D41123: [OpenMP] Add function attribute for triggering data sharing..
Dec 12 2017, 11:02 AM

Dec 4 2017

gtbercea added a comment to D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.

ping

Dec 4 2017, 10:43 AM

Nov 28 2017

gtbercea added a comment to D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.

@tra @hfinkel

Nov 28 2017, 7:17 AM

Nov 27 2017

gtbercea added inline comments to D14254: [OpenMP] Initial implementation of OpenMP offloading library - libomptarget device RTLs..
Nov 27 2017, 5:42 PM · Restricted Project
gtbercea added a comment to D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.

ping

Nov 27 2017, 7:56 AM

Nov 24 2017

gtbercea updated the diff for D40451: [OpenMP] Add function attribute for triggering shared memory lowering in the LLVM backend.
Nov 24 2017, 4:01 PM
gtbercea created D40451: [OpenMP] Add function attribute for triggering shared memory lowering in the LLVM backend.
Nov 24 2017, 3:58 PM
gtbercea updated the diff for D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.

Add regression tests and allow for shared memory lowering to be disabled at function level.

Nov 24 2017, 3:22 PM

Nov 21 2017

gtbercea closed D38976: [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs using OpenMP device offloading.
Nov 21 2017, 7:55 AM

Nov 20 2017

gtbercea accepted D40250: [OpenMP] Consistently use cubin extension for nvlink.

LG

Nov 20 2017, 4:47 PM
gtbercea added inline comments to D40250: [OpenMP] Consistently use cubin extension for nvlink.
Nov 20 2017, 10:33 AM
gtbercea added inline comments to D40250: [OpenMP] Consistently use cubin extension for nvlink.
Nov 20 2017, 9:01 AM
gtbercea added inline comments to D40250: [OpenMP] Consistently use cubin extension for nvlink.
Nov 20 2017, 8:46 AM

Nov 3 2017

gtbercea updated the diff for D38976: [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs using OpenMP device offloading.

Remove blocks.

Nov 3 2017, 1:48 PM
gtbercea updated the diff for D38976: [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs using OpenMP device offloading.
Nov 3 2017, 1:25 PM

Oct 18 2017

gtbercea updated the diff for D39061: [buildbot] Increase timeout for libomp-clang-ppc64le-linux-debian builder.
Oct 18 2017, 11:26 AM
gtbercea updated the diff for D39061: [buildbot] Increase timeout for libomp-clang-ppc64le-linux-debian builder.
Oct 18 2017, 11:06 AM
gtbercea added a reviewer for D39061: [buildbot] Increase timeout for libomp-clang-ppc64le-linux-debian builder: sfantao.
Oct 18 2017, 10:57 AM
gtbercea created D39061: [buildbot] Increase timeout for libomp-clang-ppc64le-linux-debian builder.
Oct 18 2017, 10:55 AM
gtbercea added a comment to D39005: [OpenMP] Clean up variable and function names for NVPTX backend.

I'd be interested to get the ball rolling in regard to coming up with a fix for this. I see some suggestions in past patches. Some help/clarification would be much appreciated.

Happy to help, but I'm not sure what to offer beyond the link in Art's previous comment.

Oct 18 2017, 8:17 AM

Oct 17 2017

gtbercea added a comment to D39005: [OpenMP] Clean up variable and function names for NVPTX backend.

Hi Artem, Justin,

Oct 17 2017, 2:19 PM
gtbercea updated the diff for D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.

Eliminate variable and function name clean-up. That has been moved into a separate patch: D39005

Oct 17 2017, 8:27 AM
gtbercea created D39005: [OpenMP] Clean up variable and function names for NVPTX backend.
Oct 17 2017, 8:27 AM

Oct 16 2017

gtbercea updated the summary of D38976: [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs using OpenMP device offloading.
Oct 16 2017, 2:30 PM
gtbercea created D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.
Oct 16 2017, 2:29 PM
gtbercea created D38976: [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs using OpenMP device offloading.
Oct 16 2017, 2:21 PM
gtbercea added a comment to D38883: [CMake][OpenMP] Customize default offloading arch.

LGTM

Oct 16 2017, 11:56 AM

Oct 13 2017

gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Oct 13 2017, 11:41 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Oct 13 2017, 11:39 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Oct 13 2017, 11:19 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Oct 13 2017, 11:17 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Oct 13 2017, 11:16 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Oct 13 2017, 11:05 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Oct 13 2017, 11:04 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Oct 13 2017, 11:02 AM

Sep 27 2017

gtbercea accepted D38258: [OpenMP] Fix passing of -m arguments to device toolchain.

LGTM

Sep 27 2017, 7:46 AM
gtbercea accepted D38259: [OpenMP] Fix translation of target args.

LGTM

Sep 27 2017, 7:42 AM
gtbercea added inline comments to D38258: [OpenMP] Fix passing of -m arguments to device toolchain.
Sep 27 2017, 7:40 AM
gtbercea accepted D38257: [OpenMP] Fix memory leak when translating arguments.

LGTM

Sep 27 2017, 7:37 AM
gtbercea closed D38040: [OpenMP] Add an additional test for D34888.
Sep 27 2017, 7:32 AM

Sep 26 2017

gtbercea added a reviewer for D38040: [OpenMP] Add an additional test for D34888: ABataev.
Sep 26 2017, 6:59 PM
gtbercea updated the diff for D38040: [OpenMP] Add an additional test for D34888.

Fix test.

Sep 26 2017, 6:58 PM