gtbercea (Gheorghe-Teodor Bercea)
User

Projects

User does not belong to any projects.

User Details

User Since
Dec 29 2016, 12:44 AM (94 w, 5 d)

Recent Activity

Fri, Oct 19

gtbercea added a dependency for D53448: [OpenMP][NVPTX] Use single loops when generating code for distribute parallel for: D53443: [OpenMP][NVPTX] Enable default scheduling for parallel for in non-SPMD cases..
Fri, Oct 19, 1:06 PM
gtbercea added a dependent revision for D53443: [OpenMP][NVPTX] Enable default scheduling for parallel for in non-SPMD cases.: D53448: [OpenMP][NVPTX] Use single loops when generating code for distribute parallel for.
Fri, Oct 19, 1:06 PM
gtbercea updated the diff for D53448: [OpenMP][NVPTX] Use single loops when generating code for distribute parallel for.
Rebase.
Fri, Oct 19, 12:34 PM
gtbercea created D53448: [OpenMP][NVPTX] Use single loops when generating code for distribute parallel for.
Fri, Oct 19, 12:33 PM
gtbercea updated the diff for D53141: [OpenMP][libomptarget] Add runtime function for pushing coalesced global records.
Refactor.
Fri, Oct 19, 12:21 PM
gtbercea updated the diff for D53443: [OpenMP][NVPTX] Enable default scheduling for parallel for in non-SPMD cases..
Refactor.
Fri, Oct 19, 12:19 PM
gtbercea updated the diff for D53141: [OpenMP][libomptarget] Add runtime function for pushing coalesced global records.
Refactor.
Fri, Oct 19, 12:16 PM
gtbercea updated the diff for D53141: [OpenMP][libomptarget] Add runtime function for pushing coalesced global records.
Refactor.
Fri, Oct 19, 10:54 AM
gtbercea created D53443: [OpenMP][NVPTX] Enable default scheduling for parallel for in non-SPMD cases..
Fri, Oct 19, 10:39 AM

Thu, Oct 11

gtbercea updated the diff for D53141: [OpenMP][libomptarget] Add runtime function for pushing coalesced global records.

Simply call to common push function.

Thu, Oct 11, 12:32 PM
gtbercea updated the diff for D53141: [OpenMP][libomptarget] Add runtime function for pushing coalesced global records.

Refactor.

Thu, Oct 11, 11:10 AM
gtbercea updated the diff for D53141: [OpenMP][libomptarget] Add runtime function for pushing coalesced global records.

Ensure PushSize is multiple of 8 bytes.

Thu, Oct 11, 10:56 AM
gtbercea created D53141: [OpenMP][libomptarget] Add runtime function for pushing coalesced global records.
Thu, Oct 11, 8:21 AM

Mon, Oct 1

gtbercea abandoned D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

Going through my list of reviews, this patch was reverted because of memory leaks in other changes. However, I don't think we need this anymore because Clang is raising the PTX level as needed for that CUDA version. Can we abandon this flag?

Mon, Oct 1, 7:21 AM

Fri, Sep 28

gtbercea added a comment to D52434: [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing.

You report a slow down which I am not able to reproduce actually. Do you use any additional clauses not present in your previous post?

No, only dist_schedule(static) which is faster. Tested on a Tesla P100 with today's trunk version:

#pragma omp target teams distribute parallel for (new defaults)190 - 250 GB/s
adding clauses for old defaults: schedule(static) dist_schedule(static)30 - 50 GB/s
same directive with only dist_schedule(static) added (fewer registers)320 - 400 GB/s
Fri, Sep 28, 10:36 AM
gtbercea added a comment to D52434: [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing.

One big problem your code has is that the trip count is incredibly small, especially for STREAM and especially on GPUs. You need a much larger loop size otherwise the timings will be dominated by OpenMP setups costs.

Sure, I'm not that dump. The real code has larger loops, this was just for demonstration purposes. I don't expect the register count to change based on loop size - is that too optimistic?

Fri, Sep 28, 7:54 AM
gtbercea added a comment to D52434: [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing.

One big problem your code has is that the trip count is incredibly small, especially for STREAM and especially on GPUs. You need a much larger loop size otherwise the timings will be dominated by OpenMP setups costs.

Sure, I'm not that dump. The real code has larger loops, this was just for demonstration purposes. I don't expect the register count to change based on loop size - is that too optimistic?

Fri, Sep 28, 5:40 AM
gtbercea added a comment to D52434: [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing.

Just tested this and got very weird results for register usage:

void func(double *a) {
  #pragma omp target teams distribute parallel for map(a[0:100]) // dist_schedule(static)
  for (int i = 0; i < 100; i++) {
    a[i]++;
  }
}

Compiling with current trunk for sm_60 (Pascal): 29 registers
Adding dist_schedule(static) (the previous default): 19 registers
For reference: dist_schedule(static, 128) also uses 29 registers

Any ideas? This significantly slows down STREAM...

Fri, Sep 28, 5:27 AM

Thu, Sep 27

gtbercea updated the diff for D52629: [OpenMP] Make default parallel for schedule in NVPTX target regions in SPMD mode achieve coalescing.

Address comment.

Thu, Sep 27, 1:25 PM
gtbercea created D52629: [OpenMP] Make default parallel for schedule in NVPTX target regions in SPMD mode achieve coalescing.
Thu, Sep 27, 1:18 PM
gtbercea retitled D52434: [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing from [OpenMP] Make default schedules for NVPTX target regions in SPMD mode achieve coalescing to [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing.
Thu, Sep 27, 12:25 PM
gtbercea added a comment to D52434: [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing.

Should we also change the default schedule to static, 1? I know that's not really needed for teams distribute parallel for (because the new default dist_schedule only leaves one iteration per thread), but this doesn't happen for target parallel for. Additionally it would make the intent more explicit and LLVM doesn't need to look through divisions needed to implement static without chunk. Just thinking aloud, not sure if that's worth it.

Thu, Sep 27, 8:20 AM
gtbercea updated the diff for D52434: [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing.

Fix type of chunk size.

Thu, Sep 27, 7:55 AM

Wed, Sep 26

gtbercea abandoned D52436: [OpenMP][libomptarget] Add runtime functions for default schedule for distribute.

Due to most recent proposed changes to Clang in D52434, changes to the runtime are no longer required.

Wed, Sep 26, 12:18 PM
gtbercea updated the diff for D52434: [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing.

Only change default schedule for distribute directive.

Wed, Sep 26, 12:16 PM

Mon, Sep 24

gtbercea created D52436: [OpenMP][libomptarget] Add runtime functions for default schedule for distribute.
Mon, Sep 24, 2:16 PM
gtbercea created D52434: [OpenMP] Make default distribute schedule for NVPTX target regions in SPMD mode achieve coalescing.
Mon, Sep 24, 1:44 PM

Sep 21 2018

gtbercea accepted D51875: [OPENMP][NVPTX] Add support for lastprivates/reductions handling in SPMD constructs with lightweight runtime..

LGTM

Sep 21 2018, 6:34 AM

Sep 14 2018

gtbercea created D52122: [OpenMP][libomptarget] Set the frame pointer then test empty slot condition.
Sep 14 2018, 1:43 PM

Sep 11 2018

gtbercea accepted D51937: [OPENMP]Increment iterator when the loop is continued..

LG

Sep 11 2018, 10:06 AM
gtbercea added a comment to D51687: [libomptarget-nvptx] Add testing infrastructure.

Considering your comment in the description about requiring latest Clang perhaps you should revisit this patch: D46842

Sep 11 2018, 6:06 AM

Aug 30 2018

gtbercea added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.

removing InitializePredefinedAuxMacros and the new test completely should do.

Aug 30 2018, 1:19 PM
gtbercea added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.
In D50845#1219746, @tra wrote:

Also, whatever macros we generate do not prevent headers from using x86 inline assembly. I see quite a few inline asm code in preprocessed output. The headers are from libc ~2.19.

Aug 30 2018, 11:49 AM
gtbercea added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.
In D50845#1219709, @tra wrote:

FYI. This breaks our CUDA compilation. I haven't figured out what exactly is wrong yet. I may need to unroll the patch if the fix is not obvious.

Aug 30 2018, 11:34 AM
gtbercea abandoned D39061: [buildbot] Increase timeout for libomp-clang-ppc64le-linux-debian builder.
Aug 30 2018, 11:19 AM
gtbercea updated the diff for D51446: [OpenMP][bugfix] Add missing macros for Power.
Add test.
Aug 30 2018, 11:10 AM

Aug 29 2018

gtbercea created D51446: [OpenMP][bugfix] Add missing macros for Power.
Aug 29 2018, 11:09 AM

Aug 27 2018

gtbercea updated the diff for D51303: [OpenMP][Fix] Conditional compilation leaves variables unused.
Add implicit cast.
Aug 27 2018, 11:07 AM
gtbercea updated the diff for D51312: [OpenMP][NVPTX] Use appropriate _CALL_ELF macro when offloading.

Add test.

Aug 27 2018, 10:44 AM
gtbercea created D51312: [OpenMP][NVPTX] Use appropriate _CALL_ELF macro when offloading.
Aug 27 2018, 9:08 AM
gtbercea updated the diff for D51301: [OpenMP][Fix] Ensure comparison between unsigned values..
Remove cast.
Aug 27 2018, 7:39 AM
gtbercea created D51303: [OpenMP][Fix] Conditional compilation leaves variables unused.
Aug 27 2018, 7:08 AM
gtbercea created D51301: [OpenMP][Fix] Ensure comparison between unsigned values..
Aug 27 2018, 7:07 AM

Aug 24 2018

gtbercea accepted D51226: [OpenMP][libomptarget] rework of fatal error reporting.

LG unless other reviewers have objections.

Aug 24 2018, 10:38 AM
gtbercea accepted D51222: [OPENMP][NVPTX] Lightweight runtime support for SPMD mode..

LGTM

Aug 24 2018, 10:26 AM

Aug 23 2018

gtbercea added inline comments to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.
Aug 23 2018, 8:16 AM

Aug 16 2018

gtbercea added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.

As a result, we should really have a separate header that has those actually-available functions. When targeting NVPTX, why don't we have the included math.h be CUDA's math.h? In the end, those are the functions we need to call when we generate code. Right?

That's what D47849 deals with.

Yes, but it doesn't get CUDA's math.h. Maybe I misunderstand how this works (and I very well might, because it's not clear that CUDA has a math.h by that name), but that patch tries to avoid problems with the host's math.h and then also injects __clang_cuda_device_functions.h into the device compilation. How does this compare to when you include math.h in Clang's CUDA mode? It seems to be that we want to somehow map standard includes, where applicable, to include files in CUDA's include/crt directory (e.g., crt/math_functions.h and crt/common_functions.h for stdio.h for printf), and nothing else ends up being available (because it is, in fact, not available).

Aug 16 2018, 1:12 PM
gtbercea added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.

If I understand it correctly, the root cause of this exercise is that we want to compile for GPU using plain C. CUDA avoids this issue by separating device and host code via target attributes and clang has few special cases to ignore inline assembly errors in the host code if we're compiling for device. For OpenMP there's no such separation, not in the system headers, at least.

Yes, that's one of the nice properties of CUDA (for the compiler). There used to be the same restriction for OpenMP where all functions used in target regions needed to be put in declare target. However that was relaxed in favor of implicitly marking all called functions in that TU to be declare target.
So ideally I think Clang should determine which functions are really declare target (either explicit or implicit) and only run semantical analysis on them. If a function is then found to be "broken" it's perfectly desirable to error back to the user.

It is not possible for OpenMP because we support implicit declare target functions. Clang cannot identify whether the function is going to be used on the device or not during sema analysis.

Aug 16 2018, 12:35 PM

Aug 14 2018

gtbercea updated the diff for D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

Add __NO_MATH_INLINES macro for the NVPTX toolchain to prevent any host assembly from seeping onto the device.

Aug 14 2018, 8:36 AM
gtbercea added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

Just to address any generality concerns:

Aug 14 2018, 8:03 AM
gtbercea added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

Thanks @Hahnfeld for your suggestions.

Aug 14 2018, 7:18 AM

Aug 10 2018

gtbercea added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.
I don't want to use a fast `pow(a, 2)`, I don't want to call a library function for that at all.
Aug 10 2018, 6:56 AM
gtbercea added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

The downside of this approach is that LLVM doesn't recognize these function calls and doesn't perform optimizations to fold libcalls. For example pow(a, 2) is transformed into a multiplication but __nv_pow(a, 2) is not.

Aug 10 2018, 6:08 AM

Aug 8 2018

gtbercea added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

@Hahnfeld do you get the same error if you compile with clang++ instead of clang?

Aug 8 2018, 9:00 AM
gtbercea added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

I do not get that error.

In the beginning you said that you were facing the same error. Did that go away in the meantime?
Are you testing on x86 or Power? With optimizations enabled?

Aug 8 2018, 8:21 AM
gtbercea added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

IIRC you started to work on this to fix the problem with inline assembly (see https://reviews.llvm.org/D47849#1125019). AFAICS this patch fixes declarations of math functions but you still cannot include math.h which most "correct" codes do.

I'm not sure what you mean by this. This patch enables me to include math.h.

math.c:

#include <math.h>

executed commands:

 $ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -c math.c -O2
In file included from math.c:1:
In file included from /usr/include/math.h:413:
/usr/include/bits/mathinline.h:131:43: error: invalid input constraint 'x' in asm
  __asm ("pmovmskb %1, %0" : "=r" (__m) : "x" (__x));
                                          ^
/usr/include/bits/mathinline.h:143:43: error: invalid input constraint 'x' in asm
  __asm ("pmovmskb %1, %0" : "=r" (__m) : "x" (__x));
                                          ^
2 errors generated.
Aug 8 2018, 8:07 AM
gtbercea added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

IIRC you started to work on this to fix the problem with inline assembly (see https://reviews.llvm.org/D47849#1125019). AFAICS this patch fixes declarations of math functions but you still cannot include math.h which most "correct" codes do.

I'm not sure what you mean by this. This patch enables me to include math.h.

math.c:

#include <math.h>

executed commands:

 $ clang -fopenmp -fopenmp-targets=nvptx64-nvidia-cuda -c math.c -O2
In file included from math.c:1:
In file included from /usr/include/math.h:413:
/usr/include/bits/mathinline.h:131:43: error: invalid input constraint 'x' in asm
  __asm ("pmovmskb %1, %0" : "=r" (__m) : "x" (__x));
                                          ^
/usr/include/bits/mathinline.h:143:43: error: invalid input constraint 'x' in asm
  __asm ("pmovmskb %1, %0" : "=r" (__m) : "x" (__x));
                                          ^
2 errors generated.
Aug 8 2018, 7:48 AM
gtbercea added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

IIRC you started to work on this to fix the problem with inline assembly (see https://reviews.llvm.org/D47849#1125019). AFAICS this patch fixes declarations of math functions but you still cannot include math.h which most "correct" codes do.

Aug 8 2018, 7:19 AM
gtbercea added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

This patch is concerned with calling device functions when you're on the device. The correctness issues you mention are orthogonal to this and should be handled by another patch. I don't think this patch should be held up any longer.

I'm confused by now, could you please highlight the point that I'm missing?

Aug 8 2018, 7:16 AM
gtbercea added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

Ok, so you are already talking about performance. I think we should fix correctness first, in particular the compiler shouldn't complain whenever <math.h> is included.

Aug 8 2018, 5:44 AM

Aug 7 2018

gtbercea updated the diff for D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

Prevent math builtins from being used for nvptx toolchain.

Aug 7 2018, 12:40 PM
gtbercea added inline comments to D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain.
Aug 7 2018, 10:08 AM
gtbercea updated the diff for D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain.
  • Address comments.
Aug 7 2018, 10:08 AM
gtbercea added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

Do we still need this? I think what we really need to solve is the problem of (host) inline assembly in the header files...

Aug 7 2018, 9:19 AM

Aug 6 2018

gtbercea updated the diff for D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.
Fix function call.
Aug 6 2018, 10:36 AM

Aug 2 2018

gtbercea abandoned D50158: [OpenMP] Add placeholder functions for the depend and nowait depend clauses for target data directives..

No longer needed for trunk.

Aug 2 2018, 12:15 PM
gtbercea added a reviewer for D50188: [OpenMP][libomptarget] Simplify warp master selection for data sharing: grokos.
Aug 2 2018, 11:53 AM
gtbercea added a comment to D50158: [OpenMP] Add placeholder functions for the depend and nowait depend clauses for target data directives..

! In D50158#1185865, @Hahnfeld wrote:
Is there any benefit of having more functions? The interface of libomp is already there...

Aug 2 2018, 10:16 AM
gtbercea added a comment to D50158: [OpenMP] Add placeholder functions for the depend and nowait depend clauses for target data directives..

These calls are here because they require the interface of libomp library include these functions. A patch for Clang is in the works which calls these functions so they need to have some basic, correct implementation that works when used with libomp.
The implementation can/should be improved in the future. In our proprietary OpenMP library implementation we already do something more elaborate which is why we need the placeholders here.

Aug 2 2018, 10:14 AM
gtbercea added a reviewer for D50158: [OpenMP] Add placeholder functions for the depend and nowait depend clauses for target data directives.: grokos.
Aug 2 2018, 8:36 AM
gtbercea retitled D50188: [OpenMP][libomptarget] Simplify warp master selection for data sharing from [OpenMP] Simplify warp master selection for data sharing to [OpenMP][libomptarget] Simplify warp master selection for data sharing.
Aug 2 2018, 7:58 AM
gtbercea created D50188: [OpenMP][libomptarget] Simplify warp master selection for data sharing.
Aug 2 2018, 7:57 AM

Aug 1 2018

gtbercea created D50158: [OpenMP] Add placeholder functions for the depend and nowait depend clauses for target data directives..
Aug 1 2018, 1:39 PM

Jul 31 2018

gtbercea added inline comments to D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain.
Jul 31 2018, 2:04 PM
gtbercea added a comment to D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain.

Answers to comments.

Jul 31 2018, 6:19 AM

Jul 30 2018

gtbercea updated the summary of D50001: [OpenMP] Fix new task creation.
Jul 30 2018, 12:51 PM
gtbercea added a reviewer for D50001: [OpenMP] Fix new task creation: jlpeyton.
Jul 30 2018, 12:36 PM
gtbercea updated the summary of D50001: [OpenMP] Fix new task creation.
Jul 30 2018, 12:27 PM
gtbercea updated the summary of D50001: [OpenMP] Fix new task creation.
Jul 30 2018, 12:26 PM
gtbercea created D50001: [OpenMP] Fix new task creation.
Jul 30 2018, 12:21 PM

Jul 20 2018

gtbercea accepted D49564: [OPNEMP, NVPTX] Fixed sychronization construct + code cleanup..

LGTM

Jul 20 2018, 1:37 PM

Jul 13 2018

gtbercea updated the diff for D49188: [OpenMP] Initialize data sharing stack for SPMD case.

Fix tests.

Jul 13 2018, 8:12 AM
gtbercea updated the diff for D49204: [OpenMP][libomptarget] Fix data sharing and globalization infrastructure to work in SPMD mode.

SafeFree.

Jul 13 2018, 7:40 AM

Jul 12 2018

gtbercea updated the diff for D49204: [OpenMP][libomptarget] Fix data sharing and globalization infrastructure to work in SPMD mode.

Clean-up.

Jul 12 2018, 1:17 PM
gtbercea updated the diff for D49204: [OpenMP][libomptarget] Fix data sharing and globalization infrastructure to work in SPMD mode.

Address comments and fix formatting.

Jul 12 2018, 12:31 PM
gtbercea added inline comments to D49204: [OpenMP][libomptarget] Fix data sharing and globalization infrastructure to work in SPMD mode.
Jul 12 2018, 10:02 AM
gtbercea accepted D49241: [OPENMP, NVPTX] Fix loop boundaries calculation for dynamic loops..

LG

Jul 12 2018, 8:22 AM

Jul 11 2018

gtbercea added a dependency for D49188: [OpenMP] Initialize data sharing stack for SPMD case: D49204: [OpenMP][libomptarget] Fix data sharing and globalization infrastructure to work in SPMD mode.
Jul 11 2018, 1:52 PM
gtbercea added a dependent revision for D49204: [OpenMP][libomptarget] Fix data sharing and globalization infrastructure to work in SPMD mode: D49188: [OpenMP] Initialize data sharing stack for SPMD case.
Jul 11 2018, 1:52 PM
gtbercea updated the diff for D49204: [OpenMP][libomptarget] Fix data sharing and globalization infrastructure to work in SPMD mode.

Reset StackP correctly.

Jul 11 2018, 1:31 PM
gtbercea retitled D49204: [OpenMP][libomptarget] Fix data sharing and globalization infrastructure to work in SPMD mode from [OpenMP] Fix data sharing and globalization infrastructure to work in SPMD mode to [OpenMP][libomptarget] Fix data sharing and globalization infrastructure to work in SPMD mode.
Jul 11 2018, 1:27 PM
gtbercea edited reviewers for D49204: [OpenMP][libomptarget] Fix data sharing and globalization infrastructure to work in SPMD mode, added: caomhin; removed: KevinBuist.
Jul 11 2018, 1:25 PM
gtbercea created D49204: [OpenMP][libomptarget] Fix data sharing and globalization infrastructure to work in SPMD mode.
Jul 11 2018, 1:25 PM
gtbercea added a comment to D46842: [OpenMP][libomptarget] Make bitcode library building depend on clang and llvm-linker being available .

Is this good to go?

Jul 11 2018, 11:52 AM
gtbercea added inline comments to D49188: [OpenMP] Initialize data sharing stack for SPMD case.
Jul 11 2018, 11:44 AM
gtbercea updated the diff for D49188: [OpenMP] Initialize data sharing stack for SPMD case.

Add test for spmd stack init function.

Jul 11 2018, 11:26 AM
gtbercea updated the diff for D49188: [OpenMP] Initialize data sharing stack for SPMD case.

Fix test.

Jul 11 2018, 8:06 AM
gtbercea retitled D49188: [OpenMP] Initialize data sharing stack for SPMD case from [OpenMP] Initialize data sharing for SPMD case to [OpenMP] Initialize data sharing stack for SPMD case.
Jul 11 2018, 7:55 AM
gtbercea created D49188: [OpenMP] Initialize data sharing stack for SPMD case.
Jul 11 2018, 7:54 AM

Jun 26 2018

gtbercea added a reviewer for D47394: [OpenMP][Clang][NVPTX] Replace bundling with partial linking for the OpenMP NVPTX device offloading toolchain: kkwli0.
Jun 26 2018, 11:41 AM