Page MenuHomePhabricator

grokos (George Rokos)
User

Projects

User does not belong to any projects.

User Details

User Since
Feb 16 2016, 1:22 PM (240 w, 4 d)

Recent Activity

Fri, Sep 25

grokos accepted D88149: [OpenMP][libomptarget] make omp_get_initial_device 5.1 compliant.

LGTM as well.

Fri, Sep 25, 2:53 PM · Restricted Project

Wed, Sep 23

grokos added inline comments to D88149: [OpenMP][libomptarget] make omp_get_initial_device 5.1 compliant.
Wed, Sep 23, 6:19 AM · Restricted Project
grokos added a reviewer for D87946: [OpenMP] Add Location Fields to Libomptarget Runtime for Debugging: grokos.
Wed, Sep 23, 6:13 AM · Restricted Project, Restricted Project, Restricted Project
grokos added a comment to D87946: [OpenMP] Add Location Fields to Libomptarget Runtime for Debugging.

Seems like a hacky solution to just keep adding suffixed whenever we want a new interface though.

Wed, Sep 23, 6:13 AM · Restricted Project, Restricted Project, Restricted Project

Mon, Sep 21

grokos added a comment to D87946: [OpenMP] Add Location Fields to Libomptarget Runtime for Debugging.

I wasn't aware they were explicitly deprecated. If we're keeping around old interfaces for backwards compatibility I should also add in the old mapper functions without the ident_t pointer and call into the new functions with a nullptr.

Mon, Sep 21, 4:25 PM · Restricted Project, Restricted Project, Restricted Project
grokos added a comment to D87946: [OpenMP] Add Location Fields to Libomptarget Runtime for Debugging.

Added ident_t structs to additional runtime functions.

Mon, Sep 21, 3:56 PM · Restricted Project, Restricted Project, Restricted Project

Fri, Sep 18

grokos added a comment to D87841: [openmp][libomptarget] Include header from LLVM source tree.

As openmp and llvm are now developed in the same monorepo, the only uses this
dependency can break are those that download the monorepo, then delete the llvm
directory before calling cmake. That is not a compelling use case.

That is factually not accurate.
There are standalone tarballs, including one for omp:
https://releases.llvm.org/download.html#10.0.1 OpenMP Source code (.sig)

Right. You can download the source for the sub-projects.
Though, others, e.g., clang, are already not compileable in isolation.

I see three options here:

  1. Continue to copy code from llvm-core to openmp and back. This is "good" for people that compile libomp in isolation and "bad" for developers. Arguably, it is also "bad" for people that use libomp with clang because there is an increased risk we have an unnoticed mismatch between the two. This is not theoretical but already happened.
  2. Copy the code from llvm-core to openmp during packaging time. This is a (small?) burden on the package maintainers but has all the advantages for now. However, it will get more complicated the more llvm-core features we want to use, e.g., tablegen.
  3. Require users to download llvm-core and point to it. This is a small burden on the users but has all the advantages now and in the future.

This patch, on its own, picks 3). I'm happy with that.

Fri, Sep 18, 4:23 AM · Restricted Project, Restricted Project

Tue, Sep 1

grokos added inline comments to D86804: [OpenMP] Consolidate error handling and debug messages in Libomptarget.
Tue, Sep 1, 8:26 AM · Restricted Project
grokos added a comment to D86307: [OpenMP] Pack first-private arguments to improve efficiency of data transfer.

Minor comments (about typos) to be taken into account in case of a future patch.

Tue, Sep 1, 7:40 AM · Restricted Project

Aug 28 2020

grokos accepted D86781: [LIBOMPTARGET]Do not try to optimize bases for the next parameters..

Looks good!

Aug 28 2020, 7:12 AM · Restricted Project

Aug 17 2020

grokos committed rG32ebdc70f3af: [libomptarget][NFC] Sort list of plugins in chronological order (authored by grokos).
[libomptarget][NFC] Sort list of plugins in chronological order
Aug 17 2020, 8:37 AM
grokos closed D86082: [libomptarget][NFC] Sort list of plugins in chronological order.
Aug 17 2020, 8:36 AM · Restricted Project
grokos requested review of D86082: [libomptarget][NFC] Sort list of plugins in chronological order.
Aug 17 2020, 8:29 AM · Restricted Project

Aug 6 2020

grokos accepted D85369: [OpenMP] Fix ref count dec for implicit map of partial data.

Nice catch! Looks good.

Aug 6 2020, 12:24 AM · Restricted Project

Aug 5 2020

grokos committed rG40470eb27a5c: [libomptarget][NFC] Replace `%ld` with PRId64 for data of type int64_t. (authored by grokos).
[libomptarget][NFC] Replace `%ld` with PRId64 for data of type int64_t.
Aug 5 2020, 1:32 PM
grokos closed D85353: [libomptarget][NFC] Replace `%ld` with PRId64 for data of type int64_t..
Aug 5 2020, 1:31 PM · Restricted Project
grokos accepted D85320: [OpenMP] Fix `present` diagnostic for array extension.

LGTM. Thanks!

Aug 5 2020, 1:19 PM · Restricted Project
grokos requested review of D85353: [libomptarget][NFC] Replace `%ld` with PRId64 for data of type int64_t..
Aug 5 2020, 1:17 PM · Restricted Project
grokos accepted D85342: [OpenMP] Fix `target data` exit for array extension.

Looks good.

Aug 5 2020, 12:51 PM · Restricted Project
grokos added inline comments to D85320: [OpenMP] Fix `present` diagnostic for array extension.
Aug 5 2020, 12:10 PM · Restricted Project
grokos added inline comments to D85320: [OpenMP] Fix `present` diagnostic for array extension.
Aug 5 2020, 11:19 AM · Restricted Project

Aug 4 2020

grokos accepted D85216: [LIBOMPTARGET]Fix order of mapper data for targetDataEnd function..

OK, now it makes sense. LGTM

Aug 4 2020, 3:42 PM · Restricted Project
grokos accepted D85246: [OpenMP] Fix `omp target update` for array extension.

I suppose the same must be applied to all cases where present is used. Without a modification like this, present just confirms that the starting address of the object is already mapped, although a size may be specified in the map clause.

#pragma omp target map(present: p[0:10]) // present will succeed even if only p[0] is mapped and p[1]-p[9] are not

So the question here is: should present apply to the size as well as the begin address? I would say yes.

Aug 4 2020, 3:06 PM · Restricted Project
grokos added inline comments to D82245: [libomptarget] Add support for target update non-contiguous.
Aug 4 2020, 12:19 PM · Restricted Project, Restricted Project
grokos added a comment to D85216: [LIBOMPTARGET]Fix order of mapper data for targetDataEnd function..

So does the mapper function emit entries in reverse order upon exiting a target/target data region?

Aug 4 2020, 10:35 AM · Restricted Project

Aug 3 2020

grokos added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

This is an optimization brought up by Deepak. I guess you were in that meeting too but forgot. It could be quite useful when you map an array of struct/class. Assume you map 1000 of this structure, with this optimization most memory allocation can be done in a single allocation, instead of allocation 12 bytes memory 1000 times.

Thinking about it, it's actually important for correctness too. Assume you map C a[2]. If you map separately, a[0] and a[1] could be mapped to not contiguous locations, and it will cause error/segfault when GPU kernel access this array. If you allocate the whole array a[2] together, such problem won't happen.

Aug 3 2020, 6:15 PM · Restricted Project

Jul 29 2020

grokos added inline comments to D84767: [OPENMP]Fix PR46824: Global declare target pointer cannot be accessed in target region..
Jul 29 2020, 11:27 AM · Restricted Project, Restricted Project

Jul 27 2020

grokos added a comment to D83963: [OpenMP] Use `abort` not `error` for fatal runtime exceptions.

Thanks, it makes sense. Can you submit a patch?

Jul 27 2020, 7:54 PM · Restricted Project
grokos added a comment to D83963: [OpenMP] Use `abort` not `error` for fatal runtime exceptions.

Any idea why this change broke some tests? And how does --crash fix the problem?

Jul 27 2020, 6:42 PM · Restricted Project
grokos added a comment to D84422: [OpenMP] Fix `present` for exit from `omp target data`.

This looks much better now. I don't have any other comments. Since this patch is now essentially a clang-only patch, I'll let @ABataev accept it or post comments.

Jul 27 2020, 6:33 PM · Restricted Project, Restricted Project, Restricted Project
grokos accepted D84712: [OpenMP] Implement TR8 `present` motion modifier in runtime (2/2).
Jul 27 2020, 6:29 PM · Restricted Project
grokos added a comment to D84712: [OpenMP] Implement TR8 `present` motion modifier in runtime (2/2).

(Earlier I posted here an answer for another patch). Looks good, obviously it must be committed after the clang patch.

Jul 27 2020, 6:28 PM · Restricted Project
grokos added a comment to D84422: [OpenMP] Fix `present` for exit from `omp target data`.

I've added a comment to the runtime code that performs the check. As you can see, the check is performed regardless. It's just a question of whether the runtime treats it as an error. I don't think performance is an issue.

My concern here is that it will be hard to justify changes to the runtime if I cannot formulate a use case.

Jul 27 2020, 12:28 PM · Restricted Project, Restricted Project, Restricted Project

Jul 24 2020

grokos added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

The first combined entry comes from mapping the whole structure. I think because of the alignment, the structure is actually 16 bytes. The 2nd combined entry is the real entry emitted to map the structure. Why it looks like there are 2 of them, because at the beginning of a mapper function, it maps the whole structure no matter what, which generate the 1st combined entry you saw here. Then we generate detailed mapping entry, which generates the 2nd combined entry you saw here. They are not necessarily the same. It happens to be similar in this example.

Jul 24 2020, 5:36 PM · Restricted Project
grokos added a comment to D84182: [OPENMP]Fix PR46012: declare target pointer cannot be accessed in target region..

After this patch was committed, I tried to run the following example:

#include <stdio.h>
Jul 24 2020, 4:28 PM · Restricted Project
grokos added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

Thanks George for looking into this, and sorry for the late response.

I believe this is not a bug, it's a design choice we made early. The design choice is we map the whole structure at the beginning for one piece so we don't map the individual parts of them separately, which may cause a lot of memcpy.

For the RefCount, when the runtime check the 2nd component in your example, it will find it's already mapped and will not increase the RefCount, so I think it's not a bug and the behavior is correct.

Jul 24 2020, 4:10 PM · Restricted Project
grokos added a comment to D84422: [OpenMP] Fix `present` for exit from `omp target data`.

So let's proceed with the patch.

Jul 24 2020, 11:49 AM · Restricted Project, Restricted Project, Restricted Project

Jul 23 2020

grokos added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

@lildmh: I think I've found a bug. I used declare_mapper_target.cpp. When we call the mapper function, it generates 3 components. The first 2 are identical and correspond to the parent struct. This is what MapperComponents looks like inside function target_data_mapper:

(gdb) print MapperComponents
$1 = {Components = std::vector of length 3, capacity 4 = {{Base = 0x7fffffffb598, Begin = 0x7fffffffb598, Size = 8, Type = 32}, {Base = 0x7fffffffb598, Begin = 0x7fffffffb598, Size = 8, Type = 32}, {Base = 0x7fffffffb598,
      Begin = 0x62efd0, Size = 4096, Type = 562949953421331}}}

Mapping the parent struct twice is problematic. If we have more struct members and some of them are NOT pointers, then upon target_data_end libomptarget will check the parent struct's reference counter to determine whether the scalar member must be copied back to the host. If the reference counter is greater than 1, then the runtime will skip copying back the scalar. Mapping the parent struct two times in a row results in RefCount=2.

Jul 23 2020, 7:07 PM · Restricted Project
grokos added a comment to D84470: [OpenMP 5.0] Fix user-defined mapper privatization in tasks.

I tried the patch and indeed it fixes the problem with target_depend_nowait.cpp. I'll let someone else do the review though.

Jul 23 2020, 4:33 PM · Restricted Project
grokos added a comment to D84422: [OpenMP] Fix `present` for exit from `omp target data`.

So is the test case that motivated this patch illegal OpenMP code?

#pragma omp target enter data map(alloc:i)
#pragma omp target data map(present, alloc: i)
{
  #pragma omp target exit data map(delete:i) // you cannot delete that object in the scope, illegal code?
} // fails presence check here
Jul 23 2020, 1:49 PM · Restricted Project, Restricted Project, Restricted Project
grokos added a comment to D84422: [OpenMP] Fix `present` for exit from `omp target data`.

What confuses me about this interpretation of the standard is the inconsistency at data exit. So if we have an explicit omp target exit data map(present...) then we should respect the "present" semantics, whereas when we have a scoped data exit:

#pragma omp target data map(present,...)
{
  ...
} // implicit "exit data" here

then "present" should be ignored.

Jul 23 2020, 11:37 AM · Restricted Project, Restricted Project, Restricted Project

Jul 22 2020

grokos added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

@lildmh I've got a question unrelated to the problem we are discussing here. I ran declare_mapper_target.cpp and when libomptarget calls the mapper function it prints the following:

Libomptarget --> __tgt_push_mapper_component(Handle=0x00007ffcebd0fb48) adds an entry (Base=0x00007ffcebd101e0, Begin=0x00007ffcebd101e0, Size=8, Type=0x20).
Libomptarget --> __tgt_push_mapper_component(Handle=0x00007ffcebd0fb48) adds an entry (Base=0x00007ffcebd101e0, Begin=0x000000000231bfe0, Size=4096, Type=0x2000000000013)

Why is the second entry's MEMBER_OF field set to 2? It should be MEMBER_OF 1, since the pointer-pointee pair c.a[0:N] is part of struct c which is the first entry on the list.

Jul 22 2020, 6:13 PM · Restricted Project
grokos added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

OK, I suspect there is a race condition involving the CUDA plugin. If I compile the test on x86_64-pc-linux-gnu then I always get the correct result, no matter whether we print debug output or not.

Jul 22 2020, 1:56 PM · Restricted Project
grokos added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

My system was messed up and used libraries and compilers from different builds. Please ignore my previous message.

Jul 22 2020, 1:21 PM · Restricted Project
grokos added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

I tried to run declare_mapper_target.cpp on a Nvidia GPU. The problem occurs while loading the device image:

Target CUDA RTL --> Error returned from cuModuleLoadDataEx
Target CUDA RTL --> CUDA error is: device kernel image is invalid

Sounds like a clang problem, I don't see why libomptarget could be the culprit here.

Jul 22 2020, 11:47 AM · Restricted Project
grokos added a comment to D83062: [OpenMP] Implement TR8 `present` map type modifier in runtime (2/2).

Sure, go ahead!

Jul 22 2020, 10:37 AM · Restricted Project

Jul 21 2020

grokos accepted D83062: [OpenMP] Implement TR8 `present` map type modifier in runtime (2/2).

OK, let's proceed with this version now and later on we can fine-tune the behavior.

Jul 21 2020, 3:08 PM · Restricted Project

Jul 20 2020

grokos added a comment to D64571: [OPENMP]Fix threadid in __kmpc_omp_taskwait call for dependent target calls..

I'm having a look. The problem actually occurs after D67833 (the clang patch), not D68100. If you see at the backtrace in gdb, control never reaches a libomptarget entry point, things fail once we enter the outlined function for the parallel region via __kmp_invoke_microtask. I doubt it's a libomptarget issue.

Jul 20 2020, 9:16 PM · Restricted Project, Restricted Project

Jul 17 2020

grokos added inline comments to D83062: [OpenMP] Implement TR8 `present` map type modifier in runtime (2/2).
Jul 17 2020, 6:58 PM · Restricted Project
grokos committed rG04713f8aa614: Added missing API call to OpenMP test (authored by grokos).
Added missing API call to OpenMP test
Jul 17 2020, 10:40 AM

Jul 16 2020

grokos added inline comments to D83062: [OpenMP] Implement TR8 `present` map type modifier in runtime (2/2).
Jul 16 2020, 4:54 PM · Restricted Project
grokos added inline comments to D83062: [OpenMP] Implement TR8 `present` map type modifier in runtime (2/2).
Jul 16 2020, 3:17 PM · Restricted Project
grokos accepted D83963: [OpenMP] Use `abort` not `error` for fatal runtime exceptions.
Jul 16 2020, 11:24 AM · Restricted Project
grokos added a comment to D83963: [OpenMP] Use `abort` not `error` for fatal runtime exceptions.

abort() does not flush stdout, but here we are printing to stderr which is not buffered so there is no need to flush.

Jul 16 2020, 11:24 AM · Restricted Project
grokos committed rGfc47c0e0a6a2: [clang] Fix compilation warnings in OpenMP declare mapper codegen. (authored by grokos).
[clang] Fix compilation warnings in OpenMP declare mapper codegen.
Jul 16 2020, 11:04 AM
grokos closed D83959: Fix compiling warnings in OpenMP declare mapper codegen.
Jul 16 2020, 11:04 AM · Restricted Project
grokos accepted D83959: Fix compiling warnings in OpenMP declare mapper codegen.
Jul 16 2020, 11:02 AM · Restricted Project
grokos added a comment to D83959: Fix compiling warnings in OpenMP declare mapper codegen.

I can confirm the warnings are gone with this fix.

Jul 16 2020, 10:51 AM · Restricted Project

Jul 15 2020

grokos committed rG911fcf382f10: Fix lit test related to declare mapper patch D67833. (authored by grokos).
Fix lit test related to declare mapper patch D67833.
Jul 15 2020, 8:32 PM
grokos committed rG537b16e9b8da: [OpenMP 5.0] Codegen support to pass user-defined mapper functions to runtime (authored by grokos).
[OpenMP 5.0] Codegen support to pass user-defined mapper functions to runtime
Jul 15 2020, 7:11 PM
grokos added a comment to D67833: [OpenMP 5.0] Codegen support to pass user-defined mapper functions to runtime.

OK, now it works. Thanks!

Jul 15 2020, 7:11 PM · Restricted Project, Restricted Project, Restricted Project
grokos committed rG140ab574a1c8: [OpenMP][Offload] Declare mapper runtime implementation (authored by grokos).
[OpenMP][Offload] Declare mapper runtime implementation
Jul 15 2020, 7:11 PM
grokos closed D67833: [OpenMP 5.0] Codegen support to pass user-defined mapper functions to runtime.
Jul 15 2020, 7:11 PM · Restricted Project, Restricted Project, Restricted Project
grokos closed D68100: [OpenMP 5.0] declare mapper runtime implementation.
Jul 15 2020, 7:11 PM · Restricted Project
grokos added a comment to D67833: [OpenMP 5.0] Codegen support to pass user-defined mapper functions to runtime.

I tried to build clang with this patch and I get errors like:

CGOpenMPRuntime.cpp:9463:38: error: ‘OMPRTL___tgt_target_teams_nowait_mapper’ was not declared in this scope
                                    ? OMPRTL___tgt_target_teams_nowait_mapper
Jul 15 2020, 2:06 PM · Restricted Project, Restricted Project, Restricted Project

Jul 14 2020

grokos added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

No, it's not needed anymore. This patch bypasses the need to do that refactoring. Can you please abandon that revision?

Jul 14 2020, 12:53 PM · Restricted Project
grokos added a comment to D68100: [OpenMP 5.0] declare mapper runtime implementation.

Ping. If the patch lands toady or tomorrow, then we will meet the clang-11 deadline and include support for declare mapper.

Jul 14 2020, 12:38 PM · Restricted Project

Jul 9 2020

grokos accepted D83062: [OpenMP] Implement TR8 `present` map type modifier in runtime (2/2).

The patch looks good now, thanks for all the work!

Jul 9 2020, 12:40 PM · Restricted Project

Jul 8 2020

grokos added inline comments to D83062: [OpenMP] Implement TR8 `present` map type modifier in runtime (2/2).
Jul 8 2020, 11:45 AM · Restricted Project

Jul 7 2020

grokos added inline comments to D83062: [OpenMP] Implement TR8 `present` map type modifier in runtime (2/2).
Jul 7 2020, 5:00 PM · Restricted Project
grokos added inline comments to D83062: [OpenMP] Implement TR8 `present` map type modifier in runtime (2/2).
Jul 7 2020, 4:52 PM · Restricted Project
grokos added inline comments to D83062: [OpenMP] Implement TR8 `present` map type modifier in runtime (2/2).
Jul 7 2020, 1:51 PM · Restricted Project

Jul 6 2020

grokos updated the diff for D68100: [OpenMP 5.0] declare mapper runtime implementation.

I tried to address our previous complaints about code duplication and came up with a scheme which results in a much shorter and cleaner diff with virtually no code duplication. Instead of refactoring code from taget_data_begin/end/update, I introduced a new internal function target_data_mapper which generates new arrays args_base, args, arg_sizes and arg_types for the custom mapper and calls target_data_begin/end/update again using the new arguments.

Jul 6 2020, 12:24 PM · Restricted Project
grokos accepted D83130: [OPENMP]Fix test for ARM, NFC..

Looks good. Do we know why the test doesn't run without -fPIC?

Jul 6 2020, 10:12 AM · Restricted Project

Jul 2 2020

grokos accepted D83057: [OpenMP][NFC] Remove hard-coded line numbers from more tests.

Like D82224, looks good.

Jul 2 2020, 1:31 PM · Restricted Project
grokos added inline comments to D83062: [OpenMP] Implement TR8 `present` map type modifier in runtime (2/2).
Jul 2 2020, 1:30 PM · Restricted Project

Jul 1 2020

grokos added a comment to D82718: [OpenMP] Use primary context in CUDA plugin.

I did some investigation and finally think BLOCKING_SYNC might be a good option here, but I also would like to hear from others.
Basically we have three options here: SPIN, YIELD and the BLOCKING_SYNC.

  • SPIN: In most cases this is not a good option but it might be better for a really tiny kernel.
  • YIELD: Like mentioned in CUDA documentation, it can increase latency, but can increase the performance of CPU threads performing work in parallel with the GPU. But for OpenMP, this level of yield may not help too much because it is not very common to have thread oversubscription in OpenMP.
  • BLOCKING_SYNC: I guess chances are that it works with interruption. This is kind of a balanced option which will not increase too much latency, and also will not waste the resource, especially considering that we might be going to use unshackled threads for all target tasks.
  • AUTO: In fact this option will be used if there is no flag specified. According to CUDA documentation, the behavior depends on the number of CUDA contexts in current process versus the number of logical processors in the system. Either SPIN or YIELD will be used. BLOCKING_SYNC will only be chosen on Tegra devices.
Jul 1 2020, 12:26 PM · Restricted Project

Jun 23 2020

grokos accepted D82264: [OpenMP] Adopt std::set in HostDataToTargetMap.

Looks good now. Thanks!

Jun 23 2020, 6:53 PM · Restricted Project
grokos added inline comments to D82264: [OpenMP] Adopt std::set in HostDataToTargetMap.
Jun 23 2020, 5:49 PM · Restricted Project

Jun 16 2020

grokos added a comment to D80816: [OpenMP] Add unbundling of archives containing bundled object files into device specific archives..

The patch looks good now. I don't have any other comments. We can revisit the driver flag proposal for setting the file extension in the future.

Jun 16 2020, 12:04 PM · Restricted Project, Restricted Project

Jun 11 2020

grokos added inline comments to D80816: [OpenMP] Add unbundling of archives containing bundled object files into device specific archives..
Jun 11 2020, 1:47 PM · Restricted Project, Restricted Project

Jun 3 2020

grokos commandeered D68100: [OpenMP 5.0] declare mapper runtime implementation.
Jun 3 2020, 1:11 PM · Restricted Project

Jun 2 2020

grokos added a comment to D80649: [OpenMP] Improve D2D memcpy to use more efficient driver API.

I'm happy with the changes, I think the patch looks good now.

Jun 2 2020, 1:42 PM · Restricted Project

May 27 2020

grokos added inline comments to D80649: [OpenMP] Improve D2D memcpy to use more efficient driver API.
May 27 2020, 3:50 PM · Restricted Project

May 6 2020

grokos abandoned D74262: [clang-offload-bundler] Enable handling of partially-linked fat objects.

The partial linking scheme has been found to not work correctly in all cases (it fails when we have libraries with device code only). A new patch will be uploaded which will be based on archive extraction.

May 6 2020, 6:12 PM · Restricted Project

Apr 24 2020

grokos accepted D78744: [libomptarget] Initialize reference parameter IsNew within Device::getOrAllocTgtPtr.

LGTM.

Apr 24 2020, 12:59 PM · Restricted Project

Apr 23 2020

grokos added a comment to D78744: [libomptarget] Initialize reference parameter IsNew within Device::getOrAllocTgtPtr.

I think it would look better if we initialized those variables inside Device::getOrAllocTgtPtr - the caller should not assume any default value. That's also what happens in target_data_end with the IsLast variable, it is initialized inside Device::getTgtPtrBegin.

Apr 23 2020, 1:34 PM · Restricted Project

Apr 20 2020

grokos added a comment to D78334: [libomptarget][nfc][nvptx] Less verbose debug build.

I guess it's a reasonable thought to disable such verbose output by default, but I can also see why these diagnostics could be useful l in some cases. Can we introduce a CMake variable to toggle --ptxas-options=-v on and off? (off by default)

Apr 20 2020, 3:45 PM · Restricted Project

Apr 8 2020

grokos added inline comments to D76843: [Openmp] Libomptarget plugin for NEC SX-Aurora.
Apr 8 2020, 2:08 PM · Restricted Project, Restricted Project

Apr 6 2020

grokos added a comment to D70010: [OpenMP][Offloading] Replaced default stream with an actual per-device unblocking stream in NVPTX implementation.

Superseded by D74145. You can abandon this one.

Apr 6 2020, 8:10 PM · Restricted Project, Restricted Project

Mar 26 2020

grokos added inline comments to D76843: [Openmp] Libomptarget plugin for NEC SX-Aurora.
Mar 26 2020, 1:36 PM · Restricted Project, Restricted Project
grokos accepted D76874: [libomptarget] Add missing elf_end call in elf_common.c.

Nice catch! Thanks!

Mar 26 2020, 12:29 PM · Restricted Project

Mar 20 2020

grokos committed rG0a42c9bfe4ea: Enable CUDA offloading on aarch64 host (authored by grokos).
Enable CUDA offloading on aarch64 host
Mar 20 2020, 3:45 PM
grokos closed D76469: Enabled CUDA offloading on aarch64 host.
Mar 20 2020, 3:45 PM · Restricted Project
grokos added a comment to D76469: Enabled CUDA offloading on aarch64 host.

Done!

Mar 20 2020, 3:45 PM · Restricted Project
grokos accepted D76469: Enabled CUDA offloading on aarch64 host.

Building the CUDA plugin on aarch64 hosts was enabled in the ykt branch ages ago, so the configuration has been tested extensively. LGTM.

Mar 20 2020, 12:28 PM · Restricted Project

Mar 10 2020

grokos accepted D75946: [LIBOMPTARGET]Fix PR45139: Bug in mixing Python and OpenMP target offload..

I, too, can confirm that this patch fixes the problem. It seems we need to think more about how libomptarget is initialized to avoid similar problems in the future.

Mar 10 2020, 3:21 PM · Restricted Project

Mar 3 2020

grokos added inline comments to D75581: [libomptarget][amdgcn] Implement get_wtime.
Mar 3 2020, 7:02 PM · Restricted Project
grokos committed rGfca49fe8e34f: [clang-offload-wrapper] Lower priority of __tgt_register_lib in favor of… (authored by grokos).
[clang-offload-wrapper] Lower priority of __tgt_register_lib in favor of…
Mar 3 2020, 12:37 PM
grokos closed D75223: [clang-offload-wrapper] Lower priority of __tgt_register_lib in favor of __tgt_register_requires.
Mar 3 2020, 12:37 PM · Restricted Project, Restricted Project