Page MenuHomePhabricator
Feed Advanced Search

Jul 31 2019

Hahnfeld added a comment to D60972: [OpenMP 5.0] libomptarget interface for declare mapper functions.

You need to declare them yourself in the test. That's not really elegant, but many other runtime tests already do.

Thanks for the tip! I can do that. Since I'm pretty new to the runtime, could you suggest a similar test that I should look into?

Jul 31 2019, 12:06 AM · Restricted Project, Restricted Project

Jul 30 2019

Hahnfeld added a comment to D60972: [OpenMP 5.0] libomptarget interface for declare mapper functions.

Since the mapper is not really implemented in this patch, if I add a test, it will be something like below:

__tgt_push_mapper_component(h, base0, begin0, size0, type0);
__tgt_push_mapper_component(h, base1, begin1, size1, type1);
auto total_size = __tgt_mapper_num_components(h);
printf("size=%d", total_size);
// CHECK: size=2

It seems to me this test is not meaningful. I can add a more meaningful test after all mapper patches are upstreamed.
Do you think we need a meaningless test like this now?

Yes, it's good to have a test, even a very elementary one. When full support for declare mapper is upstreamed we can revisit the test and extend it to check real-use scenarios.

I just realized that these functions (__tgt_push_mapper_component and __tgt_mapper_num_components) are not exposed to users (i.e., defined in omp.h), so the test proposed above is not possible.

I cannot imagine what test it can have for this patch now, so I think we can leave this patch testless. If you have an idea of test, please let me know.

Jul 30 2019, 1:32 PM · Restricted Project, Restricted Project
Hahnfeld committed rG52b87ac32f57: [OpenMP] Rename last file to cpp and remove LIBOMP_CFLAGS (authored by Hahnfeld).
[OpenMP] Rename last file to cpp and remove LIBOMP_CFLAGS
Jul 30 2019, 11:39 AM
Hahnfeld committed rL367343: [OpenMP] Rename last file to cpp and remove LIBOMP_CFLAGS.
[OpenMP] Rename last file to cpp and remove LIBOMP_CFLAGS
Jul 30 2019, 11:36 AM
Hahnfeld closed D65285: [OpenMP] Rename last file to cpp and remove LIBOMP_CFLAGS.
Jul 30 2019, 11:36 AM · Restricted Project, Restricted Project
Hahnfeld added inline comments to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.
Jul 30 2019, 9:54 AM · Restricted Project, Restricted Project
Hahnfeld added inline comments to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.
Jul 30 2019, 12:01 AM · Restricted Project, Restricted Project

Jul 29 2019

Hahnfeld added inline comments to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.
Jul 29 2019, 12:23 PM · Restricted Project, Restricted Project
Hahnfeld added inline comments to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.
Jul 29 2019, 11:54 AM · Restricted Project, Restricted Project
Hahnfeld added inline comments to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.
Jul 29 2019, 11:23 AM · Restricted Project, Restricted Project
Hahnfeld added inline comments to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.
Jul 29 2019, 8:48 AM · Restricted Project, Restricted Project
Hahnfeld added inline comments to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.
Jul 29 2019, 8:26 AM · Restricted Project, Restricted Project
Hahnfeld requested changes to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.

Oh, and there's still no test for use_device_ptr

Jul 29 2019, 7:36 AM · Restricted Project, Restricted Project
Hahnfeld added inline comments to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.
Jul 29 2019, 7:36 AM · Restricted Project, Restricted Project
Hahnfeld added a comment to D65340: [OpenMP][libomptarget] Add support for close map modifier.

I think this mostly looks good, but depends on D65001, right?

Jul 29 2019, 7:36 AM · Restricted Project, Restricted Project
Hahnfeld requested changes to D65341: [OpenMP] Add support for close map modifier in Clang.

There's already D55892 with a better set of tests, including target enter data / target exit data.

Jul 29 2019, 7:36 AM · Restricted Project, Restricted Project
Hahnfeld added a comment to D65285: [OpenMP] Rename last file to cpp and remove LIBOMP_CFLAGS.

I will contact ittnotify owners just in case. So that we might have lesser burden with possible future updates of this third-party code.

Jul 29 2019, 7:36 AM · Restricted Project, Restricted Project

Jul 25 2019

Hahnfeld committed rGbaeab1fc442e: [OpenMP] Fix build of stubs library, NFC. (authored by Hahnfeld).
[OpenMP] Fix build of stubs library, NFC.
Jul 25 2019, 10:53 AM
Hahnfeld committed rL367041: [OpenMP] Fix build of stubs library, NFC..
[OpenMP] Fix build of stubs library, NFC.
Jul 25 2019, 10:53 AM
Hahnfeld closed D65284: [OpenMP] Fix build of stubs library.
Jul 25 2019, 10:52 AM · Restricted Project, Restricted Project
Hahnfeld added a comment to D65284: [OpenMP] Fix build of stubs library.

LGTM (I'd treat this as NFC :).

Jul 25 2019, 10:51 AM · Restricted Project, Restricted Project
Hahnfeld created D65285: [OpenMP] Rename last file to cpp and remove LIBOMP_CFLAGS.
Jul 25 2019, 8:55 AM · Restricted Project, Restricted Project
Hahnfeld created D65284: [OpenMP] Fix build of stubs library.
Jul 25 2019, 8:55 AM · Restricted Project, Restricted Project
Hahnfeld committed rG2488ae9df155: [OpenMP] RISCV64 port (authored by Hahnfeld).
[OpenMP] RISCV64 port
Jul 25 2019, 7:38 AM
Hahnfeld committed rL367021: [OpenMP] RISCV64 port.
[OpenMP] RISCV64 port
Jul 25 2019, 7:38 AM
Hahnfeld closed D59880: [OpenMP] RISCV64 port.
Jul 25 2019, 7:38 AM · Restricted Project
Hahnfeld added a comment to D59880: [OpenMP] RISCV64 port.

I've tested this patch with current upstream LLVM and Clang on a HiFive Unleashed board and all tests are passing, including OMPT. I don't have commit access yet, @Hahnfeld would you mind commiting this for me? Thank you!

Jul 25 2019, 7:38 AM · Restricted Project
Hahnfeld added a comment to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.

@Hahnfeld I added several tests. Because these tests require unified memory to be supported by the underlying system I have added them as a new type of test: check-libomptarget-nvptx-unified (in addition to the check-libomptarget-nvptx one). These tests should only be run on platforms which support unified memory.

Jul 25 2019, 6:32 AM · Restricted Project, Restricted Project

Jul 23 2019

Hahnfeld added a comment to D65112: [OPENMP][NVPTX]Make the test compatible with CUDA9+, NFC..

But simple removal does not help, actually. It still might produce incorrect results. When you removed the barrier, you introduced implicit threads divergence. Since cuda9+ threads are not executed in lock-step manner anymore (see https://devblogs.nvidia.com/using-cuda-warp-level-primitives/). It leads to the fact that the result you get is not stable and not guaranteed to be reproduced on other platforms by other users.
The runtime relies on the warp-synchronous model and threads in the warp after the critical region must be synchronized. It means, that we should not use barriers here but still need to synchronize threads within the warp. To synchronize the threads we must use __syncwarp(mask) function instead.
Currently, it is pure coincidence that the test passes. It happens just because the second parallel region requires much time to start and execute its body and the threads in the else branch have time to execute their code. But it is not guaranteed in Cuda9+.
To reveal this problem, just enclose the code in else branch (Count += omp_get_level(); // 6 * 1 = 6) under control of another #pragma omp critical.

#pragma omp critical
Count += omp_get_level(); // 6 * 1 = 6

It must produce the same result as before but it won't, most probably.

I still get the correct results. Do you have a test that you know to fail?

I get Expected count = 67 with the critical section in this test. It is on Power9 with Cuda9. Did you try to compile it at O3?

At first I didn't, but now the original test case with added critical in the else branch works with full optimization iff I completely remove the specialization CGOpenMPRuntimeNVPTX::emitCriticalRegion.

But again, it just masks the real problem but does not solve it. It is again just a pure coincidence that it returns the expected result.

for (int I = 0; I < 32; ++I) {
 if (omp_get_thread_num() == I) {
   #pragma omp critical
   Count += omp_get_level(); // 6 * 1 = 6
  }
}

Again, it will fail though it must return correct result.

Jul 23 2019, 12:29 PM · Restricted Project
Hahnfeld added a comment to D65112: [OPENMP][NVPTX]Make the test compatible with CUDA9+, NFC..

I can reproduce that this test hangs on our Volta GPUs and I debugged it briefly: The problem seems to be how Clang emits critical regions, more precisely that the generated code introduces a barrier. Effectively, this assumes that all threads pass via all critical regions the same number of times, making it a worksharing construct. Obviously, this is not true for the current code, only 10 threads are needed to execute the J loop and all other threads wait at the end of the kernel. If I manually remove the barrier(s) from the generated code, the executable finishes and prints the correct results.

Yep, this is what I'm going to fix in my next patches.

I think we should fix this first instead of relaxing a test that fails for something that is easy to fix.

But simple removal does not help, actually. It still might produce incorrect results. When you removed the barrier, you introduced implicit threads divergence. Since cuda9+ threads are not executed in lock-step manner anymore (see https://devblogs.nvidia.com/using-cuda-warp-level-primitives/). It leads to the fact that the result you get is not stable and not guaranteed to be reproduced on other platforms by other users.
The runtime relies on the warp-synchronous model and threads in the warp after the critical region must be synchronized. It means, that we should not use barriers here but still need to synchronize threads within the warp. To synchronize the threads we must use __syncwarp(mask) function instead.
Currently, it is pure coincidence that the test passes. It happens just because the second parallel region requires much time to start and execute its body and the threads in the else branch have time to execute their code. But it is not guaranteed in Cuda9+.
To reveal this problem, just enclose the code in else branch (Count += omp_get_level(); // 6 * 1 = 6) under control of another #pragma omp critical.

#pragma omp critical
Count += omp_get_level(); // 6 * 1 = 6

It must produce the same result as before but it won't, most probably.

I still get the correct results. Do you have a test that you know to fail?

I get Expected count = 67 with the critical section in this test. It is on Power9 with Cuda9. Did you try to compile it at O3?

Jul 23 2019, 11:11 AM · Restricted Project
Hahnfeld added a comment to D65112: [OPENMP][NVPTX]Make the test compatible with CUDA9+, NFC..

I can reproduce that this test hangs on our Volta GPUs and I debugged it briefly: The problem seems to be how Clang emits critical regions, more precisely that the generated code introduces a barrier. Effectively, this assumes that all threads pass via all critical regions the same number of times, making it a worksharing construct. Obviously, this is not true for the current code, only 10 threads are needed to execute the J loop and all other threads wait at the end of the kernel. If I manually remove the barrier(s) from the generated code, the executable finishes and prints the correct results.

Yep, this is what I'm going to fix in my next patches.

Jul 23 2019, 10:20 AM · Restricted Project
Hahnfeld added a comment to D65112: [OPENMP][NVPTX]Make the test compatible with CUDA9+, NFC..

I can reproduce that this test hangs on our Volta GPUs and I debugged it briefly: The problem seems to be how Clang emits critical regions, more precisely that the generated code introduces a barrier. Effectively, this assumes that all threads pass via all critical regions the same number of times, making it a worksharing construct. Obviously, this is not true for the current code, only 10 threads are needed to execute the J loop and all other threads wait at the end of the kernel. If I manually remove the barrier(s) from the generated code, the executable finishes and prints the correct results.

Jul 23 2019, 8:11 AM · Restricted Project
Hahnfeld committed rG6e40ae8f3d3f: [libomptarget] Handle offload policy in push_tripcount (authored by Hahnfeld).
[libomptarget] Handle offload policy in push_tripcount
Jul 23 2019, 7:22 AM
Hahnfeld committed rL366810: [libomptarget] Handle offload policy in push_tripcount.
[libomptarget] Handle offload policy in push_tripcount
Jul 23 2019, 7:22 AM
Hahnfeld closed D64626: [libomptarget] Handle offload policy in push_target_tripcount.
Jul 23 2019, 7:22 AM · Restricted Project, Restricted Project
Hahnfeld updated the summary of D64626: [libomptarget] Handle offload policy in push_target_tripcount.
Jul 23 2019, 7:13 AM · Restricted Project, Restricted Project
Hahnfeld added a comment to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.

@Hahnfeld can you list the tests you would like to see please? And then I'll add them.

Jul 23 2019, 7:00 AM · Restricted Project, Restricted Project
Hahnfeld requested changes to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.

I'd still like to see tests as mentioned in my last comment.

Jul 23 2019, 6:38 AM · Restricted Project, Restricted Project

Jul 22 2019

Hahnfeld added a comment to D65112: [OPENMP][NVPTX]Make the test compatible with CUDA9+, NFC..

So if a user does this, the application will also hang?

Jul 22 2019, 11:52 PM · Restricted Project
Herald added a reviewer for D64943: [Clang][OpenMP offload] Eliminate use of OpenMP linker script: jdoerfert.
Jul 22 2019, 11:47 PM
Herald added a reviewer for D64626: [libomptarget] Handle offload policy in push_target_tripcount: jdoerfert.

I do not understand what caused the error you linked. Which part of the push_tripcount call is problematic? I would have expected get_default_device to return the host id when offload is disabled and the code to not choke one the host device.

According to the spec omp_get_default_device returns default-device-var and I couldn't find a paragraph that implies special handling when offloading is disabled.

Consequently, omp_get_default_device always returns a "real" device and if CheckDeviceAndCtors fails (in the linked case, it was because the user didn't compile for the right architecture), the call to HandleTargetOutcome(false) will result in the error message as seen in the linked post.

So why not change omp_get_default_device to return the host device number if offloading is disabled? My reasoning is: Why should "offload disabled" be different from "no offload", e.g., because there are no devices available.

So you propose to change the default-device-var ICV? Because at the moment, I think omp_get_default_device returns the value of OMP_DEFAULT_DEVICE or 0, regardless of whether the device exists or not. The user can only find out via omp_get_num_devices which will return the correct value 0 in both cases.

(I might not fully grasp the situation here, if so, feel free to present me a bigger picture).

I actually did not try to suggest a change of any ICV value. Let me try to explain:

My thought was that the following conditions should more or less result in the same execution:

  • offloading is disabled, e.g., through an environment variable,
  • offloading is not possible, e.g., there is no device,
  • offloading is not possible, e.g., there was a problem initializing the device.
Jul 22 2019, 11:19 PM · Restricted Project, Restricted Project
Hahnfeld added a comment to D64626: [libomptarget] Handle offload policy in push_target_tripcount.

I do not understand what caused the error you linked. Which part of the push_tripcount call is problematic? I would have expected get_default_device to return the host id when offload is disabled and the code to not choke one the host device.

According to the spec omp_get_default_device returns default-device-var and I couldn't find a paragraph that implies special handling when offloading is disabled.

Consequently, omp_get_default_device always returns a "real" device and if CheckDeviceAndCtors fails (in the linked case, it was because the user didn't compile for the right architecture), the call to HandleTargetOutcome(false) will result in the error message as seen in the linked post.

So why not change omp_get_default_device to return the host device number if offloading is disabled? My reasoning is: Why should "offload disabled" be different from "no offload", e.g., because there are no devices available.

Jul 22 2019, 1:46 PM · Restricted Project, Restricted Project
Hahnfeld added a comment to D64626: [libomptarget] Handle offload policy in push_target_tripcount.

I do not understand what caused the error you linked. Which part of the push_tripcount call is problematic? I would have expected get_default_device to return the host id when offload is disabled and the code to not choke one the host device.

Jul 22 2019, 12:51 PM · Restricted Project, Restricted Project
Hahnfeld committed rGa2748c74d68b: [OMPT] Cleanup reset of exit_frame pointer (authored by Hahnfeld).
[OMPT] Cleanup reset of exit_frame pointer
Jul 22 2019, 11:48 AM
Hahnfeld committed rL366721: [OMPT] Cleanup reset of exit_frame pointer.
[OMPT] Cleanup reset of exit_frame pointer
Jul 22 2019, 11:48 AM
Hahnfeld closed D64442: [OMPT] Cleanup reset of exit_frame pointer.
Jul 22 2019, 11:48 AM · Restricted Project, Restricted Project
Hahnfeld added a comment to D64626: [libomptarget] Handle offload policy in push_target_tripcount.

ping

Jul 22 2019, 11:48 AM · Restricted Project, Restricted Project
Hahnfeld committed rG4138b2f16760: Delete empty file (authored by Hahnfeld).
Delete empty file
Jul 22 2019, 11:12 AM
Hahnfeld committed rL366716: Delete empty file.
Delete empty file
Jul 22 2019, 11:12 AM
Hahnfeld added a comment to D58989: Remove deprecated taskq.

I've deleted the left file kmp_taskq.cpp in rL366716.

Jul 22 2019, 11:12 AM · Restricted Project, Restricted Project

Jul 20 2019

Hahnfeld added a comment to D64442: [OMPT] Cleanup reset of exit_frame pointer.

ping

Jul 20 2019, 2:00 AM · Restricted Project, Restricted Project
Hahnfeld requested changes to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.

Please add tests at least

  • for the API, ie that omp_target_alloc returns distinct memory (I suggest to omp_target_memcpy from host to allocated memory to a different host memory and change the first host memory before copying back).
  • that memory regions are indeed shared (in a target data region, the host should see updates from the host without an update and vice versa)
Jul 20 2019, 1:22 AM · Restricted Project, Restricted Project
Hahnfeld added inline comments to D45890: [OMPT] Add implementation and tests of Archer tool.
Jul 20 2019, 1:22 AM · Restricted Project

Jul 19 2019

Hahnfeld added inline comments to D65001: [OpenMP][libomptarget] Add support for unified memory for regular maps.
Jul 19 2019, 1:28 PM · Restricted Project, Restricted Project

Jul 18 2019

Hahnfeld accepted D59880: [OpenMP] RISCV64 port.

The changes look good to me, thanks for your patience!

Jul 18 2019, 9:31 AM · Restricted Project
Hahnfeld added inline comments to D45890: [OMPT] Add implementation and tests of Archer tool.
Jul 18 2019, 9:13 AM · Restricted Project
Hahnfeld added inline comments to D45890: [OMPT] Add implementation and tests of Archer tool.
Jul 18 2019, 9:09 AM · Restricted Project

Jul 16 2019

Hahnfeld added a comment to D64808: [OPENMMP] Resolve lost LoopTripCnt for subsequent loops in same thread..

Can we validate the result differently, without debug messages?

Jul 16 2019, 10:44 AM · Restricted Project, Restricted Project
Hahnfeld closed D64625: [OpenMP] Move header inclusion out of 'extern "C"'.

Committed in rL366229 (but made a typo)

Jul 16 2019, 10:32 AM · Restricted Project
Hahnfeld committed rG1ff553578551: [OpenMP] Move header inclusion out of 'extern "C"' (authored by Hahnfeld).
[OpenMP] Move header inclusion out of 'extern "C"'
Jul 16 2019, 10:22 AM
Hahnfeld committed rL366229: [OpenMP] Move header inclusion out of 'extern "C"'.
[OpenMP] Move header inclusion out of 'extern "C"'
Jul 16 2019, 10:22 AM
Hahnfeld updated the diff for D64625: [OpenMP] Move header inclusion out of 'extern "C"'.

Rebase. (clever git even did the merge for me!)

Jul 16 2019, 8:00 AM · Restricted Project
Hahnfeld added a comment to D45890: [OMPT] Add implementation and tests of Archer tool.

More comments

Jul 16 2019, 7:50 AM · Restricted Project

Jul 12 2019

Hahnfeld added inline comments to D64534: Remove OMP spec versioning.
Jul 12 2019, 7:06 AM · Restricted Project, Restricted Project
Hahnfeld added a comment to D64626: [libomptarget] Handle offload policy in push_target_tripcount.

Could you provide a test?

Jul 12 2019, 6:43 AM · Restricted Project, Restricted Project
Hahnfeld added a comment to D64534: Remove OMP spec versioning.

How does this play with -fopenmp-version= clang switches;
This doesn't remove any public api, only makes all of it available "unconditionall", correct?

Jul 12 2019, 4:08 AM · Restricted Project, Restricted Project
Hahnfeld added a comment to D64534: Remove OMP spec versioning.

@tlwilmar D64625 conflicts with moving the headers. Let me know if you want to land this first, but I'd like to get the fix into the next release which is set to branch next week.

Jul 12 2019, 3:30 AM · Restricted Project, Restricted Project
Hahnfeld created D64626: [libomptarget] Handle offload policy in push_target_tripcount.
Jul 12 2019, 3:30 AM · Restricted Project, Restricted Project
Hahnfeld created D64625: [OpenMP] Move header inclusion out of 'extern "C"'.
Jul 12 2019, 3:25 AM · Restricted Project
Hahnfeld committed rGaca476b29630: [libomptarget] Fix typos and grammar in error messages, NFC. (authored by Hahnfeld).
[libomptarget] Fix typos and grammar in error messages, NFC.
Jul 12 2019, 3:24 AM
Hahnfeld committed rL365890: [libomptarget] Fix typos and grammar in error messages, NFC..
[libomptarget] Fix typos and grammar in error messages, NFC.
Jul 12 2019, 3:22 AM

Jul 11 2019

Hahnfeld committed rG2dfc5179f6a8: [libomptarget-nvptx] Remove dead functions (authored by Hahnfeld).
[libomptarget-nvptx] Remove dead functions
Jul 11 2019, 1:14 PM
Hahnfeld committed rL365817: [libomptarget-nvptx] Remove dead functions.
[libomptarget-nvptx] Remove dead functions
Jul 11 2019, 1:12 PM
Hahnfeld closed D52700: [libomptarget-nvptx] Remove dead functions.
Jul 11 2019, 1:12 PM · Restricted Project
Hahnfeld accepted D55772: [OpenMP][libomptarget] Suppress C++ 11 related warnings when building libomptarget-nvptx bitcode library.

@gtbercea Could you please commit this patch? The warnings are incredibly annoying.

Jul 11 2019, 11:12 AM · Restricted Project, Restricted Project
Hahnfeld added a comment to D64217: [OpenMP][NFCI] Cleanup the target state queue implementation.

Ideally, we can try to get rid of the target state queue by reducing the amount of information and using only shared memory. Without doubt, this would improve performance as Alexey mentioned and lighten the global memory usage (hundreds of MB to GB). Based on my experiments last year, I think that's doable iff we don't support nested parallelism (see thread on openmp-dev) and possibly "tweak" the spec such that we don't need to track some ICVs once the execution reaches an active parallel region (e.g., we don't need to care about nthreads-var if all nested regions are serialized, but currently the user might query the set value via omp_get_max_threads).

Jul 11 2019, 9:34 AM · Restricted Project

Jul 10 2019

Hahnfeld added inline comments to D64375: [OpenMP][Docs] Provide implementation status details.
Jul 10 2019, 11:44 PM · Restricted Project, Restricted Project
Hahnfeld added a comment to D64534: Remove OMP spec versioning.

Great cleanup, one very minor comment inline.

Jul 10 2019, 11:36 PM · Restricted Project, Restricted Project

Jul 9 2019

Hahnfeld added a comment to D64442: [OMPT] Cleanup reset of exit_frame pointer.

I also tried to test that the exit_frame is reset for serialized regions, but we don't have an implicit barrier then :-(

Jul 9 2019, 12:29 PM · Restricted Project, Restricted Project
Hahnfeld created D64442: [OMPT] Cleanup reset of exit_frame pointer.
Jul 9 2019, 12:28 PM · Restricted Project, Restricted Project
Hahnfeld added a comment to D45326: [OpenMP] [CUDA plugin] Add support for teams reduction via scratchpad.

I think reductions are already implemented differently, can we close this?

Jul 9 2019, 12:25 PM · Restricted Project
Hahnfeld edited reviewers for D58951: [compiler-rt][tests] Improve handling with non-default toolchains, added: peter.smith; removed: Hahnfeld.

I don't actually know the use cases well enough to review this patch.

Jul 9 2019, 10:46 AM · Restricted Project, Restricted Project
Hahnfeld added a comment to D59880: [OpenMP] RISCV64 port.

Sorry for the delay...

Jul 9 2019, 9:26 AM · Restricted Project
Hahnfeld requested changes to D62841: [openmp] Use libffi only when LLVM_ENABLE_FFI is on.

After some time has passed, I still don't see a convincing solution here. Maybe abandon the patch?

Jul 9 2019, 8:17 AM · Restricted Project
Hahnfeld updated subscribers of D64375: [OpenMP][Docs] Provide implementation status details.
Jul 9 2019, 3:19 AM · Restricted Project, Restricted Project

Jun 21 2019

Hahnfeld added a comment to D63599: Fixed memory use-after-free problem..

The fixes look good overall, but I'd like approval from somebody more familiar with this part of the code.

Jun 21 2019, 9:09 AM · Restricted Project, Restricted Project

Jun 20 2019

Hahnfeld added a comment to D63607: [clang][driver] Add basic --driver-mode=flang support for fortran.

Even though I know you didn't send the email yet, can you please upload the diff with full context? Thanks :-)

Jun 20 2019, 11:14 PM · Restricted Project
Hahnfeld added inline comments to D63599: Fixed memory use-after-free problem..
Jun 20 2019, 8:47 AM · Restricted Project, Restricted Project

Jun 17 2019

Hahnfeld added inline comments to D63454: [OpenMP] Strengthen regression tests for task allocation under nowait depend clauses NFC.
Jun 17 2019, 1:03 PM · Restricted Project, Restricted Project

Jun 15 2019

Hahnfeld added a comment to D63009: [OpenMP] Add target task alloc function with device ID.

Am I correct that the second to last revision ("- Fix tests.") removed all checks for the actual device_id argument from the tests? From my point of view that's not fixing but weakening the tests! Can you explain why they needed "fixing"?

Jun 15 2019, 12:33 AM · Restricted Project, Restricted Project, Restricted Project

Jun 12 2019

Hahnfeld added a comment to D62393: [OPENMP][NVPTX]Mark parallel level counter as volatile..

Trying to catch up on all the comments, replies to some:

Jun 12 2019, 8:16 AM · Restricted Project
Hahnfeld added reviewers for D63010: [OpenMP] Add task alloc function: jlpeyton, AndreyChurbanov.
Jun 12 2019, 7:41 AM · Restricted Project, Restricted Project, Restricted Project

Jun 10 2019

Hahnfeld added a comment to D63009: [OpenMP] Add target task alloc function with device ID.

As for D63010, the new function isn't used anywhere (and hence cannot be tested). What's the advantage of splitting changes at such granularity?

Jun 10 2019, 4:49 AM · Restricted Project, Restricted Project, Restricted Project
Hahnfeld added a comment to D63010: [OpenMP] Add task alloc function.

Is there a particular reason for this kind of micro-patches? The function is not made available in the linker script and there's no implementation either.

Jun 10 2019, 4:49 AM · Restricted Project, Restricted Project, Restricted Project
Hahnfeld added a comment to D60972: [OpenMP 5.0] libomptarget interface for declare mapper functions.

@Hahnfeld Do you really think it is necessary to pass these two functions as arguments, instead of exporting them. If you do, could you explain why?

Jun 10 2019, 4:41 AM · Restricted Project, Restricted Project

Jun 4 2019

Hahnfeld added a comment to D62841: [openmp] Use libffi only when LLVM_ENABLE_FFI is on.

@Hahnfeld I'm not an OpenMP expert and don't have much strong argument here, but IMHO it might be confusing to have LLVM_ENABLE_FFI and LIBOMPTARGET_ENABLE_FFI have different default value from user's perspective. I wonder if we can set it off by default and explicitly turn it on for builds with test?

Jun 4 2019, 11:48 PM · Restricted Project
Hahnfeld added a comment to D60972: [OpenMP 5.0] libomptarget interface for declare mapper functions.

The compiler doesn't generate code related to std::vector. It's only used in the runtime implementation, so it should be okay with Fortran. Again, the IBM and Intel compiler people seem to agree with it.

Maybe I don't understand where rt_mapper_handle comes from. According to the design slides and D59474, it passed as an argument to the generated omp_mapper[...] function, but how is the runtime system involved in its creation? Will there be additional interface functions / changes that will call this?

The idea is the runtime will create a MapperComponentsTy (std::vector) before calling the mapper function, in, for instance, __tgt_target_data_begin (These parts will be implemented in later patches). When the mapper function is called, the pointer of MapperComponentsTy is passed to it, as void *rt_mapper_handle. The mapper function will call __kmpc_push_mapper_component using this rt_mapper_handle, and then the runtime can put it into the MapperComponentsTy

Jun 4 2019, 11:33 PM · Restricted Project, Restricted Project
Hahnfeld added a comment to D62841: [openmp] Use libffi only when LLVM_ENABLE_FFI is on.

Oh, LLVM_ENABLE_FFI is off by default? That's not good, because we rely on the plugins to test libomptarget. So changing the default will have a negative impact on test coverage. I'm afraid that's not a good idea, even if we are now using libffi when LLVM_ENABLE_FFI = Off. We could still add LIBOMPTARGET_ENABLE_FFI (at the top-level CMakeLists.txt, please) and default to on, but I'm not sure if that's of much help.

Jun 4 2019, 11:20 PM · Restricted Project
Hahnfeld added a comment to D60972: [OpenMP 5.0] libomptarget interface for declare mapper functions.

It would be great to have such things in public...

Sure, there is no secret. Please see it here if you are interested: https://github.com/lingda-li/public-sharing/blob/master/mapper_runtime_design.pptx

From a quick look, I'd say this does not reflect the current design: The types are named differently, have a different layout (SoA vs AoS) and there's no implementation of __tgt_target_mapper in this patch as @grokos mentioned.

Hi Jonas, starting from slide 8 is the current design, __tgt_target_mapper etc. are deprecated. It may not accurately reflect the current code but the framework should be the same.

Moreover, I'd question the following things:

  1. Why are we back to __kmpc_? naming? Most other functions specific to libomptarget are called __tgt_?.

There are other functions in libomptarget starting with __kmpc_, for example, __kmpc_push_target_tripcount.
My understanding is anything that does not directly call the device starting with __kmpc_. The IBM and Intel compiler people seem to be okay with this naming.

Jun 4 2019, 7:24 AM · Restricted Project, Restricted Project
Hahnfeld added a comment to D60972: [OpenMP 5.0] libomptarget interface for declare mapper functions.

It would be great to have such things in public...

Sure, there is no secret. Please see it here if you are interested: https://github.com/lingda-li/public-sharing/blob/master/mapper_runtime_design.pptx

Jun 4 2019, 6:29 AM · Restricted Project, Restricted Project