- User Since
- Aug 8 2018, 8:02 AM (204 w, 23 h)
Mon, Jun 27
Confirmed that this patch fixes https://github.com/llvm/llvm-project/issues/56251
Mon, Jun 20
LGTM. This allows me to write concise compile lines.
Sun, Jun 19
Use -O3 in DeviceRTL static library build.
Sat, Jun 18
We don't do "hope it works" but we are sure it works and I don't think we found undesired clang. If undesired clang gets picked up. Please report issues.
Actually in the case of ENABLE_PROJECTS=openmp, there is no clang binary searching but directly referencing the clang cmake target. The just-built clang is guaranteed to be picked up.
In the case of ENABLE_RUNTIMES=openmp, LLVM_DIR is passed in, I would consider it closer to "hope it works". Even relying on CMAKE_CXX_COMPILER in runtime builds, you actually hope the RUNTIMES setup passes in the desired compiler, it is just a different scripting magic managed by ENABLE_RUNTIMES.
May 16 2022
Restored gcc build in my tests.
This is either done with a two-step build, where OpenMP is built with the Clang that was just installed, or through the -DLLLVM_ENABLE_RUNTIMES=openmp option. This has always been the case,
This is not true. Even before D125315 breakage, all the host libraries can be built with GCC and DeviceRTL with just-built Clang via -DLLVM_ENABLE_PROJECTS=openmp
Maybe we should figure out why D125315 doesn't pick up clang when building DeviceRTL.
This is different from what I'm looking for. DeviceRTL should still be built as it was. Only the part added by D125315 should be disabled.
Apr 28 2022
@qiongsiwu1 Tested the updated patch. Works fine now.
Apr 23 2022
Got some trouble. See https://github.com/llvm/llvm-project/issues/55002
Apr 22 2022
Apr 19 2022
The debian failure seems unrelated. Let me know if you need me to commit the patch.
Apr 17 2022
Apr 16 2022
Expand the test.
Apr 15 2022
check_include_file is like writing an empty main with one include line and gets it compiled. So it doesn't pull any dependency.
To test whether hwloc.h is a valid header file, you'd need to provide CMAKE_REQUIRED_INCLUDES pointing to hwloc.h.
So it is needed.
One minor change requested. All the rest LGTM.
Apr 14 2022
Apr 13 2022
I'm afraid you are abusing existing mutex. Please be specific about what part of the Entry you are trying to protect with TPR lock/unlock.
The new scheme stores the shadow pointers inside the entries.
Makes a lot more sense now.
Apr 12 2022
Apr 11 2022
Apr 7 2022
Apr 4 2022
Two tests array_section_use_device_ptr.c and array_section_implicit_capture.c are ported from https://reviews.llvm.org/D117997
Mar 28 2022
using the multiarch directory
If we can cross compile libomp and libomptarget to the target system. We may have
Compile clang once but compile runtime library for multiple architectures.
Mar 18 2022
Mar 6 2022
Mar 5 2022
Feb 17 2022
Feb 10 2022
I was using gfx902 in the past but after installing ROCm 5.0.0 today, I got a complaint and I have to use gfx90c by copying gfx902 bc files to gfx90c.
However it is better to do the right thing here.
ROCm 5.0.0 requires me to use this target on my 4700u
Jan 28 2022
Jan 24 2022
Jan 23 2022
Jan 21 2022
It sounds like such info will be passed from host to the device once per kernel and the performance impact is negligible. Right?
Jan 20 2022
Jan 18 2022
Please update the allowed string values and also resolve conflict before landing.
Jan 10 2022
Much appreciated. Allowing to debug the host code is already a huge plus.
Jan 8 2022
Confirm that #52938 is fixed by this patch.
Jan 4 2022
Renaming Event to EventH2D and also the corresponding functions are my last request. The rest looks good.
I still think the new/old event exchange should be removed.
If T1 record an event for a D2H transfer and then T2 issues a H2D transfer and create an new event and then wait for the old event. the H2D or D2H may happen together when they are on two distinct streams.
This violates atomic data transfer behavior.
the solution is 1 persistent event per map and always check its status before issuing any transfer.
Jan 3 2022
Dec 29 2021
This patch caused many test failure in my application on Power9. Although this patch sounds like affecting SLP, adding -fno-slp-vectorize doesn't improve the pass rate but changing -O3 to -O0 does.
Dec 28 2021
Dec 27 2021
Dec 26 2021
Dec 20 2021
Dec 16 2021
I never liked init-if-passed-nullptr and prefer this explicit approach.