User Details
- User Since
- Aug 8 2018, 8:02 AM (128 w, 4 d)
Mon, Dec 28
Dec 22 2020
It is lower case openmp not OPENMP.
CMake Error at runtimes/CMakeLists.txt:34 (message): LLVM_ENABLE_RUNTIMES requests OPENMP but directory not found: /home/yeluo/opt/llvm-clang/llvm-project/llvm/runtimes/../../OPENMP
Dec 7 2020
Dec 4 2020
Nov 27 2020
This patch caused severe regression in Clang 11.
https://bugs.llvm.org/show_bug.cgi?id=48177
Nov 18 2020
- Could you separate the reordering related changes to separate patch?
- Could you mention which line in spec 4.5 was the restriction? Even 5.0/5.1 has some restrictions. Need to be clear which one you refer to.
Nov 10 2020
Nov 4 2020
Oct 21 2020
Getting this even when compiling without offload. You can use the reproducer from the original bug report.
clang++: /home/yeluo/opt/llvm-clang/llvm-project/llvm/include/llvm/ADT/APInt.h:1151: bool llvm::APInt::operator==(const llvm::APInt &) const: Assertion `BitWidth == RHS.BitWidth && "Comparison requires equal bit widths"' failed. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace, preprocessed source, and associated run script. Stack dump: 0. Program arguments: /home/packages/llvm/master-patched/bin/clang++ -DADD_ -DH5_USE_16_API -DHAVE_CONFIG_H -Drestrict=__restrict__ -I/home/yeluo/opt/miniqmc/src -I/home/yeluo/opt/miniqmc/build_clang_offlaod_nowait/src -fopenmp -fomit-frame-pointer -fstrict-aliasing -D__forceinline=inline -march=native -O3 -DNDEBUG -ffast-math -std=c++11 -o CMakeFiles/qmcutil.dir/Utilities/tinyxml/tinyxml2.cpp.o -c /home/yeluo/opt/miniqmc/src/Utilities/tinyxml/tinyxml2.cpp 1. <eof> parser at end of file 2. Per-module optimization passes 3. Running pass 'CallGraph Pass Manager' on module '/home/yeluo/opt/miniqmc/src/Utilities/tinyxml/tinyxml2.cpp'. 4. Running pass 'Combine redundant instructions' on function '@_ZN8tinyxml27XMLUtil10IsNameCharEh' #0 0x0000000001ecc523 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/packages/llvm/master-patched/bin/clang+++0x1ecc523) #1 0x0000000001eca25e llvm::sys::RunSignalHandlers() (/home/packages/llvm/master-patched/bin/clang+++0x1eca25e) #2 0x0000000001ecb8cd llvm::sys::CleanupOnSignal(unsigned long) (/home/packages/llvm/master-patched/bin/clang+++0x1ecb8cd) #3 0x0000000001e513b3 (anonymous namespace)::CrashRecoveryContextImpl::HandleCrash(int, unsigned long) (/home/packages/llvm/master-patched/bin/clang+++0x1e513b3) #4 0x0000000001e514ee CrashRecoverySignalHandler(int) (/home/packages/llvm/master-patched/bin/clang+++0x1e514ee) #5 0x00007f18f56923c0 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x153c0) #6 0x00007f18f512718b raise /build/glibc-ZN95T4/glibc-2.31/signal/../sysdeps/unix/sysv/linux/raise.c:51:1 #7 0x00007f18f5106859 abort /build/glibc-ZN95T4/glibc-2.31/stdlib/abort.c:81:7 #8 0x00007f18f5106729 get_sysdep_segment_value /build/glibc-ZN95T4/glibc-2.31/intl/loadmsgcat.c:509:8 #9 0x00007f18f5106729 _nl_load_domain /build/glibc-ZN95T4/glibc-2.31/intl/loadmsgcat.c:970:34 #10 0x00007f18f5117f36 (/lib/x86_64-linux-gnu/libc.so.6+0x36f36) #11 0x00000000019f4c00 llvm::InstCombinerImpl::foldOrOfICmps(llvm::ICmpInst*, llvm::ICmpInst*, llvm::BinaryOperator&) (/home/packages/llvm/master-patched/bin/clang+++0x19f4c00) #12 0x00000000019fb023 llvm::InstCombinerImpl::visitOr(llvm::BinaryOperator&) (/home/packages/llvm/master-patched/bin/clang+++0x19fb023) #13 0x00000000019d354c llvm::InstCombinerImpl::run() (/home/packages/llvm/master-patched/bin/clang+++0x19d354c) #14 0x00000000019d5788 combineInstructionsOverFunction(llvm::Function&, llvm::InstCombineWorklist&, llvm::AAResults*, llvm::AssumptionCache&, llvm::TargetLibraryInfo&, llvm::TargetTransformInfo&, llvm::DominatorTree&, llvm::OptimizationRemarkEmitter&, llvm::BlockFrequencyInfo*, llvm::ProfileSummaryInfo*, unsigned int, llvm::LoopInfo*) (/home/packages/llvm/master-patched/bin/clang+++0x19d5788) #15 0x00000000019d70b1 llvm::InstructionCombiningPass::runOnFunction(llvm::Function&) (/home/packages/llvm/master-patched/bin/clang+++0x19d70b1) #16 0x00000000017c7a68 llvm::FPPassManager::runOnFunction(llvm::Function&) (/home/packages/llvm/master-patched/bin/clang+++0x17c7a68) #17 0x00000000010d0033 (anonymous namespace)::CGPassManager::runOnModule(llvm::Module&) (/home/packages/llvm/master-patched/bin/clang+++0x10d0033) #18 0x00000000017c8117 llvm::legacy::PassManagerImpl::run(llvm::Module&) (/home/packages/llvm/master-patched/bin/clang+++0x17c8117) #19 0x00000000020fed4a clang::EmitBackendOutput(clang::DiagnosticsEngine&, clang::HeaderSearchOptions const&, clang::CodeGenOptions const&, clang::TargetOptions const&, clang::LangOptions const&, llvm::DataLayout const&, llvm::Module*, clang::BackendAction, std::unique_ptr<llvm::raw_pwrite_stream, std::default_delete<llvm::raw_pwrite_stream> >) (/home/packages/llvm/master-patched/bin/clang+++0x20fed4a) #20 0x0000000002d29c9c clang::BackendConsumer::HandleTranslationUnit(clang::ASTContext&) (/home/packages/llvm/master-patched/bin/clang+++0x2d29c9c) #21 0x00000000037e77e3 clang::ParseAST(clang::Sema&, bool, bool) (/home/packages/llvm/master-patched/bin/clang+++0x37e77e3) #22 0x00000000026dc383 clang::FrontendAction::Execute() (/home/packages/llvm/master-patched/bin/clang+++0x26dc383) #23 0x000000000266e4f2 clang::CompilerInstance::ExecuteAction(clang::FrontendAction&) (/home/packages/llvm/master-patched/bin/clang+++0x266e4f2) #24 0x0000000002789bb2 clang::ExecuteCompilerInvocation(clang::CompilerInstance*) (/home/packages/llvm/master-patched/bin/clang+++0x2789bb2) #25 0x0000000000a4568c cc1_main(llvm::ArrayRef<char const*>, char const*, void*) (/home/packages/llvm/master-patched/bin/clang+++0xa4568c) #26 0x0000000000a437ec ExecuteCC1Tool(llvm::SmallVectorImpl<char const*>&) (/home/packages/llvm/master-patched/bin/clang+++0xa437ec) #27 0x0000000002523de2 void llvm::function_ref<void ()>::callback_fn<clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const::$_1>(long) (/home/packages/llvm/master-patched/bin/clang+++0x2523de2) #28 0x0000000001e512c7 llvm::CrashRecoveryContext::RunSafely(llvm::function_ref<void ()>) (/home/packages/llvm/master-patched/bin/clang+++0x1e512c7) #29 0x00000000025234f7 clang::driver::CC1Command::Execute(llvm::ArrayRef<llvm::Optional<llvm::StringRef> >, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >*, bool*) const (/home/packages/llvm/master-patched/bin/clang+++0x25234f7) #30 0x00000000024efd28 clang::driver::Compilation::ExecuteCommand(clang::driver::Command const&, clang::driver::Command const*&) const (/home/packages/llvm/master-patched/bin/clang+++0x24efd28) #31 0x00000000024f0247 clang::driver::Compilation::ExecuteJobs(clang::driver::JobList const&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) const (/home/packages/llvm/master-patched/bin/clang+++0x24f0247) #32 0x0000000002509758 clang::driver::Driver::ExecuteCompilation(clang::driver::Compilation&, llvm::SmallVectorImpl<std::pair<int, clang::driver::Command const*> >&) (/home/packages/llvm/master-patched/bin/clang+++0x2509758) #33 0x0000000000a43158 main (/home/packages/llvm/master-patched/bin/clang+++0xa43158) #34 0x00007f18f51080b3 __libc_start_main /build/glibc-ZN95T4/glibc-2.31/csu/../csu/libc-start.c:342:3 #35 0x0000000000a404de _start (/home/packages/llvm/master-patched/bin/clang+++0xa404de) clang-12: error: clang frontend command failed with exit code 134 (use -v to see invocation) clang version 12.0.0 (https://github.com/llvm/llvm-project.git ca73dcd8a9ed9cc3ca1c1cc97ab893747791a681) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /home/packages/llvm/master-patched/bin clang-12: note: diagnostic msg: ********************
Fails at make install.
Oct 16 2020
Oct 7 2020
LGTM
Oct 6 2020
I just realized that this patch affects clang and libomptarget.
I cannot comment on clang. Regarding libomptarget, Could you explain why the detection is not put together with other cuda stuff in openmp/libomptarget/cmake/Modules/LibomptargetGetDependencies.cmake
3.18 introduces CMAKE_CUDA_ARCHITECTURES. Does 3.18 supports detection? If we know a new way works since 3.18, I think putting both with if-else makes sense.
The link I posted indicated that independent feature is merged since 3.12. Better to avoid deprecated stuff when introducing new cmake lines even though some existing lines may still rely on deprecated cmake.
FindCUDA has been deprecated.
Please explore the following feature without directly calling FindCUDA.
https://gitlab.kitware.com/cmake/cmake/-/merge_requests/1856
Sep 28 2020
The minimal reproducer and full app work now.
Sep 23 2020
Sep 21 2020
Should be good to go now.
After a bit more experiment, the return status of cuGetErrorString can be more than CUDA_SUCCESS, CUDA_ERROR_INVALID_VALUE.
In this particular case when the CUDA is deinitialized, the error code cannot be translated by cuGetErrorString any more.
So now only print errStr with CUDA_SUCCESS.
Treat CUDA_ERROR_INVALID_VALUE different from generic !=CUDA_SUCCESS
Hold a second. I'm exploring a bit more in the error message.
The root cause is a known issue and I put up a bug report to track the status.
https://bugs.llvm.org/show_bug.cgi?id=47595
Anyway, this patch should be sufficient for users at the moment.
Sep 19 2020
Sep 14 2020
If I remember correctly, you may yield the thread inside a target region after enqueuing kernels and transfers. So even with 1 thread, there is chance to run other tasks without finishing this target. Isn't that possible?
However, OpenMP task has a problem that it must be within
to a parallel region; otherwise the task will be executed immediately. As a
result, if we directly wrap to a regular task, the nowait target outside of a
parallel region is still a synchronous version.
Sep 10 2020
Sep 8 2020
The changes I requested as been added. Remove my blocking. Still need other reviews to be addressed.
Sep 4 2020
Sep 1 2020
Aug 31 2020
It seems that functions are marked static so they should be OK. However, including the whole Debug.h in a plugin cpp makes it feel OK to use any function/macro from the header file. But actually only part of the macros are for the plugin. some are only for the libomptarget.
I don't feel right having Debug.h shared by libomptarget and plugins especially when Debug.h is not just macro but also functions.
Aug 26 2020
Please document the flags in the patch summary.
Aug 24 2020
I prefer to PrivateArgumentManagerTy moved into its own files.
The rest looks good to me.
Aug 23 2020
Down the road, we may need a way to allocate host pinned memory via the plugin for the host buffer to maximize transfer performance.
Aug 20 2020
Only minor things.
Why just "small" ones? why not all of them?
Aug 19 2020
LGTM
LGTM
Aug 18 2020
In addition,
- the DeviceTy copy constructor and assign operator are imperfect before this patch. I don't think we can fix them in this patch. We should just document the imperfection here.
- Because the memory limit is per allocation, it seems that the MemoryManager can still hold infinite amount of memory and we don't have way to free them. I'm concerned about having this feature on by default.
Aug 13 2020
What is the current status of this patch?
@lildmh could you update this patch? I'd like to test it against
https://bugs.llvm.org/show_bug.cgi?id=47122
Aug 12 2020
Block the patch temporarily for my earlier questions.
- Please mention LIBOMPTARGET_MEMORY_MANAGER_THRESHOLD, default value and unit in the patch summary.
- Is it possible to have a unit test testing the manager class behaviors?
- Can we offload to host and run address sanitizer or valgrind?
I'm not sure if I'm asking for too much here.
Jul 31 2020
LGTM
Jul 30 2020
Thanks for fixing the bug. It should be good for the moment.
When I think about the existence of recursive mapper, we may still have more sync than needed. I think recursion the whole targetDataBegin/targetDataEnd is convenient but sub-optimal choice.
Recursion should only be done on the map/mapper analysis. Just leave my thoughts here. It needs a discussion beyond this patch.
LGTM.
LGTM. Please mention renaming variables in the summary.
Jul 29 2020
LGTM. My applications run as expected now. PR46824, PR46012, PR46868 all work fine.
Only minor documentation issues.
Jul 28 2020
This patch
GPU activities: 96.99% 350.05ms 10 35.005ms 1.5680us 350.00ms [CUDA memcpy HtoD]
before the July21 change
GPU activities: 95.33% 20.317ms 4 5.0793ms 1.6000us 20.305ms [CUDA memcpy HtoD]
Still more transfer than it should.
LGTM
OK. Leave the unrelated renaming to the future.
Only one minor issue. Your initial sophisticated patch made my thought you replaced all the lock/unlock. After splitting, the change becomes very clean.
Should be easy to address my comments and let us get this merged ASAP.
Please check the reproducer in https://bugs.llvm.org/show_bug.cgi?id=46868 with LIBOMPTARGET_DEBUG=1.
The reference counting on the base pointer variable has side effects. It was not cleaned up when these variables leave its scope.
Needs to split this patch into three.
- function renaming. In addtion, should we update target_data_update as well?
- std::lock_guard change.
- "target" change.
The order of 1 and 2 can be flexible