This flag "--fopenmp-ptx=" enables the overwriting of the default PTX version used for GPU offloaded OpenMP target regions: "+ptx42".
Details
- Reviewers
arpith-jacob caomhin carlo.bertolli ABataev Hahnfeld jlebar hfinkel tstellar - Commits
- rGa9f6a5292505: Disabling openmp-offload.c on linux until it is stabilized on all local…
rG6b26dcb6d65a: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets
rC310772: Disabling openmp-offload.c on linux until it is stabilized on all local…
rC310489: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets
rL310772: Disabling openmp-offload.c on linux until it is stabilized on all local…
rL310489: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Driver/ToolChains/Cuda.cpp | ||
---|---|---|
503–514 | I don't like this kind of code. It is better to make like this: if (DeviceOffloadingKind == Action::OFK_OpenMP) CC1Args.push_back(DriverArgs.getLastArgValue(options::OPT_fopenmp_ptx_EQ, "+ptx42")); else CC1Args.push_back("+ptx42"); or something like this |
lib/Driver/ToolChains/Cuda.cpp | ||
---|---|---|
507 | No, use CC1Args.push_back() here, or CC1Args.emplace_back(), which is even better. |
Looks like this test is failing on macOS again after this change:
Can you please take a look?
Even after r310505, openmp-offload.c continues to haunt our bots, for example http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/2012. Can you please fix this test?
r310519 did not fix the problem, see http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/7062. I would suggest to revert and fix it properly, our bots are broken for a few days already.
I've removed that test. Let's see if the other two tests pass or not. (310537)
I can't reproduce the error locally so it's hard to figure out what's failing.
If you have a machine with that configuration and can run the command I would appreciate seeing the output of the failing command. That way I know what the driver is doing on your machine.
Hi, we are seeing this test fail on our internal linux build bot. I built/ran your latest change r310537 and here is the test result:
/home/dyung/src/upstream/llvm_clean/tools/clang/test/Driver/openmp-offload.c:722:23: error: expected string not found in input // CHK-PTXAS-VERSION: clang{{.*}}.bc" {{.*}}"-target-feature" "+ptx52" ^ <stdin>:1:1: note: scanning from here clang version 6.0.0 (trunk 310537) ^ <stdin>:9:114: note: possible intended match here "nvlink" "-o" "/tmp/lit_tmp_FMSP4Q/openmp-offload-bb8c5f.out" "-arch" "sm_20" "-L/home/dyung/src/upstream/310537-linux/./lib" "-lomptarget-nvptx" "openmp-offload-74c18d.cubin"
Executing the run line from line 719 of the file at r310537 produces the following output:
dyung@Spica:~/src/upstream/llvm_clean/tools/clang/test/Driver$ /home/dyung/src/upstream/310537-linux/./bin/clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda --fopenmp-ptx=+ptx52 ~/src/upstream/llvm_clean/tools/clang/test/Driver/openmp-offload.c 2>&1 clang version 6.0.0 (trunk 310537) Target: x86_64-unknown-linux-gnu Thread model: posix InstalledDir: /home/dyung/src/upstream/310537-linux/./bin clang: error: cannot find libdevice for sm_20. Provide path to different CUDA installation via --cuda-path, or pass -nocudalib to build without linking with libdevice. "/home/dyung/src/upstream/310537-linux/./bin/clang" "-cc1" "-triple" "x86_64-unknown-linux-gnu" "-emit-llvm-bc" "-emit-llvm-uselists" "-disable-free" "-main-file-name" "openmp-offload.c" "-mrelocation-model" "static" "-mthread-model" "posix" "-mdisable-fp-elim" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-fuse-init-array" "-target-cpu" "x86-64" "-dwarf-column-info" "-debugger-tuning=gdb" "-resource-dir" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fdebug-compilation-dir" "/home/dyung/src/upstream/llvm_clean/tools/clang/test/Driver" "-ferror-limit" "19" "-fmessage-length" "117" "-fopenmp" "-fobjc-runtime=gcc" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-o" "/tmp/openmp-offload-e58520.bc" "-x" "c" "/home/dyung/src/upstream/llvm_clean/tools/clang/test/Driver/openmp-offload.c" "-fopenmp-targets=nvptx64-nvidia-cuda" "/home/dyung/src/upstream/310537-linux/./bin/clang" "-cc1" "-triple" "nvptx64-nvidia-cuda" "-aux-triple" "x86_64-unknown-linux-gnu" "-S" "-disable-free" "-main-file-name" "openmp-offload.c" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-mdisable-fp-elim" "-fmath-errno" "-no-integrated-as" "-fuse-init-array" "-target-cpu" "sm_20" "-dwarf-column-info" "-debugger-tuning=gdb" "-resource-dir" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fno-dwarf-directory-asm" "-fdebug-compilation-dir" "/home/dyung/src/upstream/llvm_clean/tools/clang/test/Driver" "-ferror-limit" "19" "-fmessage-length" "117" "-fopenmp" "-fobjc-runtime=gcc" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-o" "/tmp/openmp-offload-7135c4.s" "-x" "c" "/home/dyung/src/upstream/llvm_clean/tools/clang/test/Driver/openmp-offload.c" "-fopenmp-is-device" "-fopenmp-host-ir-file-path" "/tmp/openmp-offload-e58520.bc" "ptxas" "-m64" "-O0" "--gpu-name" "sm_20" "--output-file" "/tmp/openmp-offload-424f8c.cubin" "/tmp/openmp-offload-7135c4.s" "-c" "nvlink" "-o" "/tmp/openmp-offload-041499.out" "-arch" "sm_20" "-L/home/dyung/src/upstream/310537-linux/./lib" "-lomptarget-nvptx" "openmp-offload-424f8c.cubin" "/home/dyung/src/upstream/310537-linux/./bin/clang" "-cc1" "-triple" "x86_64-unknown-linux-gnu" "-emit-obj" "-mrelax-all" "-disable-free" "-main-file-name" "openmp-offload.c" "-mrelocation-model" "static" "-mthread-model" "posix" "-mdisable-fp-elim" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-fuse-init-array" "-target-cpu" "x86-64" "-dwarf-column-info" "-debugger-tuning=gdb" "-resource-dir" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0" "-fdebug-compilation-dir" "/home/dyung/src/upstream/llvm_clean/tools/clang/test/Driver" "-ferror-limit" "19" "-fmessage-length" "117" "-fopenmp" "-fobjc-runtime=gcc" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-o" "/tmp/openmp-offload-430e99.o" "-x" "ir" "/tmp/openmp-offload-e58520.bc" "/usr/bin/ld" "-z" "relro" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-dynamic-linker" "/lib64/ld-linux-x86-64.so.2" "-o" "a.out" "/usr/lib/gcc/x86_64-linux-gnu/5.4.1/../../../x86_64-linux-gnu/crt1.o" "/usr/lib/gcc/x86_64-linux-gnu/5.4.1/../../../x86_64-linux-gnu/crti.o" "/usr/lib/gcc/x86_64-linux-gnu/5.4.1/crtbegin.o" "-L/usr/lib/gcc/x86_64-linux-gnu/5.4.1" "-L/usr/lib/gcc/x86_64-linux-gnu/5.4.1/../../../x86_64-linux-gnu" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/usr/lib/gcc/x86_64-linux-gnu/5.4.1/../../.." "-L/home/dyung/src/upstream/310537-linux/./bin/../lib" "-L/lib" "-L/usr/lib" "/tmp/openmp-offload-430e99.o" "-lomp" "-lomptarget" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "-lpthread" "-lc" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "/usr/lib/gcc/x86_64-linux-gnu/5.4.1/crtend.o" "/usr/lib/gcc/x86_64-linux-gnu/5.4.1/../../../x86_64-linux-gnu/crtn.o" "-T" "/tmp/a-d7e5d0.lk"
Thanks for running the test on your machine! This is very useful.
I see what the problem is now:
"clang: error: cannot find libdevice for sm_20. Provide path to different CUDA installation via --cuda-path, or pass -nocudalib to build without linking with libdevice."
Looking into it now.
FWIW, I'm able to reproduce the failure using Docker:
Dockerfile:
FROM ubuntu:xenial RUN apt-get update RUN apt-get install -y build-essential ca-certificates subversion python cmake --no-install-recommends WORKDIR / RUN svn co -q -r 310537 http://llvm.org/svn/llvm-project/llvm/trunk llvm RUN svn co -q -r 310537 http://llvm.org/svn/llvm-project/cfe/trunk llvm/tools/clang RUN mkdir /build WORKDIR /build RUN cmake ../llvm -DCMAKE_BUILD_TYPE="Release"
$ docker build -t D29660-test . && docker run -it D29660-test /bin/bash
then inside the container: make check-clang-driver -j8
310549 should solve this problem by using a default architecture that is supported by the underlying device version.
I'm still seeing a failure after r301549: https://gist.github.com/jtbandes/de6118abaadc6c5a5c9b4223a62f596c
- I'm sorry, but I had to revert r310489 and follow-up commits r310505, r310519, r310537 and r310549 since it looks like the failures are accumulating. The revert commit was r310580. The following run lines were failing for me because of various assertion failures and file check errors:
/// ########################################################################### /// Check cubin file generation and usage by nvlink // RUN: %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes %s 2>&1 \ // RUN: | FileCheck -check-prefix=CHK-CUBIN %s // CHK-CUBIN: clang{{.*}}" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda.s" // CHK-CUBIN-NEXT: ptxas{{.*}}" "--output-file" "{{.*}}-openmp-nvptx64-nvidia-cuda.cubin" "{{.*}}-openmp-nvptx64-nvidia-cuda.s" // CHK-CUBIN-NEXT: nvlink" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda" {{.*}} "openmp-offload-openmp-nvptx64-nvidia-cuda.cubin" /// ########################################################################### /// Check cubin file generation and usage by nvlink when toolchain has BindArchAction // RUN: %clang -### -no-canonical-prefixes -target x86_64-apple-darwin17.0.0 -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes %s 2>&1 \ // RUN: | FileCheck -check-prefix=CHK-CUBIN-DARWIN %s // CHK-CUBIN-DARWIN: clang{{.*}}" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda.s" // CHK-CUBIN-DARWIN-NEXT: ptxas{{.*}}" "--output-file" "{{.*}}-openmp-nvptx64-nvidia-cuda.cubin" "{{.*}}-openmp-nvptx64-nvidia-cuda.s" // CHK-CUBIN-DARWIN-NEXT: nvlink" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda" {{.*}} "openmp-offload-openmp-nvptx64-nvidia-cuda.cubin" /// ########################################################################### /// Check cubin file generation and usage by nvlink // RUN: touch %t1.o // RUN: touch %t2.o // RUN: %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda %t1.o %t2.o 2>&1 \ // RUN: | FileCheck -check-prefix=CHK-TWOCUBIN %s // CHK-TWOCUBIN: nvlink"{{.*}}"openmp-offload-{{.*}}.cubin" "openmp-offload-{{.*}}.cubin" /// ########################################################################### /// Check cubin file generation and usage by nvlink when toolchain has BindArchAction // RUN: touch %t1.o // RUN: touch %t2.o // RUN: %clang -### -no-canonical-prefixes -target x86_64-apple-darwin17.0.0 -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda %t1.o %t2.o 2>&1 \ // RUN: | FileCheck -check-prefix=CHK-TWOCUBIN-DARWIN %s // CHK-TWOCUBIN-DARWIN: nvlink"{{.*}}"openmp-offload-{{.*}}.cubin" "openmp-offload-{{.*}}.cubin" /// ########################################################################### /// Check PTXAS is passed -c flag when offloading to an NVIDIA device using OpenMP. // RUN: %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda %s 2>&1 \ // RUN: | FileCheck -check-prefix=CHK-PTXAS-DEFAULT %s // CHK-PTXAS-DEFAULT: ptxas{{.*}}" "-c" /// ########################################################################### /// PTXAS is passed -c flag by default when offloading to an NVIDIA device using OpenMP - disable it. // RUN: %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -fnoopenmp-relocatable-target %s 2>&1 \ // RUN: | FileCheck -check-prefix=CHK-PTXAS-NORELO %s // CHK-PTXAS-NORELO-NOT: ptxas{{.*}}" "-c" /// ########################################################################### /// PTXAS is passed -c flag by default when offloading to an NVIDIA device using OpenMP /// Check that the flag is passed when -fopenmp-relocatable-target is used. // RUN: %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -fopenmp-relocatable-target %s 2>&1 \ // RUN: | FileCheck -check-prefix=CHK-PTXAS-RELO %s // CHK-PTXAS-RELO: ptxas{{.*}}" "-c" /// ########################################################################### /// Check PTXAS is passed the compute capability passed to the driver. // RUN: %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda --fopenmp-ptx=+ptx52 %s 2>&1 \ // RUN: | FileCheck -check-prefix=CHK-PTXAS-VERSION %s // CHK-PTXAS-VERSION: clang{{.*}}.bc" {{.*}}"-target-feature" "+ptx52" /// ########################################################################### /// Check PTXAS is passed the compute capability passed to the driver. // RUN: %clang -### -no-canonical-prefixes -target x86_64-apple-darwin17.0.0 -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda --fopenmp-ptx=+ptx52 %s 2>&1 \ // RUN: | FileCheck -check-prefix=CHK-PTXAS-DARWIN-VERSION %s // CHK-PTXAS-DARWIN-VERSION: clang{{.*}}.bc" {{.*}}"-target-feature" "+ptx52"
- I think that this test is starting to get a bit big, which makes it harder to figure out what exactly is failing. Can you please use new test files in the future patches?
- Can you please figure out why your commit emails don't make it to the cfe-commits.llvm.org mailing list? It's easier to follow the situation with the commit emails.
Let me know if you need help figuring out the failures,
Alex
While I do get the time pressure and such, doing it at the expense of others is not cool. Many teams work process is broken for days, folks are busy looking into it, investigating and reverting, other breakages are masked by these failures etc.
If your test depend on local configuration, you need to be extra careful pushing this fragile test. We are lucky, we have bots running continuously, but what about others who runs their tests less regularly? What about those who pulled LLVM code, built it and trying to run tests before contributing a patch? What's your plan to debug those configurations?
Even after all the reverts in r310580, our tests are still failing (http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/7080). not surprising, but very disrupting. Please have a plan to fix it soon, otherwise I'll have to revert it even further.
First of all, I apologize if I've upset you with my previous post. I am actively working on understanding what is causing these issues. It is not my intention to write tests that work on local configurations only. I am upset to see that these tests keep failing for your and maybe other configurations. Without knowing the actual reason of the failures I can only speculate what is going wrong with them hence the flurry of changes.
Should we have a mock CUDA installation directory in the test directory? We have a bunch of these in test/Driver/Inputs for various other things. When we could point these tests at that directory (or directories if we have different mocks for different CUDA versions) and remove any dependence on local CUDA configurations.
Thank you, apology accepted. That was exactly my point, not to start a fight, but to emphasize that depending on local configuration is never going to work, you will never be able to see and test all of them. Please disable the test ASAP and until the better way to handle it is determined.
The failures were very widespread, e.g. there's a linux buildbot that was red until the revert: http://bb.pgr.jp/builders/test-clang-i686-linux-RA. If you have access to a linux machine you should be able to reproduce the failures that the bot experienced by using the same cmake arguments (I don't know the exact ones, but judging from the bot you should be able to reproduce them using 32 bit release build with assertions enabled). I don't know what GPU that buildbot has.
I'll try to get the detailed test output for my local machine today as well.
@gtbercea Hi, I just saw your comment on my gist. (Unfortunately github does not send email notifications about gist comments; commenting here is probably better.) If you have Docker installed, it should be easy to get whatever output you like — just change the Dockerfile to use -DCMAKE_BUILD_TYPE=Debug, then run docker build -t llvm-test . and docker run -it llvm-test /bin/bash.
Thanks Alex, I will try to reproduce it locally.
I'll try to get the detailed test output for my local machine today as well.
Oh that would be great! Thanks a lot! :)
I've traced the output across all the reverted commits:
Note that after r310549 the last 9 RUN lines started failing because of the same crash:
clang version 6.0.0 (http://llvm.org/git/llvm.git 00708415fb45c18f9871def78647dd555c253e0b) Target: x86_64-apple-darwin17.0.0 Thread model: posix InstalledDir: /Users/alex/bisect/b/./bin no libdevice exists. UNREACHABLE executed at /Users/alex/bisect/llvm/tools/clang/lib/Driver/ToolChains/Cuda.h:88! 0 clang 0x000000010799795c llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 60 1 clang 0x0000000107997f59 PrintStackTraceSignalHandler(void*) + 25 2 clang 0x0000000107993969 llvm::sys::RunSignalHandlers() + 425 3 clang 0x00000001079982e2 SignalHandler(int) + 354 4 libsystem_platform.dylib 0x00007fffc35cfefa _sigtramp + 26 5 libsystem_platform.dylib 0x00007fff5b10b6a8 _sigtramp + 2545137608 6 libsystem_c.dylib 0x00007fffc341014a abort + 127 7 clang 0x0000000107872cf0 LLVMInstallFatalErrorHandler + 0 8 clang 0x000000010856c51c clang::driver::CudaInstallationDetector::getLowestExistingArch() const + 1644 9 clang 0x000000010856acfb clang::driver::toolchains::CudaToolChain::TranslateArgs(llvm::opt::DerivedArgList const&, llvm::StringRef, clang::driver::Action::OffloadKind) const + 1291 10 clang 0x000000010843ce37 clang::driver::Compilation::getArgsForToolChain(clang::driver::ToolChain const*, llvm::StringRef, clang::driver::Action::OffloadKind) + 295 11 clang 0x00000001084768c0 clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 4064 12 clang 0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393 13 clang 0x00000001084765e3 clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 3331 14 clang 0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393 15 clang 0x00000001084765e3 clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 3331 16 clang 0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393 17 clang 0x00000001084765e3 clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 3331 18 clang 0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393 19 clang 0x00000001084765e3 clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 3331 20 clang 0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393 21 clang 0x00000001084a7589 clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const::$_3::operator()(clang::driver::Action*, clang::driver::ToolChain const*, char const*) const + 409 22 clang 0x00000001084a73de void llvm::function_ref<void (clang::driver::Action*, clang::driver::ToolChain const*, char const*)>::callback_fn<clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const::$_3>(long, clang::driver::Action*, clang::driver::ToolChain const*, char const*) + 78 23 clang 0x0000000108439b40 llvm::function_ref<void (clang::driver::Action*, clang::driver::ToolChain const*, char const*)>::operator()(clang::driver::Action*, clang::driver::ToolChain const*, char const*) const + 96 24 clang 0x0000000108439d10 clang::driver::OffloadAction::doOnEachDeviceDependence(llvm::function_ref<void (clang::driver::Action*, clang::driver::ToolChain const*, char const*)> const&) const + 448 25 clang 0x0000000108439dd9 clang::driver::OffloadAction::doOnEachDependence(bool, llvm::function_ref<void (clang::driver::Action*, clang::driver::ToolChain const*, char const*)> const&) const + 73 26 clang 0x0000000108475b8b clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 683 27 clang 0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393 28 clang 0x000000010847616e clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 2190 29 clang 0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393 30 clang 0x0000000108462f82 clang::driver::Driver::BuildJobs(clang::driver::Compilation&) const + 1538 31 clang 0x000000010845951a clang::driver::Driver::BuildCompilation(llvm::ArrayRef<char const*>) + 8266 32 clang 0x0000000104ae9303 main + 12275 33 libdyld.dylib 0x00007fffc336c515 start + 1 34 libdyld.dylib 0x0000000000000008 start + 1019820788 Stack dump: 0. Program arguments: /Users/alex/bisect/b/./bin/clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes /Users/alex/bisect/llvm/tools/clang/test/Driver/openmp-offload.c 1. Compilation construction 2. Building compilation jobs 3. Building compilation jobs 4. Building compilation jobs 5. Building compilation jobs 6. Building compilation jobs 7. Building compilation jobs 8. Building compilation jobs 9. Building compilation jobs Abort trap: 6
I have disabled all the offloading tests apart from the ones that pertain to the patch previous to the one introducing Cubin integration into host bin.
Please let me know if you see any more failures on your side. If you do feel free to revert all the patches up to and including: D29654
310625
Our bots still fai after this changel: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/7085
I have re-enabled the previous offloading tests and moved the new GPU offloading tests to a new file which is disabled for linux (for now).
310718
Alex thanks so much for the logs, they have been very useful to understand what's going on.
Aleksey, I have since tried to install a Clang version with the address sanitizer enabled but without much success. Apart from turning on the sanitizer in the cmake using the -DLLVM_USE_SANITIZER="Address" flag is there any other flag that I need to pass to cmake?
I am trying to run this on my macbook x86_64 and OS X 10.11. I am getting the following error when building the compiler:
[2966/4254] Linking CXX shared library lib/libc++abi.1.0.dylib
FAILED: lib/libc++abi.1.0.dylib
Undefined symbols for architecture x86_64:
"___asan_after_dynamic_init", referenced from: __GLOBAL__sub_I_cxa_default_handlers.cpp in cxa_default_handlers.cpp.o "___asan_before_dynamic_init", referenced from: __GLOBAL__sub_I_cxa_default_handlers.cpp in cxa_default_handlers.cpp.o
[...]
ld: symbol(s) not found for architecture x86_64
Actually, you can run our bot, it is in zorg (http://llvm.org/git/zorg.git), zorg/buildbot/builders/sanitizers/buildbot_fast.sh (the one I linked the last time).
Create a temp folder and from that folder run:
BUILDBOT_REVISION= BUILDBOT_CLOBBER= $PATH_YOUR_PROJECTS$/zorg/zorg/buildbot/builders/sanitizers/buildbot_fast.sh
http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/7109 failed on r310718, please fix.
I can't seem to run this script since SVN keeps resetting the connection:
svn: E000054: Error running context: Connection reset by peer
Couldn't fix/find the actual error so for now, just moving the flag patch tests to openmp-offload-gpu.c which is a disabled test.
310765
Bad news, the bot is still red: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/7114
How am I supposed to run the script you suggested?
I keep getting svn errors:
svn: E175002: Unexpected HTTP status 413 'Request Entity Too Large' on '/svn/llvm-project/!svn/vcc/default'
I think I've found the memory leak with cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_USE_SANITIZER=Address -DLLVM_ENABLE_LIBCXX=ON.
I committed a fix, let's see if the bot likes openmp-offload.c in rL310817. There is another leak for openmp-offload-gpu.c which I will write about in D34784.
Side note: We might want to get rid of 2>&1 in all the tests so that sanitizer errors get through! That would have sped up the search...
Going through my list of reviews, this patch was reverted because of memory leaks in other changes. However, I don't think we need this anymore because Clang is raising the PTX level as needed for that CUDA version. Can we abandon this flag?
No, use CC1Args.push_back() here, or CC1Args.emplace_back(), which is even better.
Why you don't want to use the code I proposed in my previous comment?