This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Add flag for overwriting default PTX version for OpenMP targets
AbandonedPublic

Authored by gtbercea on Feb 7 2017, 9:11 AM.

Diff Detail

Repository
rL LLVM

Event Timeline

gtbercea created this revision.Feb 7 2017, 9:11 AM
gtbercea updated this revision to Diff 93244.Mar 28 2017, 8:34 AM

Update patch to reflect latest source code changes.

gtbercea updated this revision to Diff 93901.Apr 3 2017, 11:43 AM

Update test.

ABataev added inline comments.Apr 5 2017, 1:10 PM
lib/Driver/ToolChains/Cuda.cpp
503–514

I don't like this kind of code. It is better to make like this:

if (DeviceOffloadingKind == Action::OFK_OpenMP)
  CC1Args.push_back(DriverArgs.getLastArgValue(options::OPT_fopenmp_ptx_EQ, "+ptx42"));
else
  CC1Args.push_back("+ptx42");

or something like this

gtbercea updated this revision to Diff 94824.Apr 11 2017, 8:18 AM

Integrate review.

gtbercea marked an inline comment as done.Apr 11 2017, 8:18 AM
ABataev added inline comments.Apr 11 2017, 8:32 AM
lib/Driver/ToolChains/Cuda.cpp
507

No, use CC1Args.push_back() here, or CC1Args.emplace_back(), which is even better.
Why you don't want to use the code I proposed in my previous comment?

gtbercea updated this revision to Diff 94841.Apr 11 2017, 9:18 AM

Refactor.

gtbercea updated this revision to Diff 94842.Apr 11 2017, 9:24 AM

Run clang format.

gtbercea marked an inline comment as done.Apr 11 2017, 9:24 AM
This revision is now accepted and ready to land.Apr 11 2017, 11:09 AM
gtbercea closed this revision.Aug 9 2017, 8:57 AM

Looks like this test is failing on macOS again after this change:

http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/39231/testReport/Clang/Driver/openmp_offload_c/

Can you please take a look?

Looks like this test is failing on macOS again after this change:

http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/39231/testReport/Clang/Driver/openmp_offload_c/

Can you please take a look?

Looking into it now.

Revision 310505 fixes the tests for this patch.

Even after r310505, openmp-offload.c continues to haunt our bots, for example http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/2012. Can you please fix this test?

gtbercea added a comment.EditedAug 9 2017, 1:51 PM

Even after r310505, openmp-offload.c continues to haunt our bots, for example http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/2012. Can you please fix this test?

Preparing a fix now: 310519

r310519 did not fix the problem, see http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/7062. I would suggest to revert and fix it properly, our bots are broken for a few days already.

gtbercea added a comment.EditedAug 9 2017, 4:50 PM

I've removed that test. Let's see if the other two tests pass or not. (310537)

I can't reproduce the error locally so it's hard to figure out what's failing.

If you have a machine with that configuration and can run the command I would appreciate seeing the output of the failing command. That way I know what the driver is doing on your machine.

dyung added a subscriber: dyung.Aug 9 2017, 6:58 PM

Hi, we are seeing this test fail on our internal linux build bot. I built/ran your latest change r310537 and here is the test result:

/home/dyung/src/upstream/llvm_clean/tools/clang/test/Driver/openmp-offload.c:722:23: error: expected string not found in input                                                                                                            
// CHK-PTXAS-VERSION: clang{{.*}}.bc" {{.*}}"-target-feature" "+ptx52"                                               
                      ^                                                                                              
<stdin>:1:1: note: scanning from here                                                                                
clang version 6.0.0 (trunk 310537)                                                                                   
^                                                                                                                    
<stdin>:9:114: note: possible intended match here                                                                    
 "nvlink" "-o" "/tmp/lit_tmp_FMSP4Q/openmp-offload-bb8c5f.out" "-arch" "sm_20" "-L/home/dyung/src/upstream/310537-linux/./lib" "-lomptarget-nvptx" "openmp-offload-74c18d.cubin"

Executing the run line from line 719 of the file at r310537 produces the following output:

dyung@Spica:~/src/upstream/llvm_clean/tools/clang/test/Driver$ /home/dyung/src/upstream/310537-linux/./bin/clang  -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda --fopenmp-ptx=+ptx52 ~/src/upstream/llvm_clean/tools/clang/test/Driver/openmp-offload.c 2>&1                                                                
clang version 6.0.0 (trunk 310537)                                                                                   
Target: x86_64-unknown-linux-gnu                                                                                     
Thread model: posix                                                                                                  
InstalledDir: /home/dyung/src/upstream/310537-linux/./bin                                                            
clang: error: cannot find libdevice for sm_20. Provide path to different CUDA installation via --cuda-path, or pass -nocudalib to build without linking with libdevice.                                                                   

 "/home/dyung/src/upstream/310537-linux/./bin/clang" "-cc1" "-triple" "x86_64-unknown-linux-gnu" "-emit-llvm-bc" "-emit-llvm-uselists" "-disable-free" "-main-file-name" "openmp-offload.c" "-mrelocation-model" "static" "-mthread-model" "posix" "-mdisable-fp-elim" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-fuse-init-array" "-target-cpu" "x86-64" "-dwarf-column-info" "-debugger-tuning=gdb" "-resource-dir" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fdebug-compilation-dir" "/home/dyung/src/upstream/llvm_clean/tools/clang/test/Driver" "-ferror-limit" "19" "-fmessage-length" "117" "-fopenmp" "-fobjc-runtime=gcc" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-o" "/tmp/openmp-offload-e58520.bc" "-x" "c" "/home/dyung/src/upstream/llvm_clean/tools/clang/test/Driver/openmp-offload.c" "-fopenmp-targets=nvptx64-nvidia-cuda"                                                                                                            

 "/home/dyung/src/upstream/310537-linux/./bin/clang" "-cc1" "-triple" "nvptx64-nvidia-cuda" "-aux-triple" "x86_64-unknown-linux-gnu" "-S" "-disable-free" "-main-file-name" "openmp-offload.c" "-mrelocation-model" "pic" "-pic-level" "2" "-mthread-model" "posix" "-mdisable-fp-elim" "-fmath-errno" "-no-integrated-as" "-fuse-init-array" "-target-cpu" "sm_20" "-dwarf-column-info" "-debugger-tuning=gdb" "-resource-dir" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-internal-isystem" "/usr/local/include" "-internal-isystem" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0/include" "-internal-externc-isystem" "/usr/include/x86_64-linux-gnu" "-internal-externc-isystem" "/include" "-internal-externc-isystem" "/usr/include" "-fno-dwarf-directory-asm" "-fdebug-compilation-dir" "/home/dyung/src/upstream/llvm_clean/tools/clang/test/Driver" "-ferror-limit" "19" "-fmessage-length" "117" "-fopenmp" "-fobjc-runtime=gcc" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-o" "/tmp/openmp-offload-7135c4.s" "-x" "c" "/home/dyung/src/upstream/llvm_clean/tools/clang/test/Driver/openmp-offload.c" "-fopenmp-is-device" "-fopenmp-host-ir-file-path" "/tmp/openmp-offload-e58520.bc"                                                             

 "ptxas" "-m64" "-O0" "--gpu-name" "sm_20" "--output-file" "/tmp/openmp-offload-424f8c.cubin" "/tmp/openmp-offload-7135c4.s" "-c"                                                                                                         

 "nvlink" "-o" "/tmp/openmp-offload-041499.out" "-arch" "sm_20" "-L/home/dyung/src/upstream/310537-linux/./lib" "-lomptarget-nvptx" "openmp-offload-424f8c.cubin"                                                                         

 "/home/dyung/src/upstream/310537-linux/./bin/clang" "-cc1" "-triple" "x86_64-unknown-linux-gnu" "-emit-obj" "-mrelax-all" "-disable-free" "-main-file-name" "openmp-offload.c" "-mrelocation-model" "static" "-mthread-model" "posix" "-mdisable-fp-elim" "-fmath-errno" "-masm-verbose" "-mconstructor-aliases" "-munwind-tables" "-fuse-init-array" "-target-cpu" "x86-64" "-dwarf-column-info" "-debugger-tuning=gdb" "-resource-dir" "/home/dyung/src/upstream/310537-linux/./lib/clang/6.0.0" "-fdebug-compilation-dir" "/home/dyung/src/upstream/llvm_clean/tools/clang/test/Driver" "-ferror-limit" "19" "-fmessage-length" "117" "-fopenmp" "-fobjc-runtime=gcc" "-fdiagnostics-show-option" "-fcolor-diagnostics" "-o" "/tmp/openmp-offload-430e99.o" "-x" "ir" "/tmp/openmp-offload-e58520.bc"                                          

 "/usr/bin/ld" "-z" "relro" "--hash-style=gnu" "--eh-frame-hdr" "-m" "elf_x86_64" "-dynamic-linker" "/lib64/ld-linux-x86-64.so.2" "-o" "a.out" "/usr/lib/gcc/x86_64-linux-gnu/5.4.1/../../../x86_64-linux-gnu/crt1.o" "/usr/lib/gcc/x86_64-linux-gnu/5.4.1/../../../x86_64-linux-gnu/crti.o" "/usr/lib/gcc/x86_64-linux-gnu/5.4.1/crtbegin.o" "-L/usr/lib/gcc/x86_64-linux-gnu/5.4.1" "-L/usr/lib/gcc/x86_64-linux-gnu/5.4.1/../../../x86_64-linux-gnu" "-L/lib/x86_64-linux-gnu" "-L/lib/../lib64" "-L/usr/lib/x86_64-linux-gnu" "-L/usr/lib/gcc/x86_64-linux-gnu/5.4.1/../../.." "-L/home/dyung/src/upstream/310537-linux/./bin/../lib" "-L/lib" "-L/usr/lib" "/tmp/openmp-offload-430e99.o" "-lomp" "-lomptarget" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "-lpthread" "-lc" "-lgcc" "--as-needed" "-lgcc_s" "--no-as-needed" "/usr/lib/gcc/x86_64-linux-gnu/5.4.1/crtend.o" "/usr/lib/gcc/x86_64-linux-gnu/5.4.1/../../../x86_64-linux-gnu/crtn.o" "-T" "/tmp/a-d7e5d0.lk"

Thanks for running the test on your machine! This is very useful.

I see what the problem is now:

"clang: error: cannot find libdevice for sm_20. Provide path to different CUDA installation via --cuda-path, or pass -nocudalib to build without linking with libdevice."

Looking into it now.

FWIW, I'm able to reproduce the failure using Docker:

Dockerfile:

FROM ubuntu:xenial
RUN apt-get update
RUN apt-get install -y build-essential ca-certificates subversion python cmake --no-install-recommends

WORKDIR /
RUN svn co -q -r 310537 http://llvm.org/svn/llvm-project/llvm/trunk llvm
RUN svn co -q -r 310537 http://llvm.org/svn/llvm-project/cfe/trunk llvm/tools/clang

RUN mkdir /build
WORKDIR /build

RUN cmake ../llvm -DCMAKE_BUILD_TYPE="Release"
$ docker build -t D29660-test . && docker run -it D29660-test /bin/bash

then inside the container: make check-clang-driver -j8

gtbercea added a comment.EditedAug 9 2017, 10:04 PM

310549 should solve this problem by using a default architecture that is supported by the underlying device version.

  1. I'm sorry, but I had to revert r310489 and follow-up commits r310505, r310519, r310537 and r310549 since it looks like the failures are accumulating. The revert commit was r310580. The following run lines were failing for me because of various assertion failures and file check errors:
/// ###########################################################################

/// Check cubin file generation and usage by nvlink
// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes %s 2>&1 \
// RUN:   | FileCheck -check-prefix=CHK-CUBIN %s

// CHK-CUBIN: clang{{.*}}" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda.s"
// CHK-CUBIN-NEXT: ptxas{{.*}}" "--output-file" "{{.*}}-openmp-nvptx64-nvidia-cuda.cubin" "{{.*}}-openmp-nvptx64-nvidia-cuda.s"
// CHK-CUBIN-NEXT: nvlink" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda" {{.*}} "openmp-offload-openmp-nvptx64-nvidia-cuda.cubin"

/// ###########################################################################

/// Check cubin file generation and usage by nvlink when toolchain has BindArchAction
// RUN:   %clang -### -no-canonical-prefixes -target x86_64-apple-darwin17.0.0 -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes %s 2>&1 \
// RUN:   | FileCheck -check-prefix=CHK-CUBIN-DARWIN %s

// CHK-CUBIN-DARWIN: clang{{.*}}" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda.s"
// CHK-CUBIN-DARWIN-NEXT: ptxas{{.*}}" "--output-file" "{{.*}}-openmp-nvptx64-nvidia-cuda.cubin" "{{.*}}-openmp-nvptx64-nvidia-cuda.s"
// CHK-CUBIN-DARWIN-NEXT: nvlink" "-o" "{{.*}}-openmp-nvptx64-nvidia-cuda" {{.*}} "openmp-offload-openmp-nvptx64-nvidia-cuda.cubin"

/// ###########################################################################

/// Check cubin file generation and usage by nvlink
// RUN:   touch %t1.o
// RUN:   touch %t2.o
// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda %t1.o %t2.o 2>&1 \
// RUN:   | FileCheck -check-prefix=CHK-TWOCUBIN %s

// CHK-TWOCUBIN: nvlink"{{.*}}"openmp-offload-{{.*}}.cubin" "openmp-offload-{{.*}}.cubin"

/// ###########################################################################

/// Check cubin file generation and usage by nvlink when toolchain has BindArchAction
// RUN:   touch %t1.o
// RUN:   touch %t2.o
// RUN:   %clang -### -no-canonical-prefixes -target x86_64-apple-darwin17.0.0 -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda %t1.o %t2.o 2>&1 \
// RUN:   | FileCheck -check-prefix=CHK-TWOCUBIN-DARWIN %s

// CHK-TWOCUBIN-DARWIN: nvlink"{{.*}}"openmp-offload-{{.*}}.cubin" "openmp-offload-{{.*}}.cubin"

/// ###########################################################################

/// Check PTXAS is passed -c flag when offloading to an NVIDIA device using OpenMP.
// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda %s 2>&1 \
// RUN:   | FileCheck -check-prefix=CHK-PTXAS-DEFAULT %s

// CHK-PTXAS-DEFAULT: ptxas{{.*}}" "-c"

/// ###########################################################################

/// PTXAS is passed -c flag by default when offloading to an NVIDIA device using OpenMP - disable it.
// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -fnoopenmp-relocatable-target %s 2>&1 \
// RUN:   | FileCheck -check-prefix=CHK-PTXAS-NORELO %s

// CHK-PTXAS-NORELO-NOT: ptxas{{.*}}" "-c"

/// ###########################################################################

/// PTXAS is passed -c flag by default when offloading to an NVIDIA device using OpenMP
/// Check that the flag is passed when -fopenmp-relocatable-target is used.
// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -fopenmp-relocatable-target %s 2>&1 \
// RUN:   | FileCheck -check-prefix=CHK-PTXAS-RELO %s

// CHK-PTXAS-RELO: ptxas{{.*}}" "-c"

/// ###########################################################################

/// Check PTXAS is passed the compute capability passed to the driver.
// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda --fopenmp-ptx=+ptx52 %s 2>&1 \
// RUN:   | FileCheck -check-prefix=CHK-PTXAS-VERSION %s

// CHK-PTXAS-VERSION: clang{{.*}}.bc" {{.*}}"-target-feature" "+ptx52"

/// ###########################################################################

/// Check PTXAS is passed the compute capability passed to the driver.
// RUN:   %clang -### -no-canonical-prefixes -target x86_64-apple-darwin17.0.0 -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda --fopenmp-ptx=+ptx52 %s 2>&1 \
// RUN:   | FileCheck -check-prefix=CHK-PTXAS-DARWIN-VERSION %s

// CHK-PTXAS-DARWIN-VERSION: clang{{.*}}.bc" {{.*}}"-target-feature" "+ptx52"
  1. I think that this test is starting to get a bit big, which makes it harder to figure out what exactly is failing. Can you please use new test files in the future patches?
  1. Can you please figure out why your commit emails don't make it to the cfe-commits.llvm.org mailing list? It's easier to follow the situation with the commit emails.

Let me know if you need help figuring out the failures,
Alex

This comment was removed by gtbercea.

While I do get the time pressure and such, doing it at the expense of others is not cool. Many teams work process is broken for days, folks are busy looking into it, investigating and reverting, other breakages are masked by these failures etc.

If your test depend on local configuration, you need to be extra careful pushing this fragile test. We are lucky, we have bots running continuously, but what about others who runs their tests less regularly? What about those who pulled LLVM code, built it and trying to run tests before contributing a patch? What's your plan to debug those configurations?

Even after all the reverts in r310580, our tests are still failing (http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/7080). not surprising, but very disrupting. Please have a plan to fix it soon, otherwise I'll have to revert it even further.

First of all, I apologize if I've upset you with my previous post. I am actively working on understanding what is causing these issues. It is not my intention to write tests that work on local configurations only. I am upset to see that these tests keep failing for your and maybe other configurations. Without knowing the actual reason of the failures I can only speculate what is going wrong with them hence the flurry of changes.

hfinkel edited edge metadata.Aug 10 2017, 8:48 AM

First of all, I apologize if I've upset you with my previous post. I am actively working on understanding what is causing these issues. It is not my intention to write tests that work on local configurations only. I am upset to see that these tests keep failing for your and maybe other configurations. Without knowing the actual reason of the failures I can only speculate what is going wrong with them hence the flurry of changes.

Should we have a mock CUDA installation directory in the test directory? We have a bunch of these in test/Driver/Inputs for various other things. When we could point these tests at that directory (or directories if we have different mocks for different CUDA versions) and remove any dependence on local CUDA configurations.

First of all, I apologize if I've upset you with my previous post. I am actively working on understanding what is causing these issues. It is not my intention to write tests that work on local configurations only. I am upset to see that these tests keep failing for your and maybe other configurations. Without knowing the actual reason of the failures I can only speculate what is going wrong with them hence the flurry of changes.

Thank you, apology accepted. That was exactly my point, not to start a fight, but to emphasize that depending on local configuration is never going to work, you will never be able to see and test all of them. Please disable the test ASAP and until the better way to handle it is determined.

The failures were very widespread, e.g. there's a linux buildbot that was red until the revert: http://bb.pgr.jp/builders/test-clang-i686-linux-RA. If you have access to a linux machine you should be able to reproduce the failures that the bot experienced by using the same cmake arguments (I don't know the exact ones, but judging from the bot you should be able to reproduce them using 32 bit release build with assertions enabled). I don't know what GPU that buildbot has.

I'll try to get the detailed test output for my local machine today as well.

@gtbercea Hi, I just saw your comment on my gist. (Unfortunately github does not send email notifications about gist comments; commenting here is probably better.) If you have Docker installed, it should be easy to get whatever output you like — just change the Dockerfile to use -DCMAKE_BUILD_TYPE=Debug, then run docker build -t llvm-test . and docker run -it llvm-test /bin/bash.

The failures were very widespread, e.g. there's a linux buildbot that was red until the revert: http://bb.pgr.jp/builders/test-clang-i686-linux-RA. If you have access to a linux machine you should be able to reproduce the failures that the bot experienced by using the same cmake arguments (I don't know the exact ones, but judging from the bot you should be able to reproduce them using 32 bit release build with assertions enabled). I don't know what GPU that buildbot has.

Thanks Alex, I will try to reproduce it locally.

I'll try to get the detailed test output for my local machine today as well.

Oh that would be great! Thanks a lot! :)

I've traced the output across all the reverted commits:

Note that after r310549 the last 9 RUN lines started failing because of the same crash:

clang version 6.0.0  (http://llvm.org/git/llvm.git 00708415fb45c18f9871def78647dd555c253e0b)
Target: x86_64-apple-darwin17.0.0
Thread model: posix
InstalledDir: /Users/alex/bisect/b/./bin
no libdevice exists.
UNREACHABLE executed at /Users/alex/bisect/llvm/tools/clang/lib/Driver/ToolChains/Cuda.h:88!
0  clang                    0x000000010799795c llvm::sys::PrintStackTrace(llvm::raw_ostream&) + 60
1  clang                    0x0000000107997f59 PrintStackTraceSignalHandler(void*) + 25
2  clang                    0x0000000107993969 llvm::sys::RunSignalHandlers() + 425
3  clang                    0x00000001079982e2 SignalHandler(int) + 354
4  libsystem_platform.dylib 0x00007fffc35cfefa _sigtramp + 26
5  libsystem_platform.dylib 0x00007fff5b10b6a8 _sigtramp + 2545137608
6  libsystem_c.dylib        0x00007fffc341014a abort + 127
7  clang                    0x0000000107872cf0 LLVMInstallFatalErrorHandler + 0
8  clang                    0x000000010856c51c clang::driver::CudaInstallationDetector::getLowestExistingArch() const + 1644
9  clang                    0x000000010856acfb clang::driver::toolchains::CudaToolChain::TranslateArgs(llvm::opt::DerivedArgList const&, llvm::StringRef, clang::driver::Action::OffloadKind) const + 1291
10 clang                    0x000000010843ce37 clang::driver::Compilation::getArgsForToolChain(clang::driver::ToolChain const*, llvm::StringRef, clang::driver::Action::OffloadKind) + 295
11 clang                    0x00000001084768c0 clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 4064
12 clang                    0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393
13 clang                    0x00000001084765e3 clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 3331
14 clang                    0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393
15 clang                    0x00000001084765e3 clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 3331
16 clang                    0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393
17 clang                    0x00000001084765e3 clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 3331
18 clang                    0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393
19 clang                    0x00000001084765e3 clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 3331
20 clang                    0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393
21 clang                    0x00000001084a7589 clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const::$_3::operator()(clang::driver::Action*, clang::driver::ToolChain const*, char const*) const + 409
22 clang                    0x00000001084a73de void llvm::function_ref<void (clang::driver::Action*, clang::driver::ToolChain const*, char const*)>::callback_fn<clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const::$_3>(long, clang::driver::Action*, clang::driver::ToolChain const*, char const*) + 78
23 clang                    0x0000000108439b40 llvm::function_ref<void (clang::driver::Action*, clang::driver::ToolChain const*, char const*)>::operator()(clang::driver::Action*, clang::driver::ToolChain const*, char const*) const + 96
24 clang                    0x0000000108439d10 clang::driver::OffloadAction::doOnEachDeviceDependence(llvm::function_ref<void (clang::driver::Action*, clang::driver::ToolChain const*, char const*)> const&) const + 448
25 clang                    0x0000000108439dd9 clang::driver::OffloadAction::doOnEachDependence(bool, llvm::function_ref<void (clang::driver::Action*, clang::driver::ToolChain const*, char const*)> const&) const + 73
26 clang                    0x0000000108475b8b clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 683
27 clang                    0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393
28 clang                    0x000000010847616e clang::driver::Driver::BuildJobsForActionNoCache(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 2190
29 clang                    0x0000000108475541 clang::driver::Driver::BuildJobsForAction(clang::driver::Compilation&, clang::driver::Action const*, clang::driver::ToolChain const*, llvm::StringRef, bool, bool, char const*, std::__1::map<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<std::__1::pair<clang::driver::Action const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > const, clang::driver::InputInfo> > >&, clang::driver::Action::OffloadKind) const + 1393
30 clang                    0x0000000108462f82 clang::driver::Driver::BuildJobs(clang::driver::Compilation&) const + 1538
31 clang                    0x000000010845951a clang::driver::Driver::BuildCompilation(llvm::ArrayRef<char const*>) + 8266
32 clang                    0x0000000104ae9303 main + 12275
33 libdyld.dylib            0x00007fffc336c515 start + 1
34 libdyld.dylib            0x0000000000000008 start + 1019820788
Stack dump:
0.    Program arguments: /Users/alex/bisect/b/./bin/clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes /Users/alex/bisect/llvm/tools/clang/test/Driver/openmp-offload.c
1.    Compilation construction
2.    Building compilation jobs
3.    Building compilation jobs
4.    Building compilation jobs
5.    Building compilation jobs
6.    Building compilation jobs
7.    Building compilation jobs
8.    Building compilation jobs
9.    Building compilation jobs
Abort trap: 6

First of all, I apologize if I've upset you with my previous post. I am actively working on understanding what is causing these issues. It is not my intention to write tests that work on local configurations only. I am upset to see that these tests keep failing for your and maybe other configurations. Without knowing the actual reason of the failures I can only speculate what is going wrong with them hence the flurry of changes.

Thank you, apology accepted. That was exactly my point, not to start a fight, but to emphasize that depending on local configuration is never going to work, you will never be able to see and test all of them. Please disable the test ASAP and until the better way to handle it is determined.

I have disabled all the offloading tests apart from the ones that pertain to the patch previous to the one introducing Cubin integration into host bin.
Please let me know if you see any more failures on your side. If you do feel free to revert all the patches up to and including: D29654

310625

The failures were very widespread, e.g. there's a linux buildbot that was red until the revert: http://bb.pgr.jp/builders/test-clang-i686-linux-RA. If you have access to a linux machine you should be able to reproduce the failures that the bot experienced by using the same cmake arguments (I don't know the exact ones, but judging from the bot you should be able to reproduce them using 32 bit release build with assertions enabled). I don't know what GPU that buildbot has.

Thanks Alex, I will try to reproduce it locally.

I'll try to get the detailed test output for my local machine today as well.

Oh that would be great! Thanks a lot! :)

First of all, I apologize if I've upset you with my previous post. I am actively working on understanding what is causing these issues. It is not my intention to write tests that work on local configurations only. I am upset to see that these tests keep failing for your and maybe other configurations. Without knowing the actual reason of the failures I can only speculate what is going wrong with them hence the flurry of changes.

Thank you, apology accepted. That was exactly my point, not to start a fight, but to emphasize that depending on local configuration is never going to work, you will never be able to see and test all of them. Please disable the test ASAP and until the better way to handle it is determined.

I have disabled all the offloading tests apart from the ones that pertain to the patch previous to the one introducing Cubin integration into host bin.
Please let me know if you see any more failures on your side. If you do feel free to revert all the patches up to and including: D29654

310625

Our bots still fai after this changel: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/7085

The failures were very widespread, e.g. there's a linux buildbot that was red until the revert: http://bb.pgr.jp/builders/test-clang-i686-linux-RA. If you have access to a linux machine you should be able to reproduce the failures that the bot experienced by using the same cmake arguments (I don't know the exact ones, but judging from the bot you should be able to reproduce them using 32 bit release build with assertions enabled). I don't know what GPU that buildbot has.

Thanks Alex, I will try to reproduce it locally.

I'll try to get the detailed test output for my local machine today as well.

Oh that would be great! Thanks a lot! :)

First of all, I apologize if I've upset you with my previous post. I am actively working on understanding what is causing these issues. It is not my intention to write tests that work on local configurations only. I am upset to see that these tests keep failing for your and maybe other configurations. Without knowing the actual reason of the failures I can only speculate what is going wrong with them hence the flurry of changes.

Thank you, apology accepted. That was exactly my point, not to start a fight, but to emphasize that depending on local configuration is never going to work, you will never be able to see and test all of them. Please disable the test ASAP and until the better way to handle it is determined.

I have disabled all the offloading tests apart from the ones that pertain to the patch previous to the one introducing Cubin integration into host bin.
Please let me know if you see any more failures on your side. If you do feel free to revert all the patches up to and including: D29654

310625

Our bots still fai after this changel: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/7085

r310640 disables this test on Linux

I have re-enabled the previous offloading tests and moved the new GPU offloading tests to a new file which is disabled for linux (for now).

310718

Alex thanks so much for the logs, they have been very useful to understand what's going on.

Aleksey, I have since tried to install a Clang version with the address sanitizer enabled but without much success. Apart from turning on the sanitizer in the cmake using the -DLLVM_USE_SANITIZER="Address" flag is there any other flag that I need to pass to cmake?
I am trying to run this on my macbook x86_64 and OS X 10.11. I am getting the following error when building the compiler:

[2966/4254] Linking CXX shared library lib/libc++abi.1.0.dylib
FAILED: lib/libc++abi.1.0.dylib
Undefined symbols for architecture x86_64:

"___asan_after_dynamic_init", referenced from:
    __GLOBAL__sub_I_cxa_default_handlers.cpp in cxa_default_handlers.cpp.o
"___asan_before_dynamic_init", referenced from:
    __GLOBAL__sub_I_cxa_default_handlers.cpp in cxa_default_handlers.cpp.o

[...]
ld: symbol(s) not found for architecture x86_64

alekseyshl added a comment.EditedAug 11 2017, 10:02 AM

I have re-enabled the previous offloading tests and moved the new GPU offloading tests to a new file which is disabled for linux (for now).

310718

Alex thanks so much for the logs, they have been very useful to understand what's going on.

Aleksey, I have since tried to install a Clang version with the address sanitizer enabled but without much success. Apart from turning on the sanitizer in the cmake using the -DLLVM_USE_SANITIZER="Address" flag is there any other flag that I need to pass to cmake?
I am trying to run this on my macbook x86_64 and OS X 10.11. I am getting the following error when building the compiler:

[2966/4254] Linking CXX shared library lib/libc++abi.1.0.dylib
FAILED: lib/libc++abi.1.0.dylib
Undefined symbols for architecture x86_64:

"___asan_after_dynamic_init", referenced from:
    __GLOBAL__sub_I_cxa_default_handlers.cpp in cxa_default_handlers.cpp.o
"___asan_before_dynamic_init", referenced from:
    __GLOBAL__sub_I_cxa_default_handlers.cpp in cxa_default_handlers.cpp.o

[...]
ld: symbol(s) not found for architecture x86_64

Actually, you can run our bot, it is in zorg (http://llvm.org/git/zorg.git), zorg/buildbot/builders/sanitizers/buildbot_fast.sh (the one I linked the last time).

Create a temp folder and from that folder run:
BUILDBOT_REVISION= BUILDBOT_CLOBBER= $PATH_YOUR_PROJECTS$/zorg/zorg/buildbot/builders/sanitizers/buildbot_fast.sh

I have re-enabled the previous offloading tests and moved the new GPU offloading tests to a new file which is disabled for linux (for now).

310718

Alex thanks so much for the logs, they have been very useful to understand what's going on.

Aleksey, I have since tried to install a Clang version with the address sanitizer enabled but without much success. Apart from turning on the sanitizer in the cmake using the -DLLVM_USE_SANITIZER="Address" flag is there any other flag that I need to pass to cmake?
I am trying to run this on my macbook x86_64 and OS X 10.11. I am getting the following error when building the compiler:

[2966/4254] Linking CXX shared library lib/libc++abi.1.0.dylib
FAILED: lib/libc++abi.1.0.dylib
Undefined symbols for architecture x86_64:

"___asan_after_dynamic_init", referenced from:
    __GLOBAL__sub_I_cxa_default_handlers.cpp in cxa_default_handlers.cpp.o
"___asan_before_dynamic_init", referenced from:
    __GLOBAL__sub_I_cxa_default_handlers.cpp in cxa_default_handlers.cpp.o

[...]
ld: symbol(s) not found for architecture x86_64

Actually, you can run our bot, it is in zorg (http://llvm.org/git/zorg.git), zorg/buildbot/builders/sanitizers/buildbot_fast.sh (the one I linked the last time).

Create a temp folder and from that folder run:
BUILDBOT_REVISION= BUILDBOT_CLOBBER= $PATH_YOUR_PROJECTS$/zorg/zorg/buildbot/builders/sanitizers/buildbot_fast.sh

I can't seem to run this script since SVN keeps resetting the connection:

svn: E000054: Error running context: Connection reset by peer

gtbercea added a comment.EditedAug 11 2017, 2:19 PM

Couldn't fix/find the actual error so for now, just moving the flag patch tests to openmp-offload-gpu.c which is a disabled test.

310765

Couldn't fix/find the actual error so for now, just moving the flag patch tests to openmp-offload-gpu.c which is a disabled test.

310765

Bad news, the bot is still red: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/7114

Couldn't fix/find the actual error so for now, just moving the flag patch tests to openmp-offload-gpu.c which is a disabled test.

310765

Bad news, the bot is still red: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/7114

Disabled openmp-offload.c on Linux again: https://reviews.llvm.org/rL310772

Couldn't fix/find the actual error so for now, just moving the flag patch tests to openmp-offload-gpu.c which is a disabled test.

310765

Bad news, the bot is still red: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/7114

Disabled openmp-offload.c on Linux again: https://reviews.llvm.org/rL310772

How am I supposed to run the script you suggested?

I keep getting svn errors:

svn: E175002: Unexpected HTTP status 413 'Request Entity Too Large' on '/svn/llvm-project/!svn/vcc/default'

Hahnfeld edited edge metadata.Aug 14 2017, 12:53 AM

Disabled openmp-offload.c on Linux again: https://reviews.llvm.org/rL310772

I think I've found the memory leak with cmake -DCMAKE_BUILD_TYPE=Release -DLLVM_ENABLE_ASSERTIONS=ON -DLLVM_USE_SANITIZER=Address -DLLVM_ENABLE_LIBCXX=ON.
I committed a fix, let's see if the bot likes openmp-offload.c in rL310817. There is another leak for openmp-offload-gpu.c which I will write about in D34784.

Side note: We might want to get rid of 2>&1 in all the tests so that sanitizer errors get through! That would have sped up the search...

This revision is now accepted and ready to land.Sep 26 2017, 12:45 PM
mkuron added a subscriber: mkuron.Nov 3 2017, 9:46 AM
mkuron removed a subscriber: mkuron.

Going through my list of reviews, this patch was reverted because of memory leaks in other changes. However, I don't think we need this anymore because Clang is raising the PTX level as needed for that CUDA version. Can we abandon this flag?

gtbercea abandoned this revision.Oct 1 2018, 7:21 AM

Going through my list of reviews, this patch was reverted because of memory leaks in other changes. However, I don't think we need this anymore because Clang is raising the PTX level as needed for that CUDA version. Can we abandon this flag?

You are correct. I'll close this.