Page MenuHomePhabricator

[OpenMP] Integrate OpenMP target region cubin into host binary
ClosedPublic

Authored by gtbercea on Feb 7 2017, 8:27 AM.

Event Timeline

gtbercea created this revision.Feb 7 2017, 8:27 AM
gtbercea added a subscriber: cfe-commits.
gtbercea updated this revision to Diff 93192.Mar 27 2017, 4:04 PM

Update patch to reflect latest source code changes.

jlebar edited edge metadata.Mar 27 2017, 4:05 PM

With your latest set of updates, I am not sure which, if any, patches you need me to take another look at. Unfortunately I don't have a ton of time for CUDA stuff these days, so where possible I'd prefer to shunt reviews over to hfinkel or chandlerc. But do let me know.

Hahnfeld edited edge metadata.
Hahnfeld added a subscriber: rnk.

Please format all comments as full sentences.

lib/Driver/ToolChains/Cuda.cpp
330

Is this catched somewhere before with a diagnostic?

373

AddOpenMPLinkerScript uses llvm::sys::path::replace_extension which does this magic internally

380–386

Copying files around would be something new in Clang that someone more experienced has to weight. @rnk, @hfinkel?

rnk added inline comments.Mar 31 2017, 1:18 PM
lib/Driver/ToolChains/Cuda.cpp
316

This entire if block constructs a completely unrelated link line from the rest of the function. They should be separate functions. In fact, CudaToolChain::buildLinker() should probably return a different tool depending on whether you're using OpenMP or not, and this logic should be in a different tool. Maybe NVPTX::Linker should be called NVPTX::FatLink as well.

380–386

I don't think copy is an actual binary on Windows, it's a builtin cmd command. At least, that's what where copy suggests. clang-cl can also run on Linux, so you probably want to check #ifdef LLVM_ON_WIN32 for this.

Honestly, I'd prefer it if you instead found a way to get the driver to rename the inputs right before executing nvlink. We have cross-platform utilities in Support for doing renames. You can probably make some kind of custom Command to do this.

test/Driver/openmp-offload.c
596

In this case, it would be nicer if you arranged for ptxas to output to a .cubin file.

603

Can you add a test that links two input ptxas .o files into a binary? You can use touch to make the .o files.

gtbercea updated this revision to Diff 94475.Apr 6 2017, 7:27 PM

Address some of the reviews.

gtbercea marked an inline comment as done.Apr 6 2017, 7:28 PM
gtbercea marked an inline comment as done.
tstellar added inline comments.Apr 7 2017, 6:52 AM
lib/Driver/ToolChains/Cuda.cpp
311

Extra whitespace.

gtbercea updated this revision to Diff 95175.Apr 13 2017, 11:19 AM

Use the rename() utility function of LLVM for renaming the PTXAS output before invoking NVLINK.

gtbercea marked 4 inline comments as done.Apr 13 2017, 11:21 AM
gtbercea added inline comments.
test/Driver/openmp-offload.c
596

@rnk I'm not sure what you mean by this comment, can you please clarify?

gtbercea updated this revision to Diff 95737.Apr 19 2017, 7:22 AM
gtbercea marked an inline comment as done.

Avoid renaming by enabling PTXAS to generate an output file with the appropriate extension, in this case a cubin extension.

gtbercea updated this revision to Diff 97331.May 1 2017, 12:33 PM

Change output of unbundler to produce cubin files when using OpenMP to offload to NVIDIA GPUs.

gtbercea marked an inline comment as done.May 1 2017, 12:34 PM

@rnk any further thought on the changes to this patch? Thanks :)

hfinkel added inline comments.May 9 2017, 7:48 PM
lib/Driver/ToolChains/Cuda.cpp
377

Comment sentences should start with a capital letter (here and below).

396

Add period at the end of the sentence.

398

Remove this blank line.

409

Can you please explain how this works for libraries specified by file name (e.g. /path/to/foo.so or /path/to/foo.a) or why these should be treated differently? Maybe I'm misunderstanding what this check does.

gtbercea updated this revision to Diff 98766.May 12 2017, 7:52 AM

Address comments.

gtbercea marked 3 inline comments as done.May 12 2017, 7:53 AM
sfantao added a subscriber: sfantao.Jun 9 2017, 3:56 AM
hfinkel accepted this revision.Jun 21 2017, 6:09 PM

LGTM

This revision is now accepted and ready to land.Jun 21 2017, 6:09 PM
gtbercea updated this revision to Diff 104535.Jun 28 2017, 4:06 PM

[Update regression tests] Add a test for propagating the compute capability to the OpenMP device offloading toolchain which targets NVIDIA GPUs.
This is a test for patch D34784 which is enabled by this patch.

gtbercea updated this revision to Diff 110056.Aug 7 2017, 12:46 PM

Add -no-canonical-prefixes to tests.

gtbercea closed this revision.Aug 7 2017, 1:02 PM

Hi @gtbercea,
I couldn't reply to the email as cfe-commits didn't even register this commit somehow, so I'm replying here.

Unfortunately I had to revert this commit (r310291), + two others for a clean revert (r310300 and r310332) because it caused a test failure on macOS. This particular run line:

// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes %t1.o %t2.o 2>&1 \
// RUN:   | FileCheck -check-prefix=CHK-TWOCUBIN %s

Causes the following assertion failure:

assert(CachedResults.find(ActionTC) != CachedResults.end() &&
       "Result does not exist??");

Here's a backtrace:

* frame #0: 0x00007fffbf3a2b2e libsystem_kernel.dylib`__pthread_kill + 10
  frame #1: 0x00007fffbf4c72de libsystem_pthread.dylib`pthread_kill + 303
  frame #2: 0x00007fffbf30041f libsystem_c.dylib`abort + 127
  frame #3: 0x00007fffbf2c9f34 libsystem_c.dylib`__assert_rtn + 320
  frame #4: 0x0000000103a1f2d1 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b11ed0, TC=0x0000000112819000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3418
  frame #5: 0x0000000103a1cf11 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b11ed0, TC=0x0000000112819000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #6: 0x0000000103a1dfb3 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b12130, TC=0x0000000112819000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3348
  frame #7: 0x0000000103a1cf11 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b121f0, TC=0x0000000112819000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #8: 0x0000000103a1db3e clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b11bf0, TC=0x0000000112819000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3310
  frame #9: 0x0000000103a1cf11 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b11bf0, TC=0x0000000112819000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #10: 0x0000000103a0a5c2 clang`clang::driver::Driver::BuildJobs(this=0x00007fff5fbfe4e8, C=0x0000000111b11830) const at Driver.cpp:2843
  frame #11: 0x0000000103a00b5c clang`clang::driver::Driver::BuildCompilation(this=0x00007fff5fbfe4e8, ArgList=ArrayRef<const char *> @ 0x00007fff5fbfc1e8) at Driver.cpp:746
  frame #12: 0x0000000100005a52 clang`main(argc_=9, argv_=0x00007fff5fbff670) at driver.cpp:463
  frame #13: 0x00007fffbf260c05 libdyld.dylib`start + 1

Could you please take a look?

Let me know if you need anything else,
Cheers,
Alex

The revert commit is r310345 btw

Hi @gtbercea,
I couldn't reply to the email as cfe-commits didn't even register this commit somehow, so I'm replying here.

Unfortunately I had to revert this commit (r310291), + two others for a clean revert (r310300 and r310332) because it caused a test failure on macOS. This particular run line:

// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes %t1.o %t2.o 2>&1 \
// RUN:   | FileCheck -check-prefix=CHK-TWOCUBIN %s

Causes the following assertion failure:

assert(CachedResults.find(ActionTC) != CachedResults.end() &&
       "Result does not exist??");

Here's a backtrace:

* frame #0: 0x00007fffbf3a2b2e libsystem_kernel.dylib`__pthread_kill + 10
  frame #1: 0x00007fffbf4c72de libsystem_pthread.dylib`pthread_kill + 303
  frame #2: 0x00007fffbf30041f libsystem_c.dylib`abort + 127
  frame #3: 0x00007fffbf2c9f34 libsystem_c.dylib`__assert_rtn + 320
  frame #4: 0x0000000103a1f2d1 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b11ed0, TC=0x0000000112819000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3418
  frame #5: 0x0000000103a1cf11 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b11ed0, TC=0x0000000112819000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #6: 0x0000000103a1dfb3 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b12130, TC=0x0000000112819000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3348
  frame #7: 0x0000000103a1cf11 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b121f0, TC=0x0000000112819000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #8: 0x0000000103a1db3e clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b11bf0, TC=0x0000000112819000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3310
  frame #9: 0x0000000103a1cf11 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b11bf0, TC=0x0000000112819000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #10: 0x0000000103a0a5c2 clang`clang::driver::Driver::BuildJobs(this=0x00007fff5fbfe4e8, C=0x0000000111b11830) const at Driver.cpp:2843
  frame #11: 0x0000000103a00b5c clang`clang::driver::Driver::BuildCompilation(this=0x00007fff5fbfe4e8, ArgList=ArrayRef<const char *> @ 0x00007fff5fbfc1e8) at Driver.cpp:746
  frame #12: 0x0000000100005a52 clang`main(argc_=9, argv_=0x00007fff5fbff670) at driver.cpp:463
  frame #13: 0x00007fffbf260c05 libdyld.dylib`start + 1

Could you please take a look?

Let me know if you need anything else,
Cheers,
Alex

Hi Alex,

I have a fix for the failing test. What's the easiest way to do this? Do I have to commit those patches again or can you push them back in and then I also push the fix?

Thanks,

--Doru

--

Great, thanks! I think that you can just revert my revert with the fix applied in one commit

Great, thanks! I think that you can just revert my revert with the fix applied in one commit

Hi Alex,

I just commited the changes again.

Let me know if it still fails for you. I think the issue was actually that the test wasn't quite correct so I cleaned that up.

--Doru

The last RUN line in the new commit triggers the same assertion failure:

Assertion failed: (CachedResults.find(ActionTC) != CachedResults.end() && "Result does not exist??"), function BuildJobsForActionNoCache, file /Users/alex/bisect/llvm/tools/clang/lib/Driver/Driver.cpp, line 3419.

backtrace:

* frame #0: 0x00007fffbf3a2b2e libsystem_kernel.dylib`__pthread_kill + 10
  frame #1: 0x00007fffbf4c72de libsystem_pthread.dylib`pthread_kill + 303
  frame #2: 0x00007fffbf30041f libsystem_c.dylib`abort + 127
  frame #3: 0x00007fffbf2c9f34 libsystem_c.dylib`__assert_rtn + 320
  frame #4: 0x0000000103a1f311 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c017d0, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3418
  frame #5: 0x0000000103a1cf51 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c017d0, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #6: 0x0000000103a1dff3 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c01a30, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3348
  frame #7: 0x0000000103a1cf51 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c01af0, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #8: 0x0000000103a1db7e clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c014f0, TC=0x0000000113000000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3310
  frame #9: 0x0000000103a1cf51 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c014f0, TC=0x0000000113000000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #10: 0x0000000103a0a602 clang`clang::driver::Driver::BuildJobs(this=0x00007fff5fbfe518, C=0x0000000111c01130) const at Driver.cpp:2843
  frame #11: 0x0000000103a00b9c clang`clang::driver::Driver::BuildCompilation(this=0x00007fff5fbfe518, ArgList=ArrayRef<const char *> @ 0x00007fff5fbfc218) at Driver.cpp:746
  frame #12: 0x0000000100005a92 clang`main(argc_=7, argv_=0x00007fff5fbff6a8) at driver.cpp:463
  frame #13: 0x00007fffbf260c05 libdyld.dylib`start + 1
  frame #14: 0x00007fffbf260c05 libdyld.dylib`start + 1

The last RUN line in the new commit triggers the same assertion failure:

Assertion failed: (CachedResults.find(ActionTC) != CachedResults.end() && "Result does not exist??"), function BuildJobsForActionNoCache, file /Users/alex/bisect/llvm/tools/clang/lib/Driver/Driver.cpp, line 3419.

backtrace:

* frame #0: 0x00007fffbf3a2b2e libsystem_kernel.dylib`__pthread_kill + 10
  frame #1: 0x00007fffbf4c72de libsystem_pthread.dylib`pthread_kill + 303
  frame #2: 0x00007fffbf30041f libsystem_c.dylib`abort + 127
  frame #3: 0x00007fffbf2c9f34 libsystem_c.dylib`__assert_rtn + 320
  frame #4: 0x0000000103a1f311 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c017d0, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3418
  frame #5: 0x0000000103a1cf51 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c017d0, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #6: 0x0000000103a1dff3 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c01a30, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3348
  frame #7: 0x0000000103a1cf51 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c01af0, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #8: 0x0000000103a1db7e clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c014f0, TC=0x0000000113000000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3310
  frame #9: 0x0000000103a1cf51 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c014f0, TC=0x0000000113000000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #10: 0x0000000103a0a602 clang`clang::driver::Driver::BuildJobs(this=0x00007fff5fbfe518, C=0x0000000111c01130) const at Driver.cpp:2843
  frame #11: 0x0000000103a00b9c clang`clang::driver::Driver::BuildCompilation(this=0x00007fff5fbfe518, ArgList=ArrayRef<const char *> @ 0x00007fff5fbfc218) at Driver.cpp:746
  frame #12: 0x0000000100005a92 clang`main(argc_=7, argv_=0x00007fff5fbff6a8) at driver.cpp:463
  frame #13: 0x00007fffbf260c05 libdyld.dylib`start + 1
  frame #14: 0x00007fffbf260c05 libdyld.dylib`start + 1

Hi Alex, I'm not sure why it's failing as I can't reproduce the error locally. Do you have access to a machine with the configuration the test uses?

The last RUN line in the new commit triggers the same assertion failure:

Assertion failed: (CachedResults.find(ActionTC) != CachedResults.end() && "Result does not exist??"), function BuildJobsForActionNoCache, file /Users/alex/bisect/llvm/tools/clang/lib/Driver/Driver.cpp, line 3419.

backtrace:

* frame #0: 0x00007fffbf3a2b2e libsystem_kernel.dylib`__pthread_kill + 10
  frame #1: 0x00007fffbf4c72de libsystem_pthread.dylib`pthread_kill + 303
  frame #2: 0x00007fffbf30041f libsystem_c.dylib`abort + 127
  frame #3: 0x00007fffbf2c9f34 libsystem_c.dylib`__assert_rtn + 320
  frame #4: 0x0000000103a1f311 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c017d0, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3418
  frame #5: 0x0000000103a1cf51 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c017d0, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #6: 0x0000000103a1dff3 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c01a30, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3348
  frame #7: 0x0000000103a1cf51 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c01af0, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #8: 0x0000000103a1db7e clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c014f0, TC=0x0000000113000000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3310
  frame #9: 0x0000000103a1cf51 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c014f0, TC=0x0000000113000000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #10: 0x0000000103a0a602 clang`clang::driver::Driver::BuildJobs(this=0x00007fff5fbfe518, C=0x0000000111c01130) const at Driver.cpp:2843
  frame #11: 0x0000000103a00b9c clang`clang::driver::Driver::BuildCompilation(this=0x00007fff5fbfe518, ArgList=ArrayRef<const char *> @ 0x00007fff5fbfc218) at Driver.cpp:746
  frame #12: 0x0000000100005a92 clang`main(argc_=7, argv_=0x00007fff5fbff6a8) at driver.cpp:463
  frame #13: 0x00007fffbf260c05 libdyld.dylib`start + 1
  frame #14: 0x00007fffbf260c05 libdyld.dylib`start + 1

Hi Alex, I'm not sure why it's failing as I can't reproduce the error locally. Do you have access to a machine with the configuration the test uses?

Yes, I can reproduce it on my machine.

The cached results map doesn't have the key:

(lldb) p CachedResults
(std::__1::map<std::__1::pair<const clang::driver::Action *, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo, std::__1::less<std::__1::pair<const clang::driver::Action *, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > > >, std::__1::allocator<std::__1::pair<const std::__1::pair<const clang::driver::Action *, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >, clang::driver::InputInfo> > >) $0 = size=8 {
  [0] = {
    first = {
      first = 0x0000000111c01320
      second = "nvptx64-nvidia-cuda-host"
    }
    second = {
      Data = {
        Filename = 0x00007fff5fbff8f8 "/Volumes/newAPFS/bisect/b/tools/clang/test/Driver/Output/openmp-offload.c.tmp1.o"
        InputArg = 0x00007fff5fbff8f8
      }
      Kind = Filename
      Act = 0x0000000111c01320
      Type = TY_Object
      BaseInput = 0x00007fff5fbff8f8 "/Volumes/newAPFS/bisect/b/tools/clang/test/Driver/Output/openmp-offload.c.tmp1.o"
    }
  }
  [1] = {
    first = {
      first = 0x0000000111c01320
      second = "x86_64-apple-darwin17.0.0-x86_64-host"
    }
    second = {
      Data = {
        Filename = 0x00007fff5fbff8f8 "/Volumes/newAPFS/bisect/b/tools/clang/test/Driver/Output/openmp-offload.c.tmp1.o"
        InputArg = 0x00007fff5fbff8f8
      }
      Kind = Filename
      Act = 0x0000000111c01320
      Type = TY_Object
      BaseInput = 0x00007fff5fbff8f8 "/Volumes/newAPFS/bisect/b/tools/clang/test/Driver/Output/openmp-offload.c.tmp1.o"
    }
  }
  [2] = {
    first = {
      first = 0x0000000111c01420
      second = "nvptx64-nvidia-cuda-host"
    }
    second = {
      Data = {
        Filename = 0x00007fff5fbff949 "/Volumes/newAPFS/bisect/b/tools/clang/test/Driver/Output/openmp-offload.c.tmp2.o"
        InputArg = 0x00007fff5fbff949
      }
      Kind = Filename
      Act = 0x0000000111c01420
      Type = TY_Object
      BaseInput = 0x00007fff5fbff949 "/Volumes/newAPFS/bisect/b/tools/clang/test/Driver/Output/openmp-offload.c.tmp2.o"
    }
  }
  [3] = {
    first = {
      first = 0x0000000111c017d0
      second = "nvptx64-nvidia-cuda-openmp"
    }
    second = {
      Data = {
        Filename = 0x0000000111c048b0 "/var/folders/sh/cpr85hld32qf79m8x7vd31bw0000gn/T/openmp-offload-e30496.o"
        InputArg = 0x0000000111c048b0
      }
      Kind = Filename
      Act = 0x0000000111c017d0
      Type = TY_Object
      BaseInput = 0x00007fff5fbff8f8 "/Volumes/newAPFS/bisect/b/tools/clang/test/Driver/Output/openmp-offload.c.tmp1.o"
    }
  }
  [4] = {
    first = {
      first = 0x0000000111c017d0
      second = "x86_64-apple-darwin17.0.0-host"
    }
    second = {
      Data = {
        Filename = 0x0000000111c04830 "/var/folders/sh/cpr85hld32qf79m8x7vd31bw0000gn/T/openmp-offload-b856ec.o"
        InputArg = 0x0000000111c04830
      }
      Kind = Filename
      Act = 0x0000000111c017d0
      Type = TY_Object
      BaseInput = 0x00007fff5fbff8f8 "/Volumes/newAPFS/bisect/b/tools/clang/test/Driver/Output/openmp-offload.c.tmp1.o"
    }
  }
  [5] = {
    first = {
      first = 0x0000000111c01900
      second = "nvptx64-nvidia-cuda-openmp"
    }
    second = {
      Data = {
        Filename = 0x0000000111c035d0 "/var/folders/sh/cpr85hld32qf79m8x7vd31bw0000gn/T/openmp-offload-be86a1.o"
        InputArg = 0x0000000111c035d0
      }
      Kind = Filename
      Act = 0x0000000111c01900
      Type = TY_Object
      BaseInput = 0x00007fff5fbff949 "/Volumes/newAPFS/bisect/b/tools/clang/test/Driver/Output/openmp-offload.c.tmp2.o"
    }
  }
  [6] = {
    first = {
      first = 0x0000000111c01900
      second = "x86_64-apple-darwin17.0.0-host"
    }
    second = {
      Data = {
        Filename = 0x0000000111c034e0 "/var/folders/sh/cpr85hld32qf79m8x7vd31bw0000gn/T/openmp-offload-92791a.o"
        InputArg = 0x0000000111c034e0
      }
      Kind = Filename
      Act = 0x0000000111c01900
      Type = TY_Object
      BaseInput = 0x00007fff5fbff949 "/Volumes/newAPFS/bisect/b/tools/clang/test/Driver/Output/openmp-offload.c.tmp2.o"
    }
  }
  [7] = {
    first = {
      first = 0x0000000111c01a90
      second = "nvptx64-nvidia-cuda-openmp"
    }
    second = {
      Data = {
        Filename = 0x0000000111c03d90 "/var/folders/sh/cpr85hld32qf79m8x7vd31bw0000gn/T/openmp-offload-8db204.out"
        InputArg = 0x0000000111c03d90
      }
      Kind = Filename
      Act = 0x0000000111c01a90
      Type = TY_Image
      BaseInput = 0x00007fff5fbff8f8 "/Volumes/newAPFS/bisect/b/tools/clang/test/Driver/Output/openmp-offload.c.tmp1.o"
    }
  }
}

Key:

(lldb) p ActionTC
(std::__1::pair<const clang::driver::Action *, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >) $1 = {
  first = 0x0000000111c017d0
  second = "x86_64-apple-darwin17.0.0-x86_64-host"
}

If you look at the map then you can see that it contains very similar keys, but not the exact one:

    first = {
      first = 0x0000000111c017d0
      second = "x86_64-apple-darwin17.0.0-host"
    }

and

    first = {
      first = 0x0000000111c01320
      second = "x86_64-apple-darwin17.0.0-x86_64-host"
    }

The "x86_64-apple-darwin17.0.0-x86_64-host" triple looks suspicious though

The "x86_64-apple-darwin17.0.0-x86_64-host" triple looks suspicious though

It looks like the triple is in the list though:

second = "x86_64-apple-darwin17.0.0-x86_64-host

it is entry [1].

Is that the last access to CachedResults before the error?

Is that the last access to CachedResults before the error?

Is the assertion the last access? Yes.

There must be a discrepancy between

UI.DependentBoundArch in the loop above and BoundArch that's used to compute TargetTC, otherwise GetTriplePlusArchString would return the key that matches the 0x0000000111c017d0 pointer, i.e. without the additional x86_64.

Is that the last access to CachedResults before the error?

Is the assertion the last access? Yes.

There must be a discrepancy between

UI.DependentBoundArch in the loop above and BoundArch that's used to compute TargetTC, otherwise GetTriplePlusArchString would return the key that matches the 0x0000000111c017d0 pointer, i.e. without the additional x86_64.

Maybe I misunderstood the output you pasted but it looks like ActionTC contains the same triple+arch that you can find in entry [1] in the map.

arphaman added a comment.EditedAug 8 2017, 10:19 AM

Yes, but entry 1 has a different pointer (0x0000000111c01320 vs 0x0000000111c017d0) so the keys are different.

Why is there a DependentBoundArch if it's always empty?

The last RUN line in the new commit triggers the same assertion failure:

...

Hi Alex, I'm not sure why it's failing as I can't reproduce the error locally. Do you have access to a machine with the configuration the test uses?

Can you reproduce if you specifically force the host target to x86_64-apple-darwin17.0.0 (e.g., you pass -target x86_64-apple-darwin17.0.0)?

The last RUN line in the new commit triggers the same assertion failure:

...

Hi Alex, I'm not sure why it's failing as I can't reproduce the error locally. Do you have access to a machine with the configuration the test uses?

Can you reproduce if you specifically force the host target to x86_64-apple-darwin17.0.0 (e.g., you pass -target x86_64-apple-darwin17.0.0)?

Yep, I can ! I didn't realize that would actually work thanks! :)

I have just pushed a fix, revision 310433.