gtbercea (Gheorghe-Teodor Bercea)
User

Projects

User does not belong to any projects.

User Details

User Since
Dec 29 2016, 12:44 AM (42 w, 1 d)

Recent Activity

Wed, Oct 18

gtbercea updated the diff for D39061: [buildbot] Increase timeout for libomp-clang-ppc64le-linux-debian builder.
Wed, Oct 18, 11:26 AM
gtbercea updated the diff for D39061: [buildbot] Increase timeout for libomp-clang-ppc64le-linux-debian builder.
Wed, Oct 18, 11:06 AM
gtbercea added a reviewer for D39061: [buildbot] Increase timeout for libomp-clang-ppc64le-linux-debian builder: sfantao.
Wed, Oct 18, 10:57 AM
gtbercea created D39061: [buildbot] Increase timeout for libomp-clang-ppc64le-linux-debian builder.
Wed, Oct 18, 10:55 AM
gtbercea added a comment to D39005: [OpenMP] Clean up variable and function names for NVPTX backend.

I'd be interested to get the ball rolling in regard to coming up with a fix for this. I see some suggestions in past patches. Some help/clarification would be much appreciated.

Happy to help, but I'm not sure what to offer beyond the link in Art's previous comment.

Wed, Oct 18, 8:17 AM

Tue, Oct 17

gtbercea added a comment to D39005: [OpenMP] Clean up variable and function names for NVPTX backend.

Hi Artem, Justin,

Tue, Oct 17, 2:19 PM
gtbercea updated the diff for D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.

Eliminate variable and function name clean-up. That has been moved into a separate patch: D39005

Tue, Oct 17, 8:27 AM
gtbercea created D39005: [OpenMP] Clean up variable and function names for NVPTX backend.
Tue, Oct 17, 8:27 AM

Mon, Oct 16

gtbercea updated the summary of D38976: [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs using OpenMP device offloading.
Mon, Oct 16, 2:30 PM
gtbercea created D38978: [OpenMP] Enable the lowering of implicitly shared variables in OpenMP GPU-offloaded target regions to the GPU shared memory.
Mon, Oct 16, 2:29 PM
gtbercea created D38976: [OpenMP] Add implicit data sharing support when offloading to NVIDIA GPUs using OpenMP device offloading.
Mon, Oct 16, 2:21 PM
gtbercea added a comment to D38883: [CMake][OpenMP] Customize default offloading arch.

LGTM

Mon, Oct 16, 11:56 AM

Fri, Oct 13

gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Fri, Oct 13, 11:41 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Fri, Oct 13, 11:39 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Fri, Oct 13, 11:19 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Fri, Oct 13, 11:17 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Fri, Oct 13, 11:16 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Fri, Oct 13, 11:05 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Fri, Oct 13, 11:04 AM
gtbercea added inline comments to D38883: [CMake][OpenMP] Customize default offloading arch.
Fri, Oct 13, 11:02 AM

Wed, Sep 27

gtbercea accepted D38258: [OpenMP] Fix passing of -m arguments to device toolchain.

LGTM

Wed, Sep 27, 7:46 AM
gtbercea accepted D38259: [OpenMP] Fix translation of target args.

LGTM

Wed, Sep 27, 7:42 AM
gtbercea added inline comments to D38258: [OpenMP] Fix passing of -m arguments to device toolchain.
Wed, Sep 27, 7:40 AM
gtbercea accepted D38257: [OpenMP] Fix memory leak when translating arguments.

LGTM

Wed, Sep 27, 7:37 AM
gtbercea closed D38040: [OpenMP] Add an additional test for D34888.
Wed, Sep 27, 7:32 AM

Tue, Sep 26

gtbercea added a reviewer for D38040: [OpenMP] Add an additional test for D34888: ABataev.
Tue, Sep 26, 6:59 PM
gtbercea updated the diff for D38040: [OpenMP] Add an additional test for D34888.

Fix test.

Tue, Sep 26, 6:58 PM
gtbercea reopened D38040: [OpenMP] Add an additional test for D34888.

Open

Tue, Sep 26, 5:59 PM
gtbercea reopened D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

Open

Tue, Sep 26, 3:30 PM
gtbercea closed D38040: [OpenMP] Add an additional test for D34888.
Tue, Sep 26, 3:30 PM
gtbercea updated the diff for D38040: [OpenMP] Add an additional test for D34888.

Add nocudalib flag.

Tue, Sep 26, 3:30 PM
gtbercea closed D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required..
Tue, Sep 26, 3:30 PM
gtbercea reopened D38040: [OpenMP] Add an additional test for D34888.

Open

Tue, Sep 26, 3:30 PM
gtbercea closed D38040: [OpenMP] Add an additional test for D34888.
Tue, Sep 26, 3:30 PM
gtbercea updated the diff for D38040: [OpenMP] Add an additional test for D34888.

Fix test.

Tue, Sep 26, 3:30 PM

Mon, Sep 25

gtbercea reopened D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required..

Open.

Mon, Sep 25, 2:59 PM
gtbercea closed D37913: [OpenMP] Enable the existing nocudalib flag for OpenMP offloading toolchain..
Mon, Sep 25, 2:58 PM
gtbercea updated the diff for D37913: [OpenMP] Enable the existing nocudalib flag for OpenMP offloading toolchain..

Split line.

Mon, Sep 25, 2:54 PM
gtbercea closed D37912: [OpenMP] Bugfix: output file name drops the absolute path where full path is needed..
Mon, Sep 25, 2:27 PM
gtbercea closed D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required..
Mon, Sep 25, 2:08 PM
gtbercea added a comment to D38040: [OpenMP] Add an additional test for D34888.

The test is verifying whether the parameter is passed to the kernel correctly. I believe it was not passed as a reference before the patch.

Ah, right: This isn't checked anywhere before. Maybe add a comment about what's tested here?
Do we want to check the rest of the codegen with a focus that the variable is passed as a reference?

In addition to that, something that was in my previous patch is related to this code:

DSAStack->checkMappableExprComponentListsForDeclAtLevel(
        D, Level, [&](OMPClauseMappableExprCommon::MappableExprComponentListRef

In particular with the Level variable. Should the Level variable actually be Level + 1 in this case?

I'm not sure, the current public clang-ykt has Level: https://github.com/clang-ykt/clang/blob/d181aed/lib/Sema/SemaOpenMP.cpp#L1361

Mon, Sep 25, 5:02 AM

Thu, Sep 21

gtbercea added a comment to D38040: [OpenMP] Add an additional test for D34888.

Hi Doru,

if I remember correctly I submitted D34888 for a crash when mapping a scalar value with nested regions.
I've marked another test in this file that the codegen for tofrom is correct. So I don't know if this test checks some other conditions?

Jonas

Thu, Sep 21, 6:08 PM

Sep 19 2017

gtbercea updated the diff for D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required..
Sep 19 2017, 6:24 PM
gtbercea updated the diff for D38040: [OpenMP] Add an additional test for D34888.
Sep 19 2017, 6:12 PM
gtbercea updated the diff for D37913: [OpenMP] Enable the existing nocudalib flag for OpenMP offloading toolchain..

Don't take into account unknown CUDA archs not even for testing purposes.

Sep 19 2017, 6:06 PM
gtbercea updated the diff for D37912: [OpenMP] Bugfix: output file name drops the absolute path where full path is needed..

Address comment.

Sep 19 2017, 5:55 PM
gtbercea created D38040: [OpenMP] Add an additional test for D34888.
Sep 19 2017, 8:51 AM
gtbercea added a reviewer for D37912: [OpenMP] Bugfix: output file name drops the absolute path where full path is needed.: tra.
Sep 19 2017, 8:46 AM
gtbercea added a reviewer for D37913: [OpenMP] Enable the existing nocudalib flag for OpenMP offloading toolchain.: tra.
Sep 19 2017, 8:46 AM

Sep 18 2017

gtbercea added a reviewer for D37913: [OpenMP] Enable the existing nocudalib flag for OpenMP offloading toolchain.: hfinkel.
Sep 18 2017, 12:12 PM
gtbercea added inline comments to D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required..
Sep 18 2017, 11:57 AM
gtbercea updated the diff for D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required..

Only check for -S.

Sep 18 2017, 11:51 AM
gtbercea updated the diff for D37913: [OpenMP] Enable the existing nocudalib flag for OpenMP offloading toolchain..

Add test.

Sep 18 2017, 11:33 AM
gtbercea updated the diff for D37912: [OpenMP] Bugfix: output file name drops the absolute path where full path is needed..
Sep 18 2017, 9:26 AM

Sep 15 2017

gtbercea updated the diff for D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required..

Fix diff.

Sep 15 2017, 2:40 PM
gtbercea updated the diff for D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required..

Add test.

Sep 15 2017, 2:36 PM
gtbercea updated the diff for D37912: [OpenMP] Bugfix: output file name drops the absolute path where full path is needed..

Fix tests.

Sep 15 2017, 2:32 PM
gtbercea updated the diff for D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required..

Fix condition.

Sep 15 2017, 1:45 PM
gtbercea updated the diff for D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required..

Fix parantheses.

Sep 15 2017, 11:58 AM
gtbercea updated the diff for D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required..

Contract check.

Sep 15 2017, 11:55 AM
gtbercea added a comment to D37912: [OpenMP] Bugfix: output file name drops the absolute path where full path is needed..
In D37912#872294, @tra wrote:

Shouldn't this temp .cubin file go into the temporary directory, as opposed to the same directory as the input file?

Sep 15 2017, 11:48 AM
gtbercea created D37914: [OpenMP] Don't throw cudalib not found error if only front-end is required..
Sep 15 2017, 11:43 AM
gtbercea created D37913: [OpenMP] Enable the existing nocudalib flag for OpenMP offloading toolchain..
Sep 15 2017, 11:36 AM
gtbercea created D37912: [OpenMP] Bugfix: output file name drops the absolute path where full path is needed..
Sep 15 2017, 11:29 AM

Aug 12 2017

gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

Couldn't fix/find the actual error so for now, just moving the flag patch tests to openmp-offload-gpu.c which is a disabled test.

310765

Bad news, the bot is still red: http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-fast/builds/7114

Disabled openmp-offload.c on Linux again: https://reviews.llvm.org/rL310772

Aug 12 2017, 11:03 AM

Aug 11 2017

gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.
Aug 11 2017, 2:19 PM
gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

I have re-enabled the previous offloading tests and moved the new GPU offloading tests to a new file which is disabled for linux (for now).

310718

Alex thanks so much for the logs, they have been very useful to understand what's going on.

Aleksey, I have since tried to install a Clang version with the address sanitizer enabled but without much success. Apart from turning on the sanitizer in the cmake using the -DLLVM_USE_SANITIZER="Address" flag is there any other flag that I need to pass to cmake?
I am trying to run this on my macbook x86_64 and OS X 10.11. I am getting the following error when building the compiler:

[2966/4254] Linking CXX shared library lib/libc++abi.1.0.dylib
FAILED: lib/libc++abi.1.0.dylib
Undefined symbols for architecture x86_64:

"___asan_after_dynamic_init", referenced from:
    __GLOBAL__sub_I_cxa_default_handlers.cpp in cxa_default_handlers.cpp.o
"___asan_before_dynamic_init", referenced from:
    __GLOBAL__sub_I_cxa_default_handlers.cpp in cxa_default_handlers.cpp.o

[...]
ld: symbol(s) not found for architecture x86_64

Actually, you can run our bot, it is in zorg (http://llvm.org/git/zorg.git), zorg/buildbot/builders/sanitizers/buildbot_fast.sh (the one I linked the last time).

Create a temp folder and from that folder run:
BUILDBOT_REVISION= BUILDBOT_CLOBBER= $PATH_YOUR_PROJECTS$/zorg/zorg/buildbot/builders/sanitizers/buildbot_fast.sh

Aug 11 2017, 1:38 PM
gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

I have re-enabled the previous offloading tests and moved the new GPU offloading tests to a new file which is disabled for linux (for now).

Aug 11 2017, 9:08 AM

Aug 10 2017

gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

First of all, I apologize if I've upset you with my previous post. I am actively working on understanding what is causing these issues. It is not my intention to write tests that work on local configurations only. I am upset to see that these tests keep failing for your and maybe other configurations. Without knowing the actual reason of the failures I can only speculate what is going wrong with them hence the flurry of changes.

Thank you, apology accepted. That was exactly my point, not to start a fight, but to emphasize that depending on local configuration is never going to work, you will never be able to see and test all of them. Please disable the test ASAP and until the better way to handle it is determined.

Aug 10 2017, 10:00 AM
gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

The failures were very widespread, e.g. there's a linux buildbot that was red until the revert: http://bb.pgr.jp/builders/test-clang-i686-linux-RA. If you have access to a linux machine you should be able to reproduce the failures that the bot experienced by using the same cmake arguments (I don't know the exact ones, but judging from the bot you should be able to reproduce them using 32 bit release build with assertions enabled). I don't know what GPU that buildbot has.

Aug 10 2017, 9:24 AM
gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

First of all, I apologize if I've upset you with my previous post. I am actively working on understanding what is causing these issues. It is not my intention to write tests that work on local configurations only. I am upset to see that these tests keep failing for your and maybe other configurations. Without knowing the actual reason of the failures I can only speculate what is going wrong with them hence the flurry of changes.

Aug 10 2017, 8:35 AM
gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.
Aug 10 2017, 6:44 AM

Aug 9 2017

gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

310549 should solve this problem by using a default architecture that is supported by the underlying device version.

Aug 9 2017, 10:04 PM
gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

Thanks for running the test on your machine! This is very useful.

Aug 9 2017, 8:15 PM
gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

I've removed that test. Let's see if the other two tests pass or not. (310537)

Aug 9 2017, 4:50 PM
gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

Even after r310505, openmp-offload.c continues to haunt our bots, for example http://lab.llvm.org:8011/builders/sanitizer-x86_64-linux-bootstrap/builds/2012. Can you please fix this test?

Aug 9 2017, 1:51 PM
gtbercea closed D36537: [OpenMP] Enable executable lookup into driver directory..
Aug 9 2017, 12:53 PM
gtbercea updated the summary of D36537: [OpenMP] Enable executable lookup into driver directory..
Aug 9 2017, 12:03 PM
gtbercea updated the diff for D36537: [OpenMP] Enable executable lookup into driver directory..

Add comment.

Aug 9 2017, 12:02 PM
gtbercea created D36537: [OpenMP] Enable executable lookup into driver directory..
Aug 9 2017, 11:50 AM
gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

Revision 310505 fixes the tests for this patch.

Aug 9 2017, 11:30 AM
gtbercea added a comment to D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.

Looks like this test is failing on macOS again after this change:

http://lab.llvm.org:8080/green/job/clang-stage1-cmake-RA-incremental_check/39231/testReport/Clang/Driver/openmp_offload_c/

Can you please take a look?

Aug 9 2017, 10:50 AM
gtbercea closed D29905: [OpenMP] Pass argument to device kernel by reference when map is used. .

Already covered by D34888

Aug 9 2017, 9:18 AM
gtbercea closed D29660: [OpenMP] Add flag for overwriting default PTX version for OpenMP targets.
Aug 9 2017, 8:57 AM
gtbercea closed D29659: [OpenMP] Add flag for disabling the default generation of relocatable OpenMP target code for NVIDIA GPUs..
Aug 9 2017, 8:28 AM

Aug 8 2017

gtbercea added a comment to D29654: [OpenMP] Integrate OpenMP target region cubin into host binary.

I have just pushed a fix, revision 310433.

Aug 8 2017, 6:05 PM
gtbercea added a comment to D29654: [OpenMP] Integrate OpenMP target region cubin into host binary.

The last RUN line in the new commit triggers the same assertion failure:

...

Hi Alex, I'm not sure why it's failing as I can't reproduce the error locally. Do you have access to a machine with the configuration the test uses?

Can you reproduce if you specifically force the host target to x86_64-apple-darwin17.0.0 (e.g., you pass -target x86_64-apple-darwin17.0.0)?

Aug 8 2017, 10:38 AM
gtbercea added a comment to D29654: [OpenMP] Integrate OpenMP target region cubin into host binary.

Is that the last access to CachedResults before the error?

Is the assertion the last access? Yes.

There must be a discrepancy between

UI.DependentBoundArch in the loop above and BoundArch that's used to compute TargetTC, otherwise GetTriplePlusArchString would return the key that matches the 0x0000000111c017d0 pointer, i.e. without the additional x86_64.

Aug 8 2017, 10:07 AM
gtbercea added a comment to D29654: [OpenMP] Integrate OpenMP target region cubin into host binary.

Is that the last access to CachedResults before the error?

Aug 8 2017, 9:57 AM
gtbercea added a comment to D29654: [OpenMP] Integrate OpenMP target region cubin into host binary.

The "x86_64-apple-darwin17.0.0-x86_64-host" triple looks suspicious though

Aug 8 2017, 9:49 AM
gtbercea added a comment to D29654: [OpenMP] Integrate OpenMP target region cubin into host binary.

The last RUN line in the new commit triggers the same assertion failure:

Assertion failed: (CachedResults.find(ActionTC) != CachedResults.end() && "Result does not exist??"), function BuildJobsForActionNoCache, file /Users/alex/bisect/llvm/tools/clang/lib/Driver/Driver.cpp, line 3419.

backtrace:

* frame #0: 0x00007fffbf3a2b2e libsystem_kernel.dylib`__pthread_kill + 10
  frame #1: 0x00007fffbf4c72de libsystem_pthread.dylib`pthread_kill + 303
  frame #2: 0x00007fffbf30041f libsystem_c.dylib`abort + 127
  frame #3: 0x00007fffbf2c9f34 libsystem_c.dylib`__assert_rtn + 320
  frame #4: 0x0000000103a1f311 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c017d0, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3418
  frame #5: 0x0000000103a1cf51 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c017d0, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #6: 0x0000000103a1dff3 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c01a30, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3348
  frame #7: 0x0000000103a1cf51 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c01af0, TC=0x0000000113000000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #8: 0x0000000103a1db7e clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c014f0, TC=0x0000000113000000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3310
  frame #9: 0x0000000103a1cf51 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe518, C=0x0000000111c01130, A=0x0000000111c014f0, TC=0x0000000113000000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #10: 0x0000000103a0a602 clang`clang::driver::Driver::BuildJobs(this=0x00007fff5fbfe518, C=0x0000000111c01130) const at Driver.cpp:2843
  frame #11: 0x0000000103a00b9c clang`clang::driver::Driver::BuildCompilation(this=0x00007fff5fbfe518, ArgList=ArrayRef<const char *> @ 0x00007fff5fbfc218) at Driver.cpp:746
  frame #12: 0x0000000100005a92 clang`main(argc_=7, argv_=0x00007fff5fbff6a8) at driver.cpp:463
  frame #13: 0x00007fffbf260c05 libdyld.dylib`start + 1
  frame #14: 0x00007fffbf260c05 libdyld.dylib`start + 1
Aug 8 2017, 8:22 AM
gtbercea added a comment to D29654: [OpenMP] Integrate OpenMP target region cubin into host binary.

Great, thanks! I think that you can just revert my revert with the fix applied in one commit

Aug 8 2017, 7:39 AM
gtbercea added a comment to D29654: [OpenMP] Integrate OpenMP target region cubin into host binary.

Hi @gtbercea,
I couldn't reply to the email as cfe-commits didn't even register this commit somehow, so I'm replying here.

Unfortunately I had to revert this commit (r310291), + two others for a clean revert (r310300 and r310332) because it caused a test failure on macOS. This particular run line:

// RUN:   %clang -### -no-canonical-prefixes -fopenmp=libomp -fopenmp-targets=nvptx64-nvidia-cuda -save-temps -no-canonical-prefixes %t1.o %t2.o 2>&1 \
// RUN:   | FileCheck -check-prefix=CHK-TWOCUBIN %s

Causes the following assertion failure:

assert(CachedResults.find(ActionTC) != CachedResults.end() &&
       "Result does not exist??");

Here's a backtrace:

* frame #0: 0x00007fffbf3a2b2e libsystem_kernel.dylib`__pthread_kill + 10
  frame #1: 0x00007fffbf4c72de libsystem_pthread.dylib`pthread_kill + 303
  frame #2: 0x00007fffbf30041f libsystem_c.dylib`abort + 127
  frame #3: 0x00007fffbf2c9f34 libsystem_c.dylib`__assert_rtn + 320
  frame #4: 0x0000000103a1f2d1 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b11ed0, TC=0x0000000112819000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3418
  frame #5: 0x0000000103a1cf11 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b11ed0, TC=0x0000000112819000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=false, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #6: 0x0000000103a1dfb3 clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b12130, TC=0x0000000112819000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3348
  frame #7: 0x0000000103a1cf11 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b121f0, TC=0x0000000112819000, BoundArch=(Data = "x86_64", Length = 6), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #8: 0x0000000103a1db3e clang`clang::driver::Driver::BuildJobsForActionNoCache(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b11bf0, TC=0x0000000112819000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3310
  frame #9: 0x0000000103a1cf11 clang`clang::driver::Driver::BuildJobsForAction(this=0x00007fff5fbfe4e8, C=0x0000000111b11830, A=0x0000000111b11bf0, TC=0x0000000112819000, BoundArch=(Data = 0x0000000000000000, Length = 0), AtTopLevel=true, MultipleArchs=false, LinkingOutput=0x0000000000000000, CachedResults=size=8, TargetDeviceOffloadKind=OFK_None) const at Driver.cpp:3210
  frame #10: 0x0000000103a0a5c2 clang`clang::driver::Driver::BuildJobs(this=0x00007fff5fbfe4e8, C=0x0000000111b11830) const at Driver.cpp:2843
  frame #11: 0x0000000103a00b5c clang`clang::driver::Driver::BuildCompilation(this=0x00007fff5fbfe4e8, ArgList=ArrayRef<const char *> @ 0x00007fff5fbfc1e8) at Driver.cpp:746
  frame #12: 0x0000000100005a52 clang`main(argc_=9, argv_=0x00007fff5fbff670) at driver.cpp:463
  frame #13: 0x00007fffbf260c05 libdyld.dylib`start + 1

Could you please take a look?

Let me know if you need anything else,
Cheers,
Alex

Aug 8 2017, 7:10 AM

Aug 7 2017

gtbercea closed D32035: [OpenMP] Error when trying to offload to an unsupported architecture.
Aug 7 2017, 2:13 PM
gtbercea closed D29904: [OpenMP] Prevent emission of exception handling code when using OpenMP to offload to NVIDIA devices..
Aug 7 2017, 1:59 PM
gtbercea closed D29642: [OpenMP] Make OpenMP generated code for the NVIDIA device relocatable by default.
Aug 7 2017, 1:32 PM
gtbercea closed D29644: [OpenMP] Pass -v to PTXAS if it was passed to the driver..
Aug 7 2017, 1:22 PM
gtbercea closed D29654: [OpenMP] Integrate OpenMP target region cubin into host binary.
Aug 7 2017, 1:02 PM
gtbercea updated the diff for D29654: [OpenMP] Integrate OpenMP target region cubin into host binary.

Add -no-canonical-prefixes to tests.

Aug 7 2017, 12:46 PM
gtbercea closed D34784: [OpenMP] Add flag for specifying the target device architecture for OpenMP device offloading.
Aug 7 2017, 8:39 AM · Restricted Project