Page MenuHomePhabricator

jhuber6 (Joseph Huber)
User

Projects

User does not belong to any projects.

User Details

User Since
May 4 2020, 11:17 AM (107 w, 4 d)

Recent Activity

Thu, May 26

jhuber6 added inline comments to D124525: [OpenMP][ClangLinkerWrapper] Extending linker wrapper to embed metadata for multi-arch fat binaries.
Thu, May 26, 1:47 PM · Restricted Project, Restricted Project, Restricted Project
jhuber6 committed rG1bae02b77335: [Cuda] Use fallback method to mangle externalized decls if no CUID given (authored by jhuber6).
[Cuda] Use fallback method to mangle externalized decls if no CUID given
Thu, May 26, 6:18 AM · Restricted Project, Restricted Project
jhuber6 closed D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.
Thu, May 26, 6:18 AM · Restricted Project, Restricted Project

Wed, May 25

jhuber6 updated the diff for D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

Add test for #line.

Wed, May 25, 4:22 PM · Restricted Project, Restricted Project
jhuber6 added a comment to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

It would be great to have some compile-time checks for that, if possible. Otherwise it will only manifest at run-time and the end user will have no clue what's going on.

Not sure how we could check it at compile-time, if we knew what it was supposed to be we could just set it properly right?

Wed, May 25, 3:26 PM · Restricted Project, Restricted Project
jhuber6 committed rGb7c8c4d8cf07: [Clang] Introduce `--offload-link` option to perform offload device linking (authored by jhuber6).
[Clang] Introduce `--offload-link` option to perform offload device linking
Wed, May 25, 1:31 PM · Restricted Project, Restricted Project
jhuber6 closed D126398: [Clang] Introduce `--offload-link` option to perform offload device linking.
Wed, May 25, 1:31 PM · Restricted Project, Restricted Project
jhuber6 updated the diff for D126398: [Clang] Introduce `--offload-link` option to perform offload device linking.

Removing -dlink

Wed, May 25, 12:02 PM · Restricted Project, Restricted Project
jhuber6 retitled D126398: [Clang] Introduce `--offload-link` option to perform offload device linking from [Clang] Introduce `-dlink` option to perform offload device linking to [Clang] Introduce `--offload-link` option to perform offload device linking.
Wed, May 25, 12:00 PM · Restricted Project, Restricted Project
jhuber6 added inline comments to D126398: [Clang] Introduce `--offload-link` option to perform offload device linking.
Wed, May 25, 11:59 AM · Restricted Project, Restricted Project
jhuber6 updated the diff for D126398: [Clang] Introduce `--offload-link` option to perform offload device linking.

Changing to use --offload-link and use -dlink as an alias.

Wed, May 25, 11:40 AM · Restricted Project, Restricted Project
jhuber6 retitled D126398: [Clang] Introduce `--offload-link` option to perform offload device linking from [Clang] Introduce `-dl` option to perform offload device linking to [Clang] Introduce `-dlink` option to perform offload device linking.
Wed, May 25, 11:39 AM · Restricted Project, Restricted Project
jhuber6 added a comment to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

How much work would it take to add cuid generation in the new driver, similar to what the old driver does, using the same logic, however imperfect it is? I'd be OK with that as a possibly permanent solution.

Probably wouldn't be too difficult, primarily just setting up the glue since the rest of the infrastructure is in place. I was hoping it would become unnecessary, but it seems like that's not happening. I'm tempted to have OpenMP handle it on its own do we don't need to port this to the OpenMP case, I think we already do something similar there with the kernel names.

Wed, May 25, 11:25 AM · Restricted Project, Restricted Project
jhuber6 added a comment to D126398: [Clang] Introduce `--offload-link` option to perform offload device linking.

Naming, as usual, is hard. I would prefer a more explicit --offload-link which would be in line with other --offload* options we have by now.
-dl is cryptic for uninitiated and is uncomfortably close to commonly used -ldl. If it gets mistyped as -ld, it would lead to a legitimate but unrelated error about missing libd. Or it might silently succeed linking with libd without actually doing any device linking.

Yeah, I can see your point, --offload-link definitely works but it would be nice to have something less verbose. Maybe could just use -dlink or something.

Wed, May 25, 11:19 AM · Restricted Project, Restricted Project
jhuber6 added a comment to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

Is this patch in its current form blocking any of your other work? no-cuid approach, even if we figure out how to do it, will likely take some time. Do you need an interim solution until then?

Also, for the OpenMP case, we already pass the host-IR as a dependency for the device compilation. So it would be relatively easy for us to just generate these names on the host and then read them from the IR for the device. The problem is that CUDA / HIP doesn't use this approach so it wouldn't be a great solution to have two different ways to do this. So we would either need to make CUDA / HIP take the host IR and use that, or move OpenMP to use the driver. The benefit of passing the IR is that we can much more stably generate some arbitrary string to mangle these names and we're guarunteed to have them match up because we read them from the host. The downside is that it'd be somewhat of a regression because now we have an extra call to Clang for CUDA / HIP when we previously didn't need to.

Yeah. The different compilation flows are a bit of a problem. So is the closeness of NVIDIA's binary format, which limits what we can do with them. E.g. we can't currently modify GPU binary and rename of add new symbols.

I'll need to think about the no-cuid solution. If we can solve it, not deviating from C++ linking would be a valuable benefit and would save us some headaches down the road. Extra clang invocation may be worth it, but it's too early to tell.

Wed, May 25, 10:58 AM · Restricted Project, Restricted Project
jhuber6 committed rG8a1984c25e2c: [Clang][Docs] Document `-Xoffload-linker` flag (authored by jhuber6).
[Clang][Docs] Document `-Xoffload-linker` flag
Wed, May 25, 10:33 AM · Restricted Project, Restricted Project
jhuber6 requested review of D126398: [Clang] Introduce `--offload-link` option to perform offload device linking.
Wed, May 25, 10:31 AM · Restricted Project, Restricted Project
jhuber6 added a comment to D124525: [OpenMP][ClangLinkerWrapper] Extending linker wrapper to embed metadata for multi-arch fat binaries.

Where is the code to change the target registration? We need a new initialization runtime call to use and change the old one to reallocate the __tgt_device_image array. Did you work around that some other way?

Wed, May 25, 9:14 AM · Restricted Project, Restricted Project, Restricted Project
jhuber6 added a comment to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

Also, for the OpenMP case, we already pass the host-IR as a dependency for the device compilation. So it would be relatively easy for us to just generate these names on the host and then read them from the IR for the device. The problem is that CUDA / HIP doesn't use this approach so it wouldn't be a great solution to have two different ways to do this. So we would either need to make CUDA / HIP take the host IR and use that, or move OpenMP to use the driver. The benefit of passing the IR is that we can much more stably generate some arbitrary string to mangle these names and we're guarunteed to have them match up because we read them from the host. The downside is that it'd be somewhat of a regression because now we have an extra call to Clang for CUDA / HIP when we previously didn't need to.

Wed, May 25, 8:20 AM · Restricted Project, Restricted Project

Tue, May 24

jhuber6 added a comment to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

I can't think of a way to generate these new symbols, we'd need to somehow have a list of all the static entries that need new symbols and then modify the object file after its been made. Not sure if this is possible in general considering the vendor linkers might not behave. I'm definitely open to discussion though, I'd love to have a solution for this.

Tue, May 24, 5:54 PM · Restricted Project, Restricted Project
jhuber6 added a comment to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

I'm still itching to figure out a way to avoid CUID altogether and with the new driver it may be possible.

I would be 100% in favor of working around this if possible, it's proving to be one of the most painful parts of the process.

CUID serves two purposes:
a) avoid name conflicts during device-side linking ("must be globally unique" part)
b) allow host to refer to something in the GPU executable ("stable within TU" part)

My understanding that we already collect the data about all offloading entities and that include those we have to externalize. We also postpone generation of the registration glue to the final linking step.

Yes, we would have all those entries see here. The final linker just gets a pointer to __start_omp_offloading_entries so we can iterate this at runtime.

Let's suppose that we do not externalize those normally-internal symbols. The offloading table would still have entries for them, but there will be no issue with name conflicts during linking, as they do remain internal.

We would also need to make sure that they're used so they don't get optimized out.

During the final linking, if an an offloading entity uses a pointer w/o a public symbol, we would be in position to generate a unique one, using the pointer value in the offload table entry. Linker can just use a free-running counter for the suffix, or could just generate a completely new symbol. It does not matter.

This is the part I'm not sure about, how would we generate new symbols during the linking stage? We can only iterate the offloading entry table after the final linking, which is when we're already supposed to have a fully linked and registered module. We could potentially generate the same kind of table for the device, but I don't think nvlink would perform the same linker magic to merge those entries.

When we generate the host-side registration glue, we'll use the name of that generated symbol.

When we make the registration glue we haven't created the final executable, so I don't think we could modify existing entries, only create new ones.

In the end linking will work exactly as it would for C++ (modulo having offloading tables) and host/device registration will be ensured by telling host side which symbols to use, instead of assuming that we've happened to generate exactly the same unique suffix on both sides.

@yaxunl -- do you see any holes in this approach?

Tue, May 24, 5:12 PM · Restricted Project, Restricted Project
jhuber6 added a comment to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

A

Tue, May 24, 3:08 PM · Restricted Project, Restricted Project
jhuber6 added inline comments to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.
Tue, May 24, 2:57 PM · Restricted Project, Restricted Project
jhuber6 added inline comments to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.
Tue, May 24, 11:29 AM · Restricted Project, Restricted Project
jhuber6 committed rG3723868d9e07: [OpenMP] Fix file arguments for embedding bitcode in the linker wrapper (authored by jhuber6).
[OpenMP] Fix file arguments for embedding bitcode in the linker wrapper
Tue, May 24, 10:46 AM · Restricted Project, Restricted Project
jhuber6 updated the diff for D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

Adding extra commentto mention the hidden requirement that the driver shuold not define a different -D option for the host and device.

Tue, May 24, 9:47 AM · Restricted Project, Restricted Project
jhuber6 added inline comments to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.
Tue, May 24, 9:45 AM · Restricted Project, Restricted Project
jhuber6 added inline comments to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.
Tue, May 24, 9:02 AM · Restricted Project, Restricted Project
jhuber6 updated the diff for D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

Removing use of the line number, instead replacing it with an 8 character wide hash of the -D options passed to the front-end. This should make it sufficiently unique for users compiling the same file with different options. The format now looks like <var>__<qualifier>__<file-id><device-id>_<hash>.

Tue, May 24, 7:11 AM · Restricted Project, Restricted Project
jhuber6 committed rGf37101983fc9: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker (authored by jhuber6).
[OpenMP] Add `-Xoffload-linker` to forward input to the device linker
Tue, May 24, 6:11 AM · Restricted Project, Restricted Project
jhuber6 closed D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.
Tue, May 24, 6:11 AM · Restricted Project, Restricted Project
jhuber6 added a comment to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

That said, I would consider compiling the same source with different preprocessor options to be a legitimate use case that we should support.
Explicitly passing cuid would work as a workaround in those cases, so it's not a major issue if we can't make it work out of the box without explicit cuid.

Tue, May 24, 5:08 AM · Restricted Project, Restricted Project

Mon, May 23

jhuber6 added inline comments to D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.
Mon, May 23, 4:53 PM · Restricted Project, Restricted Project
jhuber6 updated the diff for D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.

Merging into a single argument and checking if the joined arg is empty.

Mon, May 23, 4:01 PM · Restricted Project, Restricted Project
jhuber6 added inline comments to D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.
Mon, May 23, 3:45 PM · Restricted Project, Restricted Project
jhuber6 committed rT19d530e6aea7: [External][CUDA] Add option to test with the new driver (authored by jhuber6).
[External][CUDA] Add option to test with the new driver
Mon, May 23, 2:59 PM · Restricted Project
jhuber6 closed D126231: [External][CUDA] Add option to test with the new driver.
Mon, May 23, 2:59 PM · Restricted Project
jhuber6 updated the diff for D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.

Go back to old joined method and also change the name to remote _EQ.

Mon, May 23, 2:58 PM · Restricted Project, Restricted Project
jhuber6 added a comment to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

This is a moderately serious issue. Some users care about the build reproducibility. Recompiling the same sources and getting different results will trigger all sorts of red flags that would need to be investigated in order to make sure the build is not broken.

I mean this in the context that the following will not work

clang a.c -c -o a-0.o // Has some externalized static variable.
clang a.c -c -o a-1.o
clang a-0.o a-1.o // Redefined symbol error

The build will be perfectly reproducible, the ID we append here is just <var>__static__<file id><device id><line number> which should be the same in a static source tree. Though it might be annoying that the line number may change on white-space changes, so we could do without the line number at the end if that's an issue.

However, this is a very niche use-case and is not supported by Nvidia's CUDA compiler so it likely to be good enough.

The fact that NVCC didn't always generate the same output was an issue when we were using it for CUDA compilation.
In general, "not supported by NVCC" is not quite applicable here, IMO. The goal here is to make clang work correctly.

I feel like linking a file with itself is pretty uncommon, but in order to support that we'd definitely need the CUID method so we can pass it to both the host and device. I'm personally fine with this and the CUID living together so if for whatever reason there's a symbol clash, the user can specify a CUID to make it go away. We also discussed the problem of non-static source trees which neither this nor the current CUID would solve. As far as I can tell, this method would work fine for like 99.99% of codes, but getting that last 0.01% would require something like generating a UUID for each compilation job, which requires intervention from the driver to set up offloading compilation properly. So I'm not sure if it's the best trade-off.

Mon, May 23, 2:44 PM · Restricted Project, Restricted Project
jhuber6 added a comment to D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.

We keep running into the same old underlying issue that we do not have a good way to name/reference specific parts of the compilation pipeline. -Xfoo used to work OK for the linear 'standard' compilation pipeline, but these days when compilation grew from a simple linear pipe it's no longer adequate and we need to extend it.

Mon, May 23, 2:16 PM · Restricted Project, Restricted Project
jhuber6 added inline comments to D126231: [External][CUDA] Add option to test with the new driver.
Mon, May 23, 2:11 PM · Restricted Project
jhuber6 updated the diff for D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.

Updating to use @MaskRay's suggestion.

Mon, May 23, 1:55 PM · Restricted Project, Restricted Project
jhuber6 added a comment to D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.

It's better to avoid JoinedAndSeparate for new options. It is for --xxx val and --xxxval but not intended for the option this patch will add.

Mon, May 23, 12:56 PM · Restricted Project, Restricted Project
jhuber6 updated the summary of D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.
Mon, May 23, 12:30 PM · Restricted Project, Restricted Project
jhuber6 updated the diff for D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.

Changing the -Xoffload-linker= to -Xoffload-linker-.

Mon, May 23, 12:30 PM · Restricted Project, Restricted Project
jhuber6 added a comment to D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.

You do not need to hardcode it. The idea of JoinedAndSeparate is that an option foo assepts two argumants, one glued to it and another following after a whitespace.
So, when you define an option -Xoffload-linker-, and then pass -Xoffload-linker-nvptx64=foo, you will get OPT_offload-linker__ with two arguments. As an example see implementation of plugin_arg which deals with the same kind of problem of passing arguments to an open-ended set of plugins.

Mon, May 23, 12:29 PM · Restricted Project, Restricted Project
jhuber6 updated the summary of D126231: [External][CUDA] Add option to test with the new driver.
Mon, May 23, 12:05 PM · Restricted Project
jhuber6 requested review of D126231: [External][CUDA] Add option to test with the new driver.
Mon, May 23, 11:53 AM · Restricted Project
jhuber6 added a comment to D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.

-Xoffload-linker=<triple> <arg>

The syntax is confusing. Normally only triple would be the argument for -Xoffload-linker option.
Having both -Xoffload-linker and -Xoffload-linker= variants also looks odd to me.

In effect you're making -Xoffload-linker=foo a full option (as opposed to it being an option -Xoffload-linker= + argument foo) with a separate argument that follows. I guess that might work, but it's a rather unconventional use of command line parser, IMO.

I think the main issue with this approach is that it makes the command line hard to understand. When one sees -Xsomething=a -b it's impossible to tell whether -b is a regular option or something to be passed to -Xsomething=a. My assumption would be the former as -Xsomething= already got its argument a and should have no business grabbing the next one.

I think it would work better if the option could use a - or`_` for the variant that passes the triple. E.g. -Xoffload-linker-nvptx64=-foo or -Xoffload-linker-nvptx64 -foo would be easily interpretable.

Mon, May 23, 11:50 AM · Restricted Project, Restricted Project
jhuber6 requested review of D126226: [OpenMP] Add `-Xoffload-linker` to forward input to the device linker.
Mon, May 23, 10:44 AM · Restricted Project, Restricted Project
jhuber6 added a comment to D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.

Ping

Mon, May 23, 7:36 AM · Restricted Project, Restricted Project

Fri, May 20

jhuber6 committed rG20ec4161d7c9: [Libomptarget] Add branch prediction intrinsic to state check (authored by jhuber6).
[Libomptarget] Add branch prediction intrinsic to state check
Fri, May 20, 12:39 PM · Restricted Project, Restricted Project

Thu, May 19

jhuber6 committed rGeda4ef3add4d: [Libomptarget] Add `leaf` attribute to `vprintf` declaration (authored by jhuber6).
[Libomptarget] Add `leaf` attribute to `vprintf` declaration
Thu, May 19, 11:23 AM · Restricted Project, Restricted Project
jhuber6 added inline comments to D125937: [NVVM] Update intrinsic defintions to include the `nocallback` attribute.
Thu, May 19, 11:03 AM · Restricted Project, Restricted Project
jhuber6 committed rGdbffa4073cf8: [NVVM] Update intrinsic defintions to include the `nocallback` attribute (authored by jhuber6).
[NVVM] Update intrinsic defintions to include the `nocallback` attribute
Thu, May 19, 9:30 AM · Restricted Project, Restricted Project
jhuber6 closed D125937: [NVVM] Update intrinsic defintions to include the `nocallback` attribute.
Thu, May 19, 9:30 AM · Restricted Project, Restricted Project

Wed, May 18

jhuber6 requested review of D125937: [NVVM] Update intrinsic defintions to include the `nocallback` attribute.
Wed, May 18, 4:40 PM · Restricted Project, Restricted Project
jhuber6 requested review of D125904: [Cuda] Use fallback method to mangle externalized decls if no CUID given.
Wed, May 18, 9:49 AM · Restricted Project, Restricted Project

Tue, May 17

jhuber6 added a comment to D125705: [OpenMP] Don't build the offloading driver without a source input.

HIP toolchain allows clang driver to compile bundled bitcode or assembly for mixed host/device compilation or device-only multi-GPU compilation.

e.g.

clang --offload-arch=gfx906 --offload-arch=gfx908 a.bc b.s

Can you add a test to make sure this does not break HIP toolchain? Thanks.

Tue, May 17, 9:06 AM · Restricted Project, Restricted Project

Mon, May 16

jhuber6 committed rGb653b409ff44: [OpenMP] Don't build the offloading driver without a source input (authored by jhuber6).
[OpenMP] Don't build the offloading driver without a source input
Mon, May 16, 3:19 PM · Restricted Project, Restricted Project
jhuber6 closed D125705: [OpenMP] Don't build the offloading driver without a source input.
Mon, May 16, 3:19 PM · Restricted Project, Restricted Project
jhuber6 committed rG5ffecd28c9fb: [Libomptarget] Don't build the device runtime without a new Clang (authored by jhuber6).
[Libomptarget] Don't build the device runtime without a new Clang
Mon, May 16, 3:19 PM · Restricted Project, Restricted Project
jhuber6 closed D125698: [Libomptarget] Don't build the device runtime without a new Clang.
Mon, May 16, 3:18 PM · Restricted Project, Restricted Project
jhuber6 added inline comments to D125698: [Libomptarget] Don't build the device runtime without a new Clang.
Mon, May 16, 12:09 PM · Restricted Project, Restricted Project
jhuber6 added a comment to D125698: [Libomptarget] Don't build the device runtime without a new Clang.

@ye-luo Can I land this now?

Mon, May 16, 11:11 AM · Restricted Project, Restricted Project
jhuber6 updated the diff for D125698: [Libomptarget] Don't build the device runtime without a new Clang.

Remove unrelated change.

Mon, May 16, 10:21 AM · Restricted Project, Restricted Project
jhuber6 requested review of D125705: [OpenMP] Don't build the offloading driver without a source input.
Mon, May 16, 10:19 AM · Restricted Project, Restricted Project
jhuber6 added a comment to D125698: [Libomptarget] Don't build the device runtime without a new Clang.

Are the llvm version variables defined for stand-alone builds? So, will stand-alone builds of OpenMP continue to work and include libomptarget code?

Does the compiler check the version of the bc files for compatibility during compilation of an offloading application and fail with a compile time error?

Mon, May 16, 9:55 AM · Restricted Project, Restricted Project
jhuber6 updated the diff for D125698: [Libomptarget] Don't build the device runtime without a new Clang.

Changing the guard to only be for the static library.

Mon, May 16, 9:46 AM · Restricted Project, Restricted Project
jhuber6 added a comment to D125698: [Libomptarget] Don't build the device runtime without a new Clang.

The old method actually work best in the current state. You thought the runtime route is better, more robust and correct. That maybe true for release but not main. I encountered clang from main miscompiling libomptarget host code and resulted in de-referencing nullptr. So right now LLVM_ENABLE_PROJECTS works best because only the DeviceLib is built by just-built Clang. This routine also doesn't require building Clang as a separate step.

Mon, May 16, 9:40 AM · Restricted Project, Restricted Project
jhuber6 added a comment to D125698: [Libomptarget] Don't build the device runtime without a new Clang.

LGTM.

We are using CMake's command to build the device runtime now. There is no way to tell CMake to use another compiler rather than CMake's compilers to build a specific target. As a result, we either disable it, or we are doomed.
Well, accurately speaking, there seems to be a way to do that, see https://stackoverflow.com/questions/27168094/cmake-how-to-change-compiler-for-individual-target, but I don't think manipulating CMake's internal variables is a good practice.

I didn't say to workaround CMake. I was saying it was working that clang builds DeviceRTL and gcc building all the rest and now it is broken. You can try it out yourself with the commit before D125315

I'll just guard the new code then. Even though it shouldn't break if you're using the standard builds it will probably make it less noisy for people currently using the other methods.

Eventually we will abandon the old method, no?

Mon, May 16, 9:32 AM · Restricted Project, Restricted Project
jhuber6 added a comment to D125698: [Libomptarget] Don't build the device runtime without a new Clang.

LGTM.

We are using CMake's command to build the device runtime now. There is no way to tell CMake to use another compiler rather than CMake's compilers to build a specific target. As a result, we either disable it, or we are doomed.
Well, accurately speaking, there seems to be a way to do that, see https://stackoverflow.com/questions/27168094/cmake-how-to-change-compiler-for-individual-target, but I don't think manipulating CMake's internal variables is a good practice.

I didn't say to workaround CMake. I was saying it was working that clang builds DeviceRTL and gcc building all the rest and now it is broken. You can try it out yourself with the commit before D125315

Mon, May 16, 9:26 AM · Restricted Project, Restricted Project
jhuber6 added a comment to D125698: [Libomptarget] Don't build the device runtime without a new Clang.
This is either done with a two-step build, where OpenMP is built with
the Clang that was just installed, or through the
-DLLLVM_ENABLE_RUNTIMES=openmp option. This has always been the case,

This is not true. Even before D125315 breakage, all the host libraries can be built with GCC and DeviceRTL with just-built Clang via -DLLVM_ENABLE_PROJECTS=openmp
Maybe we should figure out why D125315 doesn't pick up clang when building DeviceRTL.

Mon, May 16, 9:14 AM · Restricted Project, Restricted Project
jhuber6 added a comment to D125698: [Libomptarget] Don't build the device runtime without a new Clang.

This is different from what I'm looking for. DeviceRTL should still be built as it was. Only the part added by D125315 should be disabled.

Mon, May 16, 9:03 AM · Restricted Project, Restricted Project
jhuber6 requested review of D125698: [Libomptarget] Don't build the device runtime without a new Clang.
Mon, May 16, 8:55 AM · Restricted Project, Restricted Project
jhuber6 added a comment to D125315: [Libomptarget] Build the device runtime as a static library.

This change has broken the build of libomp with gcc, as it appears to be using clang-specific options:

g++: error: unrecognized command-line option '-Xopenmp-target=nvptx64-nvidia-cuda'
g++: error: unrecognized command-line option '--cuda-feature=+ptx61'
g++: error: unrecognized command-line option '-nocudalib'; did you mean '-nostdlib'?
g++: error: unrecognized command-line option '-nogpulib'
g++: error: unrecognized command-line option '--offload-arch=sm_35'
g++: error: unrecognized command-line option '--offload-arch=sm_37'
g++: error: unrecognized command-line option '--offload-arch=sm_50'
g++: error: unrecognized command-line option '--offload-arch=sm_52'
g++: error: unrecognized command-line option '--offload-arch=sm_53'
g++: error: unrecognized command-line option '--offload-arch=sm_60'; did you mean '--offload-abi=lp64'?
g++: error: unrecognized command-line option '--offload-arch=sm_61'; did you mean '--offload-abi=lp64'?
g++: error: unrecognized command-line option '--offload-arch=sm_62'
g++: error: unrecognized command-line option '--offload-arch=sm_70'
g++: error: unrecognized command-line option '--offload-arch=sm_72'
g++: error: unrecognized command-line option '--offload-arch=sm_75'
g++: error: unrecognized command-line option '--offload-arch=sm_80'
g++: error: unrecognized command-line option '--offload-arch=sm_86'
g++: error: unrecognized command-line option '--offload-arch=gfx700'
g++: error: unrecognized command-line option '--offload-arch=gfx701'
g++: error: unrecognized command-line option '--offload-arch=gfx801'
g++: error: unrecognized command-line option '--offload-arch=gfx803'
g++: error: unrecognized command-line option '--offload-arch=gfx900'
g++: error: unrecognized command-line option '--offload-arch=gfx902'
g++: error: unrecognized command-line option '--offload-arch=gfx906'
g++: error: unrecognized command-line option '--offload-arch=gfx908'
g++: error: unrecognized command-line option '--offload-arch=gfx90a'
g++: error: unrecognized command-line option '--offload-arch=gfx90c'
g++: error: unrecognized command-line option '--offload-arch=gfx940'
g++: error: unrecognized command-line option '--offload-arch=gfx1010'
g++: error: unrecognized command-line option '--offload-arch=gfx1030'
g++: error: unrecognized command-line option '--offload-arch=gfx1031'
g++: error: unrecognized command-line option '--offload-arch=gfx1032'
g++: error: unrecognized command-line option '--offload-arch=gfx1033'
g++: error: unrecognized command-line option '--offload-arch=gfx1034'
g++: error: unrecognized command-line option '--offload-arch=gfx1035'
g++: error: unrecognized command-line option '--offload-arch=gfx1036'

The OpenMP offloading runtime was always intended to be built with a newly-built Clang, either through a two-phase build or with -DLLVM_ENABLE_RUNTIMES=openmp. This patch added some code that uses standard CMake to build the library, rather than locating the clang binary directly. I could add some code to skip building this static library, or the device runtime entirely, if the compiler isn't an up-to-date Clang. I'm not sure what the best solution to this is, since we always required that this was to be built with Clang, this patch just made it a more strict requirement.

Mon, May 16, 4:52 AM · Restricted Project, Restricted Project

Sat, May 14

jhuber6 added a comment to D125256: [OpenMP] Add `__CUDA_ARCH__` definition when offloading with OpenMP.

@jhuber6 I think this or one of your other openmp commits has caused the Driver/cuda-openmp-driver.cu test failure here: https://lab.llvm.org/buildbot/#/builders/214/builds/1274/steps/6/logs/stdio

Sat, May 14, 8:48 AM · Restricted Project, Restricted Project

Fri, May 13

jhuber6 committed rG4205f4aba4af: [Cuda] Add the features using the last argument (authored by jhuber6).
[Cuda] Add the features using the last argument
Fri, May 13, 3:05 PM · Restricted Project, Restricted Project
jhuber6 committed rG54e02179b33f: [Libomptarget] Build the static library without CUDA installed (authored by jhuber6).
[Libomptarget] Build the static library without CUDA installed
Fri, May 13, 1:31 PM · Restricted Project, Restricted Project
jhuber6 committed rG7dc23abbd3b2: [CUDA] Add a flag to manually specify the target feature to use with CUDA (authored by jhuber6).
[CUDA] Add a flag to manually specify the target feature to use with CUDA
Fri, May 13, 1:31 PM · Restricted Project, Restricted Project
jhuber6 committed rG4638ae3a8575: [OpenMP] Use the new OpenMP device static library when doing LTO (authored by jhuber6).
[OpenMP] Use the new OpenMP device static library when doing LTO
Fri, May 13, 11:40 AM · Restricted Project, Restricted Project
jhuber6 committed rG16b7a0b43b38: [Libomptarget] Build the device runtime as a static library (authored by jhuber6).
[Libomptarget] Build the device runtime as a static library
Fri, May 13, 11:40 AM · Restricted Project, Restricted Project
jhuber6 committed rG9ffa945c401c: [Libomptarget] Remove global include directory from libomptarget (authored by jhuber6).
[Libomptarget] Remove global include directory from libomptarget
Fri, May 13, 11:40 AM · Restricted Project, Restricted Project
jhuber6 committed rGaf757f89806e: [OpenMP] Don't set device runtime debugging flags if using '-nogpulib' (authored by jhuber6).
[OpenMP] Don't set device runtime debugging flags if using '-nogpulib'
Fri, May 13, 11:40 AM · Restricted Project, Restricted Project
jhuber6 closed D125333: [OpenMP] Use the new OpenMP device static library when doing LTO.
Fri, May 13, 11:40 AM · Restricted Project, Restricted Project
jhuber6 committed rG5189f634a113: [OpenMP] Don't include the device wrappers if -nostdinc is used (authored by jhuber6).
[OpenMP] Don't include the device wrappers if -nostdinc is used
Fri, May 13, 11:40 AM · Restricted Project, Restricted Project
jhuber6 committed rG002a63f937d9: [OpenMP] Add `__CUDA_ARCH__` definition when offloading with OpenMP (authored by jhuber6).
[OpenMP] Add `__CUDA_ARCH__` definition when offloading with OpenMP
Fri, May 13, 11:40 AM · Restricted Project, Restricted Project
jhuber6 closed D125315: [Libomptarget] Build the device runtime as a static library.
Fri, May 13, 11:39 AM · Restricted Project, Restricted Project
jhuber6 committed rGce0caf41bdd4: [Libomptarget] Address existing warnings in the device runtime library (authored by jhuber6).
[Libomptarget] Address existing warnings in the device runtime library
Fri, May 13, 11:39 AM · Restricted Project, Restricted Project
jhuber6 closed D125563: [Libomptarget] Remove global include directory from libomptarget.
Fri, May 13, 11:39 AM · Restricted Project, Restricted Project
jhuber6 closed D125314: [OpenMP] Don't set device runtime debugging flags if using '-nogpulib'.
Fri, May 13, 11:39 AM · Restricted Project, Restricted Project
jhuber6 committed rGb4f8443d97ba: [Libomptarget] Allow the device runtime to be compiled for the host (authored by jhuber6).
[Libomptarget] Allow the device runtime to be compiled for the host
Fri, May 13, 11:39 AM · Restricted Project, Restricted Project
jhuber6 closed D125265: [OpenMP] Don't include the device wrappers if -nostdinc is used.
Fri, May 13, 11:39 AM · Restricted Project, Restricted Project
jhuber6 closed D125256: [OpenMP] Add `__CUDA_ARCH__` definition when offloading with OpenMP.
Fri, May 13, 11:39 AM · Restricted Project, Restricted Project
jhuber6 closed D125339: [Libomptarget] Address existing warnings in the device runtime library.
Fri, May 13, 11:39 AM · Restricted Project, Restricted Project
jhuber6 closed D125260: [Libomptarget] Allow the device runtime to be compiled for the host.
Fri, May 13, 11:39 AM · Restricted Project, Restricted Project
jhuber6 updated the diff for D125315: [Libomptarget] Build the device runtime as a static library.

Rebase to not change the include directories.

Fri, May 13, 10:23 AM · Restricted Project, Restricted Project
jhuber6 updated the summary of D125315: [Libomptarget] Build the device runtime as a static library.
Fri, May 13, 10:21 AM · Restricted Project, Restricted Project
jhuber6 updated the diff for D125563: [Libomptarget] Remove global include directory from libomptarget.

Changing to be directly included by each plugin rather than inherited from elf_common.

Fri, May 13, 10:20 AM · Restricted Project, Restricted Project
jhuber6 requested review of D125563: [Libomptarget] Remove global include directory from libomptarget.
Fri, May 13, 10:09 AM · Restricted Project, Restricted Project