This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP][libomptarget] Make bitcode library building depend on clang and llvm-linker being available
Needs RevisionPublic

Authored by gtbercea on May 14 2018, 12:57 PM.

Details

Summary

To build the bitcode library with newly built compiler, insert dependencies on Clang and the llvm linker.

Diff Detail

Repository
rOMP OpenMP

Event Timeline

gtbercea created this revision.May 14 2018, 12:57 PM
Hahnfeld requested changes to this revision.May 14 2018, 1:25 PM
Hahnfeld added a subscriber: Hahnfeld.

Nope, I repeatedly expressed concerns about doing this and it was repeatedly discussed, see D14254 around November 21st:

Currently, this will try to use the just-built compiler when building in-tree with LLVM and Clang. It does so by adding a dependence to the clang target which means that the files are recompiled whenever the compiler changes, ie the clang target has a newer modification timestamp.

This revision now requires changes to proceed.May 14 2018, 1:25 PM
grokos requested changes to this revision.May 14 2018, 1:50 PM

I agree with Jonas, the initial version of the nvptx RTL that we tried to upstream was somewhat like what you try to do here and the consensus was that we do not want to have these dependencies.

If you want to use the just-built clang to compile the BC lib, then the proper (albeit longer) solution is to build clang without the BC lib, then re-run cmake with LIBOMPTARGET_NVPTX_ENABLE_BCLIB enabled and LIBOMPTARGET_NVPTX_SELECTED_CUDA_COMPILER pointing to the just-built clang and re-run ninja.

I agree with Jonas, the initial version of the nvptx RTL that we tried to upstream was somewhat like what you try to do here and the consensus was that we do not want to have these dependencies.

If you want to use the just-built clang to compile the BC lib, then the proper (albeit longer) solution is to build clang without the BC lib, then re-run cmake with LIBOMPTARGET_NVPTX_ENABLE_BCLIB enabled and LIBOMPTARGET_NVPTX_SELECTED_CUDA_COMPILER pointing to the just-built clang and re-run ninja.

George, I am well aware of the long-way and that is exactly what I'm trying to avoid.

The just-built compiler is a reasonable default in my opinion. I don't see a downside to it.

gtbercea added a comment.EditedMay 14 2018, 2:12 PM

There is a very good reason why the current compiler is being used for compiling the bc-lib. Any incompatibility in LLVM version between the bclib compiler and the just-built compiler will result in an error when inlining due to different LLVM versions. So the latest build is always the safest default to use, it can never fail.

The just-built compiler is a reasonable default in my opinion. I don't see a downside to it.

I'll try to summarize some points that I said on other patches and that come to my mind:

  1. The just-built compiler doesn't exist by definition when CMake is invoked. This will prevent all kind of dynamic flag checking (Does ${LIBOMPTARGET_NVPTX_SELECTED_CUDA_COMPILER} support -fflag?). This is how I propose to reliably implement a solution for D44992.
  2. See the quote in my previous comment: This patch will result in rebuilding the entire library whenever there is a change in LLVM or Clang. While this doesn't seem to problematic at first, clang is usually the binary that finishes linking last because it's rather large. So the compilation of libomptarget-nvptx-bc can only start after almost all other steps of the build are done. I've found this really annoying in the past.
  3. There is a somewhat weaker problem when looking in the other direction: Let's say you have a fully setup LLVM build, but only want to build libomptarget-nvptx-bc (for whatever reason; let's just assume you are doing some changes in there that you want to test). This will result in the build system compiling the entire compiler before actually starting to build the library. I acknowledge that this will only happen after changes to files in LLVM or Clang, but if you synchronize to the latest changes once a day you can basically sit there and wait for your build to complete at the start of your working day.

There is a very good reason why the current compiler is being used for compiling the bc-lib. Any incompatibility in LLVM version between the bclib compiler and the just-built compiler will result in an error when inlining due to different LLVM versions.

From what I read in the documentation the LLVM bitcode should be compatible with all versions down to LLVM/Clang 3.0. Is there a particular problem that you have in mind?

George, I am well aware of the long-way and that is exactly what I'm trying to avoid.

For production installations I'd say that's the way to go anyway. If you really want to use the just-built compiler during a single run, you should take a look at LLVM's runtimes/ directory which does all that while taking care of most problems. For example, it will only run CMake once clang has been compiled for the first time and the build system will correctly handle incremental builds. However, I'm not sure if it recompiles the libraries after there was a change to the compiler. But there have been some smart people thinking about this (namely Chris Bieneman and Petr Hosek) so I think the build system now probably does "The Right Thing" (tm).

Can we double check this LLVM IR backward compatible thing? My understanding is that it is not that good, at least for the major versions.

So in my mind, we have two scenarios when the bc file is involved.

  1. when we do in-tree build, the runtime bc file should be compiled by the just-built compiler (the runtime should always as new if not newer than the compiler it intends for.)
  2. when we do out-of-tree build, the runtime bc file will be compiled by the building compiler, and hoping the runtime will work with that version (i.e you are using a newer runtime than the building compiler.)
  1. The just-built compiler doesn't exist by definition when CMake is invoked. This will prevent all kind of dynamic flag checking (Does ${LIBOMPTARGET_NVPTX_SELECTED_CUDA_COMPILER} support -fflag?). This is how I propose to reliably implement a solution for D44992.

The compiler not existing can also happen if the user provides a path to a non-existing compiler. Remember, this is a default value which can always be overwritten by the user.
Flag checking will work in the same way as if the user had provided the path to a different compiler.

  1. See the quote in my previous comment: This patch will result in rebuilding the entire library whenever there is a change in LLVM or Clang. While this doesn't seem to problematic at first, clang is usually the binary that finishes linking last because it's rather large. So the compilation of libomptarget-nvptx-bc can only start after almost all other steps of the build are done. I've found this really annoying in the past.

That's right, when Clang is available that's when the bclib is built. There is no problem if the bclib is built at the end of the build process actually. The BCLIB builds instantly so there's no perceptible overhead. Actually it's what the developer wants, if they make changes that affect the CUDA compilation then they want to see them reflected in the BCLIB.

  1. There is a somewhat weaker problem when looking in the other direction: Let's say you have a fully setup LLVM build, but only want to build libomptarget-nvptx-bc (for whatever reason; let's just assume you are doing some changes in there that you want to test). This will result in the build system compiling the entire compiler before actually starting to build the library. I acknowledge that this will only happen after changes to files in LLVM or Clang, but if you synchronize to the latest changes once a day you can basically sit there and wait for your build to complete at the start of your working day.

If you only do changes to the files that are part of the bclib then the compiler will not be rebuilt. If the user updates the current compiler then he/she will have to wait anyway even if the bclib compiler path points to a different clang compiler.

An overall comment which I think applies to all your argument above: the choice of using the just-built compiler is a reasonable default which in an overwhelming number of cases is "what the user wants". This choice can always be overwritten using cmake flags. Those cmake flags are checked first and if a valid clang compiler is found then the building of the bclib will proceed there and then without depending on the just-built compiler. So for all your arguments above, user can always just provide a different clang and it will work as you want.

From what I read in the documentation the LLVM bitcode should be compatible with all versions down to LLVM/Clang 3.0. Is there a particular problem that you have in mind?

I don't think that works, when trying to inline newer LLVM code into older LLVM code I get an error saying that a different LLVM version is used or something along those lines. Since seeing something like that a few months ago I have always used the just-built compiler to make sure. This is a different matter though.

From what I read in the documentation the LLVM bitcode should be compatible with all versions down to LLVM/Clang 3.0. Is there a particular problem that you have in mind?

I don't think that works, when trying to inline newer LLVM code into older LLVM code I get an error saying that a different LLVM version is used or something along those lines. Since seeing something like that a few months ago I have always used the just-built compiler to make sure. This is a different matter though.

I just want to comment this last point, my understanding is that it will not work even you try to inline older LLVM IR into a newer one.

Can we double check this LLVM IR backward compatible thing? My understanding is that it is not that good, at least for the major versions.

I just want to comment this last point, my understanding is that it will not work even you try to inline older LLVM IR into a newer one.

I did just that: I built libomptarget-nvptx-bc with Clang 3.9.1 (which is actually the only released version that you can currently build with) and inlined the result with LLVM/Clang trunk which will become 7.0. It worked without problems for all simple test cases I tried. Do you have a scenario that fails for you?

I don't think that works, when trying to inline newer LLVM code into older LLVM code I get an error saying that a different LLVM version is used or something along those lines.

Well, that's ... obvious, isn't it? How is an older version of LLVM supposed to know what to do with a newer format and "downgrade" that? The other direction is generally no problem: Newer LLVM versions usually auto-upgrade older bitcode files, see above.

Actually, I think that's an argument for my point of view: Building with an older compiler will result in libraries that are compatible with a larger number (more recent versions) of compilers. And by definition, CMAKE_C_COMPILER is older than the just-built compiler.

  1. The just-built compiler doesn't exist by definition when CMake is invoked. This will prevent all kind of dynamic flag checking (Does ${LIBOMPTARGET_NVPTX_SELECTED_CUDA_COMPILER} support -fflag?). This is how I propose to reliably implement a solution for D44992.

The compiler not existing can also happen if the user provides a path to a non-existing compiler. Remember, this is a default value which can always be overwritten by the user.
Flag checking will work in the same way as if the user had provided the path to a different compiler.

If the user supplies junk, we can just error out. That's what CMake will also do for invalid CMAKE_C_COMPILERs and the like.

  1. See the quote in my previous comment: This patch will result in rebuilding the entire library whenever there is a change in LLVM or Clang. While this doesn't seem to problematic at first, clang is usually the binary that finishes linking last because it's rather large. So the compilation of libomptarget-nvptx-bc can only start after almost all other steps of the build are done. I've found this really annoying in the past.

That's right, when Clang is available that's when the bclib is built. There is no problem if the bclib is built at the end of the build process actually. The BCLIB builds instantly so there's no perceptible overhead. Actually it's what the developer wants, if they make changes that affect the CUDA compilation then they want to see them reflected in the BCLIB.

And it's the least thing developers expect if they are hacking on an entirely different region of the compiler...

An overall comment which I think applies to all your argument above: the choice of using the just-built compiler is a reasonable default which in an overwhelming number of cases is "what the user wants". This choice can always be overwritten using cmake flags. Those cmake flags are checked first and if a valid clang compiler is found then the building of the bclib will proceed there and then without depending on the just-built compiler. So for all your arguments above, user can always just provide a different clang and it will work as you want.

My statement would be that the user should just build the openmp repo with the "new" compiler, even though I don't see an advantage (see my points above).

gtbercea added a comment.EditedMay 17 2018, 2:40 PM

I am not at all convinced. I think there are absolutely no good arguments against allowing the just built compiler to build the BCLIB (as a default, this is simply a default which can be overwritten at any time by the user). The drawbacks you mention have no detrimental impact on the quality of life of the compiler developer. Developers improving other parts of the compiler who do not care about OpenMP and device offloading can just disable the building of the BCLIB because it is optional after all. The BCLIB is something that is required for performance not correctness so it can be disabled at any time. Even when BCLIB building is turned on there is no perceptible overhead to it, compared to the time it takes to compile a cpp file somewhere in the source code, the BCLIB builds instantly.

Your argument that an older version is better is actually wrong. Using an older version to build the BCLIB is always a BAD idea. The reason why you want to build the BCLIB is so that the runtime functions actually get inlined. If you use an older compiler the functions will not be inlined which defeats the purpose of having the BCLIB in the first place. I stumbled again upon this problem today after updating to the latest changes.

I do not understand why you are opposing this, so far I haven't heard a single strong argument against this.

I am not at all convinced. I think there are absolutely no good arguments against allowing the just built compiler to build the BCLIB (as a default, this is simply a default which can be overwritten at any time by the user). The drawbacks you mention have no detrimental impact on the quality of life of the compiler developer. Developers improving other parts of the compiler who do not care about OpenMP and device offloading can just disable the building of the BCLIB because it is optional after all. The BCLIB is something that is required for performance not correctness so it can be disabled at any time. Even when BCLIB building is turned on there is no perceptible overhead to it, compared to the time it takes to compile a cpp file somewhere in the source code, the BCLIB builds instantly.

Your argument that an older version is better is actually wrong. Using an older version to build the BCLIB is always a BAD idea. The reason why you want to build the BCLIB is so that the runtime functions actually get inlined. If you use an older compiler the functions will not be inlined which defeats the purpose of having the BCLIB in the first place. I stumbled again upon this problem today after updating to the latest changes.

I do not understand why you are opposing this, so far I haven't heard a single strong argument against this.

I second this opinion. Besides, I think when the runtime is in-tree built, the compiler to compile the bc file HAS be the just-built clang compiler.

After that said, I am not sure if we consider in-tree runtime built for the moment. It sounds we do not? I may missed some earlier discussions. To me, I am not sure how to exactly set the dependence to make that happen. Our old code seems did that and then get changed?

Can we double check this LLVM IR backward compatible thing? My understanding is that it is not that good, at least for the major versions.

I just want to comment this last point, my understanding is that it will not work even you try to inline older LLVM IR into a newer one.

I did just that: I built libomptarget-nvptx-bc with Clang 3.9.1 (which is actually the only released version that you can currently build with) and inlined the result with LLVM/Clang trunk which will become 7.0. It worked without problems for all simple test cases I tried. Do you have a scenario that fails for you?

My understanding is that if things like the data layout is changed between versions, the old code will simply be wrong from the new version's perspective.

Your argument that an older version is better is actually wrong. Using an older version to build the BCLIB is always a BAD idea. The reason why you want to build the BCLIB is so that the runtime functions actually get inlined. If you use an older compiler the functions will not be inlined which defeats the purpose of having the BCLIB in the first place. I stumbled again upon this problem today after updating to the latest changes.

To be precise, I didn't say that "an older version is better". My statement was that it works and that older versions of the compiler generate bitcode files that are compatible with a larger number of (newer) compilers.

But you are right, inlining of old bitcode files broke last month with rC329829. Luckily, this can be fixed, see D47070.

I do not understand why you are opposing this, so far I haven't heard a single strong argument against this.

"strong" is certainly subjective and may imply different things for you and me. If I wanted to put my point to the extreme, I'd say that there is none of your arguments left because I sat down and fixed all the problems you have been describing so far.

Coming back to my opinion: I don't like using the just-built compiler because that's not how CMake usually works when you build a project. The build system takes a single (version of a) compiler that can't change afterwards, determines its abilities through dynamic feature checking and compiles the source code. The last two step can be repeated as often as the project or its prerequisites change.

I second this opinion. Besides, I think when the runtime is in-tree built, the compiler to compile the bc file HAS be the just-built clang compiler.

I don't see why that has to be the case. As I explained throughout the comments, you can build the bitcode library with an older compiler and still inline it with a newer version.

After that said, I am not sure if we consider in-tree runtime built for the moment. It sounds we do not?

You can build the bclib in-tree, it just won't use the just-built compiler but CMAKE_C_COMPILER by default.

I may missed some earlier discussions. To me, I am not sure how to exactly set the dependence to make that happen. Our old code seems did that and then get changed?

I'll happily link to the previous discussion once more: The "support" for using the just-built compiler was removed during review of D14254. CC'ing Hal who chimed in last time, maybe he can add his opinion and convince one of the sides (I'm also fine with being convinced myself if I do still miss an important point!)

As long as cmake allows a part of the build to depend on another part of the build I think there is no problem with this. It just so happens that we have a part of the build depending on the clang compiler. It could have depended on the building of any other random file. I don't see this as something un-cmake-like.

If older versions of the compiler will re-enable inlining in the future, the user is free to point the exposed CMAKE flags to an older version of the compiler. If the user doesn't do anything then they are guaranteed a compiler that will inline correctly. Any other default will not guarantee inlining.

Your argument that an older version is better is actually wrong. Using an older version to build the BCLIB is always a BAD idea. The reason why you want to build the BCLIB is so that the runtime functions actually get inlined. If you use an older compiler the functions will not be inlined which defeats the purpose of having the BCLIB in the first place. I stumbled again upon this problem today after updating to the latest changes.

To be precise, I didn't say that "an older version is better". My statement was that it works and that older versions of the compiler generate bitcode files that are compatible with a larger number of (newer) compilers.

But you are right, inlining of old bitcode files broke last month with rC329829. Luckily, this can be fixed, see D47070.

I do not understand why you are opposing this, so far I haven't heard a single strong argument against this.

"strong" is certainly subjective and may imply different things for you and me. If I wanted to put my point to the extreme, I'd say that there is none of your arguments left because I sat down and fixed all the problems you have been describing so far.

Coming back to my opinion: I don't like using the just-built compiler because that's not how CMake usually works when you build a project.

You're correct, this is not how CMake normally works. However, CMake is also not normally being used to build a compiler. Compiler builds often self host to some degree or another, and the compiler provided to the build system is generally through of only as something necessary to bootstrap the build process. LLVM/Clang, being both a self-hosting compiler and also a library component, has a split personality in this regard. However, in this case, as I believe we do with compiler-rt components, etc., it makes sense to default to using the just-built clang to build the bitcode library (which is intended to be used by that just-built compiler). Honestly, I think we should carefully consider whether we want to officially support a non-version-locked configuration, because if we do, then we need to set parameters on that and test it on the buildbots. My inclination is to officially support on the version-locked configuration between the compiler building the application code, the compiler used to build the bitcode library, and the version of the bitcode library itself.

The build system takes a single (version of a) compiler that can't change afterwards, determines its abilities through dynamic feature checking and compiles the source code. The last two step can be repeated as often as the project or its prerequisites change.

I second this opinion. Besides, I think when the runtime is in-tree built, the compiler to compile the bc file HAS be the just-built clang compiler.

I don't see why that has to be the case. As I explained throughout the comments, you can build the bitcode library with an older compiler and still inline it with a newer version.

I agree, I don't think it must be the case, technically it should work with other compilers. I think that using the just-built clang is a better default, however.

After that said, I am not sure if we consider in-tree runtime built for the moment. It sounds we do not?

You can build the bclib in-tree, it just won't use the just-built compiler but CMAKE_C_COMPILER by default.

I may missed some earlier discussions. To me, I am not sure how to exactly set the dependence to make that happen. Our old code seems did that and then get changed?

I'll happily link to the previous discussion once more: The "support" for using the just-built compiler was removed during review of D14254. CC'ing Hal who chimed in last time, maybe he can add his opinion and convince one of the sides (I'm also fine with being convinced myself if I do still miss an important point!)

Thanks ;)

Can we double check this LLVM IR backward compatible thing? My understanding is that it is not that good, at least for the major versions.

I just want to comment this last point, my understanding is that it will not work even you try to inline older LLVM IR into a newer one.

I did just that: I built libomptarget-nvptx-bc with Clang 3.9.1 (which is actually the only released version that you can currently build with) and inlined the result with LLVM/Clang trunk which will become 7.0. It worked without problems for all simple test cases I tried. Do you have a scenario that fails for you?

My understanding is that if things like the data layout is changed between versions, the old code will simply be wrong from the new version's perspective.

That's correct. If you change the data layout then you can't, in general, combine the IR with other IR with a different data layout in one module.

Your argument that an older version is better is actually wrong. Using an older version to build the BCLIB is always a BAD idea. The reason why you want to build the BCLIB is so that the runtime functions actually get inlined. If you use an older compiler the functions will not be inlined which defeats the purpose of having the BCLIB in the first place. I stumbled again upon this problem today after updating to the latest changes.

To be precise, I didn't say that "an older version is better". My statement was that it works and that older versions of the compiler generate bitcode files that are compatible with a larger number of (newer) compilers.

But you are right, inlining of old bitcode files broke last month with rC329829. Luckily, this can be fixed, see D47070.

I do not understand why you are opposing this, so far I haven't heard a single strong argument against this.

"strong" is certainly subjective and may imply different things for you and me. If I wanted to put my point to the extreme, I'd say that there is none of your arguments left because I sat down and fixed all the problems you have been describing so far.

Coming back to my opinion: I don't like using the just-built compiler because that's not how CMake usually works when you build a project.

You're correct, this is not how CMake normally works. However, CMake is also not normally being used to build a compiler. Compiler builds often self host to some degree or another, and the compiler provided to the build system is generally through of only as something necessary to bootstrap the build process. LLVM/Clang, being both a self-hosting compiler and also a library component, has a split personality in this regard. However, in this case, as I believe we do with compiler-rt components, etc., it makes sense to default to using the just-built clang to build the bitcode library (which is intended to be used by that just-built compiler).

I just had a quick look and compiler-rt indeed uses the just-built compiler to build instrumented flavors of libc++ that can be used together with the sanitizers. To do so they use the same "trick" as LLVM's runtimes/ directory, namely CMake's support for external projects. If used correctly CMake can be taught to only rebuild the bitcode library if one of the library's source files changed and not whenever the compiler is rebuilt. That would address one of my largest pain points, maybe that's a compromise? "correctly" turned out to be a bit harder here but I think that code path is only useful for in-tree builds where we can maybe reuse LLVM's CMake modules?

Honestly, I think we should carefully consider whether we want to officially support a non-version-locked configuration, because if we do, then we need to set parameters on that and test it on the buildbots. My inclination is to officially support on the version-locked configuration between the compiler building the application code, the compiler used to build the bitcode library, and the version of the bitcode library itself.

Good point, we probably should. I would have opted to also allow building with the previous release of Clang, but maybe I can just try to maintain that property on a best effort base. (My use case is that I usually build the openmp repository multiple times in different configurations and being able to do that with the latest release during development of the next version would simply things...)

gtbercea updated this revision to Diff 148416.May 24 2018, 7:57 AM

Update patch on top of latest changes.

Is this good to go?

Hahnfeld requested changes to this revision.Jul 11 2018, 10:59 PM

I'm still opposed to this, as stated in my many comments. If at all, you need to make sure that this library doesn't rebuild whenever you do an unrelated change to Clang.

That's at least my opinion. I'm not sure if somebody can / wants to override me here.

This revision now requires changes to proceed.Jul 11 2018, 10:59 PM

I'm still opposed to this, as stated in my many comments. If at all, you need to make sure that this library doesn't rebuild whenever you do an unrelated change to Clang.

I think that we should do the same thing here as we do with other things that are built with the just-built Clang (to be consistent with everything). By this metric, how does this compare?

That's at least my opinion. I'm not sure if somebody can / wants to override me here.

I'm still opposed to this, as stated in my many comments. If at all, you need to make sure that this library doesn't rebuild whenever you do an unrelated change to Clang.

I think that we should do the same thing here as we do with other things that are built with the just-built Clang (to be consistent with everything). By this metric, how does this compare?

I totally agree, but AFAIK nothing else depends on the just-built Clang except runtimes/ and some libc++ variants for testing.