Page MenuHomePhabricator

[CMake] Re-order runtimes in the order of dependencies
Needs RevisionPublic

Authored by phosek on Oct 10 2019, 2:00 PM.

Details

Summary

This allows runtimes to have checks like if(TARGET ${runtime}).

Diff Detail

Event Timeline

phosek created this revision.Oct 10 2019, 2:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 10 2019, 2:00 PM

This is an alternative to D68791 which is trying to address the issue introduced in r374116.

phosek updated this revision to Diff 224533.Oct 10 2019, 7:47 PM
phosek added a reviewer: smeenai.
smeenai accepted this revision.Oct 10 2019, 7:59 PM

Makes sense to me, particularly since this is generalizing what we're already doing for compiler-rt.

Is compiler-rt's place in the dependency list accurate though? For example, libc++ needs a builtins library, which could be compiler-rt, and I think the compiler-rt to libc++ dependency is limited to things like libFuzzer. I don't think libc++ supports using an in-tree compiler-rt though, so that's probably a moot point.

llvm/runtimes/CMakeLists.txt
113

depend on -> depends on

This revision is now accepted and ready to land.Oct 10 2019, 7:59 PM
beanz added a comment.Oct 10 2019, 8:08 PM

Two thoughts.

(1) Short term I think this is wrong, in general we should avoid using if(TARGET...). Compiler-rt has the most ridiculous magic in it for which targets are enabled or not, which is why it was put first. That allows the other runtimes to check against compiler-rt which was harder to replicate logic for which sanitizers were included in the build. The other runtime libraries should be able to use the HAVE_${runtime} variables to determine the presence of the other runtimes. That should correctly allow libcxx, libcxxabi and libunwind to not be order dependent.
(2) Longer term, we really need to update the runtimes to make their dependencies based on generator expressions so that we can remove at least some of the order dependence.

beanz requested changes to this revision.Oct 10 2019, 8:09 PM
This revision now requires changes to proceed.Oct 10 2019, 8:09 PM

I feel like Im not understanding something. The point of CMake is that it tracks dependencies. We are we manually tracking dependencies?

The MSAN check was needed due to the inverted dependencies; with unified builds, this shouldnt be an issue any longer. I think that propagating the checks this way is the wrong approach.

Can you please explain why you need the if(TARGET runtime) in the first place? Understanding that might help with coming up with a better solution (or possibly an existing solution).

See https://reviews.llvm.org/D68791 for an example where this came up. I'm open to other solutions, but I think what's in that change (which is what we were using until r374116 when it was accidentally removed) isn't great either.

ldionne accepted this revision.Oct 11 2019, 9:36 AM

CMake tracks dependencies between targets, but not between directories. If the CMakeLists.txt in some directory (e.g. <root>/libcxx/) needs a target defined in another directory (e.g. <root>/libcxxabi/), one has to make sure that libcxxabi's CMakeLists.txt is included before libcxx's CMakeLists.txt. This isn't new or vexing, IMO. If that is what this patch ensures (I don't know the runtimes build very well), I think this is good.

[...] in general we should avoid using if(TARGET...).

I'd like to question that affirmation. What is it based on?

The other runtime libraries should be able to use the HAVE_${runtime} variables to determine the presence of the other runtimes

IMO, the HAVE_${runtime} variables are the weird ones here. The normal LLVM monorepo build orders the directories correctly, and we don't run into that issue. I don't see how HAVE_${runtime} can get around things like being able to query properties of e.g. cxxabi_shared inside the libcxx build before cxxabi_shared has been defined. I do support a push towards using generator expressions more, but I don't think generator expressions are a complete solution to this problem.

I'd like to see this patch go in under some form so that we can remove the hacky workaround introduced in https://reviews.llvm.org/D68791.

CMake tracks dependencies between targets, but not between directories. If the CMakeLists.txt in some directory (e.g. <root>/libcxx/) needs a target defined in another directory (e.g. <root>/libcxxabi/), one has to make sure that libcxxabi's CMakeLists.txt is included before libcxx's CMakeLists.txt. This isn't new or vexing, IMO. If that is what this patch ensures (I don't know the runtimes build very well), I think this is good.

The problem is that libcxx depends on compiler-rt, and compiler-rt depends on libcxx. Which means we have circular dependencies. The solution is to migrate to CMake usage patterns that don't require strict ordering. Specifically the use of generator expressions in this (and many other situations) is warranted.

I'd like to question that affirmation. What is it based on?

The fact that we have circular dependencies at the project level (not the target level). The if (TARGET ...) feature in CMake requires that targets are processed in specific orders and cannot deal with projects that have circular dependency relationships.

IMO, the HAVE_${runtime} variables are the weird ones here. The normal LLVM monorepo build orders the directories correctly, and we don't run into that issue.

I'm not sure what you mean by this. What I *think* you are referring to is specifying runtime libraries in LLVM_ENABLE_PROJECTS which uses the llvm/projects subdirectory for building them. This build flow is also part of the monorepo and uses LLVM_ENABLE_RUNTIMES. Fundamentally LLVM_ENABLE_PROJECTS is the incorrect way to build runtime libraries because they are not built with the in-tree compiler. While this works for most development cases it is 100% wrong for building and shipping toolchains, and it is questionable practice to have developers building and testing things that aren't representative of what we ship.

I don't see how HAVE_${runtime} can get around things like being able to query properties of e.g. cxxabi_shared inside the libcxx build before cxxabi_shared has been defined. I do support a push towards using generator expressions more, but I don't think generator expressions are a complete solution to this problem.

What do you think that generator expressions can't do that is relevant to this problem?

I'd like to see this patch go in under some form so that we can remove the hacky workaround introduced in https://reviews.llvm.org/D68791.

You can also remove that hacky workaround using generator expressions. Replacing one hack with another isn't exactly a meaningful transformation of the codebase.

CMake tracks dependencies between targets, but not between directories. If the CMakeLists.txt in some directory (e.g. <root>/libcxx/) needs a target defined in another directory (e.g. <root>/libcxxabi/), one has to make sure that libcxxabi's CMakeLists.txt is included before libcxx's CMakeLists.txt. This isn't new or vexing, IMO. If that is what this patch ensures (I don't know the runtimes build very well), I think this is good.

[...]
I'm not sure what you mean by this. What I *think* you are referring to is specifying runtime libraries in LLVM_ENABLE_PROJECTS which uses the llvm/projects subdirectory for building them.

Yes, precisely. That's also the currently preferred way of building libc++: https://libcxx.llvm.org/docs/BuildingLibcxx.html

This build flow is also part of the monorepo and uses LLVM_ENABLE_RUNTIMES. Fundamentally LLVM_ENABLE_PROJECTS is the incorrect way to build runtime libraries because they are not built with the in-tree compiler. While this works for most development cases it is 100% wrong for building and shipping toolchains, and it is questionable practice to have developers building and testing things that aren't representative of what we ship.

Not everybody ships libc++/libc++abi as part of the toolchain, and for those, building with whatever CMAKE_CXX_COMPILER they specify is really the right thing to do. Don't get me wrong, I'm 100% on board that there's value in having this runtime build, however let's not pretend that it's the only correct way to build libc++.

I don't see how HAVE_${runtime} can get around things like being able to query properties of e.g. cxxabi_shared inside the libcxx build before cxxabi_shared has been defined. I do support a push towards using generator expressions more, but I don't think generator expressions are a complete solution to this problem.

What do you think that generator expressions can't do that is relevant to this problem?

Say I need to generate a file based on the properties of a target. I'll need to call get_target_property on a target that hasn't been defined yet, and there's no way around that because file(GENERATE) does not expand generator expressions.

I'd like to see this patch go in under some form so that we can remove the hacky workaround introduced in https://reviews.llvm.org/D68791.

You can also remove that hacky workaround using generator expressions.

Are you thinking about this?

set(libname "$<IF:$<TARGET_EXISTS:${lib}>,$<TARGET_PROPERTY:${lib},OUTPUT_NAME>,${lib}>")
 list(APPEND link_libraries "${CMAKE_LINK_LIBRARY_FLAG}${libname}")

That is clever, I had not thought about it.

If the above workaround works, I don't care about this patch that much. I still think we need to clarify the status of the Runtimes build and document it, unless that's already done and I've missed it (in which case please point it to me).

beanz added a comment.Oct 11 2019, 1:57 PM

Yes, precisely. That's also the currently preferred way of building libc++: https://libcxx.llvm.org/docs/BuildingLibcxx.html

That documentation is more than a bit lacking, as is much of the LLVM documentation.

Not everybody ships libc++/libc++abi as part of the toolchain, and for those, building with whatever CMAKE_CXX_COMPILER they specify is really the right thing to do.

While this is true, in many instances libc++ even when libc++ isn't shipped with a toolchain it is locked to one. Darwin is a prime example of this. On Darwin libc++ is shipped as part of the OS, but that cycle is closely coordinated with the toolchain updates and the two are usually kept in sync.

Don't get me wrong, I'm 100% on board that there's value in having this runtime build, however let's not pretend that it's the only correct way to build libc++.

I would argue if you don't ship libc++ as part of the toolchain you shouldn't build it as part of the toolchain either. In which case the standalone build configuration is the correct way to build it. My intention isn't to say building libcxx as a runtime is the only correct way to build libc++, my intention is to state that *if* you are building libc++ with the toolchain, building it as a runtime is the only correct way to build it.

Say I need to generate a file based on the properties of a target. I'll need to call get_target_property on a target that hasn't been defined yet, and there's no way around that because file(GENERATE) does not expand generator expressions.

Is this something you need to do? If so I'd question higher-level decisions about how the build is structured.

Are you thinking about this?

set(libname "$<IF:$<TARGET_EXISTS:${lib}>,$<TARGET_PROPERTY:${lib},OUTPUT_NAME>,${lib}>")
 list(APPEND link_libraries "${CMAKE_LINK_LIBRARY_FLAG}${libname}")

That is clever, I had not thought about it.

If the above workaround works, I don't care about this patch that much. I still think we need to clarify the status of the Runtimes build and document it, unless that's already done and I've missed it (in which case please point it to me).

That is a much better approach. CMake 3.11 is when the TARGET_EXISTS generator expression was added, although the documentation wasn't updated until CMake 3.15, so this change would require a CMake version update, which I don't think is unreasonable.

Yes, precisely. That's also the currently preferred way of building libc++: https://libcxx.llvm.org/docs/BuildingLibcxx.html

That documentation is more than a bit lacking, as is much of the LLVM documentation.

Not everybody ships libc++/libc++abi as part of the toolchain, and for those, building with whatever CMAKE_CXX_COMPILER they specify is really the right thing to do.

While this is true, in many instances libc++ even when libc++ isn't shipped with a toolchain it is locked to one. Darwin is a prime example of this. On Darwin libc++ is shipped as part of the OS, but that cycle is closely coordinated with the toolchain updates and the two are usually kept in sync.

Don't get me wrong, I'm 100% on board that there's value in having this runtime build, however let's not pretend that it's the only correct way to build libc++.

I would argue if you don't ship libc++ as part of the toolchain you shouldn't build it as part of the toolchain either. In which case the standalone build configuration is the correct way to build it. My intention isn't to say building libcxx as a runtime is the only correct way to build libc++, my intention is to state that *if* you are building libc++ with the toolchain, building it as a runtime is the only correct way to build it.

Say I need to generate a file based on the properties of a target. I'll need to call get_target_property on a target that hasn't been defined yet, and there's no way around that because file(GENERATE) does not expand generator expressions.

Is this something you need to do? If so I'd question higher-level decisions about how the build is structured.

Are you thinking about this?

set(libname "$<IF:$<TARGET_EXISTS:${lib}>,$<TARGET_PROPERTY:${lib},OUTPUT_NAME>,${lib}>")
 list(APPEND link_libraries "${CMAKE_LINK_LIBRARY_FLAG}${libname}")

That is clever, I had not thought about it.

If the above workaround works, I don't care about this patch that much. I still think we need to clarify the status of the Runtimes build and document it, unless that's already done and I've missed it (in which case please point it to me).

That is a much better approach. CMake 3.11 is when the TARGET_EXISTS generator expression was added, although the documentation wasn't updated until CMake 3.15, so this change would require a CMake version update, which I don't think is unreasonable.

I've tested D68880 which requires newer CMake but it does seem to be working. Shall we start the discussion about bumping the minimum CMake version requirement on llvm-dev?

That is a much better approach. CMake 3.11 is when the TARGET_EXISTS generator expression was added, although the documentation wasn't updated until CMake 3.15, so this change would require a CMake version update, which I don't think is unreasonable.

I've tested D68880 which requires newer CMake but it does seem to be working. Shall we start the discussion about bumping the minimum CMake version requirement on llvm-dev?

Yes, I think that would be great. CMake is one of those programs that's terribly easy to install, and I think it's reasonable to request a recent CMake in order to build LLVM (or at least libc++).

If the rest of LLVM wants/needs to stay stuck on CMake 3.4 (!!!), it might be possible to move the requirements for libc++/libc++abi only.