Page MenuHomePhabricator

[cmake] Add new LLVM_CACHE_VARIABLES variable to contains all variables passed to cmake on the commandline or in cache files.
Needs ReviewPublic

Authored by hintonda on May 23 2019, 4:17 PM.

Details

Summary

This patch gathers all variables passed to cmake, adds them
to a list in the cache, and prints them out. This is helpful for
debugging, especially in the buildbots, but could also be used to
automatically decide what to pass down to cmake when building NATIVE
tools.

Here's an example of what it prints:

-- *** Gathering initial cache variables -- from command line and/or cache files into LLVM_CACHE_VARIABLES ***
-- from commandline or cache files: CLANGD_BUILD_XPC = OFF
-- from commandline or cache files: CMAKE_BUILD_TYPE = Debug
-- from commandline or cache files: CMAKE_CXX_FLAGS = -I/opt/local/include
-- from commandline or cache files: CMAKE_C_FLAGS = -I/opt/local/include
-- from commandline or cache files: CMAKE_GENERATOR = Ninja
-- from commandline or cache files: GOLD = GOLD-NOTFOUND
-- from commandline or cache files: LLVM_APPEND_VC_REV = OFF
-- from commandline or cache files: LLVM_ENABLE_PROJECTS = Scratch;clang;libcxx;libcxxabi
-- from commandline or cache files: LLVM_EXTERNAL_PROJECTS = Scratch
-- from commandline or cache files: LLVM_INCLUDE_BENCHMARKS = OFF
-- from commandline or cache files: LLVM_LIT_ARGS = '-vv'
-- from commandline or cache files: LLVM_OPTIMIZED_TABLEGEN = ON
-- from commandline or cache files: LLVM_TARGETS_TO_BUILD = X86

And the corresponding LLVM_CACHE_VARIABLES variable added to the cache:

LLVM_CACHE_VARIABLES:STRING=CLANGD_BUILD_XPC;CMAKE_BUILD_TYPE;CMAKE_CXX_FLAGS;CMAKE_C_FLAGS;CMAKE_GENERATOR;GOLD;LLVM_APPEND_VC_REV;LLVM_ENABLE_PROJECTS;LLVM_EXTERNAL_PROJECTS;LLVM_INCLUDE_BENCHMARKS;LLVM_LIT_ARGS;LLVM_OPTIMIZED_TABLEGEN;LLVM_TARGETS_TO_BUILD

Event Timeline

hintonda created this revision.May 23 2019, 4:17 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 23 2019, 4:17 PM
hintonda marked an inline comment as done.May 23 2019, 4:21 PM
hintonda added inline comments.
llvm/CMakeLists.txt
28

cmake_minimum_required adds a lot of variables, none of which we are interested in, so putting it first reduces the amount of filtering needed.

hintonda updated this revision to Diff 201087.May 23 2019, 4:24 PM
  • Fix typo.

I'm somewhat worried about the fact that this only works for the very first cmake run. Reconfiguring an existing build tree is something that is normally supported by cmake. I don't know about others, but I do that relatively frequently. gui tools like ccmake cannot even work without making multiple cmake invocations (they do one run to fetch the list of cache variables, let the user change that, and then one run to reconfigure)..

ilya-biryukov added inline comments.May 24 2019, 7:03 AM
llvm/CMakeLists.txt
4

NIT: are we missing a part of the phrase? what should when

hintonda marked an inline comment as done.May 24 2019, 8:11 AM
hintonda added inline comments.
llvm/CMakeLists.txt
4

Yes, thanks: s/what should when/what should be passed when/

However, as @labath mentioned, this method won't be reliable when cmake is rerun with different variables/values. I can test help text in a lot of cases, but that's not 100% either.

So, I'll continue to use this in my local external test project and see if I can add something like this to cmake itself -- assuming people find this useful.

I'm somewhat worried about the fact that this only works for the very first cmake run. Reconfiguring an existing build tree is something that is normally supported by cmake. I don't know about others, but I do that relatively frequently. gui tools like ccmake cannot even work without making multiple cmake invocations (they do one run to fetch the list of cache variables, let the user change that, and then one run to reconfigure)..

Thanks for bringing this up.

I think I might be able to handle it by examining the cache file, which isn't written until the config process is complete. Which means it's available as a history, and only needs to be examined on subsequent runs. So, I should be able to gather variables on each run and grow my list as I go. Even if a subsequent run excludes a variable or changes it back to a default, I'll still consider it changed.

I don't think cmake provides any mechanism for this, so it'll probably require a little python script, but it's essentially just an intelligent diff. I could also add a commandline switch, LLVM_ENABLE_PRINT_COMMANDLINE_ARGS, or something like that so you only have to pay for it you want it.

beanz added a subscriber: sqlbyme.May 24 2019, 9:59 AM

I have the same concerns as @labath, so we should certainly put it behind a flag.

I'm also not sure that this is how I would want to handle bot reproducibility. I've been advocating for years that we should migrate bots to using CMake cache files for configurations so that the bot configurations are stored in-tree. @sqlbyme liked that idea, but we never really got any progress on that front. It would be great if we made cache-based configuration a requirement for all bots on lab.llvm.org.

I have the same concerns as @labath, so we should certainly put it behind a flag.

I'm also not sure that this is how I would want to handle bot reproducibility. I've been advocating for years that we should migrate bots to using CMake cache files for configurations so that the bot configurations are stored in-tree. @sqlbyme liked that idea, but we never really got any progress on that front. It would be great if we made cache-based configuration a requirement for all bots on lab.llvm.org.

That's a good idea. Btw, where would you want to put them? Perhaps if they existed, it might be easier to bot owners to use them.

Btw, where would you want to put them?

Pre-monorepo, it would depend on what the bot is building, but with the mono-repo we could just create a new top-level directory to contain them all. Name each file to match the bot config it runs.

Perhaps if they existed, it might be easier to bot owners to use them.

I think bot owners need to create and maintain them. Today bot owners control what flags they pass to CMake and we shouldn't change that.

Btw, where would you want to put them?

Pre-monorepo, it would depend on what the bot is building, but with the mono-repo we could just create a new top-level directory to contain them all. Name each file to match the bot config it runs.

Sounds good.

Perhaps if they existed, it might be easier to bot owners to use them.

I think bot owners need to create and maintain them. Today bot owners control what flags they pass to CMake and we shouldn't change that.

I don't disagree, but I could create the files based on what's in the build logs as a starting point.

If this were to be just a debugging aid, then it does not matter that much that it does not work on reconfigures. We just shouldn't rely on it to make any decisions reflecting the build result. However, it also sounds to me that for bots specifically, we can come up with a better solution. The bot cache idea sounds great to me, but even without it, it should be possible to see the exact cmake command the bot invokes in the "stdio" link, right? That has worked for me for all bots that I needed to examine.

What I think could be helpful in some situations is to dump the resulting CMakeCache.txt after the configuration step completes. That way one could also see the actual configuration being used, which might be helpful in debugging cmake issues.

I’m not sure printing out the entire CMakeCache.txt file is a good idea — it’s huge and much of it is of no interest.

However, a subset including all the project specific variables would be great. The good, or bad, depending on your perspective, is that the cache file isn’t available until processing is complete. So we’d still need to iterate over all variables as a last step in say llvm/CMakeLists.txt.

Perhaps noting default value and whether or not it’s in the cache. To facilitate that, I’d like to try to wrap option in llvm_option and save the default to the cache as well. This would also make it easy for maintainers to do the right thing and remove a lot of boilerplate from the cmake files.

I’m not sure printing out the entire CMakeCache.txt file is a good idea — it’s huge and much of it is of no interest.

However, a subset including all the project specific variables would be great. The good, or bad, depending on your perspective, is that the cache file isn’t available until processing is complete. So we’d still need to iterate over all variables as a last step in say llvm/CMakeLists.txt.

Perhaps noting default value and whether or not it’s in the cache. To facilitate that, I’d like to try to wrap option in llvm_option and save the default to the cache as well. This would also make it easy for maintainers to do the right thing and remove a lot of boilerplate from the cmake files.

I didn't mean to print that as a part of the normal configure step. That would definitely be too much information. I was thinking doing something on the buildbot side. I'm not sure how, but it seems like it is possible to attach arbitrary text nodes to each buildbot step (that's what our "WarningCountingShellCommand" does to create a link with all compiler warnings). I was thinking we could attach the cache file as a one of these nodes to the "cmake" step. That way people could open it if needed. 99% percent of the time you can just ignore it, but if you need it, the info is there...

hintonda added a comment.EditedMay 27 2019, 7:55 AM

Ah, yes. That would be great and wouldn’t require the bot maintainers to do anything.

Btw, where would you want to put them?

Pre-monorepo, it would depend on what the bot is building, but with the mono-repo we could just create a new top-level directory to contain them all. Name each file to match the bot config it runs.

Beanz if you can send me an example of what you think the directory layout should look like I'll try to implement one of these on the mono-repo and get a test bot running. Then folks could have an example to look at.

Perhaps if they existed, it might be easier to bot owners to use them.

I think bot owners need to create and maintain them. Today bot owners control what flags they pass to CMake and we shouldn't change that.

+1 for having bot owners be responsible for making this change. Though I feel like we could do some work behind the scenes to make it easier for them to move to this convention. Also, we probably
should have a discussion on the list about making this the new standard for setting up bots.