This is an archive of the discontinued LLVM Phabricator instance.

[RFC] Factor out repetitive cmake patterns for llvm-style projects
ClosedPublic

Authored by stephenneuendorffer on Aug 3 2020, 9:46 AM.

Details

Summary

New projects (particularly out of tree) have a tendency to hijack the existing
llvm configuration options and build targets (add_llvm_library,
add_llvm_tool). This can lead to some confusion.

  1. When querying a configuration variable, do we care about how LLVM was

configured, or how these options were configured for the out of tree project?

  1. LLVM has lots of defaults, which are easy to miss

(e.g. LLVM_BUILD_TOOLS=ON). These options all need to be duplicated in the
CMakeLists.txt for the project.

In addition, with LLVM Incubators coming online, we need better ways for these
incubators to do things the "LLVM way" without alot of futzing. Ideally, this
would happen in a way that eases importing into the LLVM monorepo when
projects mature.

This patch creates some generic infrastructure in llvm/cmake/modules and
refactors MLIR to use this infrastructure. This should expand to include
add_xxx_library, which is by far the most complicated bit of building a
project correctly, since it has to deal with lots of shared library
configuration bits. (MLIR currently hijacks the LLVM infrastructure for
building libMLIR.so, so this needs to get refactored anyway.)

Diff Detail

Event Timeline

Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 3 2020, 9:46 AM
stephenneuendorffer requested review of this revision.Aug 3 2020, 9:46 AM

Something you don't make very explicit in the description of this patch is that beyond a "refactoring" this is also changing the behavior by *duplicating* the options for each subproject.
For example building with -DLLVM_INCLUDE_EXAMPLES won't be enough to include the examples in MLIR, the user must also provide -DMLIR_INCLUDE_EXAMPLES. Same for every other options.

I can see how this provide more flexibility to the user configuring the build, but this flexibility comes also with a more complex surface. For example when building LLVM+MLIR, passing -DLLVM_INSTALL_TOOLCHAIN_ONLY=ON isn't enough and you need also -DMLIR_INSTALL_TOOLCHAIN_ONLY=ON.

tstellar added inline comments.
llvm/cmake/modules/LLVMProjectOptions.cmake
52

Are there any of these that we can remove from LLVM, so we don't have to duplicate them for other projects? It seems to be that we could drop all the INCLUDE_* options. I'm guessing they exist, so that things like examples are buildable, but aren't added to the all target? To me it seems better to automatically add everything to all, and users that want to build less then all can just specify the targets that they want to build. Having both BUILD_* and INCLUDE_* options is very confusing.

Something you don't make very explicit in the description of this patch is that beyond a "refactoring" this is also changing the behavior by *duplicating* the options for each subproject.
For example building with -DLLVM_INCLUDE_EXAMPLES won't be enough to include the examples in MLIR, the user must also provide -DMLIR_INCLUDE_EXAMPLES. Same for every other options.

I can see how this provide more flexibility to the user configuring the build, but this flexibility comes also with a more complex surface. For example when building LLVM+MLIR, passing -DLLVM_INSTALL_TOOLCHAIN_ONLY=ON isn't enough and you need also -DMLIR_INSTALL_TOOLCHAIN_ONLY=ON.

That's true. My thinking was that these should get defaults from LLVM, like MLIR_INCLUDE_DOCS already did.

llvm/cmake/modules/LLVMProjectOptions.cmake
52

I'm all for simplification, but I think it's somewhat orthogonal with what I'm trying to achieve here. I agree with the general sentiment that there are just too many configuration options.

That's true. My thinking was that these should get defaults from LLVM, like MLIR_INCLUDE_DOCS already did.

Seems like a good idea!

llvm/cmake/modules/LLVMProjectOptions.cmake
52

I think part of the reason why the INCLUDE configurations exists is to speed up the project generation time. However, I don't know how much of a problem this is in practice. I suspect that for most people compilation still takes longer than configuration.

I am supportive of this cleanup (I have found this awkward in out of tree projects particularly). However, I am probably also not the right person for a detailed review compared to the existing CMake boiler-plate.

Anyone have any concerns with this patch landing? I'd be particularly interested if there are opinions about better ways to structure llvm/cmake/modules/LLVMProjectOptions.cmake or llvm/cmake/modules/LLVMProjectTargets.cmake

This revision was not accepted when it landed; it landed in state Needs Review.Oct 3 2020, 5:51 PM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
phosek added a comment.Oct 4 2020, 1:52 PM

Anyone have any concerns with this patch landing? I'd be particularly interested if there are opinions about better ways to structure llvm/cmake/modules/LLVMProjectOptions.cmake or llvm/cmake/modules/LLVMProjectTargets.cmake

I'm mostly concerned about this change being too MLIR specific. Not everything in add_llvm_project_options and add_llvm_project_targets makes sense for every LLVM project and I'd preferred if this was broken down into smaller pieces that could be used independently rather than having this all-or-nothing functions.

Also a styling nit, this should have been cmake/Modules, not cmake/modules. The former is used only by LLVM and is a historic relic, all other projects use cmake/Modules which is more common in CMake-based builds.

Thanks Petr. Can you point to some example projects where some of these
targets are harmful? Personally I'm willing to accept that some of the
mechanisms here might not be used by all projects in exchange for
simplification and ease of use.

Steve

Thanks Petr. Can you point to some example projects where some of these
targets are harmful? Personally I'm willing to accept that some of the
mechanisms here might not be used by all projects in exchange for
simplification and ease of use.

Steve

I don't think any of these are harmful, they are just not needed in most projects.

  • *_INCLUDE_UTILS, *_BUILD_UTILS and *_INSTALL_UTILS is only relevant in LLVM which is the only project that uses this concept.
  • *_BUILD_TOOLS is used by Clang, Flang, LLD and LLVM, *_INCLUDE_TOOLS is used by LLVM only.
  • *_INSTALL_TOOLCHAIN_ONLY is defined in LLVM and while other projects use this variable, it's used as a global shared variable and it doesn't make sense to define it for each project. Maybe we should rename LLVM_INSTALL_TOOLCHAIN_ONLY to INSTALL_TOOLCHAIN_ONLY?
  • *_INCLUDE_EXAMPLES and *_BUILD_EXAMPLES is only used by Clang, LLVM and MLIR.
  • While *_BUILD_TESTS is only used by LLVM, *_INCLUDE_TESTS is used by all projects. In some cases, *_BUILD_TESTS might make sense even for projects that don't use them currently. For example, Clang uses add_unittest so whether those are built is controlled by LLVM_BUILD_TESTS, but perhaps there should be CLANG_BUILD_TESTS?
  • *_INCLUDE_INTEGRATION_TESTS is only used by MLIR.
  • *_INCLUDE_DOCS is used by all projects.
pawelo12345678 added a subscriber: pawelo12345678.EditedOct 8 2020, 3:08 AM

Does further work on this one have a potential to fix following issues while building MLIR?:

[ 81%] Building CXX object tools/mlir/lib/Conversion/LinalgToLLVM/CMakeFiles/obj.MLIRLinalgToLLVM.dir/LinalgToLLVM.cpp.o
cd /home/pawelo/llvm-project.git/build-shared-release/tools/mlir/lib/Conversion/LinalgToLLVM && /usr/lib/llvm/10/bin/clang++ -DBUILD_EXAMPLES -DGTEST_HAS_RTTI=0 -DMLIR_CUDA_CONVERSIONS_ENABLED=0 -DMLIR_ROCM_CONVERSIONS_ENABLED=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_M
ACROS -I/home/pawelo/llvm-project.git/build-shared-release/tools/mlir/lib/Conversion/LinalgToLLVM -I/home/pawelo/llvm-project.git/mlir/lib/Conversion/LinalgToLLVM -I/home/pawelo/llvm-project.git/build-shared-release/include -I/home/pawelo/llvm-project.git/llvm/include -I/home/pawelo/llvm-project.git/mlir/include -I
/home/pawelo/llvm-project.git/build-shared-release/tools/mlir/include -stdlib=libc++ -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wc
overed-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -ffunction-sections -fdata-sections -Werror=global-constructors -O3   -fno-exceptions -fno-rtti -UNDEBUG -std=c++17 -o CMakeFiles/obj.MLIRLinalgToLLVM.dir/LinalgToLLVM.cpp.o -c /home/pawelo/llvm-project.git/mli
r/lib/Conversion/LinalgToLLVM/LinalgToLLVM.cpp
In file included from /home/pawelo/llvm-project.git/mlir/lib/Conversion/LinalgToLLVM/LinalgToLLVM.cpp:21:
/home/pawelo/llvm-project.git/mlir/include/mlir/Dialect/Linalg/Passes.h:67:10: fatal error: 'mlir/Dialect/Linalg/Passes.h.inc' file not found
#include "mlir/Dialect/Linalg/Passes.h.inc"
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

...or:

[ 70%] Building CXX object tools/mlir/lib/Dialect/LLVMIR/CMakeFiles/obj.MLIRLLVMIR.dir/IR/LLVMDialect.cpp.o
cd /home/pawelo/llvm-project.git/build-shared-release/tools/mlir/lib/Dialect/LLVMIR && /usr/lib/llvm/10/bin/clang++ -DBUILD_EXAMPLES -DGTEST_HAS_RTTI=0 -DMLIR_CUDA_CONVERSIONS_ENABLED=0 -DMLIR_ROCM_CONVERSIONS_ENABLED=0 -D_DEBUG -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/
home/pawelo/llvm-project.git/build-shared-release/tools/mlir/lib/Dialect/LLVMIR -I/home/pawelo/llvm-project.git/mlir/lib/Dialect/LLVMIR -I/home/pawelo/llvm-project.git/build-shared-release/include -I/home/pawelo/llvm-project.git/llvm/include -I/home/pawelo/llvm-project.git/mlir/include -I/home/pawelo/llvm-project.g
it/build-shared-release/tools/mlir/include -stdlib=libc++ -fPIC -fvisibility-inlines-hidden -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wimplicit-fallthrough -Wcovered-switch-default -Wno-
noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wstring-conversion -ffunction-sections -fdata-sections -Werror=global-constructors -O3   -fno-exceptions -fno-rtti -UNDEBUG -std=c++17 -o CMakeFiles/obj.MLIRLLVMIR.dir/IR/LLVMDialect.cpp.o -c /home/pawelo/llvm-project.git/mlir/lib/Dialect/LLVMIR/IR/LLVMDia
lect.cpp
In file included from /home/pawelo/llvm-project.git/mlir/lib/Dialect/LLVMIR/IR/LLVMDialect.cpp:13:
In file included from /home/pawelo/llvm-project.git/mlir/include/mlir/Dialect/LLVMIR/LLVMDialect.h:18:
/home/pawelo/llvm-project.git/mlir/include/mlir/Dialect/OpenMP/OpenMPDialect.h:19:10: fatal error: 'mlir/Dialect/OpenMP/OpenMPOpsDialect.h.inc' file not found
#include "mlir/Dialect/OpenMP/OpenMPOpsDialect.h.inc"
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
1 error generated.

It occurs on every build since June/July. I guess it's because I'm using plain cmake, not ninja.

Does further work on this one have a potential to fix following issues while building MLIR?:
It occurs on every build since June/July. I guess it's because I'm using plain cmake, not ninja.

That's a separate dependence problem. I'll take a look

Also a styling nit, this should have been cmake/Modules, not cmake/modules. The former is used only by LLVM and is a historic relic, all other projects use cmake/Modules which is more common in CMake-based builds.

FWIW, a bunch of other LLVM subprojects (Clang, LLD, MLIR, Flang, etc.) all appear to use modules. Modules appears to be used primarily by the runtimes (compiler-rt, libcxx, etc.)