This is an archive of the discontinued LLVM Phabricator instance.

[bazel] Make labels to third-party dependencies explicit
ClosedPublic

Authored by aaronmondal on Oct 21 2022, 2:00 PM.

Details

Summary

Prefix occurrences of //utils/bazel with an explicit @llvm-raw.

This change lets us reuse code from configure.bzl in future compatibility patches for the bzlmod
module system.

The llvm-project overlay will be made available as an @llvm-project-overlay (name WIP) module in
the Bazel Central Registry. This means that we will have an @llvm-project-overlay workspace in
addition to the @llvm-raw and @llvm-project workspaces currently involved in the build. To keep
future patches to the existing build files as small as possible, the explicit naming proposed in this
change appears to be the simplest way to not confuse the module workspace resolution.

This is not a functional change to the current WORKSPACE build. It is a foundation for future patches.

GitHub Issue in the BCR: https://github.com/bazelbuild/bazel-central-registry/issues/206
GitHub Issue in LLVM: https://github.com/llvm/llvm-project/issues/55924

Diff Detail

Event Timeline

aaronmondal created this revision.Oct 21 2022, 2:00 PM
Herald added a project: Restricted Project. ยท View Herald TranscriptOct 21 2022, 2:00 PM
aaronmondal requested review of this revision.Oct 21 2022, 2:00 PM
Herald added a project: Restricted Project. ยท View Herald TranscriptOct 21 2022, 2:00 PM
Herald added a subscriber: llvm-commits. ยท View Herald Transcript

@fmeum I'm not sure who to tag for this as a reviewer, but I remembered you commenting on both some BCR and LLVM issues. Please refer someone else if you are not the right person for this ๐Ÿ˜…

csigg added a subscriber: csigg.Oct 21 2022, 2:26 PM

I think you can remove the surrounding Label() if the repo is explicitly stated in the target string.

@csigg Ah yes that was an oversight. The Label()s are redundant.

+@aeubanks @GMNGeoffrey Can you review? Will this change affect our usage of LLVM?

The way I am implementing these patches should not change anything for the current WORKSPACE build. The patches I will submit will look something like this:

  1. This patch
  2. A small compatibility patch for clang/BUILD.bazel, changing builtin_headers_gen to also accept paths that are not only external/llvm-project/..., since with bzlmod these paths will look different depending on the context.
  3. A patch adding a bzlmod module extension that reuses llvm_configure from configure.bzl, a MODULE.bazel file and documentation/examples.

If I don't mess something up, the first two patches should be strictly generalizations of the current logic, so that the WORKSPACE build flow is untouched, but made compatible with bzlmod as well. Without an --experimental_enable_bzlmod in the .bazelrc the new extension and the MODULE file should be ignored by Bazel.

One thing to consider is that if we would want this run in CI similar to the premerge-check pipeline, the Bazel version in .bazelversion would need to be bumped to 5.3.2 (maybe 5.3.0 suffices), which should be backwards compatible with 5.0.0, but contains some bugfixes for bzlmod wich are required for all of this to work with in-tree builds.

csigg accepted this revision.Oct 28 2022, 12:34 PM

This looks good to me.

This revision is now accepted and ready to land.Oct 28 2022, 12:34 PM

I'm not familiar with how bzlmod or the central registry are going to work, but just looking at the content of this patch, I think there is a problem. I think this adds a dependency on exactly the name "llvm-raw" being used when this is imported by external projects. As currently written, it's a relative reference, which has no dependence on this name. I took some care when setting all of this up to not make user-defined strings like that load-bearing, so I think this is undesirable. llvm-raw is kind of an implementation detail name that no one should care about too much though and is unlikely to be referenced by any other dependency, so it isn't too bad to bake it in. It was more important for the name of the final llvm-project Bazel repository where there were already multiple names being used in the wild. The fact that dependency names are user-defined and not necessarily consistent between projects has always been a failing in the Bazel dependency management system though, and it looks like perhaps bzlmod and this registry are attempting to establish canonical names for projects, more aligned with how package managers like pip work. If that's the world we're headed for, then it seems like baking this in may be the right thing to do, but I'm unsure.

Ok I now added the rest of the commits which should give a better picture on how/what I'm doing here ๐Ÿ˜…

https://reviews.llvm.org/D137007 is the small fix for the clang BUILD.bazel.
https://reviews.llvm.org/D137008 adds the actual bzlmod support.
https://reviews.llvm.org/D137009 adds an additional example.

This demo repository shows how this will look in a bazel registry (without the 0*_.patch files): https://github.com/eomii/bazel-eomii-registry/commit/04d3d96f0fa35e38b056853db3bae8dd903a4bbd

I'm not familiar with how bzlmod or the central registry are going to work, but just looking at the content of this patch, I think there is a problem. I think this adds a dependency on exactly the name "llvm-raw" being used when this is imported by external projects. As currently written, it's a relative reference, which has no dependence on this name. I took some care when setting all of this up to not make user-defined strings like that load-bearing, so I think this is undesirable. llvm-raw is kind of an implementation detail name that no one should care about too much though and is unlikely to be referenced by any other dependency, so it isn't too bad to bake it in. It was more important for the name of the final llvm-project Bazel repository where there were already multiple names being used in the wild. The fact that dependency names are user-defined and not necessarily consistent between projects has always been a failing in the Bazel dependency management system though, and it looks like perhaps bzlmod and this registry are attempting to establish canonical names for projects, more aligned with how package managers like pip work. If that's the world we're headed for, then it seems like baking this in may be the right thing to do, but I'm unsure.

Thinking about this, I also think that it would be nice if llvm-raw would not be a more-or-less hard required name for the intermediary repository. I'll try again whether I can get this to work with some more "relative" approach. I couldn't get Workspace-relative labels to work, but I file-relative labels (relative to the .bzl files themselves) may work.

csigg added a comment.Oct 29 2022, 1:31 AM

Have you tried to simply use e.g. "//:WORKSPACE" (without Label, without @llvm-raw)?
I think this might work because those rules are provided by @llvm-raw (or whatever the repo is called).

So this change would mainly be about not actively referring to targets in the calling repository, which is what Label does.

Ok I tried a bunch of combinations. Apart from the current one, none seem to work.

I believe what is happening is that when we call configure from a module extension, the repository_ctx effectively becomes a module_ctx. If the labels are relative, e.g. //utils/bazel/deps_impl:zlib_disable.BUILD, this will be interpreted as @llvm-raw//utils/bazeldeps_impl:zlib.BUILD with the WORKSPACE build, which is correct, but when using bzlmod with an in-tree build, it will incorrectly resolve to @llvm-project-overlay//utils/bazel, which is the nonexistent llvm-project/utils/bazel/utils/bazel, since @llvm-project-overlay refers to the utils/bazel subdirectory already. So then using bzlmod we need the explicit @llvm-raw workspace since it is in fact an external workspace in that case.

Regarding the Label("//:WORKSPACE"), this doesn't work for the same reason. With the WORKSPACE build, It resolves to the correct @llvm-raw//:WORKSPACE. With bzlmod we want @llvm-raw//:WORKSPACE as well, but if this is relative we get the incorrect @llvm-project-overlay//:WORKSPACE (aka @llvm-raw//utils/bazel:WORKSPACE).

I don't think there is a way to implement this differently without significant changes to the entire overlay logic. The explicit labels in this case actually do what they are supposed to do - explicitly reference targets as if they are coming from an external workspace.

A case where the opposite is actually desirable is https://reviews.llvm.org/D137007, where we need to be able to use the workspace @llvm-project and the bzlmod @llvm-project-overlay.16.0.0-bcr.0.llvm_project_overlay.llvm-project interchangeably.

There is also the situation where @llvm-project-overlay actually refers to the top level directory in the llvm repo. This is the case if we import the module in an external module via bazel_dep(name="@llvm-project-overlay", version="16.0.0-bcr.0"). If we were to disallow builds from within the utils/bazel repo, this may make it possible to keep the labels relative,

This seems like a bad idea to me though, since it would mean that it would be impossible to build and test the bazel overlay from within utils/bazel. This would add additional complexity to future CI runs that build/test the bzlmod build.

I think this is coming from my complete ignorance of bzlmod, but why exactly do we have a separate llvm-project-overlay repository here? Why isn't it pulling from utils/bazel?

aaronmondal added a comment.EditedNov 4 2022, 1:49 PM

I think this is coming from my complete ignorance of bzlmod, but why exactly do we have a separate llvm-project-overlay repository here? Why isn't it pulling from utils/bazel?

The outline is as follows.

  1. A user imports bazel_dep(name="llvm-project-overlay", version = "...").
  2. bzlmod will now go to whatever bazel registry was specified and pull the sources declared there. While the module name is @llvm-project-overlay, the actual sources are the regular sources of ad github.com/llvm/llvm-project, slightly patched with MODULE.bazel, and empty BUILD.bazel and WORKSPACE.bazel files. This behavior is completly independent of anything implemented in the actually fetched repo.
  3. Now, an extension is called from the @llvm-project-overlay's MODULE.bazel file to configure the llvm project (D137008). The extension implementation has to be located via @llvm-project-overlay//utils/bazel:extensions.bzl, since we currently have no notion of @llvm-raw or @llvm-project. Two things can happen now.
    • 3a. If no commit is specified, the extension re-registers the @llvm-project-overlay archive as @llvm-raw. It then applies the configure logic to create @llvm-project from these sources. The @llvm-project-overlay module makes these workspaces available to importers via use_repo(...,) in the @llvm-project-overlay's MODULE.bazel file.
    • 3b. If a commit (and optional patches) are specified, we download ANOTHER copy of the llvm sources at the specified commit (and optionally patch them) and register those as @llvm-raw. Now the configure logic from @llvm-project-overlay//utils/bazel:configure.bzl is applied, to the sources of @llvm-raw to create @llvm-project.

So basically, we always want configure to run on @llvm-raw, never on @llvm-project-overlay, since that would make patching and custom commit checkouts even more complicated than they already are ๐Ÿ˜…

The 3b part i there so that we do not need to update the version of this module too often in bazel registries. The implementation in D137008 lets us get away with only rarely updating the module in bazel registries, since we can use an older version of llvm to just download a newer version by bumping the commit on the importer side. Without 3b (i.e. without custom commit checkout and patching) we would have to frequently update the registry version to stay at head in downstream projects.


Note that this only covered the kinds of builds downstream users would run. In-tree builds work more or less like the current WORKSPACE in-tree build.

  1. Assume we are in utils/bazel.
  2. Dependencies like zlib and rules_cc are fetched from a bazel registry.
  3. We import the module extension as //:extensions.bzl, since we are already in utils/bazel.
  4. We use essentially the same flow as 3a from above. I believe there may still be some issues with the current D137008 in this case though, since we somehow need to get a WORKSPACE.bazel and BUILD.bazel into the top level of the repo, but we cannot use native local_repository from module extensions. This may resolve itself as bzlmod matures. I'm currently looking into this.

@GMNGeoffrey @MaskRay @rupprecht @csigg To make sure, I'd like to verify again whether you are you ok with me committing this. @GMNGeoffrey had some concerns regarding the "pinning" of the name to llvm-raw. I've set up the current zstd support (https://reviews.llvm.org/D143344) in a way that reduces the dependence on llvm-raw to a single occurance. For zlib we have the same still pending in https://reviews.llvm.org/D143320. At the moment that requires dependents that use custom system overrides to make some changes to their workspaces (see @GMNGeoffrey's comment in https://reviews.llvm.org/D143344). This is temporary until the external deps are available as modules. Then we can completely remove llvm-raw from all external deps and directly use bazel_dep(name = "zlib", version = "1234") without custom imports in LLVM.

This is the last patch we need so that I can start properly getting the bzlmod patch ready (https://reviews.llvm.org/D137008).

When bzlmod support is complete a default LLVM import will look something like this in downstream projects:

bazel_dep(name = "llvm-project-overlay", version = "17-init-bcr.2")

llvm_project_overlay = use_extension(
    "@llvm-project-overlay//utils/bazel:extensions.bzl",
    "llvm_project_overlay",
)

llvm_project_overlay.configure()

use_repo(llvm_project_overlay, "llvm-raw", "llvm-project")

BTW the entire rules_ll build infrastructure has been using these patches for quite some time now. We're currently experimenting with caching and remote execution to see wether the cloud infra for an open-read remote Bazel cache for LLVM would be sustainable. If it is, we'd have a way to reduce the "build" time for LLVM Bazel users to ~2 min on a laptop from an empty local cache and some CI runtimes to potentially less than that.

Rebase and test a final time before committing

chapuni added a subscriber: chapuni.May 7 2023, 6:43 AM
This revision was automatically updated to reflect the committed changes.

We're currently experimenting with caching and remote execution to see wether the cloud infra for an open-read remote Bazel cache for LLVM would be sustainable. If it is, we'd have a way to reduce the "build" time for LLVM Bazel users to ~2 min on a laptop from an empty local cache and some CI runtimes to potentially less than that.

The CI is already using a remote cache and it should be possible to make that remote cache available to anyone. The tricky bit is just that for the caching to be effective it needs to be from a sufficiently similar software environment. Containerization could help with that. @aeubanks are you still making improvements to the LLVM Bazel infra?

We're currently experimenting with caching and remote execution to see wether the cloud infra for an open-read remote Bazel cache for LLVM would be sustainable. If it is, we'd have a way to reduce the "build" time for LLVM Bazel users to ~2 min on a laptop from an empty local cache and some CI runtimes to potentially less than that.

The CI is already using a remote cache and it should be possible to make that remote cache available to anyone. The tricky bit is just that for the caching to be effective it needs to be from a sufficiently similar software environment. Containerization could help with that. @aeubanks are you still making improvements to the LLVM Bazel infra?

I haven't done anything in a while (it's been chugging along pretty well on its own), but yeah I still technically own the bazel bots.
Does bazel allow read-only access to a remote cache? Or does it assume that you can write to it as well? I don't think we'd want non-CI users uploading to the cache.
(the bots also use --config=ci and I believe a custom clang built by Chrome so you'd have to match that)

Who is "we" in this case? If it's a company, it seems doable to have your own GCS (or whatever) remote cache and make everybody + CI use the same configuration. A world-usable one probably requires more thought.

@aeubanks wrote:

I haven't done anything in a while (it's been chugging along pretty well on its own), but yeah I still technically own the bazel bots.
Does bazel allow read-only access to a remote cache? Or does it assume that you can write to it as well?

Yes --noremote_upload_local_results

I don't think we'd want non-CI users uploading to the cache.

For sure, that would be bad

(the bots also use --config=ci and I believe a custom clang built by Chrome so you'd have to match that)

Yeah this is the containerization bit

Who is "we" in this case? If it's a company, it seems doable to have your own GCS (or whatever) remote cache and make everybody + CI use the same configuration. A world-usable one probably requires more thought.

I don't think there's too much issue with a world-readable GCS bucket, especially if you don't allow anonymous access, so people have to authenticate themselves (although that also makes it more annoying to use and requires a gmail account). I think GCS can handle the traffic fine. We've got a few world-readable buckets in IREE for hosting misc binary files and such and we haven't had any issues.

The tricky bit is just that for the caching to be effective it needs to be from a sufficiently similar software environment.

Yes, that's tricky. Since Bazel doesn't track it's external dependencies in the same way that it tracks labels, we wrap the entire Bazel process in a reproducible environment via nix. That environement is fairly straigthforward to convert into a reproducible toolchain container that can then be used by CI runs. It also double-acts as development environment, similar to a docker environment or a python venv. Our current setup works roughly like this:

  • Wrap Bazel with the paths from dependencies outside of Bazel's build graph
  • Create a toolchain container that uses that Bazel
  • Use that toolchain container to generate remote-execution toolchains. Here is an example of a generated toolchain config.
  • *All* build invocations, including local ones then use that RBE-compatible toolchain. If a user's local environment deviates in just a single tool, the build will fail (or at least not use the same cache). This prevents wrong cache use/reuse that would occur when e.g. using just a path like /usr/bin/clang.

Who is "we" in this case? If it's a company, it seems doable to have your own GCS (or whatever) remote cache and make everybody + CI use the same configuration. A world-usable one probably requires more thought.

We use that "everyone has the same configuration" approach internally at the moment, but my intention with this is indeed to create a world-cache that's more or less accessible to everyone. The nixpkgs repository actually does just that, but on a package-level - it's a gigantic cache of reproducible build artifacts that makes it quite simple to customize specific dependencies, but reuse all unaffected ones. Pretty much a coarser-grained version of a Bazel cache. I want the same thing on a fine-grained, incremental-build level.