This is an archive of the discontinued LLVM Phabricator instance.

[bazel] Rework zlib dependency
ClosedPublic

Authored by aaronmondal on Feb 4 2023, 7:13 AM.

Details

Summary

Switches the library to use the zlib-ng implementation since the
original implementation is warning-incompatible with recent versions of clang.

Removes the wrapper logic for zlib in the bazel build and introduces new
logic to handle LLVM_ENABLE_ZLIB.

Removes the BAZEL_LLVM_ZLIB_STRATEGY environment variable and instead
introduces a boolean --@llvm_zlib//:llvm_enable_zlib flag which defaults
to true.

To migrate:

  • The previous "external" strategy is the default. May be explicitly enabled with --@llvm_zlib//:llvm_enable_zlib=true. For custom zlib variants you can use the BUILD file at third_party_build/zlib.BUILD as reference and adjust the @llvm_zlib archive in the WORKSPACE directly.
  • The previous "disable" strategy may be enabled with --@llvm_zlib//:llvm_enable_zlib=false.
  • The previous "system" strategy has been removed since it breaks hermeticity.

Addresses breakages of downstream projects using upstream clang and the
previously "external" zlib strategy (D141553).

Diff Detail

Event Timeline

aaronmondal created this revision.Feb 4 2023, 7:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2023, 7:13 AM
aaronmondal requested review of this revision.Feb 4 2023, 7:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2023, 7:13 AM
aaronmondal edited the summary of this revision. (Show Details)
  • Fix lld:ELF
  • Run buildifier on zlib.BUILD
MaskRay accepted this revision.Feb 4 2023, 11:43 AM
This revision is now accepted and ready to land.Feb 4 2023, 11:43 AM

@MaskRay It the patch is ok like this, would you mind commiting on my behalf?

Aaron Siddhartha Mondal <aaron@eomii.org>

@MaskRay It the patch is ok like this, would you mind commiting on my behalf?

Aaron Siddhartha Mondal <aaron@eomii.org>

Thanks, I'll do that after giving other reviewers some time to check. (I am unfamiliar with Bazel but I like that this patch reduces complexity!)

@aa

utils/bazel/third_party_build/zlib.BUILD
17

Is llvm_zlib_enabled unneeded?

aaronmondal added inline comments.Feb 5 2023, 12:36 PM
utils/bazel/third_party_build/zlib.BUILD
17

Yes this is not needed at the moment.

aaronmondal updated this revision to Diff 494941.EditedFeb 5 2023, 12:44 PM

Remove redundant llvm_zlib_enabled config setting.

aaronmondal marked an inline comment as done.Feb 6 2023, 7:07 AM

@MaskRay While we're at it, what do you think about changing zlib to zlib-ng? (https://github.com/zlib-ng/zlib-ng) The original zlib still didn't address https://github.com/madler/zlib/issues/633 and I'd rather spend time on writng a buildfile for zlib-ng than to set -Wno-strict-prototypes (or get spammed by a million warnings).

Since this only changes the dep in the Bazel build that should be fine right? Do any other Bazel users have an opinion on this? Note that this wouldn't require codechanges in LLVM. We'd just switch to a different zlib-API-compatible implementation for the Bazel build.

GMNGeoffrey requested changes to this revision.Feb 6 2023, 10:46 AM
GMNGeoffrey added a subscriber: chandlerc.

I like that this fixes this to use a proper flag, but support for using the system version was deliberate and static-linking an external archive had interesting licensing implications. Let me dig up the history. Hermeticity is indeed an important consideration, but building with system dependencies is also key functionality. @chandlerc, who added this originally.

This revision now requires changes to proceed.Feb 6 2023, 10:46 AM

Ok it looks like the conversation about licensing implications was in some ephemeral format or something, as I can't find it. And zlib actually has a permissive license. Maybe Chandler will comment, but my preference would be that we keep the ability to depend on the system zlib version

@GMNGeoffrey I originally had a version with a string_flag, but abandoned that because I couldn't solve the following issues:

  • Where is the system zlib? Some systems use /lib, some use /usr/lib, some use their 64 bit counterparts. Some use completely different layouts, e.g. /nix/store/<some_config_hash>-zlib-<version>/lib and /nix/store/<some_config_hash>-zlib-<version>-dev/include for zlib on NixOS. How could I ensure that the system is flexible enough for something like that without adding significant amounts of logic?
  • How should the difference to the CMake build be handled? The current diff mirrors the behavior of LLVM_ENABLE_ZLIB and the select logic is reasonably friendly to migrate to e.g. the CMake-to-Bazel scripts. The most precise option would be to use two config flags (enable/disable, hermetic/non-hermetic), but that seemed like an awful lot of spaghetti code only to get a single use case working. Since we can't "export" an -lz copt, We'd have to add selects to the LLVM Build file, which seems *very* hacky.

Would patching WORKSPACE in the downstream project be a reasonable approach to get this kind of functionality? The proposed diff makes it very simple to patch the WORKSPACE and zlib.BUILD with local_repository logic.

If you know the systems that use the "system" strategy, you'll also be able to use a module *_overridewith bzlmod in the future. E.g. you could use a custom Bazel registry with a module pointing to specific local paths.

If these migration options and workarounds are not satisfactory I can bring the string_flag back, but please let's only do this if the "system" functionality is actually important enough to justify the additional complexity.

@MaskRay While we're at it, what do you think about changing zlib to zlib-ng? (https://github.com/zlib-ng/zlib-ng) The original zlib still didn't address https://github.com/madler/zlib/issues/633 and I'd rather spend time on writng a buildfile for zlib-ng than to set -Wno-strict-prototypes (or get spammed by a million warnings).

Since this only changes the dep in the Bazel build that should be fine right? Do any other Bazel users have an opinion on this? Note that this wouldn't require codechanges in LLVM. We'd just switch to a different zlib-API-compatible implementation for the Bazel build.

Using zlib-ng seems reasonable. I made another comment on https://github.com/madler/zlib/issues/633 just to wish that madler can change his mind...

Aside: this patch appears to be missing a bunch of context in the diff. Could you try reuploading or something?

Good point about a dependent repo being able to override llvm_zlib if they want. Let's make sure that's actually ergonomic, but I think that should be sufficient to support the system case.

Where is the system zlib? Some systems use /lib, some use /usr/lib, some use their 64 bit counterparts. Some use completely different layouts, e.g. /nix/store/<some_config_hash>-zlib-<version>/lib and /nix/store/<some_config_hash>-zlib-<version>-dev/include for zlib on NixOS. How could I ensure that the system is flexible enough for something like that without adding significant amounts of logic?

You'd add -lz as was here previously and let the system handle it. Definitely don't want to reinvent system package search here :-)

Would patching WORKSPACE in the downstream project be a reasonable approach to get this kind of functionality? The proposed diff makes it very simple to patch the WORKSPACE and zlib.BUILD with local_repository logic.

Dependencies don't use the WORKSPACE file from the project they depend on. They can invoke Bazel macros or repo rules that it defines. It would be good to check that this actually makes that ergonomic. https://github.com/llvm/llvm-project/blob/main/utils/bazel/examples/http_archive/WORKSPACE and https://github.com/llvm/llvm-project/blob/main/utils/bazel/examples/submodule/WORKSPACE are examples of how to use the repository. Currently, it's easy to get things up and running with a couple of macros. I think this would break it and zlib would now need to be explicitly configured in the downstream WORKSPACE, repeating all the logic from LLVM's workspace. Can you factor the configuration into a macro so that we can preserve the easy configuration? I propose that we should offer the following macros: llvm_configure, the main repo only as we have it today; for each external dep, llvm_configure_external_<dep_name>; llvm_configure_external_support_deps, pulls in all of the deps as external repos initialized with maybe; llvm_disable_external_support_deps, initializes them all as empty with maybe and makes sure Bazel analysis doesn't get upset because of missing repos. The use of maybe here means that the downstream project can do its own configuration of select deps and then use a catch-all for the rest. If they want to be judicious about taking on dependencies, they can explicitly enable each one they want, but use the macros provided.

utils/bazel/WORKSPACE
38

Somewhat orthogonal to your change, but last week's GitHub sha256 blowup reminded me that http_archive is horrible and we should probably switch to using git_repository

utils/bazel/third_party_build/zlib.BUILD
17–20

Can we make this a positive option? It seems a bit confusing to have it negated like this. I think this works even if true is the default?

config_setting(
  name = "llvm_zlib_enabled",
  flag_values = {":llvm_enable_zlib": "true"},
)

And then the selects would be "if zlib is enabled add this thing, otherwise add nothing" rather than "if zlib is not enabled then don't add anything, otherwise add this thing"

aaronmondal added inline comments.Feb 6 2023, 4:35 PM
utils/bazel/WORKSPACE
38

It turned out that GitHub actually does guarantee sha stability, but only for release tag links. This *really* screwed all our workflows.

I'd love changing this (and all other http archives), but I think it should be handled in a different commit, since it may break CI environments without git.

utils/bazel/third_party_build/zlib.BUILD
17–20

Yeah I think originally I did it like this so that it's clear that the zlib.BUILD file actually builds zlib source code by default, but looking at it again it indeed seems unnecessarily unintuitive 😆 Will change it.

GMNGeoffrey added inline comments.Feb 6 2023, 4:49 PM
utils/bazel/WORKSPACE
38

Well I think even there it was just that at some point someone said they would be stable maybe, but obviously didn't communicate that to the team who was actually able to control that :-P Anyway, agreed on it being a separate commit.

You'd add -lz as was here previously and let the system handle it. Definitely don't want to reinvent system package search here :-)

This only works if -L<path_to_lib> is set and may cause inconsistencies in some shells and environments (like Nix), where it becomes unclear whether we are linking -L/usr/lib or -L/veryesotericpath. If that path is unknown or disabled things don't work at all. So far I haven't found a way to "query" for these standard library paths please let me know if there is some some config file or something that I'm not aware of. This is causing me so many headaches.

Regarding the WORKSPACE logic, that sounds good. I'll try to get something like that working but that may take me a while. Tbh I'm not really using WORKSPACES myself and instead using the bzlmod patch series that I tried upstreaming a while ago but I think they got kinda dropped and would probably need some rework at this point. (A working bazel module for a registry is here with sample usage here).

Regarding the WORKSPACE logic, that sounds good. I'll try to get something like that working but that may take me a while. Tbh I'm not really using WORKSPACES myself and instead using the bzlmod patch series that I tried upstreaming a while ago but I think they got kinda dropped and would probably need some rework at this point. (A working bazel module for a registry is here with sample usage here).

Ugh that's my bad. I totally forgot about those :-( I think it enforced naming of the llvm-project repo in a way I found unsavory and I never figured out a solution after getting very lost in what Label appears to do vs what I think the documentation says it does...

aaronmondal added a comment.EditedFeb 6 2023, 5:19 PM

Ugh that's my bad. I totally forgot about those :-( I think it enforced naming of the llvm-project repo in a way I found unsavory and I never figured out a solution after getting very lost in what Label appears to do vs what I think the documentation says it does...

Even worse, there is now an additional @@ syntax that i think is supposed to solve issues like that. I *think* it tells bazel to not "resolve" the workspace name or something like that but maybe that's not how it works either 😆

  • Rebase
  • Make selects positive
  • Switch library implementation to zlib-ng

Rename zlib-ng target to zlib

aaronmondal edited the summary of this revision. (Show Details)Apr 1 2023, 8:28 AM

Switch to zip instead of tar.gz, as the latter seems to be broken.

Add strip_include_prefix to allow angled inclusion of zlib.h.

This shows up in "must review" for me, but I think the ball's in your court for preserving the ability to easily set up a dependent project. Please LMK if this (or any of your related patches) are blocked on my review.

utils/bazel/third_party_build/zlib-ng.BUILD
95–97

I don't understand the comment. It looks like it's enabled by default here?

aaronmondal planned changes to this revision.EditedMay 16 2023, 5:31 PM

The "migration" necessitated by this patch is pretty much the same as for the zstd commit https://reviews.llvm.org/D143344. The "system" behavior is still possible, but requires the workaround you proposed in https://reviews.llvm.org/D143344. Is that workaround good enough for your needs?

The current bzlmod patches don't allow customization of external dependencies and we need changes like this one to remove dependencies on maybe (which is incompatible with bzlmod by design). It's not exactly "blocking" the patches per se, but it blocks them from being customizable.

I'd also like to move this forward since it would finally get rid of all the zlib deprecation warnings when building with a recent clang and it fixes layering-check issues.

utils/bazel/third_party_build/zlib-ng.BUILD
95–97

Oh yes this is definitely wrong.

Is that workaround good enough for your needs?

It's not ideal from a user perspective that I have to copy-paste code from the workspace. We migrated with the change to zstd though 🤷 The usage of system zlib is something I'm advocating making possible based on recollections of past conversations with Chandler. This doesn't make people accidentally depend on an external version though, so I think it's not terrible from that perspective. I wish the ergonomics were a bit better, but I'm fine with this going in

GMNGeoffrey accepted this revision.May 16 2023, 6:00 PM

Improve comment that wasn't wrong but written in a way that was easy misread.

This revision is now accepted and ready to land.May 19 2023, 2:11 PM
aaronmondal marked an inline comment as done.May 19 2023, 2:12 PM
This revision was automatically updated to reflect the committed changes.
utils/bazel/third_party_build/zlib.BUILD