This is an archive of the discontinued LLVM Phabricator instance.

[zorg][RISCV] Add a qemu-user based builder
ClosedPublic

Authored by asb on Feb 2 2023, 5:09 AM.

Details

Summary

I spent a bit of time looking at the best way of improving our test coverage for RISC-V. I hope in this coming year more hardware suitable for this kind of workload will become available, but for now qemu-based builders seem the best option. This patch adds a builder using qemu user mode emulation (lower fidelity than full system emulation, but faster and makes better use of the host cores).

Currently, clang, clang-tools-extra, and lld are build and tested. It should be possible to enable libunwind, libcxx, and compiler-rt (as long as -DCOMPILER_RT_BUILD_SANITIZERS=OFF is set - the sanitizer tests won't run under qemu-user) once the following issues are resolved:

The plan for now is just to have it on the staging buildbot and see how it works out. I'd like to later augment or replace with a multi-stage bootstrap build, but this seems the most useful addition for now.

As the description above indicates, some further tweaks are expected - I'm curious whether _any_ change to the buildbot/**.py files in zorg require a scheduled buildbot master restart to take effect?

Diff Detail

Event Timeline

asb created this revision.Feb 2 2023, 5:09 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 2 2023, 5:09 AM
asb requested review of this revision.Feb 2 2023, 5:09 AM
asb added a comment.Feb 2 2023, 8:55 AM

I forgot to mention - this is a case where the proposed "gate keeper" build bots would really help. There's not a lot that can be done to make this builder faster, but if we gate it only on cases where faster builders haven't already found issues, it should allow the cycles used building + running tests to be used more effectively.

reames added a comment.Feb 2 2023, 2:19 PM

Not a review per se since I'm not super familiar with this parts of zorg, but just wanted to say I'm thrilled to see this!

gkistanova requested changes to this revision.Feb 2 2023, 11:55 PM

Thanks, Alex!

We can use this builder as a pilot of gated builds if you have bandwidth and are willing to babysit it for awhile. I'm looking forward to it.

Out of curiosity, what response time do you expect from this builder? Do you think it would benefit from using ccache?

As the description above indicates, some further tweaks are expected - I'm curious whether _any_ change to the buildbot/**.py files in zorg require a scheduled buildbot master restart to take effect?

Staging buildbot automatically applies commits to llvm-zorg and reconfigures every 2 hours or so. Meaning that you wouldn't need to wait that long. But this also means that other commits could hold you back if something goes wrong. It sanitizes llvm-zorg commits and holds on applying them if a commit breaks the bot. Staging could go down without notifications for experiments, research, checking dependencies, and so on. This is not a stable environment with all pros and cons. If you will see issues, please feel free to ping me.

Please see my comments in line.

By the way, I see a worker with the name "rvqemubuilder" trying to connect to the staging. Is this yours? If so, you either need to rename it to "rv64gc-qemu-user" or change the worker name to "rvqemubuilder" in this patch.

buildbot/osuosl/master/config/builders.py
2747

You may skip this, as it is False by default.

2748

The builder would run check-all by default, but if you want to have it set explicitly, that's fine.

2751

You specified llvm only in depends_on_projects param, but per -DLLVM_ENABLE_PROJECTS=clang;clang-tools-extra;lld you build other projects too.

Effectively this builder would build on commits to llvm only, ignoring commits to clang, clang-tools-extra, and lld, meaning wrong blames if a previous commit to any of these projects breaks the build.

Are you sure this is what you want?

If yes, could you elaborate on what you are after, please? If not, you can just list in the depends_on_projects param all the projects this builder should listen for commits and build, and remove the -DLLVM_ENABLE_PROJECTS and -DLLVM_ENABLE_RUNTIMES cmake args. UnifiedTreeBuilder will take care of them for you.

buildbot/osuosl/master/config/workers.py
331

Did you want to have "rv64gc-qemu-user" worker here? The one you configured above to run the "clang-rv64gc-qemu-user-single-stage" builder. Looks like you crossed the worker and the builder names here.

This revision now requires changes to proceed.Feb 2 2023, 11:55 PM
asb added a comment.EditedFeb 3 2023, 7:25 AM

Thanks, Alex!

We can use this builder as a pilot of gated builds if you have bandwidth and are willing to babysit it for awhile. I'm looking forward to it.

Yes, that would be great.

Out of curiosity, what response time do you expect from this builder? Do you think it would benefit from using ccache?

Yes, I'll enable that. It's about ~3.5-4h for the build stage (a recent build was real 221m35.734s; user 6814m33.973s sys 59m55.406s). I can't see the figure for check-all in my notes right now as I've done a lot of component-by-component testing to work through various issues.

As the description above indicates, some further tweaks are expected - I'm curious whether _any_ change to the buildbot/**.py files in zorg require a scheduled buildbot master restart to take effect?

Staging buildbot automatically applies commits to llvm-zorg and reconfigures every 2 hours or so. Meaning that you wouldn't need to wait that long. But this also means that other commits could hold you back if something goes wrong. It sanitizes llvm-zorg commits and holds on applying them if a commit breaks the bot. Staging could go down without notifications for experiments, research, checking dependencies, and so on. This is not a stable environment with all pros and cons. If you will see issues, please feel free to ping me.

That's perfect for the time being.

Thanks for the rapid review - I'll loop back soon with an update. I set up the factory based on looking at my CMake invocation and for each one searching for the rest of the file to see if anyone else was using it, assuming that meant I should just add it directly. Clearly that approach has meant I've accumulated all the mistakes from all the other configs!

asb updated this revision to Diff 496915.Feb 13 2023, 3:17 AM
asb marked 4 inline comments as done.

Address review comments (thanks!), and enable ccache using the now-preferred -DCMAKE_{C,CXX}_COMPILER_LAUNCHER options.

I'd wondered if I'd be able to fix libcxx ahead of landing this, but it looks like D143158 is blocked on a request for a buildkit pre-commit testing solution.

This revision is now accepted and ready to land.Feb 13 2023, 7:20 PM
This revision was landed with ongoing or failed builds.Feb 20 2023, 12:32 PM
This revision was automatically updated to reflect the committed changes.

Now live at https://lab.llvm.org/staging/#/builders/241 - though give it a few days for me to babysit any teething issues (looks like the first build failed due to a mutex assertion in ccache for instance).