This is an archive of the discontinued LLVM Phabricator instance.

[buildbot] Annotated builder tweaks
ClosedPublic

Authored by tra on Jul 9 2020, 1:13 PM.

Details

Summary
  • Allow bypassing source code checkouts. Cloning complete LLVM tree takes 2-3 minutes and not all bots need it (e.g. some CUDA bots just need to run tests built somewhere else)
  • Allow using out-of-tree annotated scripts. This is useful for tinkering with bot operations without having to update buildmaster.

Diff Detail

Event Timeline

tra created this revision.Jul 9 2020, 1:13 PM
gkistanova requested changes to this revision.Jul 13 2020, 9:12 PM

Hello Artem,

Please make sure all the scripts annotated builder could run are in zorg.

This revision now requires changes to proceed.Jul 13 2020, 9:12 PM
tra added a comment.Jul 14 2020, 12:13 PM

Please make sure all the scripts annotated builder could run are in zorg.

I'm a bit confused. If allowing to run external scripts is OK, then such script would not come from the zorg repo by definition. Are you suggesting that we should not allow external scripts and only the ones under zorg/buildbot/builders/annotated/ should ever be run?

If external scripts are OK, then what's the point keeping them in zorg repo?

I don't mind publishing the scripts. However I am not convinced that zorg is the right place for
all the things build machines will get to run. It's impossible in practice to put everything under zorg's control because the script
launched by buildslave may rely on external infrastructure, pre-staged data, networking setup, etc.

That was the idea of this patch -- to introduce a clean cut-off point between the things controlled centrally by zorg repo (let the builder know when to do another build and collect the results) and the things that would be administered locally by buildslave owners (how to actually build/test a given revision).
While many bots are uniform enough to let buildmaster micromanage them, it's not always the case and the separation of responsibilities would be very useful, IMO, at least in some cases where things are a bit more complicated than "cheskout source, run cmake, run ninja".

If you prefer to have not-particularly-reusable build script in the repo as some sort of reference, I'm OK with it, even if I don't see much value there.

A script could be "external" as long as it is a part of the LLVM code base, i.e. committed under https://github.com/llvm to be available to others for review, use, and improve according to the "Apache 2.0 License with LLVM exceptions” and the Developer Policy.
llvm-zorg is the right place for the components of CI, but if your script is also used for generic builds, it is fine to have it as a part of CUDA.

Please feel free to ask if you have questions.

tra added a comment.Jul 21 2020, 12:26 PM

A script could be "external" as long as it is a part of the LLVM code base, i.e. committed under https://github.com/llvm to be available to others for review, use, and improve according to the "Apache 2.0 License with LLVM exceptions” and the Developer Policy.

Bot configuration and the build scripts are in D84258.

llvm-zorg is the right place for the components of CI, but if your script is also used for generic builds, it is fine to have it as a part of CUDA.

I'm not sure I understand what you mean. All I need is:

  • ability to change what my bot is doing *quickly*. If it requires changing the build master configs, it takes too long. That's why I want my workers to run an external script which I can update.
  • have control over what my bots run w/o surprises. This is why I want to run the external script from an absolute path under my control, not from the freshly checked out zorg repo which can be updated by many other people.

Is that acceptable?

gkistanova accepted this revision.Jul 22 2020, 4:06 PM

Ok. Let’s see how this will work in the reality.

This revision is now accepted and ready to land.Jul 22 2020, 4:06 PM
This revision was automatically updated to reflect the committed changes.
tra added a comment.Jul 22 2020, 5:04 PM

Ok. Let’s see how this will work in the reality.

So far the bots have been running in staging area reasonably well and got about 1000 builds done, each. Round-trip time for the results is within 5-15 minutes, though there are periods then VMs are preempted.

Please let me know when the LLVM buildbot has been updated so I can make sure that my builders come up there.