This is an archive of the discontinued LLVM Phabricator instance.

Add some facilities to work with a git monorepo (experimental setup)
ClosedPublic

Authored by mehdi_amini on Nov 6 2016, 11:35 AM.

Details

Summary

Some changes are made to cmake, especially the addition of a new
LLVM_ENABLE_PROJECTS option that makes the build system aware of
the monorepo directory structure.

Also a new script is added in llvm/utils/git-svn/. When present in
the $PATH, it enables a git llvm command. It is providing at this
point only the ability to push from the git monorepo: git llvm push.
It is intended to evolves with more features, for instance I plan on
features like git llvm show r284955 to help working with sequential
revision numbers.
The push feature is taken from Justin Lebar's script available here:
https://github.com/jlebar/llvm-repo-tools/

Diff Detail

Repository
rL LLVM

Event Timeline

mehdi_amini retitled this revision from to Add some facilities to work with a git monorepo (experimental setup).
mehdi_amini updated this object.
mehdi_amini added a reviewer: jlebar.
mehdi_amini added a subscriber: llvm-commits.
jlebar added inline comments.Nov 6 2016, 12:05 PM
llvm/utils/git-svn/git-llvm
1 ↗(On Diff #76991)

It is important that we retain the warning (somewhere) that this tool does not handle renames properly.

112 ↗(On Diff #76991)

We have a global -q and a per-command --verbose? That does not seem right.

124 ↗(On Diff #76991)

Please be consistent with respect to ' or ". Personally, I prefer single quotes because it's less visually dense.

129 ↗(On Diff #76991)

Please be consistent with whether we write statuses to stderr or stdout.

129 ↗(On Diff #76991)

Please wrap lines to 80 characters.

130 ↗(On Diff #76991)

I'm kind of uncomfortable with all the chdir'ing we're doing. (It was tenuous before, as well, but now we're doing even more.) I think it would be clearer if we did a single chdir (maybe to the git root) and then specified an explicit cwd for all other commands.

133 ↗(On Diff #76991)

We may want to think about whether and how we should change this so that it will work if we later add a directory to GIT_TO_SVN_DIR. I think it may just be a matter of changing the other svn update command, below. Maybe at that point it should be factored out into a separate function.

143 ↗(On Diff #76991)

Spaces around =.

154 ↗(On Diff #76991)

Please rearrange this file applying some sort of logical ordering.

Usually Python programs do a rough topological sort, with main at the bottom.

220 ↗(On Diff #76991)

This prompt may not be necessary now that you're putting the svn checkout inside the .git dir.

jlebar added inline comments.Nov 6 2016, 12:05 PM
llvm/utils/git-svn/git-llvm
1 ↗(On Diff #76991)

Can we mention that this is derived from my work and link to the original repository in the commit message? I would appreciate that, and it may also reduce confusion among users.

76 ↗(On Diff #76991)

Do we need this at all? None of the commands at the moment take positional arguments.

87 ↗(On Diff #76991)

push changes back to the LLVM SVN repository

116 ↗(On Diff #76991)

Remove blank line (or if you want it, please be consistent everywhere).

118 ↗(On Diff #76991)

Spaces after commas

118 ↗(On Diff #76991)

Please use the git helper consistently (or get rid of it).

120 ↗(On Diff #76991)

s/get/find/

124 ↗(On Diff #76991)

Nit, I think we should namespace the directory we create (i.e. "llvm-upstream-svn" or something).

131 ↗(On Diff #76991)

Please use the svn helper consistently, or get rid of it.

137 ↗(On Diff #76991)

Please pull this out into a helper function.

285 ↗(On Diff #76991)

run() and shell() seem to do basically the same thing?

288 ↗(On Diff #76991)

Can we use .get instead?

305 ↗(On Diff #76991)

Can we import print_fn from future instead of using the old-style print syntax? The new thing is IMO easier to understand, and it also eases the py3 conversion.

jlebar edited edge metadata.Nov 6 2016, 12:08 PM

I cannot speak to the cmake changes here.

llvm/docs/GettingStarted.rst
725 ↗(On Diff #76991)

A helper script is provided in llvm/utils/git-svn/git-llvm. After you add it to your path, you can push committed changes upstream with git llvm push.

mehdi_amini marked 22 inline comments as done.Nov 6 2016, 2:59 PM

Thanks Justin!

llvm/utils/git-svn/git-llvm
1 ↗(On Diff #76991)

Can we mention that this is derived from my work and link to the original repository in the commit message? I would appreciate that, and it may also reduce confusion among users.

Of course, I wanted to add this in the commit message but forgot in the end.
Note that my idea is that the script should evolve with other subcommand than push, and persist even after the move to github (for example I mentioned llvm show r12345 to continue interacting with SVN rev).

It is important that we retain the warning (somewhere) that this tool does not handle renames properly.

Sure.

Git does not record rename in the first place anyway, it there more to this than "rename are not shown as rename in SVN"?

76 ↗(On Diff #76991)

I can kill this for now. TBH I just started from git-clang-format.

133 ↗(On Diff #76991)

I didn't get what you mean here?

154 ↗(On Diff #76991)

Done.
(I took this from git-clang-format. Is there some sort of official convention (like pep8) that would mention something?)

285 ↗(On Diff #76991)

Merged!

305 ↗(On Diff #76991)

It was included: this patch would have thrown at runtime. Fixed.

mehdi_amini updated this revision to Diff 76995.Nov 6 2016, 3:00 PM
mehdi_amini marked 4 inline comments as done.
mehdi_amini edited edge metadata.

Address Justin's comments on the python script

mehdi_amini updated this object.Nov 6 2016, 3:01 PM
mehdi_amini added inline comments.
llvm/utils/git-svn/git-llvm
129 ↗(On Diff #76991)

The file passes pep8 now.

Added @beanz and @bogner to make sure I didn't mess up the cmake part.

mehdi_amini updated this revision to Diff 76996.Nov 6 2016, 3:06 PM
mehdi_amini updated this object.

Fix typo

mehdi_amini updated this revision to Diff 76997.Nov 6 2016, 3:13 PM

Fix another typo

jlebar added a comment.Nov 7 2016, 9:35 AM

Sorry for all the nits; I'm sort of picky about Python. :)

llvm/utils/git-svn/git-llvm
12 ↗(On Diff #76997)

'r' isn't necessary here (and docstrings canonically don't have them unless it's necessary).

40 ↗(On Diff #76997)

I'm not sure we need this? The docstring at the top of the file should be sufficient -- you can access it as __doc__ and so on.

96 ↗(On Diff #76997)

Nit, some commands (like svn update) quite reasonably take more than .1s. So I don't think we should beg the question [1] and say "warning".

[1] https://en.wikipedia.org/wiki/Begging_the_question

143 ↗(On Diff #76997)

Don't you just want to strip the output of git instead?

159 ↗(On Diff #76997)

This isn't your change, but I think this will break if you try to touch a file with a space in its name, and we do have such files in the tree. I guess we could fix it later if you wanted to leave a TODO.

160 ↗(On Diff #76997)

Nit, this isn't quite right if a filename starts or ends with " ". Which, I grant, is dumb. But I sort of worry about a file named ../info (that is, a file named info in a directory named " .."), which would cause us to delete your .git/info. Maybe there's a worse attack.

176 ↗(On Diff #76997)

Nit, the usual convention I've seen is to use double-quotes when the string contains an apostrophe, on the grounds that that's cleaner than backslash-escaping it.

176 ↗(On Diff #76997)

No need to put svn_root in parens. I don't object if you want to put them everywhere (% formatting definitely has its quirks, and using parens makes them go away, although at the expense of more typing, so I usually leave them out), but we should be consistent.

179 ↗(On Diff #76997)

The convention I've seen in other similar programs is to call these top-level functions cmd_foo, where foo is the name typed on the command line (i.e. git llvm foo).

I kind of like this convention; see the confusion I had about svn_push vs push below.

182 ↗(On Diff #76997)

I'm going to take down the script since this will be canonical, so this won't be helpful, I think. Although I do appreciate it. :)

186 ↗(On Diff #76997)

Same here with \'.

204 ↗(On Diff #76997)

Maybe svn_push_one_rev, or push_one_rev_to_svn or something? Otherwise it wasn't clear to me how these two functions interacted, but I originally believed that push called svn_push, because push matched the name of the top-level command (i.e. git llvm push) and because it appears below svn_push in the topological sort.

(May be worth fixing the topo sort too.)

240 ↗(On Diff #76997)

This is actually no longer necessary. We needed this when I was using patch, but git apply's man page says it will never leave .rej files around unless you pass --reject, hooray.

252 ↗(On Diff #76997)

Same here with \'.

261 ↗(On Diff #76997)

Should this var be called subcommands?

267 ↗(On Diff #76997)

Why are we doing count as opposed to store_true? VERBOSE and QUIET are nominally bools, per the top of the file, but then below we store an int into them...

287 ↗(On Diff #76997)

Huh, I've never seen that before. That is cool!

293 ↗(On Diff #76997)

Should be

make *up* your mind

But you probably want https://docs.python.org/2.7/library/argparse.html#mutual-exclusion

1 ↗(On Diff #76991)

Git does not record rename in the first place anyway, it there more to this than "rename are not shown as rename in SVN"?

Right, but AIUI git-svn will notice renames using whatever your current git settings are and commits them as renames in SVN.

I think we're on the same page about this? I agree all that's necessary is to warn somewhere that "renames are not shown as renames in SVN."

Since it should be possible to mirror git-svn's behavior, perhaps it should be a TODO?

133 ↗(On Diff #76991)

I mean: Will this script continue to work if we add a new directory to GIT_TO_SVN_DIR?

I think it will not because after doing so we would have to run svn update parallel-libs/trunk, but we only do that when creating the svn directory for the first time.

So I was wondering if the *other* svn update command, which we run every time, should do svn update cfe/trunk llvm/trunk blah/trunk just like this first one does, which (I think?) would let us pick up new GIT_TO_SVN_DIRs.

Then I was wondering, if so, if we should pull out that svn update command into a helper function, because code duplication.

154 ↗(On Diff #76991)

Is there some sort of official convention (like pep8) that would mention something?

Not to my knowledge. But it sort of follows from the fact that

if __name__ == '__main__':
  main()

has to appear at the very bottom of the file (because only at that point have all of our functions been declared). Because the call to main() appears at the bottom, it sort of makes sense for main itself to appear nearby. Then everything else follows from there.

(It also sort of makes sense for constants like VERBOSE to appear at the top of the file, out of parallelism with other languages if nothing else.)

This revision was automatically updated to reflect the committed changes.
mehdi_amini marked 8 inline comments as done.
mehdi_amini marked 22 inline comments as done.Nov 7 2016, 10:17 AM
mehdi_amini added inline comments.
llvm/utils/git-svn/git-llvm
40 ↗(On Diff #76997)

Removed. It was a remainder of git-clang-format (desc was used for argparse, I replaced with __doc__).

96 ↗(On Diff #76997)

The .1 seems arbitrary, since this is behind the -v flag, what about always printing it?

143 ↗(On Diff #76997)

For some reason, I couldn't get it to not have an empty entry after the split, even by adding strip() or rstrip() before.
However replacing split('\n') with splitlines() address the issue.

159 ↗(On Diff #76997)

Fixed.

160 ↗(On Diff #76997)

I think we can wait for someone able to commit this at the top of the SVN repo.

1 ↗(On Diff #76991)

I added it to the doc.

133 ↗(On Diff #76991)

OK, fixed. That's a one liner though, I didn't feel like pulling it in a separate function. See line 156, was: svn(svn_repo, 'update') and is now svn(svn_repo, 'update', *list(GIT_TO_SVN_DIR.values())).

mehdi_amini reopened this revision.Nov 7 2016, 10:17 AM
mehdi_amini marked 5 inline comments as done.

Address Justin's comments.

This looks good to me! My only remaining nontrivial concern is about the error handling in the parser.

Our version of phab has a bug where it sometimes won't delete comments. Sometimes those comments do show up when I hit "submit", and sometimes they don't. I have a comment that I've tried to delete that starts with:

I am afraid that if the only notice we have about this limitation is in the HTML documentation,

If this shows up when I hit "submit", you can ignore it, I decided it's not a big deal and mean to delete it.

llvm/utils/git-svn/git-llvm
22 ↗(On Diff #77054)

Maybe we can check the Python version in addition to or instead of this comment (sys.version >= (2,7))?

238 ↗(On Diff #77054)

Can we move this helper up? Kind of weird that it's here.

245 ↗(On Diff #77054)

Ok, the __doc__ will now be printed out when you run git llvm -h, so maybe it shouldn't redirect people to looking at the output of -h. :)

253 ↗(On Diff #77054)

Maybe a better name than group? verbosity_group?

279 ↗(On Diff #77054)

Because you're using a mutually_exclusive_group, you don't need this anymore, right?

282 ↗(On Diff #77054)

What happens if you just run git llvm? It seems like args.func would be None? Maybe we should print a better error message (via parser.error)?

96 ↗(On Diff #76997)

Always printing the time elapsed with -v sounds fine to me.

1 ↗(On Diff #76991)

I am afraid that if the only notice we have about this limitation is in the HTML documentation,

(a) many people won't see it (because they're just going to run git llvm -h,
(b) maintainers of this code will not be aware of the limitation, and
(c) if we ever fix this problem, we will not necessarily remember to update the HTML documentation.

That's why I was hoping we could put something inside of this file, like a TODO, or a note in the --help / docstring.

I know this is a fine point, and I'm sorry to belabor it, but I have found that the documentation that is most likely to be up to date is the documentation that lives closest to the code. I am concerned about putting important notes in a separate file for the same reasons I would be concerned about duplicating code into two files.

mehdi_amini marked 11 inline comments as done.Nov 7 2016, 11:42 AM
mehdi_amini added inline comments.
llvm/utils/git-svn/git-llvm
282 ↗(On Diff #77054)

It will print:

usage: git llvm [-h] [-q | -v] {push} ...
git llvm: error: too few arguments
1 ↗(On Diff #76991)

I added it to the docstring of cmd_push, so that now git llvm push -h prints:

$ git llvm push -h
usage: git llvm push [-h] [-n] [GIT_REVS]

Push changes back to SVN: this is extracted from Justin Lebar's script
available here: https://github.com/jlebar/llvm-repo-tools/ Note: a current
limitation is that git does not track file renames, so they will show up in SVN
as delete+add.

positional arguments:
  GIT_REVS       revs to push (default: everything not in the branch's
                 upstream, or not in origin/master if the branch lacks an
                 explicit upstream)

optional arguments:
  -h, --help     show this help message and exit
  -n, --dry-run  Do everything other than commit to svn. Leaves junk in the
                 svn repo, so probably will not work well if you try to commit
                 more than one rev.
jlebar accepted this revision.Nov 7 2016, 11:47 AM
jlebar edited edge metadata.

lgtm for python bits.

This revision is now accepted and ready to land.Nov 7 2016, 11:47 AM

Justin B., even if we went with a multirepo, this script might still be useful, because I expect Mehdi and I will maintain tools for a monorepo mirror that points to the new multirepo, just as we do today.

mehdi_amini marked 2 inline comments as done.
mehdi_amini edited edge metadata.

Only include the python script in this rev.

This revision was automatically updated to reflect the committed changes.
pcc added a subscriber: pcc.Nov 7 2016, 12:35 PM
pcc added inline comments.
llvm/trunk/utils/git-svn/git-llvm
216

This doesn't appear to give the correct result for worktrees. It looks like you want to use git rev-parse --git-common-dir here.

mehdi_amini added inline comments.Nov 7 2016, 12:45 PM
llvm/trunk/utils/git-svn/git-llvm
216

Thanks, fixed in r286140