Useful for migrating from one monorepo to another, or from a multirepo
to a monorepo.
Details
Diff Detail
- Build Status
Buildable 23918 Build 23917: arc lint + arc unit
Event Timeline
If we go with this document moving forward, I would first purge it so that it reflects the current situation. Otherwise we could also just archive it and start a new one with the actual action plan.
I feel adding more stuff here will be a bit confusing moving forward.
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
874–875 | this can be done shorter and faster: | |
906 | Why not rebase --onto instead of format-patch/am ? |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
906 | Ignore the above, it wont work with --directory=llvm hacks. |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
934–935 | I'm afraid this wont work. --since sets a bottom limit for the commits shown with git-log, but git-log starts from the top (HEAD), so in most cases it will be equivalent to git log -n1 --author <xxx>, which is absolutely not whats needed here. Perhaps doing both --since/--until can do the trick. And if we find the times of commits reliable enough then we can automate it: merge_base=$(git merge-base origin/master my-branch) |
Thanks a lot for proofreading my totally untested code. :)
I agree, but my goal is to avoid bikesheding on this as much as possible, and I'm sort of afraid as soon as I create a new page I'm going to get to enjoy that... At some point someone is going to have to write the canonical documentation, and they can move this there. And until that canonical documentation exists, does it hurt too much to have this here?
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
874–875 | Hm...I *think* you're right. I was worried about weird edge cases, but I now I can't articulate any. Done, thanks. | |
906 | These steps have two separate repositories; git rebase only operates within one repository. I did it this way because if I suggested that pulling both histories into one repository was a good way to do it, people would complain that they can't switch to the new monorepo without making their .git directory larger. :) | |
934–935 | Added --until. I decided not to script it because this is not 100% sound, and so I'd rather encourage a human to be in the loop. |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
934–935 | How would that deal with multiple merges (not necessarily only with origin/master) with many conflicts? Would that essentially require to manually redo all merges? |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
934–935 | Then you still need to add %an <%ae> to the merge-base description format, e.g.: git log -n1 --pretty="format:Date-Time: %aD %at%nAuthor: %an <%ae>%nDescription:%s" $(git merge-base origin/master my-branch) so users have something to copy-paste from. |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
934–935 |
This is not trying to recreate all the merges. Recreating merges without duplicating upstream commits requires substantial complications in history rewrite ala git filter-branch and are not for the faint of heart... |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
927–930 | Perhaps this comment should be rewritten into a bit less git-talk form. I'm bad at writing good English prose, so I wont be giving any wording suggestions :) |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
934–935 | Ah, I think I understand. So it's basically nothing more than "find the commit in new final repo corresponding to the latest merged commit from some other llvm git mirror and merge with that"? I.e. if you know you are at HEAD you could just do the merge? Then I don't see this that much helpful, since IMHO, the main problem comes from the actual transition to the new repo and everything it involves, not finding the point of such transition. I.e. stuff like rewriting history to use final repo commits instead (in if you don't lieve with duplicated commits), or dealing with the merge conflicts again and again (since I don't think that git would understand that they were already resolved by previous non-final merge). The painful stuff. |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
934–935 |
Exactly.
It depends on what do you want to achieve. If you do not care about past history other than being able to find your own downstream commits then doing a single merge should be fine.
Yes, rewrites are considerably more complicated.
And here - no, you will not need to perform *any* merge conflict resolution besides the first one described here, |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
934–935 | the last statement was a bit of an overstatement, but the point is that your conflicts from now on will not be any different than any other merge conflicts you had before. |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
934–935 | I was talking about existing merge conflict. I.e. suppose you've just completed big merge from one of llvm mirrors and resolved many non-trivial conflicts. Now, even if you didn't introduce any new changes and decided to merge llvm git prototype right after that (at the same commit) you'd have to go through the whole tedious process again, i.e. you'll have to resolve all conflicts again, potentially introducing new bugs. You could as well do something like merge -s ours and hope for the best (that wouldn't solve duplicated commits though). Btw, the git will refuse to merge with prototype branch unless you pass --allow-unrelated-histories. |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
866–868 | I guess the use case here is for local/in-progress branches, rather than any sort of real downstream consumer. Maybe we should point that out directly rather than having to infer it from the structure of the repos? I guess this one could be called "Local branches from one of the prototype monorepos". Also it's a bit odd that this is first, before the "multirepo to monorepo" local branches case below ("Local branches from the official LLVM git mirrors"). I expect most developers have that case, since the monorepo prototypes are pretty new and aren't documented as the official way to do things. | |
872 | It's not clear what origin/master is meant to be here, since you don't say what origin is at all. Having the sections based on use cases rather than structure of the repos as I suggest above would probably help. | |
933–935 | I guess the second command here should point at monorepo/master so it doesn't find the exact same commit as git merge-base did. In any case, this is an extremely tedious way to deal with this, especially when you have multiple branches. | |
938–943 | I expect this is the common case for anyone with any significant out of tree changes. We probably need to pay a bit more attention to it. |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
938–943 | Agreed. It is definitely the situation for us. I suspect the vast majority of cases will be people using the existing project repositories, with their own long-lived branches that have periodic merges from master. Essentially, downstream looks like a "downstream master" branches that periodically merge from the "upstream master" branches to sync up. In our situation we have multiple such "downstream master" branches, one for each project repository we're using (llvm, compiler-rt, etc.). It's not at all clear to me how we will transition those multiple branches to a single "downstream master" branch on the monorepo. In the worst case we'll replay all of our downstream commits on top of the monorepo master, but I am hoping there is a better way. |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
938–943 | I hear that you and others are in this position. Do you want to rewrite history so it's like you always had a monorepo? That's tantamount to "replaying all of our downstream commits on top of the monorepo master". It's going to be painful, but it sounds like you have a rough idea of how to do it? I'm not sure what a better way would be, in part because I'm not sure what you're trying to accomplish. Anyway writing such a tool was not what I meant to volunteer for when I volunteered to write this patch. Do you want me to abandon it? |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
938–943 | I believe this patch is still very useful, as it at least enumerates the options if not providing full solutions. |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
938–943 | I have to experiment a bit, but it's possible that some unpushed changes to git-subtree that I have might help. If that turns out to be the case, I'd make it a high priority to make those changes available (probably as a fork on GitHub) and we could point people there if they want to try it. I know that writing such tools is difficult. I did something similar in git-subtree and it took a very long time to get everything working. The document is certainly useful. You shouldn't abandon it! |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
938–943 | I've sent some thoughts about an approach that makes this a lot easier to deal with to llvmdev. See http://lists.llvm.org/pipermail/llvm-dev/2018-October/127334.html |
Let's submit this soon, even if it will be further edited.
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
860–861 | Probably worth explicitly mentioning that this is intended for when you're OK with changing your commit hashes as part of the migration. | |
881 | Don't think you need merge-base here. git log origin/master..my-branch should be sufficient? | |
883–885 | I found it easier to output to a single file, using git format-patch --stdout origin/master..my-branch > foo.patch. (but that doesn't really matter, maybe not worth mentioning) | |
886–889 | Easier to just go by svn revision number of the upstream commit, probably? I'd just go directly: git checkout -b my-branch 'origin/master^{/llvm-svn=1234\W}' (\W, aka end-of-word, is only necessary if your revision number is fewer digits than current.) | |
938 | Can now also suggest the migrate-downstream-fork.py tool I wrote about on the list. (Could just point to the mailing list archive initially, before adding better instructions). |
llvm/docs/Proposals/GitHubMove.rst | ||
---|---|---|
938 | Yes, we should absolutely mention migrate-downstream-fork.py. It worked great for our downstream forks. Here's the link to James' post about it: http://lists.llvm.org/pipermail/llvm-dev/2018-November/127496.html I also added a tool to zip downstream forks based on a submodule update history: http://lists.llvm.org/pipermail/llvm-dev/2018-November/127704.html I think both of these tools will be useful for folks living downstream. |
I ended up finding these instructions accidentally when searching for other monorepo related stuff, and I followed these instructions, and they worked. Thanks for writing them up!
What's the status of this? Since the monorepo prototype seems very likely to be blessed soon, I was thinking of writing up some recipes for common downstream migrations. This document would seem to be the right place to put them but it seems from all the comments that this is very much in flux.
Would it make more sense to create a separate "downstream migration process" document? This one has a lot of content about the whats and whys of the migration while I think most users will want to cut to the chase and know what they should do to start using the monorepo.
What's the status of this?
Mostly I'm just kind of a little apprehensive about taking another stab at this patch since getting it into a state that everyone is happy with seems pretty challenging.
As @bogner mentioned, the last section (Multirepo to Monorepo, With Merges) is almost certainly the most common downstream situation. I may write up some recipes under this header and move it to a more prominent position in a separate patch. The discussion of the other situations is orthogonal to the kinds of things I found I needed to do downstream.
Probably worth explicitly mentioning that this is intended for when you're OK with changing your commit hashes as part of the migration.