This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
-
Proposals/
171/182
GitHubMove.rst
-
index.rst

Differential D24167

Moving to GitHub - Unified Proposal
ClosedPublic

Authored by mehdi_amini on Sep 1 2016, 4:19 PM.

Download Raw Diff

Details

Reviewers

dexonsmith

Commits

rG647deb8f1a2d: Moving to GitHub - Unified Proposal
rL284077: Moving to GitHub - Unified Proposal

Summary

This document described the proposal to move to GitHub, and includes the two proposals side-by-side with a comparison between the two.
It also goes through various workflow examples, presenting the current set of commands following by the ones involved in each of the two proposals.

It is intended to supersede the previous "submodule proposal" document entirely, and drive the design of a survey addressed to the community.

Diff Detail

Repository: rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes

rengolin added inline comments.Sep 2 2016, 1:26 AM

docs/Proposals/GitHubMove.rst
161	IMO, we don't need to keep the same format, but that's a good point. Though, it would be better to outline the two options in one quick phrase than leave the other implied.

mehdi_amini marked 4 inline comments as done.Sep 2 2016, 1:46 AM

mehdi_amini added inline comments.

docs/Proposals/GitHubMove.rst

298

I added "with the sequence" following your comment to make it more clear.

425

I'm not sure I follow: AFAIK recursive is for nested submodules, which is not part of the proposal. So to be clear I expect --recursive to be a no-op. I can be wrong, but I'll need some more explanation if I missed something obvious here.

If your point is about cloning *all* the sub-projects and not only just a selected list, then --recursive is not the right option, just doing git submodule update without any other flag will do it. I'll spell it out.

618

I'm sorry I don't follow. You mention a changed in the flow for commit. Here is what's mentioned in the section I referred to, can you clarify where is the inaccuracy?

Workflow today:

# direct SVN checkout
svn co https://user@llvm.org/svn/llvm-project/llvm/trunk llvm
# or using the read-only Git view, with git-svn
git clone http://llvm.org/git/llvm.git
cd llvm
git svn init https://llvm.org/svn/llvm-project/llvm/trunk --username=<username>
git config svn-remote.svn.fetch :refs/remotes/origin/master
git svn rebase -l  # -l avoids fetching ahead of the git mirror.

Workflow after (copy/paste):

A second option is to use svn via the GitHub svn native bridge::

  svn co https://github.com/llvm/llvm-projects/trunk/compiler-rt compiler-rt  —username=...

This checks out only compiler-rt and provides commit access using "svn commit",
in the same way as it would do today.

Finally, you could use *git-svn* and one of the sub-project mirrors::

  # Clone from the single read-only Git repo
  git clone http://llvm.org/git/llvm.git
  cd llvm
  # Configure the SVN remote and initialize the svn metadata
  git svn init https://github.com/joker-eph/llvm-project/trunk/llvm —username=...
  git config svn-remote.svn.fetch :refs/remotes/origin/master
  git svn rebase -l


In this case the repository contains only a single sub-project, and commits can
be made using `git svn dcommit`, again **exactly as we do today**.

philip.pfaffe added a subscriber: philip.pfaffe.Sep 2 2016, 2:20 AM

kparzysz added a subscriber: kparzysz.Sep 2 2016, 9:46 AM

kparzysz added inline comments.

docs/Proposals/GitHubMove.rst
199	and preserve the history.

In D24167#532327, @jlebar wrote:

Wait a second. We're choosing between two proposals. The three of us here are among the experts.

Assuming that we're somehow experts on workflows above other contributors seems a bit presumptuous to me. Keep in mind the difference in the proposals isn't Git, so being a Git expert (which I'm certainly not) isn't really relevant. I'm not an expert on other people's workflows, so I would prefer if we approach this from a perspective of providing information and allowing people to form their own opinions.

We absolutely should be comparing the two proposals explicitly, to draw users' attentions to the differences that we think are important. Because we actually know something that others don't!

Really? No offense, but this comes off as incredibly condescending. You're asserting that you know better than everyone else. I'm not saying we shouldn't help people compare the proposals. I'm saying we shouldn't draw conclusions for them. Our community is filled with a lot of really smart people and I strongly believe they are capable of forming their own informed decisions.

I don't want a one-sided fight, where only the monorepo or multirepo cadre gets to have its say. But if you believe that debate leads to better outcomes, then absolutely we should compare and advocate, which is just another way of explaining why, in our view, one proposal is better than the other.

Debate is great. The LLVM.org documentation on the proposal isn't the place for it. Debate it at the social, debate it in your office, debate it on the lists and on IRC, debate it on your livejournal (I think that's still a thing right?). I believe the proposals should be neutral. I believe the documentation on LLVM.org should be position agnostic geared toward informing people without trying to directly influence them.

Can we just have these as separate sections in one document? That's almost what we have here.

Four documents is a lot to ask people to read and understand. At least with one, they can skip around, etc... And it will also be easier to review and edit.

I disagree. There are a number of advantages to multiple focused documents, and one of them is that they will be smaller and easier to review. Also shorter documents are generally easier to digest because you can pick one up and read it in a few minutes, and they provide break points.

I think we could agree on a section that lays out the monorepo and multirepo proposals in a dry way, explaining what each looks like. Then we could have the "why monorepo" and "why multirepo" sections. Finally we could have the workflow comparison.

I would have no objection to this assuming that the two "why" sections and workflow comparisons are also neutrally written and avoid direct references to the other proposals.

In D24167#532395, @jlebar wrote:

Could we just add a similar section advocating for the multirepo and be done here, move on with our lives?

No. I don't believe advocacy documents of this nature have any place on llvm.org. I know it isn't your intention, but your document as written is little better than propaganda. It is filled with half-truths, assertions that aren't backed by fact, and slanted language designed to influence the reader to your opinion. To provide some examples:

SVN users would be more disrupted in the multirepo approach.

I disagree. I think that the mono-repo is more disruptive. But, that is my opinion, not backed by fact. You present this as a fact, and it is clearly a subjective opinion.

Because all of our subprojects would live in a single repository, we could move code between them easily and without losing history.

This can actually be done with a multi-repo approach too using sub-tree merging. While it may not be as easy, it doesn't lose history.

With the multirepo, moving clang-tools-extra into clang would be much more complicated, and might end up loosing history.

Same as above. You don't need to lose history to do this. Yes, it is more complicated, but it is a one-time cost.

Look, at the end of the day I care *way* more about how our community arrives at a decision rather than the specifics of the decision. I believe that are community is filled with intelligent people that are capable of drawing their own opinions, and that those opinions are equally or more valuable than my own. As a result I believe that the information we should be providing to the community for consideration of these proposals should be the best possible effort at being non-biased and impartial.

You may disagree with some or all of that, which is certainly your prerogative. What I'm trying to push for here is a framework for how to construct documents for these proposals that strive to be non-biased. Is it possible to write non-biased documents that refer to each other? Sure. Is it possible to write biased documents that *don't* directly refer to each other? Sure. My point is that it is *easier* to write non-biased documents if they don't directly relate themselves against their opposition.

-Chris

probinson added inline comments.Sep 2 2016, 11:03 AM

docs/Proposals/GitHubMove.rst
521	Very nicely succinct. One typo: scripts -> script

Chris, I am really happy to work with you to make sure that you're happy that the parts of this document that are explicitly not advocating a position come across as dry and factual. I agree that parts of this document don't come across that way, and, where we're not explicitly advocating for a position, they should be changed. This was actually my explicit feedback to Mehdi when I reviewed his document, and also when he sent me this review yesterday.

But I admit to being flummoxed by what I read as rage here against the idea that we would allow advocates to explain their reasoned positions in a document posted to llvm.org. (Do you burn your California voter's guide for this reason?) Indeed, I am even more confused because the rage seems to be only aimed at the parts of this document that compare the monorepo and multirepo, but not at the parts that compare git and svn while clearly advocating for one side.

I also am really confused by the idea that somehow explaining why I think X is true prevents someone else from making up their own mind. It seems to me that *not* explaining my arguments would actually prevent people from making an informed decision.

But you're clearly very upset by this, and I don't think it is worth arguing further. Frankly I'm already feeling fight-or-flight here, and if the rhetoric escalates further, I'm afraid I won't feel welcome in this discussion at all.

Would you be amenable to a compromise, Chris? How about we link to advocacy statements from this document? Would that be acceptable to you?

Justin,

I apologize if my frustration comes off as rage. I believe that healthy debate is valuable and should be welcomed, but I also believe that debate has a place, and that this isn't debate. While I don't burn my CA voter's guide, it at least is a bit more honest about the fact that it publishes paid opinions (which I generally don't read). If you would prefer to follow the format of the CA voter's guide I'd feel way more comfortable with this. The idea of providing a dedicated space for arguments and rebuttals that are clearly labeled as the opinions of a person or group of people is not objectionable to me. I object to the intermingling of opinion and fact that blurs the lines between the two.

Comparing the proposals isn't the problem. The problem is that this document draws conclusions from the comparisons. All of that is fixable, and I spoke with Mehdi before he posted this review and voiced my concerns and willingness to help improve the document.

The problem is also not you thinking something is true, the problem is the document doesn't say "@jlebar thinks the mono repo is better because ...", it says "the mono repo is better because ...". These documents will be consumed by many people who are not following the review or dev list threads on the topic, which means that separating the opinion and fact will be very difficult for most of the audience.

I apologize for any comments I've made that have made you feel unwelcome in this conversation. It is not my intent. The discussions around moving to Git have caused a lot of passionate discourse which has already been way too unwelcoming to many members of our community. In the future I will strive to keep my responses less impassioned.

-Chris

In D24167#533142, @beanz wrote:

In D24167#532327, @jlebar wrote:

Wait a second. We're choosing between two proposals. The three of us here are among the experts.

Assuming that we're somehow experts on workflows above other contributors seems a bit presumptuous to me. Keep in mind the difference in the proposals isn't Git, so being a Git expert (which I'm certainly not) isn't really relevant. I'm not an expert on other people's workflows, so I would prefer if we approach this from a perspective of providing information and allowing people to form their own opinions.

I think this is more about addressing people's needs: we can't know everyone else's workflow, but we (well, not me) can certainly provide the answers to "how do I do this in git". In that sense being a git expert definitely helps.

We absolutely should be comparing the two proposals explicitly, to draw users' attentions to the differences that we think are important. Because we actually know something that others don't!

Really? No offense, but this comes off as incredibly condescending. You're asserting that you know better than everyone else. I'm not saying we shouldn't help people compare the proposals. I'm saying we shouldn't draw conclusions for them. Our community is filled with a lot of really smart people and I strongly believe they are capable of forming their own informed decisions.

I am not an expert in either git or svn, nor do I want to become one for the sole purpose of making an informed decision here. I cannot predict all possible consequences of either approach that could affect my work, and I do welcome a summary of the differences with their potential impact. Moreover, it may very well be that the differences in the proposed solutions will not be significant to me and my workflow. In that case, I would give preference to the solution that helps the the rest of the "core" developers.

Thanks a lot for the apology, Chris.

If you would prefer to follow the format of the CA voter's guide I'd feel way more comfortable with this. The idea of providing a dedicated space for arguments and rebuttals that are clearly labeled as the opinions of a person or group of people is not objectionable to me. I object to the intermingling of opinion and fact that blurs the lines between the two.

I am 100% onboard with this. In fact, it's what I was trying to get at in the first place when I wrote:

I think we could agree on a section that lays out the monorepo and multirepo proposals in a dry way, explaining what each looks like. Then we could have the "why monorepo" and "why multirepo" sections. Finally we could have the workflow comparison.

I also don't like the half-advocacy position taken in parts of this document, particularly "one or many repositories?" When I revised Mehdi's proposal, I structured it more like you suggested, being explicit when we're advocating for one side or the other.

If you're amenable, I'm actually kind of interested in doing a postmortem (offline) to figure out how I could have communicated that better. Would have saved us both some grief, I think. In any case I'm really sorry for getting you worked up over something we agree on. And I greatly appreciate your apology. I too will try to stay a bit more calm. :)

Hi folks,

Deviating a bit from the conflict, I'd like to show how all of us agree on the same things basically...

In D24167#533142, @beanz wrote:

I believe the proposals should be neutral. I believe the documentation on LLVM.org should be position agnostic geared toward informing people without trying to directly influence them.

I agree with you, and I think that's what Justin was going for. I also see Mehdi's text as an attempt to do that, but it's hard to do it (I certainly can't) while we have an opinion (and we all have some).

So, the way I see it, we're arguing over specifics. There's no need.

I can see two ways this will go down without exploding again:

We "sanitise" the text from our biases the best we can and expect people to understand it. I'm not saying the intention was to be biased, just that humans bias. By having more points of view (this review is a good start), we "clean it up" a bit. Ie. we just continue doing this review as is, until everyone is happy "enough". (the quotes are important, as those words can have multiple meanings, and I mean the best of them). This will have the cost of re-reviewing what was agreed before (the sub-mod proposal), but we can make it simpler by having a clear table to workflow (as I proposed earlier) and a very short list of pros and cons (as was proposed on the list).

We do the original plan, to have two completely separate sections (on the same document). The sub-mod section copied (+ some workflow) from the other text, the mono-repo as a filter from this text. It'd still be good to have a table at the end, though. The additional cost is re-do a lot of what Mehdi has done, but the benefit is that people that spent time discussing the other proposal won't have to do the whole thing again here.

I don't mind either way, but would be good to pick one and stick with it.

I would have no objection to this assuming that the two "why" sections and workflow comparisons are also neutrally written and avoid direct references to the other proposals.

I agree this has to be avoided, no matter how we organise the document. Justin's point that "this is a comparison" is very fair, but IIGIR, Chris' point was surrounding "bias comparison" (ex. "A is more complicated than B").

Workflows are all different, and people find different things complicated. Let's just lay out the independent facts about each one, even if the text becomes a bit brittle, and let people do their own comparisons.

I don't believe advocacy documents of this nature have any place on llvm.org.

So, removing all feelings around the phrase and the moment, I think there's a lot of meaning in this statement.

The OSS LLVM community has prided in not pushing an agenda towards anyone. The individual contributions (company or personal) do have agendas, and they're synced upstream, and the discussions are massively technical in nature. There is obviously a power play, as in any other community (OSS or not), but here, we have always valued technical arguments over everything else.

The GitHub problem is a technical one, but also a personal one. Technical problems, with technical solutions, but very rooted on how companies and individuals develop, validate and deliver their products. It's very easy to let our biases of "this is much easier than that" unintentionally blind us, and we have to be careful.

The first rule of thumb is to not assume, for any reason, that we're experts. Most people in our community could have been doing the job of investigating this issue, but they're not. That's not to say that whatever opinion *we* reach will be the best, just that whatever we present has to be clear, concise and technically accurate, so *they* can take their own decisions.

We're not here to decide what's best, but to digest and present the information in the simplest form possible, so everyone can decide what's best for themselves.

SVN users would be more disrupted in the multirepo approach.

I disagree. I think that the mono-repo is more disruptive. But, that is my opinion, not backed by fact. You present this as a fact, and it is clearly a subjective opinion.

This is a very good example why we can't use biased comparison like "A would be more disruptive than B". This is a personal opinion, not an invariant fact to all members of our community. We should not have any of that on this document, or people will not take it seriously and we won't have the effect we want.

I'm not pointing fingers or trying to start a fight, I'm just making clear that this discussion cannot happen in the text.

We need simplicity and clarity. Despite our best efforts, this text is not there yet, and it's no one's fault. I think Mehdi, Justin and Chris have done a remarkable job at collating and discussing all the facts, and I think a lot of people in the community *really* appreciate it. I certainly do.

But now it's time to stop discussing how the *best* workflow looks like, and start clearing up the proposal.

In my personal view, Mehdi's description and workflow examples are great. We just need to validate the sub-module part with the previous proposal, and work on the formatting.

cheers,
-renato

docs/Proposals/GitHubMove.rst
618	This is how it would work on a multi-repo, but this section is talking about the mono-repo. IIGIR, on a mono-repo, developers of a single component will have to commit back on the mono-repo, which will then be propagated to the individual (read-only) repos, no?

Address most minor comments

mehdi_amini added inline comments.Sep 3 2016, 2:09 PM

docs/Proposals/GitHubMove.rst
425	I added a comment mentioning that the list if optional. Let me know if I misunderstood something about --recursive above.
618	This is how it would work on a multi-repo I'm not totally sure what is "This" referring to? Assuming it is about my previous paste, then no it describes the monorepo. IIGIR, on a mono-repo, developers of a single component will have to commit back on the mono-repo, which will then be propagated to the individual (read-only) repos, no? Right, and this is the same thing as what a git-svn developer do today: git clone the individual repo configure git svn to point to the SVN repo (the one from the monorepo in the future). commit through SVN the commits are propagated to the individual repo.

Lots of inline comments.

docs/Proposals/GitHubMove.rst
64	The language here is also misleading. Maybe change to something like: Many new coders nowadays start with Git, and a lot of people have never used SVN, CVS, or anything else.
69	I would remove the "(most?)" bit here because it doesn't really add any value. We have no data to support an assertion of "most", and it could be misleading to suggest it.
80	Can we also add this as a point: Maintain remote forks and branches on Git hosting services and easily integrate back to the main repository. In particular for people that maintain out-of-tree code or forks, the ability to seamlessly merge between repositories is a big win for Git.
138	Can we also add something about the more traditional Git approaches to this? Maybe something like: Additionally, there are simple Git commands that can also be used to determine the order of commits. For example to answer the question is a bug fixed in <hash-a> fixed in a compiler built at <hash-b> can be answered with the command `git rev-list <hash-a>..<hash-b> --count`. If this prints a number greater than 0, the fix is contained in <hash-b>. Additionally if we were to use Git tags similarly to how we use SVN tags today you would be able to identify which releases contained a fix by running `git describe --contains <hash>`.
193	This is a completely subjective statement, and should not be present.
203	With git history could be preserved even across repositories. Git subtree merges support this, and while it isn't as simple, it is a one-time cost.
206	As in my other comment, losing history is not an issue.
209	Actually, there were also concerns about the increased burden for contributors not just downstream users. In general I think this entire section is designed to point out supporting arguments for the mono-repo with no recognition of the merits of the multi-repo proposal.
231	This is very slanted wording. From a user perspective the multi-repo solution to this problem is not much more complicate than the mono-repo solution.
422	Nit: "enters the dance" implies complexity.
453	Not sure I agree this is easy for svn users. To my knowledge llvm.org doesn't even document how to checkout the SVN repositories in a way to make this possible.
519	Additionally users of the umbrella repo can use `git submodule foreach` to have single command workflows that nearly match the mono-repo proposal.
571	This is inaccurate. Even though my rough prototype of the git umbrella repo doesn't have each submodule update being a single commit that was the stated plan for how the umbrella would be updated. That means each umbrella repo commit would represent a single commit to a single subproject, so your bisection granularity is comparable.
582	Better to say "both proposals will allow you to continue to use SVN". The wording here makes it seem like only the mono-repo has GitHub's SVN support, even though that is later contradicted.
621	This is a subjective statement that I don't believe is factually accurate. We could easily teach the build system to checkout subprojects so that building a full toolchain could be `git clone ... && configure && build` regardless of the repository layout.
628	I'm confused by this. The sub-project mirrors are read-only, so the workflow is either checkout the full mono-repo or use Git-SVN. That doesn't sound unchanged.
636	It is worth noting (as I did when I sent this out) that this was a very rough prototype, and it doesn't solve all the problems that we would expect a more permanent solution to solve. For example, the submodule update is periodic, not on a push-based notification, and the scripting around it doesn't do a single commit per update, which was the intended solution.

emaste added a subscriber: emaste.Sep 9 2016, 12:49 PM

emaste added inline comments.Sep 9 2016, 12:58 PM

docs/Proposals/GitHubMove.rst
11	A little point, but I think we should say "why we're proposing such a move" or similar. "why we need such a move" in the first paragraph of the document implies the decision is already made, and might discourage those against change from even responding.

Replace "why we need to move" with "why we are proposing to move".

Address beanz' comments.

mehdi_amini added inline comments.Sep 9 2016, 4:07 PM

docs/Proposals/GitHubMove.rst
139	I'm not against mentioning this somewhere, but the "traditional" Git approach of hashes does not address at all the concerns mentioned right above.
194	Rewrote, but I suspect we'll need some other rounds. Suggestion welcome.
204	Can you provide an example where the history of a single file contents can be preserved without pulling all the source repository entirely? I'd like to try it and see how git log/git blame deals with that.
210	You're welcome to suggest merits of the multi-repo proposal to balance.
232	Please provide a replacement for this sentence.
454	Do you have an alternative to suggest?
572	If you have a way to guarantee it, I'm willing to hear about it. Right now, I don't believe it is possible without implementing it on the git hosting itself.
583	I did a minor rewording (we're on a different support level here between the two solutions, which need to be conveyed somehow).
622	Removed the paragraph
629	We're talking about libcxx in the monorepo proposal? Assuming yes, can you give an example of workflow that would be changed compared to today?
637	(Already addressed above)

I'll work up some suggested alternate phrases this weekend or early next week, but I have some responses inline.

docs/Proposals/GitHubMove.rst
205	Google is our friend -> http://stackoverflow.com/questions/1365541/how-to-move-files-from-one-git-repo-to-another-not-a-clone-preserving-history
211	I don't think that our proposals should be constructed as convoluted arguments between contributing authors. Adding pro multi-repo statements will only make this more difficult to grok. I actually think there is very little in this section that shouldn't be part of an "arguments/rebuttals" section.
573	You can absolutely guarantee the same granularity. You can't guarantee the same ordering, but generally speaking that is significantly less important than granularity. To get the same granularity you allow the script that updates submodules to produce more than one commit to the submodule repo at a time. If there are multiple you can sort them by committer date. While committer date isn't a great thing to use since our proposals both depend on maintaining a linear history it should be good enough for the common cases because committer date gets reset on rebase.
630	Ah. I think the confusing phrasing is that monorepo is being used in two contexts. Maybe rephrase this to something like: With this variant of the monorepo proposal developers who only work on excluded sub-projects will continue to use the single-project repositories. The workflow is still changed from today, because today we're using SVN.
638	I'd like to see that mentioned here as well. This document is quite large and people may jump around reading it. It is worth having the note directly next to the link.

mehdi_amini marked 4 inline comments as done.Sep 9 2016, 4:58 PM

mehdi_amini added inline comments.

docs/Proposals/GitHubMove.rst
205	I don't see `git subtree` at work on this link, just `filter-branch` + `git mv` + merge. That flow tracks the history of a file, not its content AFAIK (i.e. if a function was moved from another file into the current one, the history of when/why this function added/modified won't be included). Also, what would be the effect of moving a file from a repo to another, and later back to the original repo?
211	There is nothing convoluted here. Adding fact-based pro multi-repo statement will make it easier to understand. I disagree, I think most of this section should stay here. So we'll have to go in the specifics, piece by piece.
573	but generally speaking that is significantly less important than granularity. No sorry, I can't agree, this is critical: correctness goes before usability. It seems to me that you're willing to trade correctness to bring a guarantee of usability here. I'm willing to believe that "in practice" the granularity should be small enough, it just has to be worded carefully. Right now it is a parenthesis at the end: `(it is possible that one commit in the umbrella repository includes multiple commits in the sub-projects)` , we can reword this `(it is possible that one commit in the umbrella repository includes multiple commits in the sub-projects, though it should be occasional in practice)` (One may bikeshed on what exactly "occasional" is though, but we don't have any data to bikeshed efficiently anyway).
630	Sorry, the sentence is really about the monorepo: leaving libcxx within the monorepo should not be a regression compared to today.

mehdi_amini marked 22 inline comments as done.Sep 26 2016, 2:29 PM

mehdi_amini added inline comments.

docs/Proposals/GitHubMove.rst
213	Can you clarify what you're referring to exactly and/or suggest some editing?
233	(Tried to make it more explicit that complexity is handled by the infrastructure)

Address the remaining outstanding comments (but 1 or 2 maybe)

Minor fix.

mehdi_amini updated this revision to Diff 73090.Sep 30 2016, 10:25 AM

Any other comments? Otherwise we should move forward.

Lots of comments inline, and one meta-comment.

Looking at the details of the mono-repo proposal "use the GitHub SVN" interface is the answer to a lot of workflows. How would the Git-SVN workflows be impacted by moving to a PR-based workflow? I assume it works fine, you just create SVN branches and commit to them then make the PR via the web UI. Is that correct?

I know we're not actually considering a PR-based workflow, but it is something to consider.

docs/Proposals/GitHubMove.rst
223	What about the concerns about active community members having this burden?
250	Remove "(with some granularity)". The multi-repo proposal can have the same 1:1 mapping of commits in per-project repos to umbrella commits that the mono-repo would have. When the update job runs with a list of more than one commit we can sort them by committer timestamp (which is updated after rebase). It will provide a roughly linear timeline for the commits to be sorted across the repositories. It won't be perfect, but it should be good enough for sorting commits in close proximity because the pushed commits will either be rebased (which updates the committer timestamp) or they will be merge commits which will have a committer timestamp generated when the merge commit was generated.
254	I would say 'continuously' rather than periodically here. You describe in more detail below how the notifications would be configured and 'periodically' isn't a full picture.
269	s/interacts/would interact/
335	You've lost me here. Checking out all the projects in SVN today involves multiple svn co commands. Unless there is some magic in SVN I'm unaware of. If there is such magic we should document it somewhere on LLVM.org (maybe on the getting started page?) and link to it here.
351	Can you please add actual size numbers for each project and the mono-repo? Just saying '2x' isn't super meaningful without knowing the size of 1x.
388	the emphasis on 'exactly as we do today' is unnecessary.
446	Alternatively since our intention is to enforce a linear history in the repositories doing a checkout by timestamp using the format below should also work in the majority of cases. git checkout 'master@{...}'
462	Again, I don't follow how this is easy. There is no documentation on LLVM.org explaining how to do this and my limited knowledge of SVN leaves me with no idea how to do it.
467	I would phrase as "It would be possible...", because it most certainly is possible.
472	Please remove "and makes this use case ...", it is a value judgement.
585	If we go with the multi-repo approach we can ensure that each umbrella repo commit will be only one submodule update. This is relatively straight forward tooling to add. The only situation where we could potentially allow multiple updates in a single umbrella commit would be if we wanted to do cross-repository correlating of revlocked changes.
590	The granularity is not finer.
602	Better to say both proposals allow you to continue using SVN the same way, but that each solution will have minor impacts. In the monorepo there will be a one-time change in revision numbers, and in the multi-repo each project will have its own revision numbers out of sync from each other.
608	s/any of the proposal/both of the proposals/
612	Reword from the second sentence on. You're making a value assessment. A better phrasing might be: If your fork touches multiple LLVM projects, migrating your fork into the mono repo would enable you to make commits that touch multiple projects at the same time the same way LLVM contributors would be able to do so.
623	I would phrase the downside as "rewriting the fork's history and changing its commit hashes", because that is what happens.
631	This is a little unclear to me. Do you mean applying the patches via "git apply" from a patch file? Might be worth clarification about how that would work.
642	This makes it sound like the git mirrors are read-write. Might be worth adding a "via Git-SVN" comment to clarify.

beanz added inline comments.Sep 30 2016, 2:30 PM

docs/Proposals/GitHubMove.rst
208	How does the mono-repo do this? It might make it easier, but since it is likely that even with a mono-repo most people won't build all projects I don't think it actually encourages updates across all sub-projects.
215	You still haven't addressed the feedback here. Saying the multi-repo would lose history is still inaccurate. For starters, you're not actually deleting the history from the repository you're moving code from. Also with a multi-repo you can easily preserve the file history by using git filter-branch. Using filter-branch will not follow history across renames that are outside the filter, but will follow them within the filter. For example if you were to use filter branch on lib/Support to break it out into its own repository, filter branch would preserve history of files under lib/Support that are renamed as long as they remain under libSupport. It would not preserve the history of a file being renamed and moved under libSupport. Even with that the history before that point is traceable because the history would still exist in the old repository, so you are not losing history, you just aren't moving it with the file.

Address review.

docs/Proposals/GitHubMove.rst
208	I was thinking about the fact that if I change the API `createTargetMachineFromTriple()`, and `git grep` to find the uses, then all the uses in sub-projects will show up.
215	Fair enough: replaced "losing history" with "the history of the refactored code won't be available from the new place".
223	Can you clarify what you're referring to exactly? (No regression compared to now I believe)
250	It seems to me that at the beginning the idea was that the submodules would be updated every few minutes, so that we'd be able to have rev-locked commits pushed to multiple projects at the same time and have them appear a single umbrella update (with somehow a heuristic like "update the submodules when there hasn't been a push for 2 min"). Apparently your idea is rather than we should update it with single commits, but what's the story for rev-locked? How would the tooling not have a race condition? Example: I commit to LLVM I commit to Clang the script runs, pull LLVM, no change I push to LLVM I push to Clang the script pulls Clang, see my commit the script is done with pulling and update the submodule with the clang change, before the LLVM change, even though the commit date would be reversed. I don't see a principled solution to implement the umbrella without server-side (i.e. native git hook) support. Sure you can craft it, and it'll work fine most of the time, but that does not make it bulletproof.
335	I was referring to: svn co http://llvm.org/svn/llvm-project/ --depth=immediates cd llvm-project/ svn up llvm/trunk clang/trunk libcxx/trunk You can then have a build with only LLVM configured like: mkdir ../build-llvm && cd ../build-llvm cmake ../llvm-project/llvm/trunk And a build dir with llvm+clang: mkdir ../build-clang && cd ../build-clang cmake ../llvm-project/llvm/trunk -DLLVM_EXTERNAL_CLANG_DIR=../llvm-project/clang/trunk/ So that a single `svn up $projects` in the source directory update all the sources and you can still build a subset of the projects from these sources. This is also how I'd synchronize if I was integrating downstream from SVN.
446	This applies to both proposals right? Where do you want me to add this?
462	(Copy/pasted commands above)
462	Copy/pasted above (I'm not sure I really want to document it on llvm.org now).
472	I don't believe so, but if you insist...
571	(I'm waiting for the story to support this above)
585	(I'm waiting for the story to support this above)
590	(I'm waiting for the story to support this above)
602	"The same way" implies "a single SVN revision number to me". One could even say "a single SVN checkout" (cf the command I copy/pasted above). I don't see how it'd work with the multi-repo? How would someone downstream integrating from SVN be able to correlate revision across repositories?
623	The paragraph starts with " Using a script that rewrites history" and end with "changes the fork's commit hashes", it seems to me that this makes explicit that the downside of rewriting history is that the hashes change. (I'm not sure how "rewriting history" is a downside by itself otherwise)

beanz added inline comments.Sep 30 2016, 4:57 PM

docs/Proposals/GitHubMove.rst
208	That is 'making easier' not 'encouraging'. Personally I fall to 'grep' way before I fall to 'git grep' for things like this, and I don't think the monorepo has any enforcement of this.
215	In your example of moving clang-tools-extra there would be no need for loss of history at all. There is no need for filter-branch. You can literally reformat clang-tools-extra to be under tools/extra/ and merge the whole tree into the clang master branch. The only point where you would lose any history at all is if you were trimming one part of a repository into another repository, and even in that situation you can minimize the losses pretty well using filter-branch and index scripts. It is complicated but possible.
223	Ah. I misread. I see what you are saying. This is fine.
250	The automation will run. It will collect a list of commits that have been pushed to each repository since the last time the script ran. It will then sort them by committer timestamp order, and commit one at a time to the umbrella repo as submodule updates. We can setup the automation to run based on GitHub WebHooks, and periodically in case a WebHook gets dropped. There is no race condition that I see. If we need to support revlocked changes, (and I'm not convinced this is the case since they are by far a minority of commits) we can support them via annotations on the commit messages. We can teach the automation to look for markers in the commit message denoting that it is revlocked to other changes, and we can have it group revlocked changes together. There is no need for server-side hooks, and this solution would work as well as any mirroring system. I don't believe there is any need for this solution to be bulletproof, but I see no reason why it cannot be as robust as the single-project mirrors that the mono-repo proposal includes.
335	I can't imagine that is a common workflow. It certainly isn't the documented recommended workflow on llvm.org, so I'm not sure there is value in bringing it into the discussion.
351	Can you add per-project sizes?
446	I think it is worth noting under the multi-repo proposal something along the lines of: Because we will be maintaining a linear history you can perform a timestamp based checkout of each project repository with the following command: git checkout 'master@{...}' Additionally you can use the umbrella repository... If you want to also add the timestamp checkout to the mono-repo proposal, that makes sense too. I just think it is worth noting under the multi-repo proposal that timestamp based checkouts are expected to work due to the linear history requirement, which means you don't need the submodule repo.
462	Fine if you don't want to document it, but I certainly would not describe that as "easy". Especially because if you ever mix up and type "svn up" in the root it starts updating everything. I think this is an incredibly fragile workflow, which is probably why it is also incredibly uncommon.
571	See above.
585	Again, above.
602	Maybe rather than "the same way" "with similar workflows to today"?
623	Fine.

mehdi_amini marked 2 inline comments as done.Sep 30 2016, 5:36 PM

mehdi_amini added inline comments.

docs/Proposals/GitHubMove.rst
208	That is 'making easier' not 'encouraging'. "All the source is there by default" + "making it easier" => why I wrote "encouraging". Personally I fall to 'grep' way before I fall to 'git grep' for things like this, and I don't think the monorepo has any enforcement of this. Not sure why "enforcement" comes into play here?
215	So do you have anything concrete that could be added here, be practical (something we'd be willing to encourage in the future), be understandable by any dev, and not take > 20 lines to describe?
250	The automation will run. It will collect a list of commits that have been pushed to each repository since the last time the script ran. Atomically? There is no race condition that I see. Did you read my sequence 1-7 that describes an example of race? but I see no reason why it cannot be as robust as the single-project mirrors that the mono-repo proposal includes. Define "robust". The single-project mirrors have a very well deterministic algorithm to construct, and reconstruct them at will, you don't have one for the multi-repo. That's not "robust" to me.
351	That'd make a long list, how should it be presented?
446	Are you sure that this command does what you think it does? If I read correctly the doc, it is looking at your reflog, not the history. The right one should be something like `git checkout` `git rev-list -n 1 --before="2009-07-27 13:37" master` I just think it is worth noting under the multi-repo proposal that timestamp based checkouts are expected to work due to the linear history requirement, which means you don't need the submodule repo. OK that wasn't clear to me the first time.
602	I'm still missing what would be similar for someone integrating multiple projects from SVN today (assuming such downstream integrator exists) with the multi-repo?

Add mention of the ability to check out the individual repos according to a timestamp

Ping?

beanz added inline comments.Oct 3 2016, 10:05 AM

docs/Proposals/GitHubMove.rst
208	"All the source is there by default" This is what makes it easier. Your math is double counting it. I disagree with your wording here. I've told you I disagree. You can continue to disregard my feedback or you can fix it. The choice is yours.
215	You gave an example that is factually incorrect. I'm asking you to fix it. That is concrete. In my earlier comment I told you why your example was incorrect. You can remove the example, or come up with an alternative. That is your choice. What you cannot do, is use this factually inaccurate example.
250	I've updated my automation (https://github.com/llvm-beanz/llvm-submodules) to make one umbrella commit per commit to sub-project repository. This has a single commit granularity. That was the original point I was arguing. It works. It is done. Is it perfect? No. There are a number of situations where the order of the commits to the submodule can be impacted by the order and proximity of commits to the project repositories. That is irrelevant to the point I was making. I'm more than happy to debate with you about whether or not that matters, but that is a separate issue from what I was pointing out. Do we need to belabor this further, or will you update the document based on my feedback?
351	However you think it is best presented. A table would seem fitting. You could put it below and have a link down to it. I think that if you're bringing size into the discussion you need to provide sufficient data.
446	You are correct, you need to use `rev-list` to get the commit hash.
602	I strongly suspect that very few users are using a single SVN checkout that contains more than one sub-project. If you discount that workflow, the workflow for interfacing using the GitHub SVN bridge is very similar whether you are using one repo or many. Additionally, with the mono repo the combined SVN workflow is actually a lot better than with SVN today. It is way less fragile since you aren't doing sub-directory checkouts. This means you don't run the risk of inadvertently running `svn up` and pulling down way more than you wanted.

mehdi_amini added inline comments.Oct 3 2016, 11:43 AM

docs/Proposals/GitHubMove.rst
208	"All the source is there by default" This is what makes it easier. Sorry, but I mentioned earlier `git grep` and you answered `That is 'making easier'`. All the source presents by default is more than making it easier. I disagree with your wording here. I've told you I disagree. I strongly disagree with your disagreement here.
215	The current spelling (Friday, 3:51pm) is: "With the multirepo, moving clang-tools-extra into clang would be more complicated than a simple `git mv` command, and the history of the refactored code won't be available from the new place." I can change the example to: "Refactoring some functions from clang to make it a utility in one of the llvm/lib/Support file to share it across sub-projects wouldn't carry the history of the code in the llvm repo." That said, I asked you on 9/9 (over 3 weeks ago) "Can you provide an example where the history of a single file contents can be preserved without pulling all the source repository entirely? I'd like to try it and see how git log/git blame deals with that." You haven't been able to provide me with this. So you can claim whatever you want about "factual innacuracy", you still failed to provide counter facts to support your claim.
250	You're moving goal posts. Your previous message said that there is no race, while now you're eluding it with "There are a number of situations where...". Also you're changing the definition of the multi-repo as I was foreseeing it. I think it is worse, and if we were to adopt the multi-repo proposal, I would be totally against this. Now, just to please you, because again I don't think it does any good to this proposal, I'll re-formulate making clear that: update in the multi-repo are single commits based. commits can be in different orders. it does not handle cross-project commits.
602	If you discount that workflow, the workflow for interfacing using the GitHub SVN bridge is very similar whether you are using one repo or many. "Very similar" is subjective, to me it can't be similar as long as there is no longer a single revision number. Additionally, with the mono repo the combined SVN workflow is actually a lot better than with SVN today. It is way less fragile since you aren't doing sub-directory checkouts. This means you don't run the risk of inadvertently running svn up and pulling down way more than you wanted. I don't understand what you mean here.

kparzysz added inline comments.Oct 3 2016, 12:41 PM

docs/Proposals/GitHubMove.rst
356	Even with sparse checkout? Am I going to see new files in projects that were not originally included in the sparse checkout?
367	A conflicting change would have to affect the same file. This is regardless of whether it's monorepo or multirepo. Am I missing something here? Rebasing is always a good practice, but it's not strictly required. If there are no conflicts, the system will just add the change on top of the current ToT, even if they have not been fetched to the local repo.

jlebar added inline comments.Oct 3 2016, 1:41 PM

docs/Proposals/GitHubMove.rst
356	What do you mean by "see"? In order to push a commit without `-f`, the commit's parent commit must be the current remote head. The commits in git are unaffected by sparse checkout. So, if you have a commit you want to push, you will need to rebase it atop current remote HEAD -- you'll have to do this rebase even if you're using sparse checkouts and all of the changes between your current base revision and current remote HEAD are to subprojects that you don't have checked out. If you don't like this, you can continue to use the single-subproject mirrors exactly as you currently do (with git-svn and everything), by changing the configs as explained elsewhere in this document. But I've been using a monorepo (http://github.com/llvm-project/llvm-project) for months now. I've pushed maybe 30 commits using my custom script (https://github.com/jlebar/llvm-repo-tools) and this necessity to rebase hasn't once been an annoyance for me.
367	Rebasing is always a good practice, but it's not strictly required. If there are no conflicts, the system will just add the change on top of the current ToT, even if they have not been fetched to the local repo. That is what git-svn will do, yes. But that's not pure git's behavior.

mehdi_amini added inline comments.Oct 3 2016, 1:52 PM

docs/Proposals/GitHubMove.rst
356	Even with sparse checkout? Am I going to see new files in projects that were not originally included in the sparse checkout? If you mean are you seeing them when typing `ls` in your terminal, then no you don't. I can add "unless you're using a sparse checkout" to make it more clear.
367	A conflicting change would have to affect the same file. This is regardless of whether it's monorepo or multirepo. Am I missing something here? The point was that when you run `git pull --rebase`, you have new changes, and even without an explicit "diff conflict" your changes that you're about to push may use an API that have changed upstream. Note today this is not addressed: SVN will blindly accept the push and break the build. Rebasing is always a good practice, but it's not strictly required. If there are no conflicts, the system will just add the change on top of the current ToT, even if they have not been fetched to the local repo. As Justing mentions, this is not true with `git push` AFAIK. You have to `pull` (merge or rebase) before being able to push.

After this round of feedback I'm removing myself from this discussion.

docs/Proposals/GitHubMove.rst
208	You asked for feedback. If you want to disregard it that is your decision.
215	"Can you provide an example where the history of a single file contents can be preserved without pulling all the source repository entirely? I'd like to try it and see how git log/git blame deals with that." git-filter-branch can preserve the history of a single file. It does not follow renames, however if you know a file was renamed, you can use git-filter-branch's --tree-filter or --index-filter flags to perform more complicated slicing of the repository to preserve that history. If you're unfamiliar with the types of things you can do with filter branch, this article gives a good overview (https://devsector.wordpress.com/2014/10/05/advanced-git-branch-filtering/).
250	From the beginning I said: It won't be perfect, but it should be good enough for sorting commits in close proximity... If you want to debate that statement we can do so, but I would prefer not to in this thread. Also you're changing the definition of the multi-repo as I was foreseeing it. I think it is worse, and if we were to adopt the multi-repo proposal, I would be totally against this. You don't get to dictate how the proposal in opposition to your preferred approach is written. I think you've been pretty clear about being against the multi-repo proposal, so I don't see how your opinion factors in to the final document, which shouldn't be opinion based.
602	Saying the workflows is "similar" is not a subjective wording. Today someone who writes: `svn co svn co http://llvm.org/svn/llvm-project/llvm/trunk` Under the mono-repo could write something like: `svn co http://github.com/llvm/llvm-project/master/llvm` Under the multi-repo could write something like: `svn co http://github.com/llvm/llvm/master/` The workflow of `svn co` -> `svn add` -> `svn commit` is similar in all cases.

beanz removed a subscriber: beanz.Oct 3 2016, 2:02 PM

kparzysz added inline comments.Oct 3 2016, 2:02 PM

docs/Proposals/GitHubMove.rst
356	What do you mean by "see"? I'm referring to this (and the rest of this paragraph): "However when you fetch you'll likely pull in changes to sub-projects you don't care about." The intent wasn't clear---I wasn't aware of the requirement about parent commit (I use SVN for upstreaming changes). But that brings another question: what is the anticipated frequency of commits to the monorepo? My concern is that the "rebuild and retest" approach may take long enough to require another rebase...

remove reference to size.
change the model for multi-repos: single commit based, losing cross-project commits.
clarify that sparse-checkout don't see changes from other projects

mehdi_amini added inline comments.Oct 3 2016, 2:16 PM

docs/Proposals/GitHubMove.rst
356	When you commit to SVN, you add a "patch" on top of the existing codebase. Unless there is a conflict your patch will be committed. It does not mean it will build, since someone else may just have changed an API you're using in your patch. The new monorepo won't be different from SVN on this aspect: you have the same frequency of commits, and you can run `git pull && git push` which is roughly equivalent to `git svn dcommit` today. The thing is that between the `git pull` and the `git push`, you can also inspect what changed since your last build/check, and decide if you need to rebuild or not.

mehdi_amini added inline comments.Oct 3 2016, 2:18 PM

docs/Proposals/GitHubMove.rst
356	Maybe we should just remove all this paragraph, it is confusing...

Remove confusing paragraph.

jlebar added inline comments.Oct 3 2016, 2:21 PM

docs/Proposals/GitHubMove.rst
356	My concern is that the "rebuild and retest" approach may take long enough to require another rebase... This isn't a function of the monorepo. You choose when to rebuild/retest, and that's orthogonal to the repository structure. If "rebuild/retest only when there were changes to files I changed" is what you want to do, you can still do that. You can ask that question of git before pushing. Or you could ask "have any of the projects I care about changed?" Or you could ask a different question. And you could ask those questions of the monorepo, or the multirepo (although it might be a bit more work in the multirepo -- I say "might" so beanz doesn't jump on me). In this sense it's safer than SVN, which assumes that you only care about retesting if there were modifications to files you also changed.

jlebar added inline comments.Oct 3 2016, 2:23 PM

docs/Proposals/GitHubMove.rst
356	I really think you want this paragraph, btw. This is a very common question -- it's been asked many times before. "I don't want the monorepo because it will mean I have to rebuild/retest a lot more than I do today." False, but we need to explain why.

mehdi_amini marked 45 inline comments as done.Oct 3 2016, 2:37 PM

mehdi_amini added inline comments.

docs/Proposals/GitHubMove.rst
250	You don't get to dictate how ... Sorry, you mischaracterizing my position and what I wrote, I don't appreciate this.
356	OK, I'll try to rephrase it then. The main point is that `git pull && git push` is not different from today SVN.

mehdi_amini marked an inline comment as done.Oct 3 2016, 2:38 PM

Restore the paragraph.

mehdi_amini added a reviewer: dexonsmith.Oct 5 2016, 9:36 PM

dtzWill added a subscriber: dtzWill.Oct 6 2016, 7:43 AM

Address Duncan's inline comments.

New layout attempt

I believe what Duncan is asking for is basically the same thing I (and others) have also been asking for: An explicit "not dryly-factual" section where the experts explain their various positions.

I am saddened that we won't have these sections in the document -- I think not having them does a disservice to the readers who, like you, want this material. (We've had at least one other person comment in this thread, and we've had nobody say they don't want it.) But the disagreement seems to be based on Mehdi having fundamentally different conceptions of concourse than certainly I have, and I'm not prepared to litigate the philosophy of argument just to get a section added to this document.

On the other hand, in light of the amount of abuse it seems that whoever drives this process will receive no matter what they say, I'm certainly not willing to switch places with Mehdi, and I think he deserves a heaping ton of credit for frankly superhuman positivity here (in addition to credit for doing the work itself). So if he continues to oppose this idea, I respect his decision. In that case, I think we're just going to have to write it up separately, and hope that it gets the visibility it deserves. Maybe we'll be able to get a link in this document, although if that turns into a fight, like so much else here has, I hope I'll have the self-control to turn and run in the other direction.

Try another layout: add first a description of the multirepo, then one for the monorepo, then the interleaved comparison.

ioeric added a subscriber: ioeric.Oct 12 2016, 2:30 AM

ioeric added inline comments.Oct 12 2016, 4:19 AM

docs/Proposals/GitHubMove.rst
181	I am wondering where we are in the process now. Specifically, when would we get to this step (2.5)? Phabricator is seeing frequent connection errors from svn server (might due to the increased number of svn connections after the recent phabricator upgrade): svn: OPTIONS of 'http://llvm.org/svn-robots/llvm-project': could not connect to server (http://llvm.org) This blocks syncing svn commits from time to time. I'd expect Github to be more stable.

Address Duncan's feedback

(Remove duplicated section)

mehdi_amini updated this revision to Diff 74445.Oct 12 2016, 2:55 PM

Split the bullet about the overhead of the monorepo for users that care only about a single subproject.

Closed by commit rL284077: Moving to GitHub - Unified Proposal (authored by mehdi_amini). · Explain WhyOct 12 2016, 4:54 PM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

docs/

Proposals/

GitHubMove.rst

685 lines

index.rst

1 line

Diff 70090

docs/Proposals/GitHubMove.rst

This file was added.

				==============================
				Moving LLVM Projects to GitHub
				==============================

				Introduction
				============

				This is a proposal to move our current revision control system from our own
				hosted Subversion to GitHub. Below are the financial and technical arguments as
				to why we need such a move and how will people (and validation infrastructure)
				continue to work with a Git-based LLVM.
				emasteUnsubmitted Done Reply Inline Actions A little point, but I think we should say "why we're proposing such a move" or similar. "why we need such a move" in the first paragraph of the document implies the decision is already made, and might discourage those against change from even responding. emaste: A little point, but I think we should say "why we're proposing such a move" or similar. "why…

				There will be a survey pointing at this document which we'll use to gague the
				community's reaction and, if we collectively decide to move, the time-frame. Be
				probinsonUnsubmitted Done Reply Inline Actions s/gague/gauge/ probinson: s/gague/gauge/
				sure to make your view count.

				This proposal is divided into the following parts:

				* Outline of the reasons to move to Git and GitHub
				* Description on the options
				* What some examples of workflow will look like (compared to currently)
				* The proposed migration plan

				What This Proposal is Not About
				=================================

				Changing the development policy: the development of LLVM will continue as it
				exists now.

				This proposal relates only to moving the hosting of our source-code repository
				from SVN hosted on our own servers to Git hosted on GitHub. We are not proposing
				other workflow changes here. That is, it should not be assumed that moving to
				GitHub implies using GitHub's issue tracking, or using the GitHub UI for
				pull-requests and/or code-review.

				Every existing contributors will get commit access on demand in the same
				condition as currently. Those who don't have an existing GitHub account will
				have to create one in order to continue having commit access.
				probinsonUnsubmitted Done Reply Inline Actions contributors -> contributor 'in the same condition' -> 'under the same conditions' probinson: contributors -> contributor 'in the same condition' -> 'under the same conditions'

				Why Git, and Why GitHub?
				========================

				Why Move At All?
				----------------

				One of the reasons for the move, and why this discussion started in the first
				place, is that we currently host our own Subversion server and Git mirror in a
				voluntary basis. The LLVM Foundation sponsors the server and provides limited
				support, but there is only so much it can do.

				Volunteers are not sysadmins themselves, but compiler engineers that happen
				to know a thing or two about hosting servers. We also don't have 24/7 support,
				and we sometimes wake up to see that continuous integration is broken because
				the SVN server is either down or unresponsive.

				On the other hand, there are multiple services out there (GitHub, GitLab,
				BitBucket among others) that offer that same service (24/7 stability, disk
				space, Git server, code browsing, forking facilities, etc) for free.

				Why Git?
				--------

				It seems that Git is new coders first choice nowadays . A lot of them have never
				Eugene.ZelenkoUnsubmitted Done Reply Inline Actions Please remove space before dot. Eugene.Zelenko: Please remove space before dot.
				used SVN, CVS, or anything else. Websites like GitHub have changed the landscape
				beanzUnsubmitted Done Reply Inline Actions The language here is also misleading. Maybe change to something like: Many new coders nowadays start with Git, and a lot of people have never used SVN, CVS, or anything else. beanz: The language here is also misleading. Maybe change to something like: > Many new coders…
				of open source contributions, reducing the cost of first contribution and
				fostering collaboration.

				Git is also the version control many (most?) LLVM developers use. Despite the
				sources being stored in a SVN server, these developers are already using Git
				beanzUnsubmitted Done Reply Inline Actions I would remove the "(most?)" bit here because it doesn't really add any value. We have no data to support an assertion of "most", and it could be misleading to suggest it. beanz: I would remove the "(most?)" bit here because it doesn't really add any value. We have no data…
				through the Git-SVN integration.

				Git allows you to:

				* Commit, squash, merge, and fork locally without touching the remote server.
				* Maintain as many local branches as you like, letting you maintain multiple
				threads of development.
				* Collaborate on these branches (e.g. through your own fork of llvm on GitHub).
				* Inspect the repository history (blame, log, bisect) without Internet access.

				In addition, because Git seems to be replacing many OSS projects' version
				beanzUnsubmitted Done Reply Inline Actions Can we also add this as a point: Maintain remote forks and branches on Git hosting services and easily integrate back to the main repository. In particular for people that maintain out-of-tree code or forks, the ability to seamlessly merge between repositories is a big win for Git. beanz: Can we also add this as a point: > * Maintain remote forks and branches on Git hosting…
				control systems, there are many tools that are built over Git. Future tooling is
				much more likely to support Git first (if not only).

				Why GitHub?
				-----------

				GitHub, like GitLab and BitBucket, provides free code hosting for open source
				projects. Any of these could replace the code-hosting infrastructure that we
				have today.

				These services also have a dedicated team to monitor, migrate, improve and
				distribute the contents of the repositories depending on region and load.

				All things being equal, GitHub has one important advantage over GitLab and
				BitBucket: It offers read-write SVN access to the repository
				(https://github.com/blog/626-announcing-svn-support).
				This would enable people to continue working post-migration as though our code
				were still canonically in an SVN repository.

				In addition, there are already multiple LLVM mirrors on GitHub, indicating that
				part of our community has already settled there.

				On Managing Revision Numbers with Git
				-------------------------------------

				The current SVN repository hosts all the LLVM sub-projects alongside each other.
				A single revision number (e.g. r123456) thus identifies a consistent version of
				all LLVM sub-projects.

				Git does not use sequential integer revision number but instead uses a hash to
				identify each commit. (Linus mentioned that the lack of such revision number
				is "the only real design mistake" in Git [TorvaldRevNum]_.)

				The loss of a sequential integer revision number has been a sticking point in
				past discussions about Git:

				- "The 'branch' I most care about is mainline, and losing the ability to say
				'fixed in r1234' (with some sort of monotonically increasing number) would
				be a tragic loss." [LattnerRevNum]_
				- "I like those results sorted by time and the chronology should be obvious, but
				timestamps are incredibly cumbersome and make it difficult to verify that a
				given checkout matches a given set of results." [TrickRevNum]_
				- "There is still the major regression with unreadable version numbers.
				Given the amount of Bugzilla traffic with 'Fixed in...', that's a
				non-trivial issue." [JSonnRevNum]_
				- "Sequential IDs are important for LNT and llvmlab bisection tool." [MatthewsRevNum]_.

				However, Git can emulate this increasing revision number:
				`git rev-list --count <commit-hash>`. This identifier is unique only within a
				single branch, but this means the tuple `(num, branch-name)` uniquely identifies
				a commit.

				We can thus use this revision number to ensure that e.g. `clang -v` reports a
				user-friendly revision number (e.g. `master-12345` or `4.0-5321`). This should
				be enough to address the objections raised above with respect to this aspect of
				Git.

				What About Branches and Merges?
				beanzUnsubmitted Done Reply Inline Actions Can we also add something about the more traditional Git approaches to this? Maybe something like: Additionally, there are simple Git commands that can also be used to determine the order of commits. For example to answer the question is a bug fixed in <hash-a> fixed in a compiler built at <hash-b> can be answered with the command `git rev-list <hash-a>..<hash-b> --count`. If this prints a number greater than 0, the fix is contained in <hash-b>. Additionally if we were to use Git tags similarly to how we use SVN tags today you would be able to identify which releases contained a fix by running `git describe --contains <hash>`. beanz: Can we also add something about the more traditional Git approaches to this? Maybe something…
				-------------------------------
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions I'm not against mentioning this somewhere, but the "traditional" Git approach of hashes does not address at all the concerns mentioned right above. mehdi_amini: I'm not against mentioning this somewhere, but the "traditional" Git approach of hashes does…

				In contrast to SVN, Git makes branching easy. Git's commit history is represented
				as a DAG, a departure from SVN's linear history.

				However, we propose to enforce linear history in our canonical Git repository
				repository. (This is not uncommon amongst many large users of Git.)

				..
				TODO: Is this going to work when people push via the SVN bridge?
				jlebarUnsubmitted Done Reply Inline Actions What's the resolution here? jlebar: What's the resolution here?

				rengolinUnsubmitted Done Reply Inline Actions This is a good question. If it works at all, two things can happen: SVN reports rev 123, I commit, get rev 124. Git rev-count get's 124 SVN reports rev 123, I commit, get rev 125 because someone committed at the same time and git sorted the other commit first. If 2 happens, I don't think it'll be a big deal, so, we should be fine, as long as the SVN bridge can work with the linearity enforcement of restricted branches. rengolin: This is a good question. If it works at all, two things can happen: 1. SVN reports rev 123, I…
				We'll do this with a combination of client-side and server-side hooks. GitHub
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Right now it is one or the other, I asked the github support to consider adding the option to have SVN commits bypass the status-check, they are considering it (no promises). mehdi_amini: Right now it is one or the other, I asked the github support to consider adding the option to…
				offers a feature called `Status Checks`: a branch protected by `status checks`
				requires commits to be whitelisted before the push can happen. A supplied
				pre-push hook on the client side will run and check the history, before
				whitelisting the commit being pushed [statuschecks]_.

				What About Commit Emails?
				-------------------------

				An extra bot will need to be set up to continue to send emails for every commit.
				We'll keep the exact same email format as we currently have (a change is possible
				rengolinUnsubmitted Done Reply Inline Actions No need, GitHub has email hooks: https://help.github.com/articles/managing-notifications-for-pushes-to-a-repository/ rengolin: No need, GitHub has email hooks: https://help.github.com/articles/managing-notifications-for…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions This does not line up with "We'll keep the exact same email format". mehdi_amini: This does not line up with "We'll keep the exact same email format".
				later, but beyond the scope of the current discussion), the only difference
				rengolinUnsubmitted Done Reply Inline Actions IMO, we don't need to keep the same format, but that's a good point. Though, it would be better to outline the two options in one quick phrase than leave the other implied. rengolin: IMO, we don't need to keep the same format, but that's a good point. Though, it would be…
				being changing the URL from `http://llvm.org/viewvc/...` to
				`http://github.org/llvm/...`.


				One or Multiple Repositories?
				=============================

				There are two major proposals for how to structure our Git repository: The
				"multirepo" and the "monorepo".

				1. Multirepo - Moving each SVN sub-project into its own separate Git repository.
				2. Monorepo - Moving all the LLVM sub-projects into a single Git repository.

				The first proposal would mimic the existing official separate read-only Git
				repositories (e.g. http://llvm.org/git/compiler-rt.git), while the second one
				would mimic an export of the SVN repository (i.e. it would look similar to
				https://github.com/llvm-project/llvm-project, where each sub-project has its own
				top-level directory).

				With the Monorepo, the existing read-only repositories (i.e. for example
				jlebarUnsubmitted Done Reply Inline Actions Maybe we should call them "single-subproject mirrors" instead of "read-only repositories". jlebar: Maybe we should call them "single-subproject mirrors" instead of "read-only repositories".
				ioericUnsubmitted Not Done Reply Inline Actions I am wondering where we are in the process now. Specifically, when would we get to this step (2.5)? Phabricator is seeing frequent connection errors from svn server (might due to the increased number of svn connections after the recent phabricator upgrade): svn: OPTIONS of 'http://llvm.org/svn-robots/llvm-project': could not connect to server (http://llvm.org) This blocks syncing svn commits from time to time. I'd expect Github to be more stable. ioeric: I am wondering where we are in the process now. Specifically, when would we get to this step (2.
				http://llvm.org/git/compiler-rt.git) with git-svn read-write access would be
				jlebarUnsubmitted Done Reply Inline Actions would continue to be maintained jlebar: would continue to be maintained
				maintained
				jlebarUnsubmitted Not Done Reply Inline Actions I think we need to explain what this means, because this is critical for understanding the monorepo. Developers will continue to be able to use the existing single-subproject git repositories as they do today, with no changes to workflow beyond a one-time git-svn config change. Everything (git fetch, git svn dcommit, etc.) would continue to work identically to how it works today. jlebar: I think we need to explain what this means, because this is critical for understanding the…
				jlebarUnsubmitted Done Reply Inline Actions Missing period at end of sentence. jlebar: Missing period at end of sentence.

				rengolinUnsubmitted Done Reply Inline Actions Full stop. rengolin: Full stop.
				There are other impacts that are less immediates and less technicals: the first
				jlebarUnsubmitted Done Reply Inline Actions This segue does not make sense in context. jlebar: This segue does not make sense in context.
				proposal of keeping the repository separate implies that the sub-projects are
				probinsonUnsubmitted Done Reply Inline Actions immediates ... technicals -> immediate ... technical probinson: immediates ... technicals -> immediate ... technical
				more independent from each other, while the second proposal
				encourage better code sharing and refactoring across projects, for example
				reusing a datastructure initially in LLDB by moving it into libSupport. It
				would also be easier to decide to extract some pieces of libSupport and/or
				ADT to a new top-level independent library that can be reused in libcxxabi for
				instance. Finally, it also encourages to update all the sub-projects when
				changing API or refactoring code ("git grep" works across sub-projects for
				beanzUnsubmitted Done Reply Inline Actions This is a completely subjective statement, and should not be present. beanz: This is a completely subjective statement, and should not be present.
				instance).
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Rewrote, but I suspect we'll need some other rounds. Suggestion welcome. mehdi_amini: Rewrote, but I suspect we'll need some other rounds. Suggestion welcome.

				As another example, some developers think that the division between e.g. clang
				and clang-tools-extra is not useful. With the monorepo, we can move code around
				as we wish. With the multirepo, moving clang-tools-extra into clang would be
				much more complicated, and might end up loosing history.
				jlebarUnsubmitted Done Reply Inline Actions losing jlebar: losing
				kparzyszUnsubmitted Done Reply Inline Actions and preserve the history. kparzysz: and preserve the history.

				rengolinUnsubmitted Done Reply Inline Actions Better to avoid "much more" for the reasons we have discussed before. Either say how it's worse, or don't compare. rengolin: Better to avoid "much more" for the reasons we have discussed before. Either say how it's worse…
				Some concerns have been raised that having a single repository would be a burden
				for downstream users that have interest in only a single repository, however
				this is addressed by keeping a read-only Git repo for each project just as we
				beanzUnsubmitted Done Reply Inline Actions With git history could be preserved even across repositories. Git subtree merges support this, and while it isn't as simple, it is a one-time cost. beanz: With git history could be preserved even across repositories. Git subtree merges support this…
				do today. Also the GitHub SVN bridge allows to contribute to a single
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Can you provide an example where the history of a single file contents can be preserved without pulling all the source repository entirely? I'd like to try it and see how git log/git blame deals with that. mehdi_amini: Can you provide an example where the history of a single file contents can be preserved…
				sub-project the same way it is possible today (see below before/after section
				beanzUnsubmitted Done Reply Inline Actions Google is our friend -> http://stackoverflow.com/questions/1365541/how-to-move-files-from-one-git-repo-to-another-not-a-clone-preserving-history beanz: Google is our friend -> http://stackoverflow.com/questions/1365541/how-to-move-files-from-one…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions I don't see `git subtree` at work on this link, just `filter-branch` + `git mv` + merge. That flow tracks the history of a file, not its content AFAIK (i.e. if a function was moved from another file into the current one, the history of when/why this function added/modified won't be included). Also, what would be the effect of moving a file from a repo to another, and later back to the original repo? mehdi_amini: I don't see `git subtree` at work on this link, just `filter-branch` + `git mv` + merge. That…
				for more details).
				beanzUnsubmitted Done Reply Inline Actions As in my other comment, losing history is not an issue. beanz: As in my other comment, losing history is not an issue.

				Finally, nobody will be forced to compile projects they don't want to build.
				beanzUnsubmitted Done Reply Inline Actions How does the mono-repo do this? It might make it easier, but since it is likely that even with a mono-repo most people won't build all projects I don't think it actually encourages updates across all sub-projects. beanz: How does the mono-repo do this? It might make it easier, but since it is likely that even with…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions I was thinking about the fact that if I change the API `createTargetMachineFromTriple()`, and `git grep` to find the uses, then all the uses in sub-projects will show up. mehdi_amini: I was thinking about the fact that if I change the API `createTargetMachineFromTriple()`, and…
				beanzUnsubmitted Done Reply Inline Actions That is 'making easier' not 'encouraging'. Personally I fall to 'grep' way before I fall to 'git grep' for things like this, and I don't think the monorepo has any enforcement of this. beanz: That is 'making easier' not 'encouraging'. Personally I fall to 'grep' way before I fall to…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions That is 'making easier' not 'encouraging'. "All the source is there by default" + "making it easier" => why I wrote "encouraging". Personally I fall to 'grep' way before I fall to 'git grep' for things like this, and I don't think the monorepo has any enforcement of this. Not sure why "enforcement" comes into play here? mehdi_amini: > That is 'making easier' not 'encouraging'. "All the source is there //by default//" +…
				beanzUnsubmitted Done Reply Inline Actions "All the source is there by default" This is what makes it easier. Your math is double counting it. I disagree with your wording here. I've told you I disagree. You can continue to disregard my feedback or you can fix it. The choice is yours. beanz: > "All the source is there by default" This is what makes it easier. Your math is double…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions "All the source is there by default" This is what makes it easier. Sorry, but I mentioned earlier `git grep` and you answered `That is 'making easier'`. All the source presents by default is more than making it easier. I disagree with your wording here. I've told you I disagree. I strongly disagree with your disagreement here. mehdi_amini: >> "All the source is there by default" > This is what makes it easier. Sorry, but I mentioned…
				beanzUnsubmitted Done Reply Inline Actions You asked for feedback. If you want to disregard it that is your decision. beanz: You asked for feedback. If you want to disregard it that is your decision.
				The exact structure is TBD, but even if you use the monorepo directly, we'll
				beanzUnsubmitted Done Reply Inline Actions Actually, there were also concerns about the increased burden for contributors not just downstream users. In general I think this entire section is designed to point out supporting arguments for the mono-repo with no recognition of the merits of the multi-repo proposal. beanz: Actually, there were also concerns about the increased burden for contributors not just…
				ensure that it's easy to set up your build to compile only a few particular
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions You're welcome to suggest merits of the multi-repo proposal to balance. mehdi_amini: You're welcome to suggest merits of the multi-repo proposal to balance.
				sub-projects.
				beanzUnsubmitted Done Reply Inline Actions I don't think that our proposals should be constructed as convoluted arguments between contributing authors. Adding pro multi-repo statements will only make this more difficult to grok. I actually think there is very little in this section that shouldn't be part of an "arguments/rebuttals" section. beanz: I don't think that our proposals should be constructed as convoluted arguments between…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions There is nothing convoluted here. Adding fact-based pro multi-repo statement will make it easier to understand. I disagree, I think most of this section should stay here. So we'll have to go in the specifics, piece by piece. mehdi_amini: 1) There is nothing convoluted here. 2) Adding fact-based pro multi-repo statement will make it…

				rengolinUnsubmitted Done Reply Inline Actions I think this won't address the fears of people that don't know enough to not panic. This is why getting the technical parts correct and accurate is so important (and I confess I didn't do enough due diligence on my part of the text either). rengolin: I think this won't address the fears of people that don't know enough to not panic. This is why…
				How Do We Handle A Single Revision Number Across Multiple Repositories?
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Can you clarify what you're referring to exactly and/or suggest some editing? mehdi_amini: Can you clarify what you're referring to exactly and/or suggest some editing?
				-----------------------------------------------------------------------

				beanzUnsubmitted Done Reply Inline Actions You still haven't addressed the feedback here. Saying the multi-repo would lose history is still inaccurate. For starters, you're not actually deleting the history from the repository you're moving code from. Also with a multi-repo you can easily preserve the file history by using git filter-branch. Using filter-branch will not follow history across renames that are outside the filter, but will follow them within the filter. For example if you were to use filter branch on lib/Support to break it out into its own repository, filter branch would preserve history of files under lib/Support that are renamed as long as they remain under libSupport. It would not preserve the history of a file being renamed and moved under libSupport. Even with that the history before that point is traceable because the history would still exist in the old repository, so you are not losing history, you just aren't moving it with the file. beanz: You still haven't addressed the feedback here. Saying the multi-repo would lose history is…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Fair enough: replaced "losing history" with "the history of the refactored code won't be available from the new place". mehdi_amini: Fair enough: replaced "losing history" with "the history of the refactored code won't be…
				beanzUnsubmitted Done Reply Inline Actions In your example of moving clang-tools-extra there would be no need for loss of history at all. There is no need for filter-branch. You can literally reformat clang-tools-extra to be under tools/extra/ and merge the whole tree into the clang master branch. The only point where you would lose any history at all is if you were trimming one part of a repository into another repository, and even in that situation you can minimize the losses pretty well using filter-branch and index scripts. It is complicated but possible. beanz: In your example of moving clang-tools-extra there would be no need for loss of history at all.
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions So do you have anything concrete that could be added here, be practical (something we'd be willing to encourage in the future), be understandable by any dev, and not take > 20 lines to describe? mehdi_amini: So do you have anything concrete that could be added here, be practical (something we'd be…
				beanzUnsubmitted Done Reply Inline Actions You gave an example that is factually incorrect. I'm asking you to fix it. That is concrete. In my earlier comment I told you why your example was incorrect. You can remove the example, or come up with an alternative. That is your choice. What you cannot do, is use this factually inaccurate example. beanz: You gave an example that is factually incorrect. I'm asking you to fix it. That is concrete. In…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions The current spelling (Friday, 3:51pm) is: "With the multirepo, moving clang-tools-extra into clang would be more complicated than a simple `git mv` command, and the history of the refactored code won't be available from the new place." I can change the example to: "Refactoring some functions from clang to make it a utility in one of the llvm/lib/Support file to share it across sub-projects wouldn't carry the history of the code in the llvm repo." That said, I asked you on 9/9 (over 3 weeks ago) "Can you provide an example where the history of a single file contents can be preserved without pulling all the source repository entirely? I'd like to try it and see how git log/git blame deals with that." You haven't been able to provide me with this. So you can claim whatever you want about "factual innacuracy", you still failed to provide counter facts to support your claim. mehdi_amini: The current spelling (Friday, 3:51pm) is: "With the multirepo, moving clang-tools-extra into…
				beanzUnsubmitted Done Reply Inline Actions "Can you provide an example where the history of a single file contents can be preserved without pulling all the source repository entirely? I'd like to try it and see how git log/git blame deals with that." git-filter-branch can preserve the history of a single file. It does not follow renames, however if you know a file was renamed, you can use git-filter-branch's --tree-filter or --index-filter flags to perform more complicated slicing of the repository to preserve that history. If you're unfamiliar with the types of things you can do with filter branch, this article gives a good overview (https://devsector.wordpress.com/2014/10/05/advanced-git-branch-filtering/). beanz: > "Can you provide an example where the history of a single file contents can be preserved…
				A key need is to be able to check out multiple projects (i.e. lldb+llvm or
				clang+llvm+libcxx for example) at a specific revision.

				Under the monorepo, this is a non-issue. That proposal maintains property of
				the existing SVN repository that the sub-projects move synchronously, and a
				probinsonUnsubmitted Done Reply Inline Actions maintains the property probinson: maintains the property
				single revision number (or commit hash) identifies the state of the development
				across all projects.

				beanzUnsubmitted Done Reply Inline Actions What about the concerns about active community members having this burden? beanz: What about the concerns about active community members having this burden?
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Can you clarify what you're referring to exactly? (No regression compared to now I believe) mehdi_amini: Can you clarify what you're referring to exactly? (No regression compared to now I believe)
				beanzUnsubmitted Done Reply Inline Actions Ah. I misread. I see what you are saying. This is fine. beanz: Ah. I misread. I see what you are saying. This is fine.
				Under the multirepo, things are more involved. We describe here the proposed
				solution.

				rengolinUnsubmitted Done Reply Inline Actions This is a nit, please don't take it personal... When I read "things are more involved", I had a negative feeling that "it's complicated". Down there, when explaining the "involved" way of checking out a singular repo (compiler-rt), instead, you say "there are a number of options", and I had a positive feeling of "choice". Even though they mean the same thing, it felt different. I don't particularly mind either way, but to avoid backlash, I'd try to be consistent and use the same (preferably neutral) phrases for all cases. rengolin: This is a nit, please don't take it personal... When I read "things are more involved", I had…
				mehdi_aminiAuthorUnsubmitted Not Done Reply Inline Actions They don't mean the same thing. Here it is more complicated. The other one exposes multiple options with various tradeoff. mehdi_amini: They don't mean the same thing. Here it is more complicated. The other one exposes multiple…
				Fundamentally, separated Git repositories imply that a tuple of revisions
				(one entry per repository) is needed to describe the state across
				repositories/sub-projects.
				For example, a given version of clang would be
				<LLVM-12345, clang-5432, libcxx-123, etc.>.
				beanzUnsubmitted Done Reply Inline Actions This is very slanted wording. From a user perspective the multi-repo solution to this problem is not much more complicate than the mono-repo solution. beanz: This is very slanted wording. From a user perspective the multi-repo solution to this problem…

				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Please provide a replacement for this sentence. mehdi_amini: Please provide a replacement for this sentence.
				To make this more convenient, a separate umbrella repository would be
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions (Tried to make it more explicit that complexity is handled by the infrastructure) mehdi_amini: (Tried to make it more explicit that complexity is handled by the infrastructure)
				provided. This repository would be used for the sole purpose of understanding
				the sequence (with some granularity) in which commits were added across
				repository and to provide a single revision number.

				This umbrella repository will be read-only and periodically updated
				to record the above tuple. The proposed form to record this is to use Git
				[submodules]_, possibly along with a set of scripts to help check out a
				specific revision of the LLVM distribution.

				A regular LLVM developer does not need to interact with the umbrella repository
				-- the individual repositories can be checked out independently -- but you would
				need to use the umbrella repository to bisect or to check out old revisions of
				llvm plus another sub-project at a consistent version.

				One example of such a repository is Takumi's llvm-project-submodule
				(https://github.com/chapuni/llvm-project-submodule). You can use
				`git submodule init` to check out only the sub-projects you're interested in, and
				rengolinUnsubmitted Done Reply Inline Actions Can you add a link of the monorepo as well? I think you had one, right? rengolin: Can you add a link of the monorepo as well? I think you had one, right?
				beanzUnsubmitted Done Reply Inline Actions Remove "(with some granularity)". The multi-repo proposal can have the same 1:1 mapping of commits in per-project repos to umbrella commits that the mono-repo would have. When the update job runs with a list of more than one commit we can sort them by committer timestamp (which is updated after rebase). It will provide a roughly linear timeline for the commits to be sorted across the repositories. It won't be perfect, but it should be good enough for sorting commits in close proximity because the pushed commits will either be rebased (which updates the committer timestamp) or they will be merge commits which will have a committer timestamp generated when the merge commit was generated. beanz: Remove "(with some granularity)". The multi-repo proposal can have the same 1:1 mapping of…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions It seems to me that at the beginning the idea was that the submodules would be updated every few minutes, so that we'd be able to have rev-locked commits pushed to multiple projects at the same time and have them appear a single umbrella update (with somehow a heuristic like "update the submodules when there hasn't been a push for 2 min"). Apparently your idea is rather than we should update it with single commits, but what's the story for rev-locked? How would the tooling not have a race condition? Example: I commit to LLVM I commit to Clang the script runs, pull LLVM, no change I push to LLVM I push to Clang the script pulls Clang, see my commit the script is done with pulling and update the submodule with the clang change, before the LLVM change, even though the commit date would be reversed. I don't see a principled solution to implement the umbrella without server-side (i.e. native git hook) support. Sure you can craft it, and it'll work fine most of the time, but that does not make it bulletproof. mehdi_amini: It seems to me that at the beginning the idea was that the submodules would be updated every…
				beanzUnsubmitted Done Reply Inline Actions The automation will run. It will collect a list of commits that have been pushed to each repository since the last time the script ran. It will then sort them by committer timestamp order, and commit one at a time to the umbrella repo as submodule updates. We can setup the automation to run based on GitHub WebHooks, and periodically in case a WebHook gets dropped. There is no race condition that I see. If we need to support revlocked changes, (and I'm not convinced this is the case since they are by far a minority of commits) we can support them via annotations on the commit messages. We can teach the automation to look for markers in the commit message denoting that it is revlocked to other changes, and we can have it group revlocked changes together. There is no need for server-side hooks, and this solution would work as well as any mirroring system. I don't believe there is any need for this solution to be bulletproof, but I see no reason why it cannot be as robust as the single-project mirrors that the mono-repo proposal includes. beanz: The automation will run. It will collect a list of commits that have been pushed to each…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions The automation will run. It will collect a list of commits that have been pushed to each repository since the last time the script ran. Atomically? There is no race condition that I see. Did you read my sequence 1-7 that describes an example of race? but I see no reason why it cannot be as robust as the single-project mirrors that the mono-repo proposal includes. Define "robust". The single-project mirrors have a very well deterministic algorithm to construct, and reconstruct them at will, you don't have one for the multi-repo. That's not "robust" to me. mehdi_amini: > The automation will run. It will collect a list of commits that have been pushed to each…
				beanzUnsubmitted Done Reply Inline Actions I've updated my automation (https://github.com/llvm-beanz/llvm-submodules) to make one umbrella commit per commit to sub-project repository. This has a single commit granularity. That was the original point I was arguing. It works. It is done. Is it perfect? No. There are a number of situations where the order of the commits to the submodule can be impacted by the order and proximity of commits to the project repositories. That is irrelevant to the point I was making. I'm more than happy to debate with you about whether or not that matters, but that is a separate issue from what I was pointing out. Do we need to belabor this further, or will you update the document based on my feedback? beanz: I've updated my automation (https://github.com/llvm-beanz/llvm-submodules) to make one umbrella…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions You're moving goal posts. Your previous message said that there is no race, while now you're eluding it with "There are a number of situations where...". Also you're changing the definition of the multi-repo as I was foreseeing it. I think it is worse, and if we were to adopt the multi-repo proposal, I would be totally against this. Now, just to please you, because again I don't think it does any good to this proposal, I'll re-formulate making clear that: update in the multi-repo are single commits based. commits can be in different orders. it does not handle cross-project commits. mehdi_amini: You're moving goal posts. Your previous message said that there is no race, while now you're…
				beanzUnsubmitted Not Done Reply Inline Actions From the beginning I said: It won't be perfect, but it should be good enough for sorting commits in close proximity... If you want to debate that statement we can do so, but I would prefer not to in this thread. Also you're changing the definition of the multi-repo as I was foreseeing it. I think it is worse, and if we were to adopt the multi-repo proposal, I would be totally against this. You don't get to dictate how the proposal in opposition to your preferred approach is written. I think you've been pretty clear about being against the multi-repo proposal, so I don't see how your opinion factors in to the final document, which shouldn't be opinion based. beanz: From the beginning I said: > It won't be perfect, but it should be good enough for sorting…
				mehdi_aminiAuthorUnsubmitted Not Done Reply Inline Actions You don't get to dictate how ... Sorry, you mischaracterizing my position and what I wrote, I don't appreciate this. mehdi_amini: > You don't get to dictate how ... Sorry, you mischaracterizing my position and what I wrote…
				other submodule commands to e.g. update all submodules to an older revision.

				This umbrella repository will be updated automatically by a bot (running on
				notice from a webhook on every push, and periodically). Note that commits in
				beanzUnsubmitted Done Reply Inline Actions I would say 'continuously' rather than periodically here. You describe in more detail below how the notifications would be configured and 'periodically' isn't a full picture. beanz: I would say 'continuously' rather than periodically here. You describe in more detail below how…
				different repositories pushed within the same time frame may be visible together
				or in undefined order in the umbrella repository.

				Workflow Before/After
				=====================

				This section goes through a few examples of workflows.

				Checkout/Clone a Single Project, without Commit Access
				------------------------------------------------------

				Except the URL, nothing changes. The possibilities today are::

				svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm
				# or with Git
				beanzUnsubmitted Done Reply Inline Actions s/interacts/would interact/ beanz: s/interacts/would interact/
				git clone http://llvm.org/git/llvm.git

				After the move to GitHub, you would do either::

				git clone https://github.com/llvm-project/llvm.git
				# or using the GitHub svn native bridge
				svn co https://github.com/llvm-project/llvm/trunk

				rengolinUnsubmitted Done Reply Inline Actions Better to keep the same order svn/git. And don't need to specify that it's a bridge, since you mention above. rengolin: Better to keep the same order svn/git. And don't need to specify that it's a bridge, since you…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions The "canonical way" changes. The bridge isn't mentioned in this section, I rather have it clear (and it doesn't hurt). mehdi_amini: The "canonical way" changes. The bridge isn't mentioned in this section, I rather have it clear…
				The above works for both the monorepo and the multirepo, as we'll maintain the
				existing read-only views of the individual sub-projects.

				Checkout/Clone a Single Project, with Commit Access
				---------------------------------------------------

				Currently
				::

				# direct SVN checkout
				svn co https://user@llvm.org/svn/llvm-project/llvm/trunk llvm
				# or using the read-only Git view, with git-svn
				git clone http://llvm.org/git/llvm.git
				cd llvm
				git svn init https://llvm.org/svn/llvm-project/llvm/trunk --username=<username>
				git config svn-remote.svn.fetch :refs/remotes/origin/master
				git svn rebase -l # -l avoids fetching ahead of the git mirror.

				Commits are performed using `svn commit` or `git commit` and `git svn dcommit`.

				rengolinUnsubmitted Done Reply Inline Actions Can you commit via git directly? rengolin: Can you commit via git directly?
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions How do you use git svn? mehdi_amini: How do you use git svn?
				Multirepo Proposal
				rengolinUnsubmitted Done Reply Inline Actions D'oh, "with the sequence...". Ignore me. rengolin: D'oh, "with the sequence...". Ignore me.
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions I added "with the sequence" following your comment to make it more clear. mehdi_amini: I added "with the sequence" following your comment to make it more clear.

				With the multirepo proposal, nothing changes but the URL, and commits can be
				performed using `svn commit` or `git commit` and `git push`::

				git clone https://github.com/llvm/llvm.git llvm
				# or using the GitHub svn native bridge
				svn co https://github.com/llvm/llvm/trunk/ llvm

				Monorepo Proposal

				With the monorepo, there are multiple possibilities to achieve this. First,
				you could just clone the full repository::

				git clone https://github.com/llvm/llvm-projects.git llvm
				# or using the GitHub svn native bridge
				svn co https://github.com/llvm/llvm-projects/trunk/ llvm

				At this point you have every sub-project (llvm, clang, lld, lldb, ...), which
				doesn't imply you have to build all of them. You can still build only
				compiler-rt for instance. In this way it's not different from someone who would
				check out all the projects with SVN today.

				You can commit as normal using `git commit` and `git push` or `svn commit`, and
				read the history for a single project (`git log libcxx` for example).

				rengolinUnsubmitted Done Reply Inline Actions Is this a flat tree (like today) or the checked-out tree (tools/clang, etc)? rengolin: Is this a flat tree (like today) or the checked-out tree (tools/clang, etc)?
				If you don't want to have the sources for all the sub-projects checked out for,
				there are again a few options.

				First, you could hide the other directories using a Git sparse checkout::

				git config core.sparseCheckout true
				echo /compiler-rt > .git/info/sparse-checkout
				git read-tree -mu HEAD

				The data for all sub-projects is still in your `.git` directory, but in your
				checkout, you only see `compiler-rt`. Git compresses its history, so
				a clone of everything is only about 2x as much data as a clone of llvm only (and
				beanzUnsubmitted Done Reply Inline Actions You've lost me here. Checking out all the projects in SVN today involves multiple svn co commands. Unless there is some magic in SVN I'm unaware of. If there is such magic we should document it somewhere on LLVM.org (maybe on the getting started page?) and link to it here. beanz: You've lost me here. Checking out all the projects in SVN today involves multiple svn co…
				mehdi_aminiAuthorUnsubmitted Not Done Reply Inline Actions I was referring to: svn co http://llvm.org/svn/llvm-project/ --depth=immediates cd llvm-project/ svn up llvm/trunk clang/trunk libcxx/trunk You can then have a build with only LLVM configured like: mkdir ../build-llvm && cd ../build-llvm cmake ../llvm-project/llvm/trunk And a build dir with llvm+clang: mkdir ../build-clang && cd ../build-clang cmake ../llvm-project/llvm/trunk -DLLVM_EXTERNAL_CLANG_DIR=../llvm-project/clang/trunk/ So that a single `svn up $projects` in the source directory update all the sources and you can still build a subset of the projects from these sources. This is also how I'd synchronize if I was integrating downstream from SVN. mehdi_amini: I was referring to: ``` svn co http://llvm.org/svn/llvm-project/ --depth=immediates cd llvm…
				beanzUnsubmitted Done Reply Inline Actions I can't imagine that is a common workflow. It certainly isn't the documented recommended workflow on llvm.org, so I'm not sure there is value in bringing it into the discussion. beanz: I can't imagine that is a common workflow. It certainly isn't the documented recommended…
				in any case this is dwarfed by the size of e.g. a llvm build dir).

				Before you push, you'll need to fetch and rebase as normal. However when you
				fetch you'll likely pull in changes to sub-projects you don't care about. You
				may need to rebuild and retest, but only if the fetch included changes to a
				sub-project that your change depends on. You can check this by running::

				git log origin/master@{1}..origin/master libcxx

				This shows you all of the changes to `libcxx` since you last fetched. This
				command can be hidden in a script so that `git llvmpush` would perform all these
				steps, fail only if such a dependent change exists, and show immediately the
				change that prevented the push. An immediate repeat of the command would
				(almost) certainly result in a successed push. (This is
				an extra step that you don't need in the multirepo, but for those of us who
				rengolinUnsubmitted Done Reply Inline Actions successful? rengolin: successful?
				probinsonUnsubmitted Done Reply Inline Actions successed -> successful probinson: successed -> successful
				work on a sub-project that depends on llvm, it has the advantage that we can
				beanzUnsubmitted Done Reply Inline Actions Can you please add actual size numbers for each project and the mono-repo? Just saying '2x' isn't super meaningful without knowing the size of 1x. beanz: Can you please add actual size numbers for each project and the mono-repo? Just saying '2x'…
				beanzUnsubmitted Done Reply Inline Actions Can you add per-project sizes? beanz: Can you add per-project sizes?
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions That'd make a long list, how should it be presented? mehdi_amini: That'd make a long list, how should it be presented?
				beanzUnsubmitted Done Reply Inline Actions However you think it is best presented. A table would seem fitting. You could put it below and have a link down to it. I think that if you're bringing size into the discussion you need to provide sufficient data. beanz: However you think it is best presented. A table would seem fitting. You could put it below and…
				check whether we pulled in any changes to say clang or llvm.)

				A second option is to use svn via the GitHub svn native bridge::

				svn co https://github.com/llvm/llvm-projects/trunk/compiler-rt compiler-rt —username=...
				kparzyszUnsubmitted Done Reply Inline Actions Even with sparse checkout? Am I going to see new files in projects that were not originally included in the sparse checkout? kparzysz: Even with sparse checkout? Am I going to see new files in projects that were not originally…
				jlebarUnsubmitted Done Reply Inline Actions What do you mean by "see"? In order to push a commit without `-f`, the commit's parent commit must be the current remote head. The commits in git are unaffected by sparse checkout. So, if you have a commit you want to push, you will need to rebase it atop current remote HEAD -- you'll have to do this rebase even if you're using sparse checkouts and all of the changes between your current base revision and current remote HEAD are to subprojects that you don't have checked out. If you don't like this, you can continue to use the single-subproject mirrors exactly as you currently do (with git-svn and everything), by changing the configs as explained elsewhere in this document. But I've been using a monorepo (http://github.com/llvm-project/llvm-project) for months now. I've pushed maybe 30 commits using my custom script (https://github.com/jlebar/llvm-repo-tools) and this necessity to rebase hasn't once been an annoyance for me. jlebar: What do you mean by "see"? In order to push a commit without `-f`, the commit's parent commit…
				kparzyszUnsubmitted Done Reply Inline Actions What do you mean by "see"? I'm referring to this (and the rest of this paragraph): "However when you fetch you'll likely pull in changes to sub-projects you don't care about." The intent wasn't clear---I wasn't aware of the requirement about parent commit (I use SVN for upstreaming changes). But that brings another question: what is the anticipated frequency of commits to the monorepo? My concern is that the "rebuild and retest" approach may take long enough to require another rebase... kparzysz: >What do you mean by "see"? I'm referring to this (and the rest of this paragraph): "However…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions When you commit to SVN, you add a "patch" on top of the existing codebase. Unless there is a conflict your patch will be committed. It does not mean it will build, since someone else may just have changed an API you're using in your patch. The new monorepo won't be different from SVN on this aspect: you have the same frequency of commits, and you can run `git pull && git push` which is roughly equivalent to `git svn dcommit` today. The thing is that between the `git pull` and the `git push`, you can also inspect what changed since your last build/check, and decide if you need to rebuild or not. mehdi_amini: When you commit to SVN, you add a "patch" on top of the existing codebase. Unless there is a…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Maybe we should just remove all this paragraph, it is confusing... mehdi_amini: Maybe we should just remove all this paragraph, it is confusing...
				jlebarUnsubmitted Done Reply Inline Actions I really think you want this paragraph, btw. This is a very common question -- it's been asked many times before. "I don't want the monorepo because it will mean I have to rebuild/retest a lot more than I do today." False, but we need to explain why. jlebar: I really think you want this paragraph, btw. This is a very common question -- it's been asked…
				mehdi_aminiAuthorUnsubmitted Not Done Reply Inline Actions OK, I'll try to rephrase it then. The main point is that `git pull && git push` is not different from today SVN. mehdi_amini: OK, I'll try to rephrase it then. The main point is that `git pull && git push` is not…
				jlebarUnsubmitted Done Reply Inline Actions My concern is that the "rebuild and retest" approach may take long enough to require another rebase... This isn't a function of the monorepo. You choose when to rebuild/retest, and that's orthogonal to the repository structure. If "rebuild/retest only when there were changes to files I changed" is what you want to do, you can still do that. You can ask that question of git before pushing. Or you could ask "have any of the projects I care about changed?" Or you could ask a different question. And you could ask those questions of the monorepo, or the multirepo (although it might be a bit more work in the multirepo -- I say "might" so beanz doesn't jump on me). In this sense it's safer than SVN, which assumes that you only care about retesting if there were modifications to files you also changed. jlebar: > My concern is that the "rebuild and retest" approach may take long enough to require another…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Even with sparse checkout? Am I going to see new files in projects that were not originally included in the sparse checkout? If you mean are you seeing them when typing `ls` in your terminal, then no you don't. I can add "unless you're using a sparse checkout" to make it more clear. mehdi_amini: > Even with sparse checkout? Am I going to see new files in projects that were not originally…

				This checks out only compiler-rt and provides commit access using "svn commit",
				in the same way as it would do today.

				Finally, you could use git-svn and one of the sub-project mirrors::

				# Clone from the single read-only Git repo
				git clone http://llvm.org/git/llvm.git
				cd llvm
				# Configure the SVN remote and initialize the svn metadata
				$ git svn init https://github.com/joker-eph/llvm-project/trunk/llvm —username=...
				kparzyszUnsubmitted Done Reply Inline Actions A conflicting change would have to affect the same file. This is regardless of whether it's monorepo or multirepo. Am I missing something here? Rebasing is always a good practice, but it's not strictly required. If there are no conflicts, the system will just add the change on top of the current ToT, even if they have not been fetched to the local repo. kparzysz: A conflicting change would have to affect the same file. This is regardless of whether it's…
				jlebarUnsubmitted Done Reply Inline Actions Rebasing is always a good practice, but it's not strictly required. If there are no conflicts, the system will just add the change on top of the current ToT, even if they have not been fetched to the local repo. That is what git-svn will do, yes. But that's not pure git's behavior. jlebar: > Rebasing is always a good practice, but it's not strictly required. If there are no conflicts…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions A conflicting change would have to affect the same file. This is regardless of whether it's monorepo or multirepo. Am I missing something here? The point was that when you run `git pull --rebase`, you have new changes, and even without an explicit "diff conflict" your changes that you're about to push may use an API that have changed upstream. Note today this is not addressed: SVN will blindly accept the push and break the build. Rebasing is always a good practice, but it's not strictly required. If there are no conflicts, the system will just add the change on top of the current ToT, even if they have not been fetched to the local repo. As Justing mentions, this is not true with `git push` AFAIK. You have to `pull` (merge or rebase) before being able to push. mehdi_amini: > A conflicting change would have to affect the same file. This is regardless of whether it's…
				git config svn-remote.svn.fetch :refs/remotes/origin/master
				git svn rebase -l

				In this case the repository contains only a single sub-project, and commits can
				be made using `git svn dcommit`, again exactly as we do today.

				Checkout/Clone Multiple Projects, with Commit Access
				----------------------------------------------------

				Let's look how to assemble llvm+clang+libcxx at a given revision.

				Currently
				::

				svn co http://llvm.org/svn/llvm-project/llvm/trunk llvm -r $REVISION
				cd llvm/tools
				svn co http://llvm.org/svn/llvm-project/clang/trunk clang -r $REVISION
				cd ../projects
				svn co http://llvm.org/svn/llvm-project/libcxx/trunk libcxx -r $REVISION

				Or using git-svn::
				beanzUnsubmitted Done Reply Inline Actions the emphasis on 'exactly as we do today' is unnecessary. beanz: the emphasis on 'exactly as we do today' is unnecessary.

				git clone http://llvm.org/git/llvm.git
				cd llvm/
				git svn init https://llvm.org/svn/llvm-project/llvm/trunk --username=<username>
				git config svn-remote.svn.fetch :refs/remotes/origin/master
				git svn rebase -l
				git checkout `git svn find-rev -B r258109`
				cd tools
				git clone http://llvm.org/git/clang.git
				cd clang/
				git svn init https://llvm.org/svn/llvm-project/clang/trunk --username=<username>
				git config svn-remote.svn.fetch :refs/remotes/origin/master
				git svn rebase -l
				git checkout `git svn find-rev -B r258109`
				cd ../../projects/
				git clone http://llvm.org/git/libcxx.git
				cd libcxx
				git svn init https://llvm.org/svn/llvm-project/libcxx/trunk --username=<username>
				git config svn-remote.svn.fetch :refs/remotes/origin/master
				git svn rebase -l
				git checkout `git svn find-rev -B r258109`

				Note that the list would be longer with more sub-projects.

				Multirepo Proposal

				With the multirepo proposal, the umbrella repository enters the dance. This is
				where the mapping from a single revision number to the individual repositories
				revisions is stored.::

				git clone https://github.com/llvm-beanz/llvm-submodules
				cd llvm-submodules
				git checkout $REVISION
				git submodule init
				beanzUnsubmitted Done Reply Inline Actions Nit: "enters the dance" implies complexity. beanz: Nit: "enters the dance" implies complexity.
				git submodule update clang llvm libcxx

				rengolinUnsubmitted Done Reply Inline Actions Maybe mention --recursive, too? rengolin: Maybe mention --recursive, too?
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions What's the point? mehdi_amini: What's the point?
				At this point the clang, llvm, and libcxx individual repositories are cloned
				rengolinUnsubmitted Done Reply Inline Actions "Update" would take one per project and is more cumbersome when you don't know beforehand which or how many projects you'll build (we have that problem). Conceptually the same, but recursive gives a better "impression" of simplicity. It's about the bias issue that Chris was talking about, even if totally unintended. rengolin: "Update" would take one per project and is more cumbersome when you don't know beforehand which…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions I'm not sure I follow: AFAIK recursive is for nested submodules, which is not part of the proposal. So to be clear I expect `--recursive` to be a no-op. I can be wrong, but I'll need some more explanation if I missed something obvious here. If your point is about cloning all the sub-projects and not only just a selected list, then `--recursive` is not the right option, just doing `git submodule update` without any other flag will do it. I'll spell it out. mehdi_amini: I'm not sure I follow: AFAIK recursive is for nested submodules, which is not part of the…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions I added a comment mentioning that the list if optional. Let me know if I misunderstood something about --recursive above. mehdi_amini: I added a comment mentioning that the list if optional. Let me know if I misunderstood…
				and stored alongside each other. There exist flags you can use to inform CMake
				of your directory structure, and alternatively you can just symlink `clang` to
				`llvm/tools/clang`, etc.

				Monorepo Proposal

				The repository contains natively the source for every sub-projects at the right
				revision, which makes this straightforward::

				git clone https://github.com/llvm/llvm-projects.git llvm
				cd llvm
				git checkout $REVISION

				As before, at this point clang, llvm, and libcxx are stored in directories
				alongside each other.

				Commit an API Change in LLVM and Update the Sub-projects
				--------------------------------------------------------

				Today this is easy for subversion users, and possible but not straighfoward for
				git-svn users. Few Git users try to e.g. update LLD or Clang in the same commit
				beanzUnsubmitted Done Reply Inline Actions Alternatively since our intention is to enforce a linear history in the repositories doing a checkout by timestamp using the format below should also work in the majority of cases. git checkout 'master@{...}' beanz: Alternatively since our intention is to enforce a linear history in the repositories doing a…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions This applies to both proposals right? Where do you want me to add this? mehdi_amini: This applies to both proposals right? Where do you want me to add this?
				beanzUnsubmitted Done Reply Inline Actions I think it is worth noting under the multi-repo proposal something along the lines of: Because we will be maintaining a linear history you can perform a timestamp based checkout of each project repository with the following command: git checkout 'master@{...}' Additionally you can use the umbrella repository... If you want to also add the timestamp checkout to the mono-repo proposal, that makes sense too. I just think it is worth noting under the multi-repo proposal that timestamp based checkouts are expected to work due to the linear history requirement, which means you don't need the submodule repo. beanz: I think it is worth noting under the multi-repo proposal something along the lines of: >…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Are you sure that this command does what you think it does? If I read correctly the doc, it is looking at your reflog, not the history. The right one should be something like `git checkout` `git rev-list -n 1 --before="2009-07-27 13:37" master` I just think it is worth noting under the multi-repo proposal that timestamp based checkouts are expected to work due to the linear history requirement, which means you don't need the submodule repo. OK that wasn't clear to me the first time. mehdi_amini: Are you sure that this command does what you think it does? If I read correctly the doc, it is…
				beanzUnsubmitted Done Reply Inline Actions You are correct, you need to use `rev-list` to get the commit hash. beanz: You are correct, you need to use `rev-list` to get the commit hash.
				as they change an LLVM API.

				The multirepo proposal does not address this: one would have to commit and push
				separately in every individual repository. It might be possible to establish a
				protocol whereby users add a special token to their commit messages that causes
				the umbrella repo's updater bot to group all of them into a single revision.

				beanzUnsubmitted Not Done Reply Inline Actions Not sure I agree this is easy for svn users. To my knowledge llvm.org doesn't even document how to checkout the SVN repositories in a way to make this possible. beanz: Not sure I agree this is easy for svn users. To my knowledge llvm.org doesn't even document how…
				The single repository proposal handles this natively and makes this use case
				mehdi_aminiAuthorUnsubmitted Not Done Reply Inline Actions Do you have an alternative to suggest? mehdi_amini: Do you have an alternative to suggest?
				trivial.
				probinsonUnsubmitted Done Reply Inline Actions single repository -> monorepo probinson: single repository -> monorepo

				Branching/Stashing/Updating for Local Development or Experiments
				----------------------------------------------------------------

				Currently

				SVN does not allow this use case, but developers that are currently using
				beanzUnsubmitted Done Reply Inline Actions Again, I don't follow how this is easy. There is no documentation on LLVM.org explaining how to do this and my limited knowledge of SVN leaves me with no idea how to do it. beanz: Again, I don't follow how this is easy. There is no documentation on LLVM.org explaining how to…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions (Copy/pasted commands above) mehdi_amini: (Copy/pasted commands above)
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Copy/pasted above (I'm not sure I really want to document it on llvm.org now). mehdi_amini: Copy/pasted above (I'm not sure I really want to document it on llvm.org now).
				beanzUnsubmitted Done Reply Inline Actions Fine if you don't want to document it, but I certainly would not describe that as "easy". Especially because if you ever mix up and type "svn up" in the root it starts updating everything. I think this is an incredibly fragile workflow, which is probably why it is also incredibly uncommon. beanz: Fine if you don't want to document it, but I certainly would not describe that as "easy".
				git-svn can do it. Let's look in practice what it means when dealing with
				multiple sub-projects.

				To update the repository to tip of trunk::

				beanzUnsubmitted Done Reply Inline Actions I would phrase as "It would be possible...", because it most certainly is possible. beanz: I would phrase as "It would be possible...", because it most certainly is possible.
				git pull
				cd tools/clang
				git pull
				cd ../../projects/libcxx
				git pull
				beanzUnsubmitted Done Reply Inline Actions Please remove "and makes this use case ...", it is a value judgement. beanz: Please remove "and makes this use case ...", it is a value judgement.
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions I don't believe so, but if you insist... mehdi_amini: I don't believe so, but if you insist...

				To create a new branch::

				git checkout -b MyBranch
				cd tools/clang
				git checkout -b MyBranch
				cd ../../projects/libcxx
				git checkout -b MyBranch

				To switch branches::

				git checkout AnotherBranch
				cd tools/clang
				git checkout -b AnotherBranch
				cd ../../projects/libcxx
				git checkout -b AnotherBranch

				probinsonUnsubmitted Done Reply Inline Actions These checkouts should not have -b on them. probinson: These checkouts should not have -b on them.
				Multirepo Proposal

				The multirepo works the same as the current Git workflow: every command needs
				to be applied to each of the individual repositories.

				Monorepo Proposal

				Regular Git commands are sufficient, because everything is in a single
				repository:

				To update the repository to tip of trunk::

				git pull

				To create a new branch::

				git checkout -b MyBranch

				To switch branches::

				git checkout AnotherBranch

				Bisecting
				---------

				Assuming a developer is looking for a bug in clang (or lld, or lldb, ...).

				Currently

				SVN does not have builtin bisection support. Using the existing Git read-only
				beanzUnsubmitted Done Reply Inline Actions Additionally users of the umbrella repo can use `git submodule foreach` to have single command workflows that nearly match the mono-repo proposal. beanz: Additionally users of the umbrella repo can use `git submodule foreach` to have single command…
				view of the repositories, it is possible to use the native Git bisection script
				probinsonUnsubmitted Done Reply Inline Actions SVN bisection is not built-in but it is easy to do manually (or scripted) because you can do `svn update -r $REVISION` to an arbitrary revision. Because revisions are integers, do `(BAD - GOOD)/2` to pick the next revision. So, it is not materially harder than bisecting on the multirepo. probinson: SVN bisection is not built-in but it is easy to do manually (or scripted) because you can do…
				mehdi_aminiAuthorUnsubmitted Not Done Reply Inline Actions Thanks Paul, I tried to clarify, can you double-check? mehdi_amini: Thanks Paul, I tried to clarify, can you double-check?
				over the llvm repository, and use some scripting to synchronize the clang
				probinsonUnsubmitted Done Reply Inline Actions Very nicely succinct. One typo: scripts -> script probinson: Very nicely succinct. One typo: scripts -> script
				repository to match the llvm revision.

				Multirepo Proposal

				With the multi-repositories proposal, the cross-repository synchronization is
				achieved using the umbrella repository. This repository contains only
				submodules for the other sub-projects. The native Git bisection can be used on
				the umbrella repository directly. A subtlety is that the bisect script itself
				needs to make sure the submodules are updated accordingly.

				For example, to find which commit introduces a regression where clang-3.9
				crashes but not clang-3.8 passes, one should be able to simply do::

				git bisect start release_39 release_38
				git bisect run ./bisect_script.sh

				With the `bisect_script.sh` script being::

				#!/bin/sh
				cd $UMBRELLA_DIRECTORY
				git submodule update llvm clang libcxx #....
				cd $BUILD_DIR

				ninja clang \|\| exit 125 # an exit code of 125 asks "git bisect"
				# to "skip" the current commit

				./bin/clang some_crash_test.cpp

				When the `git bisect run` command returns, the umbrella repository is set to
				the state where the regression is introduced, one can inspect the history on
				every sub-projects compared to the previous revision in the umbrella (it is
				possible that one commit in the umbrella repository includes multiple commits
				probinsonUnsubmitted Done Reply Inline Actions sub-projects -> sub-project probinson: sub-projects -> sub-project
				in the sub-projects).

				Monorepo Proposal

				Bisecting on the monorepo is straightforward and almost identical to the
				multirepo situation explained above. The granularity is finer since each
				individual commits in every sub-projects participate in the bisection. The
				bisection script does not need to include the `git submodule update` step.

				Living Downstream
				-----------------

				Depending on which of the multirepo or the monorepo proposal gets accepted,
				and depending on the integration scheme, downstream projects may be differently
				impacted and have different options.

				* If you were pulling from the SVN repo before the switch to Git. The monorepo
				will allow you to continue to use SVN. The main caveat is that you'll need to
				beanzUnsubmitted Done Reply Inline Actions This is inaccurate. Even though my rough prototype of the git umbrella repo doesn't have each submodule update being a single commit that was the stated plan for how the umbrella would be updated. That means each umbrella repo commit would represent a single commit to a single subproject, so your bisection granularity is comparable. beanz: This is inaccurate. Even though my rough prototype of the git umbrella repo doesn't have each…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions (I'm waiting for the story to support this above) mehdi_amini: (I'm waiting for the story to support this above)
				beanzUnsubmitted Done Reply Inline Actions See above. beanz: See above.
				be prepared for a one-time change to the revision numbers. The multirepo
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions If you have a way to guarantee it, I'm willing to hear about it. Right now, I don't believe it is possible without implementing it on the git hosting itself. mehdi_amini: If you have a way to guarantee it, I'm willing to hear about it. Right now, I don't believe…
				proposal does not provide a great solution for this.
				beanzUnsubmitted Done Reply Inline Actions You can absolutely guarantee the same granularity. You can't guarantee the same ordering, but generally speaking that is significantly less important than granularity. To get the same granularity you allow the script that updates submodules to produce more than one commit to the submodule repo at a time. If there are multiple you can sort them by committer date. While committer date isn't a great thing to use since our proposals both depend on maintaining a linear history it should be good enough for the common cases because committer date gets reset on rebase. beanz: You can absolutely guarantee the same granularity. You can't guarantee the same ordering, but…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions but generally speaking that is significantly less important than granularity. No sorry, I can't agree, this is critical: correctness goes before usability. It seems to me that you're willing to trade correctness to bring a guarantee of usability here. I'm willing to believe that "in practice" the granularity should be small enough, it just has to be worded carefully. Right now it is a parenthesis at the end: `(it is possible that one commit in the umbrella repository includes multiple commits in the sub-projects)` , we can reword this `(it is possible that one commit in the umbrella repository includes multiple commits in the sub-projects, though it should be occasional in practice)` (One may bikeshed on what exactly "occasional" is though, but we don't have any data to bikeshed efficiently anyway). mehdi_amini: > but generally speaking that is significantly less important than granularity. No sorry, I…

				rengolinUnsubmitted Done Reply Inline Actions This last phrase is odd... It's not clear what "this" is, but I think you mean "a single repo in build structure". In a multi-repo, people will continue to checkout independent projects and commit directly to them, there's no difference for them. rengolin: This last phrase is odd... It's not clear what "this" is, but I think you mean "a single repo…
				probinsonUnsubmitted Done Reply Inline Actions Wouldn't the multirepo still have SVN views on each subproject? Seems like the SVN views would basically be the same for either multirepo or monorepo. probinson: Wouldn't the multirepo still have SVN views on each subproject? Seems like the SVN views would…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Tried to clarify: the multirepo breaks the cross-project synchronization with SVN. mehdi_amini: Tried to clarify: the multirepo breaks the cross-project synchronization with SVN.
				* If you were pulling from one of the existing read-only Git repos, this also
				rengolinUnsubmitted Done Reply Inline Actions Looks better, thanks! rengolin: Looks better, thanks!
				will continue to work as before as they will continue to exist in any of the
				proposal.

				Under the monorepo proposal, you have a third option: migrating your fork to
				the monorepo. This can be particularly beneficial if your fork touches
				multiple sub-projects (e.g. llvm and clang), because now you can commingle
				commits to llvm and clang in a single repository.
				beanzUnsubmitted Done Reply Inline Actions Better to say "both proposals will allow you to continue to use SVN". The wording here makes it seem like only the mono-repo has GitHub's SVN support, even though that is later contradicted. beanz: Better to say "both proposals will allow you to continue to use SVN". The wording here makes it…

				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions I did a minor rewording (we're on a different support level here between the two solutions, which need to be conveyed somehow). mehdi_amini: I did a minor rewording (we're on a different support level here between the two solutions…
				As a demonstration, we've migrated the "Cherry" fork to the monorepo in two ways:

				rengolinUnsubmitted Done Reply Inline Actions "CHERI" rengolin: "CHERI"
				beanzUnsubmitted Done Reply Inline Actions If we go with the multi-repo approach we can ensure that each umbrella repo commit will be only one submodule update. This is relatively straight forward tooling to add. The only situation where we could potentially allow multiple updates in a single umbrella commit would be if we wanted to do cross-repository correlating of revlocked changes. beanz: If we go with the multi-repo approach we can ensure that each umbrella repo commit will be only…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions (I'm waiting for the story to support this above) mehdi_amini: (I'm waiting for the story to support this above)
				beanzUnsubmitted Done Reply Inline Actions Again, above. beanz: Again, above.
				* Using a script that rewrites history (including merges) so that it looks like
				the fork always lived in the monorepo [LebarCherry]_. The upside of this is
				when you check out an old revision, you get a copy of all llvm sub-projects at
				a consistent revision. (For instance, if it's a clang fork, when you check
				out an old revision you'll get a consistent version of llvm proper.) The
				beanzUnsubmitted Done Reply Inline Actions The granularity is not finer. beanz: The granularity is not finer.
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions (I'm waiting for the story to support this above) mehdi_amini: (I'm waiting for the story to support this above)
				downside is that this changes the fork's commit hashes.

				* Merging the fork into the monorepo [AminiCherry]_. This preserves the fork's
				commit hashes, but when you check out an old commit you only get the one
				sub-project.

				If you keep a split-repository solution downstream, upstreaming patches is
				always possible: you can apply the patches in the appropriate subdirectory of
				the monorepo.

				rengolinUnsubmitted Done Reply Inline Actions ... or you can apply directly on the multi-repo solution. It's good to repeat to make clear that both solutions are covered. rengolin: ... or you can apply directly on the multi-repo solution. It's good to repeat to make clear…
				Monorepo Variant
				================
				beanzUnsubmitted Done Reply Inline Actions Better to say both proposals allow you to continue using SVN the same way, but that each solution will have minor impacts. In the monorepo there will be a one-time change in revision numbers, and in the multi-repo each project will have its own revision numbers out of sync from each other. beanz: Better to say both proposals allow you to continue using SVN the same way, but that each…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions "The same way" implies "a single SVN revision number to me". One could even say "a single SVN checkout" (cf the command I copy/pasted above). I don't see how it'd work with the multi-repo? How would someone downstream integrating from SVN be able to correlate revision across repositories? mehdi_amini: "The same way" implies "a single SVN revision number to me". One could even say "a single SVN…
				beanzUnsubmitted Done Reply Inline Actions Maybe rather than "the same way" "with similar workflows to today"? beanz: Maybe rather than "the same way" "with similar workflows to today"?
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions I'm still missing what would be similar for someone integrating multiple projects from SVN today (assuming such downstream integrator exists) with the multi-repo? mehdi_amini: I'm still missing what would be similar for someone integrating multiple projects from SVN…
				beanzUnsubmitted Done Reply Inline Actions I strongly suspect that very few users are using a single SVN checkout that contains more than one sub-project. If you discount that workflow, the workflow for interfacing using the GitHub SVN bridge is very similar whether you are using one repo or many. Additionally, with the mono repo the combined SVN workflow is actually a lot better than with SVN today. It is way less fragile since you aren't doing sub-directory checkouts. This means you don't run the risk of inadvertently running `svn up` and pulling down way more than you wanted. beanz: I strongly suspect that very few users are using a single SVN checkout that contains more than…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions If you discount that workflow, the workflow for interfacing using the GitHub SVN bridge is very similar whether you are using one repo or many. "Very similar" is subjective, to me it can't be similar as long as there is no longer a single revision number. Additionally, with the mono repo the combined SVN workflow is actually a lot better than with SVN today. It is way less fragile since you aren't doing sub-directory checkouts. This means you don't run the risk of inadvertently running svn up and pulling down way more than you wanted. I don't understand what you mean here. mehdi_amini: > If you discount that workflow, the workflow for interfacing using the GitHub SVN bridge is…
				beanzUnsubmitted Done Reply Inline Actions Saying the workflows is "similar" is not a subjective wording. Today someone who writes: `svn co svn co http://llvm.org/svn/llvm-project/llvm/trunk` Under the mono-repo could write something like: `svn co http://github.com/llvm/llvm-project/master/llvm` Under the multi-repo could write something like: `svn co http://github.com/llvm/llvm/master/` The workflow of `svn co` -> `svn add` -> `svn commit` is similar in all cases. beanz: Saying the workflows is "similar" is not a subjective wording. Today someone who writes: `svn…

				A variant of the monorepo proposal is to group together in a single repository
				only the projects that are rev-locked to LLVM (clang, lld, lldb, ...) and
				leave projects like libcxx and compiler-rt in their own individual and separate
				repository.

				rengolinUnsubmitted Done Reply Inline Actions I'd add a small paragraph explaining the problems that will come from having "two worlds", neither here, nor there. If it's too complicated, than lets not even propose that, as it'll end up as a third proposal. rengolin: I'd add a small paragraph explaining the problems that will come from having "two worlds"…
				probinsonUnsubmitted Done Reply Inline Actions repository -> repositories. probinson: repository -> repositories.
				beanzUnsubmitted Done Reply Inline Actions s/any of the proposal/both of the proposals/ beanz: s/any of the proposal/both of the proposals/
				Note however that many users of the monorepo would benefit from having all of
				the pieces needed for a full toolchain present in one repository. And for
				newcomers, getting and building a toolchain is easier.

				beanzUnsubmitted Done Reply Inline Actions Reword from the second sentence on. You're making a value assessment. A better phrasing might be: If your fork touches multiple LLVM projects, migrating your fork into the mono repo would enable you to make commits that touch multiple projects at the same time the same way LLVM contributors would be able to do so. beanz: Reword from the second sentence on. You're making a value assessment. A better phrasing might…
				Also, developers who hack only on one of these sub-projects can continue to use
				the single sub-project Git mirrors, so their workflow is unchanged. (That is,
				they aren't forced to download or check out all of llvm, clang, etc. just to
				make a change to libcxx.)

				rengolinUnsubmitted Done Reply Inline Actions Can they checkout the read-only libc++ and commit without checking out the entire monorepo? rengolin: Can they checkout the read-only libc++ and commit without checking out the entire monorepo?
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions This is covered in the section `Checkout/Clone a Single Project, with Commit Access` (or I don't understand the question) mehdi_amini: This is covered in the section `Checkout/Clone a Single Project, with Commit Access` (or I…
				Previews
				rengolinUnsubmitted Done Reply Inline Actions That work flow example shows a changed flow for commits, so the statement that "their workflow is unchanged" is not accurate. The parentheses comment helps, but doesn't address the issue completely. A better way would be "the workflow is as described in [link] pointing above. rengolin: That work flow example shows a changed flow for commits, so the statement that "their workflow…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions I'm sorry I don't follow. You mention a changed in the flow for commit. Here is what's mentioned in the section I referred to, can you clarify where is the inaccuracy? Workflow today: # direct SVN checkout svn co https://user@llvm.org/svn/llvm-project/llvm/trunk llvm # or using the read-only Git view, with git-svn git clone http://llvm.org/git/llvm.git cd llvm git svn init https://llvm.org/svn/llvm-project/llvm/trunk --username=<username> git config svn-remote.svn.fetch :refs/remotes/origin/master git svn rebase -l # -l avoids fetching ahead of the git mirror. Workflow after (copy/paste): A second option is to use svn via the GitHub svn native bridge:: svn co https://github.com/llvm/llvm-projects/trunk/compiler-rt compiler-rt —username=... This checks out only compiler-rt and provides commit access using "svn commit", in the same way as it would do today. Finally, you could use git-svn and one of the sub-project mirrors:: # Clone from the single read-only Git repo git clone http://llvm.org/git/llvm.git cd llvm # Configure the SVN remote and initialize the svn metadata git svn init https://github.com/joker-eph/llvm-project/trunk/llvm —username=... git config svn-remote.svn.fetch :refs/remotes/origin/master git svn rebase -l In this case the repository contains only a single sub-project, and commits can be made using `git svn dcommit`, again exactly as we do today. mehdi_amini: I'm sorry I don't follow. You mention a changed in the flow for commit. Here is what's…
				rengolinUnsubmitted Done Reply Inline Actions This is how it would work on a multi-repo, but this section is talking about the mono-repo. IIGIR, on a mono-repo, developers of a single component will have to commit back on the mono-repo, which will then be propagated to the individual (read-only) repos, no? rengolin: This is how it would work on a multi-repo, but this section is talking about the mono-repo.
				mehdi_aminiAuthorUnsubmitted Not Done Reply Inline Actions This is how it would work on a multi-repo I'm not totally sure what is "This" referring to? Assuming it is about my previous paste, then no it describes the monorepo. IIGIR, on a mono-repo, developers of a single component will have to commit back on the mono-repo, which will then be propagated to the individual (read-only) repos, no? Right, and this is the same thing as what a git-svn developer do today: git clone the individual repo configure git svn to point to the SVN repo (the one from the monorepo in the future). commit through SVN the commits are propagated to the individual repo. mehdi_amini: > This is how it would work on a multi-repo I'm not totally sure what is "This" referring to?
				========

				FIXME: make something more official/testable and update all the URLs in the
				beanzUnsubmitted Done Reply Inline Actions This is a subjective statement that I don't believe is factually accurate. We could easily teach the build system to checkout subprojects so that building a full toolchain could be `git clone ... && configure && build` regardless of the repository layout. beanz: This is a subjective statement that I don't believe is factually accurate. We could easily…
				examples above.
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Removed the paragraph mehdi_amini: Removed the paragraph

				beanzUnsubmitted Done Reply Inline Actions I would phrase the downside as "rewriting the fork's history and changing its commit hashes", because that is what happens. beanz: I would phrase the downside as "rewriting the fork's history and changing its commit hashes"…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions The paragraph starts with " Using a script that rewrites history" and end with "changes the fork's commit hashes", it seems to me that this makes explicit that the downside of rewriting history is that the hashes change. (I'm not sure how "rewriting history" is a downside by itself otherwise) mehdi_amini: The paragraph starts with " Using a script that rewrites history" and end with "changes the…
				beanzUnsubmitted Done Reply Inline Actions Fine. beanz: Fine.
				Example of a working version:

				* Repository: https://github.com/llvm-beanz/llvm-submodules
				* Update bot: http://beanz-bot.com:8180/jenkins/job/submodule-update/
				rengolinUnsubmitted Done Reply Inline Actions This is confusing... I thought you were going to list all of them, mono and multi. rengolin: This is confusing... I thought you were going to list all of them, mono and multi.
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions This is the intention, should be updated later. mehdi_amini: This is the intention, should be updated later.

				rengolinUnsubmitted Done Reply Inline Actions Ok rengolin: Ok
				beanzUnsubmitted Done Reply Inline Actions I'm confused by this. The sub-project mirrors are read-only, so the workflow is either checkout the full mono-repo or use Git-SVN. That doesn't sound unchanged. beanz: I'm confused by this. The sub-project mirrors are read-only, so the workflow is either checkout…

				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions We're talking about libcxx in the monorepo proposal? Assuming yes, can you give an example of workflow that would be changed compared to today? mehdi_amini: We're talking about libcxx in the monorepo proposal? Assuming yes, can you give an example of…
				Remaining Issues
				beanzUnsubmitted Done Reply Inline Actions Ah. I think the confusing phrasing is that monorepo is being used in two contexts. Maybe rephrase this to something like: With this variant of the monorepo proposal developers who only work on excluded sub-projects will continue to use the single-project repositories. The workflow is still changed from today, because today we're using SVN. beanz: Ah. I think the confusing phrasing is that monorepo is being used in two contexts. Maybe…
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions Sorry, the sentence is really about the monorepo: leaving libcxx within the monorepo should not be a regression compared to today. mehdi_amini: Sorry, the sentence is really about the monorepo: leaving libcxx within the monorepo should not…
				================
				beanzUnsubmitted Done Reply Inline Actions This is a little unclear to me. Do you mean applying the patches via "git apply" from a patch file? Might be worth clarification about how that would work. beanz: This is a little unclear to me. Do you mean applying the patches via "git apply" from a patch…

				LNT and llvmlab will need to be updated: they rely on unique monotonically
				increasing integer across branch [MatthewsRevNum]_.

				Straw Man Migration Plan
				beanzUnsubmitted Done Reply Inline Actions It is worth noting (as I did when I sent this out) that this was a very rough prototype, and it doesn't solve all the problems that we would expect a more permanent solution to solve. For example, the submodule update is periodic, not on a push-based notification, and the scripting around it doesn't do a single commit per update, which was the intended solution. beanz: It is worth noting (as I did when I sent this out) that this was a very rough prototype, and it…
				========================
				mehdi_aminiAuthorUnsubmitted Done Reply Inline Actions (Already addressed above) mehdi_amini: (Already addressed above)

				beanzUnsubmitted Done Reply Inline Actions I'd like to see that mentioned here as well. This document is quite large and people may jump around reading it. It is worth having the note directly next to the link. beanz: I'd like to see that mentioned here as well. This document is quite large and people may jump…
				STEP #1 : Before The Move

				1. Update docs to mention the move, so people are aware of what is going on.
				2. Set up a read-only version of the GitHub project, mirroring our current SVN
				beanzUnsubmitted Done Reply Inline Actions This makes it sound like the git mirrors are read-write. Might be worth adding a "via Git-SVN" comment to clarify. beanz: This makes it sound like the git mirrors are read-write. Might be worth adding a "via Git-SVN"…
				repository.
				3. Add the required bots to implement the commit emails, as well as the
				umbrella repository update (if the multirepo is selected) or the read-only
				Git views for the sub-projects (if the monorepo is selected).

				STEP #2 : Git Move

				4. Update the buildbots to pick up updates and commits from the GitHub
				repository. Not all bots have to migrate at this point, but it'll help
				provide infrastructure testing.
				5. Update Phabricator to pick up commits from the GitHub repository.
				6. Instruct downstream integrators to pick up commits from the GitHub
				repository.
				7. Review and prepare an update for the LLVM documentation.

				Until this point nothing has changed for developers, it will just
				boil down to a lot of work for buildbot and other infrastructure
				owners.

				Once all dependencies are cleared, and all problems have been solved:

				STEP #3: Write Access Move

				8. Collect developers' GitHub account information, and add them to the project.
				9. Switch the SVN repository to read-only and allow pushes to the GitHub repository.
				10. Update the documentation
				11. Mirror Git to SVN.

				STEP #4 : Post Move

				10. Archive the SVN repository.
				11. Update links on the LLVM website pointing to viewvc/klaus/phab etc. to
				point to GitHub instead.

				.. [LattnerRevNum] Chris Lattner, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041739.html
				.. [TrickRevNum] Andrew Trick, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041721.html
				.. [JSonnRevNum] Joerg Sonnenberg, http://lists.llvm.org/pipermail/llvm-dev/2011-July/041688.html
				.. [TorvaldRevNum] Linus Torvald, http://git.661346.n2.nabble.com/Git-commit-generation-numbers-td6584414.html
				.. [MatthewsRevNum] Chris Matthews, http://lists.llvm.org/pipermail/cfe-dev/2016-July/049886.html
				.. [submodules] Git submodules, https://git-scm.com/book/en/v2/Git-Tools-Submodules)
				.. [statuschecks] GitHub status-checks, https://help.github.com/articles/about-required-status-checks/
				.. [LebarCherry] Port Cherry to a single repository rewriting history, http://lists.llvm.org/pipermail/llvm-dev/2016-July/102787.html
				.. [AminiCherry] Port Cherry to a single repository preserving history, http://lists.llvm.org/pipermail/llvm-dev/2016-July/102804.html

docs/index.rst

Context not available.

	CodeOfConduct	CodeOfConduct
	Proposals/GitHubSubMod	Proposals/GitHubSubMod
		Proposals/GitHubMove

	:doc:`CodeOfConduct`	:doc:`CodeOfConduct`
	Proposal to adopt a code of conduct on the LLVM social spaces (lists, events,	Proposal to adopt a code of conduct on the LLVM social spaces (lists, events,
Context not available.

This is an archive of the discontinued LLVM Phabricator instance.

Moving to GitHub - Unified ProposalClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 70090

docs/Proposals/GitHubMove.rst

docs/index.rst

Moving to GitHub - Unified Proposal
ClosedPublic