This is an archive of the discontinued LLVM Phabricator instance.

[RFC] Moving to GitHub Proposal: NOT DECISION!
ClosedPublic

Authored by rengolin on Jul 18 2016, 7:01 AM.

Details

Reviewers
rengolin
Summary

This is a draft on the proposal to moving to GitHub. Once we agree if this is a good proposal, we'll create the survey to get the final and complete intent of the community and act on it.

*THIS IS NOT TO AGREE ON THE DECISION*, but only to agree on the proposal, which will then fuel the decision survey. Please, avoid arguments against Git or GitHub in this review. You'll have the opportunity to do so in the survey.

Dates and times can be agreed later, but we're probably looking to a 6 month period anyway.

I tried to include everyone I found in Phab that was on one of the discussions, but I may have missed a few. Feel free to add more people to it, but let's try to get a reasonable agreement to the proposal (not the move) under two weeks.

Diff Detail

Event Timeline

rengolin updated this revision to Diff 64313.Jul 18 2016, 7:01 AM
rengolin retitled this revision from to [RFC] Moving to GitHub Proposal: NOT DECISION!.
rengolin updated this object.
rengolin set the repository for this revision to rL LLVM.
jroelofs added inline comments.Jul 18 2016, 7:26 AM
docs/Proposals/GitHub.rst
128

Do you mean s/SVN RW access/SVN RO access/ here?

137

Need to clarify here whether *write* access through SVN will be going away. If I understand the proposal correctly, it will go away, but this section makes it sound like it's staying.

rengolin added inline comments.Jul 18 2016, 8:04 AM
docs/Proposals/GitHub.rst
128

No, I actually mean SVN RW access. GitHub's SVN view does allow write access to the Git repos via "svn commit".

137

Our SVN server will die, SVN access will continue via GitHub.

compnerd added inline comments.Jul 18 2016, 8:07 AM
docs/Proposals/GitHub.rst
128

I believe @rengolin is referring to the final state here. I agree that the current phrasing makes it hard to follow.

130

"Which will continue to have SVN access" is redundant given the previous statement.

137

The way that I read the nutshell is that it would potentially continue to exist, just at a different address.

156

GitHub does have HTTPS based connections. It seems highly unlikely that this is a real concern. Companies would have to go out of their way to block access specifically to github over SSH and HTTPS.

168

I don't fully understand how this is any different from today. We have a core set of developers with commit access. Others are encouraged to provide patches via email (or may use phabricator depending on the reviewer). Once reviewed and accepted, one of the core developers still commits the change. I just see this as a process change.

The person forks the repository on github, and creates a branch, and then a PR. The PR is reviewed and once accepted, merged by one of the core developers. It even implicitly handles authorship tracking which has currently been done in an adhoc fashion via the commit message.

223

Giving permissions to only the LLVM "project" is sufficient. People can be added to the LLVM "project" as collaborators and get access that way. This is similar to how Apple is managing swift for comparison.

rengolin added inline comments.Jul 18 2016, 8:32 AM
docs/Proposals/GitHub.rst
130

good point. I'll try and re-write those points to be more clear.

156

I have had this problem in China... Though no one has raised this issue, so I'll just remove and let people complain about this in the survey.

168

Today we all commit to SVN, which is linear. In GitHub, we'll be committing to git. If we can have hooks forbidding merges, it'll remain linear, but then pull requests will be blocked. Additional hooks will need to be in place (please suggest all of them here and I'll update the doc).

223

That's what I meant but I will change the wording.

compnerd added inline comments.Jul 18 2016, 8:35 AM
docs/Proposals/GitHub.rst
168

I think that we should aim to preserve the linearity of history. This would mean that we block non-fastforward commits (i.e. no merges, no force pushes).

jroelofs added inline comments.
docs/Proposals/GitHub.rst
128

Ah, I didn't catch that part. Cool.

137

Ah, ok.

rengolin updated this revision to Diff 64334.Jul 18 2016, 9:02 AM
rengolin removed rL LLVM as the repository for this revision.

First round of changes reflecting reviews.

filcab added a subscriber: filcab.Jul 18 2016, 10:42 AM

What about branches? I'm guessing we should expect the usual release branches. But will any person be able to create a branch? Will there be a policy, if this is the case? Is the policy enforceable?

filcab added inline comments.Jul 18 2016, 10:42 AM
docs/Proposals/GitHub.rst
123

How easy will it be to clone the "aggregated" repo, and then get some (but not all) of the submodules?

131

I wouldn't call it broken.
Won't it have the same end result as having a checkout per project and simply updating them close to each other?

Basically, it won't be "any more broken" than using this method for updating:

#!/bin/bash
for dir in llvm/{,tools/{clang,lld},projects/{libcxx,libcxxabi,compiler-rt}}; do
  # (cd $dir && svn up) # for SVN
  (cd $dir && git checkout master && git pull) # for git
done
probinson added inline comments.
docs/Proposals/GitHub.rst
142

This reads a little bit like "we will create a GitHub account for you if you don't have one" but I suspect people actually need to create their own GitHub accounts first. (We're not all on GitHub already!)

emaste added a subscriber: emaste.Jul 18 2016, 11:16 AM
emaste added inline comments.
docs/Proposals/GitHub.rst
9–10

It seems pedantic, but I think we should try hard to avoid conflating Git and GitHub. What about: "move our revision control system from self-hosted Subversion to Git hosted by GitHub."

94

Replace "stuck" with something neutral. "stuck" implies that everyone will want to move but some may not be able to for technical or other reasons, but some people actually prefer SVN.

181

This presents it as if the decision is already made, which somewhat defeats the purpose of writing a proposal for the LLVM community to vote on.

Maybe "If we decide to move"?

winksaville added inline comments.
docs/Proposals/GitHub.rst
133

This could be worded a little better, may I suggest something like:

"There is no need for additional tags, flags, properties, or external ..."

mehdi_amini added inline comments.
docs/Proposals/GitHub.rst
123

I expect the umbrella repo to come with scripts for that.

130

This sentence is not clear: "Individual projects' history will be broken", I don't see what's "broken".

169

I think that's covered line 136-137.

174

I don't think so: LNT is not dependent on SVN history. It is dependent on a single revision number cross-repository and cross-branches. This is something that the umbrella projects "could" provide.

199

Uh, this point is not clear: there will a need for us to maintain a "non-trivial" hook on our server to handle this. This is not fleshed out in this document.

rengolin added inline comments.Jul 18 2016, 1:15 PM
docs/Proposals/GitHub.rst
9–10

This is a summary of the whole proposal, which specifically dictates GitHub.

We're not proposing to move to some Git hosting anymore, but exactly GitHub, due to the constraints that we outline below. If we do not move to GitHub specifically, a lot of the assumptions below will be wrong, and this proposal can't be accepted.

There is a paragraph on why git, and another on why GitHub, and both are key points of this proposal.

I'll change "from Subversion" to "from our own hosted Subversion" to make that even more clear.

94

good point.

123

Checking out this project:

https://github.com/chapuni/llvm-project-submodule

Will return the references, not the sub modules. You have to "init" each sub-module independently, which achieves the same task as only checking out the SVN repos you need, with the correct numbering.

131

No, it won't.

Checking out LLVM only skips all commits from all other repos. So, for example:

LNT 123
Clang 124
RT 125
LLVM 126

Then, "svn checkout 126" will be:

In LNT, 123 as 126
In Clang, 124 as 126
In RT, 125 as 126
In LLVM, 126 as 126

With the new SVN interface, each one of those commits will be referred to, locally, as 123, and 126 will not exist, because the "git rev-list --count" won't get as high as 126.

However, on the umbrella submodule project, because the sequence of the commits is guaranteed, the rev-list count will bring the correct numbering, the same as via the SVN interface, and thus be "just like SVN was".

133

Yup, changing on next round. Thanks!

142

well, "you need to provide the GitHub user" should take care of that, but I'll try to re-write this to make it more clear.

(We're not all on GitHub already!)

Are we not? Egregious! :D

181

I tried very hard to not mean that, but this one may be interpreted in the wrong way... I'll change that.

rengolin updated this revision to Diff 64371.Jul 18 2016, 1:16 PM

Second round of suggestions applied.

rengolin added inline comments.Jul 18 2016, 1:21 PM
docs/Proposals/GitHub.rst
175

So, LNT migration plan could be:

  1. Use GitHub's SVN view on llvm-proj-submods
  2. Move to understand submods
  3. Migrate all instances

Looks fairly orthogonal to me...

199

Good point. I added just a description to this topic, since this is covered elsewhere.

rengolin updated this revision to Diff 64373.Jul 18 2016, 1:22 PM

Removing "broken" to describe the history, just explaining it'll be local.

Expanding to mention that hooks will need to be implemented in step 3.

mehdi_amini added inline comments.Jul 18 2016, 1:25 PM
docs/Proposals/GitHub.rst
200

Annoyingly my comment does no longer show-up next to the point it was referring to, it was about your third point:

Make sure we have an llvm-project (with submodules) setup in the official account.

I don't think how this project will be updated is explained (or even mentioned) anywhere.

rengolin added inline comments.Jul 18 2016, 1:31 PM
docs/Proposals/GitHub.rst
201

You can click on the "<<" button and it will show where it was first inserted. That's how I found out. :)

The hooks, AFAICS, will be added to the project initially, and won't need to be updated. Takumi's current repository is automatically updated, and IIRC, it's just a hook.

Assuming it's atomic and quick enough, would this process make much of a difference to the users?

mehdi_amini added inline comments.Jul 18 2016, 1:31 PM
docs/Proposals/GitHub.rst
209

I'd like to see clearly mentioned somewhere else than in the plan that on top of "hooks" hosted by github, we will need to maintain our own webservice on our own server to answer updates from theses hooks and update the umbrella.
That's a non-negligible drawback in face of the motivation exposed in the "Why move at all?" section.

rengolin added inline comments.Jul 18 2016, 1:58 PM
docs/Proposals/GitHub.rst
209

There are two types of hooks:

  1. Pre-commit hooks that will stop anyone trying to merge/force push commits to the projects, in order to keep the history clean. These are install once, use forever. Zero maintenance after the initial period.
  1. Post-commit hooks on the other projects / OR / external webservice/buildbot that will update the umbrella project like any existing Git mirror. Maintenance of this is either free (hooks) or very cheap (buildbot/cron jobs).

On both cases, the history is preserved at least within the update cycle, which shouldn't be more than 5 minutes, and will be the update cycle that buildbots will pick the commits anyway.

rengolin updated this revision to Diff 64383.Jul 18 2016, 2:18 PM

Expand step 2 to make sure we don't forget about the safety hooks on each project as well as the webhook to update the umbrella project. This could turn out to be a buildbot, but makes no difference at this stage.

mehdi_amini added inline comments.Jul 18 2016, 2:41 PM
docs/Proposals/GitHub.rst
210

Pre-commit hooks

Won't handle the update of the umbrella.

Post-commit hooks

Can't handle the update of the umbrella *because of GitHub*, this could be possible with our own hosting of git for instance.

rengolin added inline comments.Jul 18 2016, 3:55 PM
docs/Proposals/GitHub.rst
210

Pre-commit hooks are not designed to update the umbrella. Webhooks will be able to update the umbrella with a small external service, as proposed in the IRC.

lattner resigned from this revision.Jul 18 2016, 4:05 PM
lattner removed a reviewer: lattner.
lattner added a subscriber: lattner.

Please send this to llvm-dev for discussion when it converges, thanks!

delcypher added inline comments.
docs/Proposals/GitHub.rst
103

s/How will/What will/

137

@rengolin : I know GitHub enterprise has a "protected branch" feature to prevent forced pushed ( https://github.com/blog/2051-protected-branches-and-required-status-checks ). You might want to speak to them to see if they can offer us that feature. I think there's a limited support to do a merge as a squash and rebase too ( https://github.com/blog/2141-squash-your-commits )

234

GitHub organizations support the notion of teams which can each have different permissions (for example you'll want to make sure only the right people have admin access and give the rest write/read access). You could also make a team per project so that write access in one project does not give write access to another. I think it would be good to decide on how teams will be organized and state this in the document.

mehdi_amini added inline comments.Jul 18 2016, 6:02 PM
docs/Proposals/GitHub.rst
210

That's why I asked it to be clearly mentioned somewhere else that on top of "hooks" hosted by github, we will need to maintain our own webservice on our own server to answer updates from theses hooks and update the umbrella, because that's a non-negligible drawback in face of the motivation exposed in the "Why move at all?" section.
Right now the document does not acknowledge that AFAICT.

jyknight added inline comments.
docs/Proposals/GitHub.rst
210

The maintenance of that service will be negligible compared to running a subversion installation.

I expect that we could set up the webhook as an AppEngine app, using Github's "Git Data" API to generate the new commit, and then to the first approximation never have to touch it again.

lattner removed a subscriber: lattner.Jul 18 2016, 7:51 PM

Mostly wording comments, thank you for writing this up!

docs/Proposals/GitHub.rst
79

nit: I see you use FREE in caps but this instance isn't *FREE* (as opposed to the first mention above) -- consider making it consistent? Either remove the emphasis (just "free") or emphasise consistently?

87

Did you mean "diversity and equality" instead of "diversity and quality" here?

111–113

Consider rewording this sentence -- it's a little too long and is trying to say too many things.

Perhaps something like:

"Each LLVM project will continue to be hosted as separate GitHub repositories under a single GitHub organisation. Users can continue to choose to use either SVN or Git to access the repositories to suit their current workflow."

186

Probably worth mentioning how Phabricator will need to be updated to integrate with the GitHub repository once the canonical repo is changed.

vsk added subscribers: friss, vsk.Jul 18 2016, 8:21 PM

@rengolin thanks for putting this together! I chimed in with some comments in-line.

docs/Proposals/GitHub.rst
69

What do you mean by multiple concurrent builds?

186

Yes, the llvmlab bisect functionality is affected. IMO it is invaluable for bug triage. Could you add some kind of reassurance that initially, updating it for the new VC model will just require pointing it to github's SVN view?

221

This is tricky. CC'ing @friss to comment on how we'd need to update our internal auto-merger.

234

I think this is an important discussion to have once the move is under-way. I don't think finer-grained write privileges should be a part of this proposal since it's (1) a separate issue and (2) not *just* an artifact of llvm-project's svn structure (i.e there are good reasons to keep the current permissions model in place).

friss added inline comments.Jul 18 2016, 8:49 PM
docs/Proposals/GitHub.rst
221

This is not an issue. We were already consuming the llvm.org git repos, so we'd just need to point to the new git repos (given the git history is preserved, I don't think I've read this anywhere in this document, but I think it's safe to make this assumption)

Thanks a lot for working on this!

Filipe

rengolin added inline comments.Jul 19 2016, 4:05 AM
docs/Proposals/GitHub.rst
69

With git worktree you can work on source code and build different things at the same time, but I guess this is a specific use case which is only made "easier" in git. I'll remove this extra comment.

79

good point.

87

indeed, fixed.

103

ok

111–113

Much better, thanks!

137

I'm asking about protected branches (it was proposed earlier, but I can't find any info on it).

But we don't want to squash people's commits. They can do that on their own before commit.

186

Adding llvmlab bisect and Phab to the list. Both should be trivial.

210

I'm adding a list of needed hooks to the "What changes" section.

234

Indeed. We want a flat permission model, in the same way we have today, with the exception of the umbrella project, which no one will have write access to.

If we decide to change things in the future, it'll be a completely different discussion.

rengolin updated this revision to Diff 64467.Jul 19 2016, 4:06 AM

More updates, following recent comments.

rengolin updated this revision to Diff 64468.Jul 19 2016, 4:08 AM

Formatting issues (bullet points)

rengolin updated this revision to Diff 64469.Jul 19 2016, 4:09 AM
jlebar added a subscriber: jlebar.EditedJul 19 2016, 5:37 PM

I'm sure you all have thought about this more than I have, and I apologize if this has been brought up before because I haven't been following the thread closely. But I am not convinced by this document that using subrepositories beats using a single git repo.

I see two reasons here for using subrepos as opposed to one big repository.

  1. Subrepos mirror our current scheme.
  2. Subrepos let people check out only the bits of llvm that they want.

I don't find either of these particularly compelling, compared to the advantages of one-big-repo (discussed below). Taking them in turn:

  1. Although subrepos would mirror our current scheme, it's going to be different *enough* that existing tools are going to have to change either way. In particular, the svn view of the master repository is not going to be useful for anything. I tried svn checkout https://github.com/chapuni/llvm-project-submodule, and the result was a 504 error. I have no idea how this is supposed to work, but I would be very surprised if you got a checkout that recursively contained the git submodules.
  1. It's true that subrepos let people check out only the bits that they want. But disk space and bandwidth are very cheap today, and LLVM is not as large as one might think. My copy of https://github.com/llvm-project/llvm-project, which includes *everything* is 2.5G, whereas my copy of just llvm is 626M.

    Given that a release build of llvm and clang is ~3.5G, a 2.5G source checkout doesn't seem at all unreasonable to me.

    If it's really problematic, you can do a shallow checkout, which would take the contains-everything repo from 2.5G to 1.3G. Moreover if it's *really* a problem, you can mirror the subdir of llvm that you care about. Maybe the LLVM project could maintain said mirrors for some of the small subrepos that are often used independently.

So what's the advantage of using one big repository? The simple answer is: Have you ever *tried* using git submodules? :)

Submodules make everything more complicated. Here's an example that I hope proves the point. Suppose you want to commit your current work and switch to a new clean branch off head. You make some changes there, then come back to your current work. And let's assume that all of your changes are to clang only.

# Commit current work, switch to a clean branch off head, then switch back.

# One big repo: 
$ git commit  # on old-branch
$ git fetch
$ git checkout -b new-branch origin/master
# Hack hack hack...
$ git commit
$ git checkout old-branch

# Submodules, attempt 1:
$ cd clang
$ git commit  # on old-branch
$ git fetch
$ git checkout -b new-branch origin/master
# Also have to update llvm...
$ cd ../llvm
$ git fetch
$ git checkout origin/master
$ cd ../clang
# Hack hack hack
$ git commit

# Now we're ready to switch back to old-branch, but...it's not going to work.
# When we committed our old branch, we didn't save the state of our llvm
# checkout.  So in particular we don't know which revision to roll it back to.

# Let's try again.
# Submodules, attempt 2:
$ cd clang
$ git commit  # on old-branch
$ cd ..
$ git checkout -b old-branch # in master repo
$ git commit

# Now we have two branches called "old-branch": One in the master repo, and one
# in the clang submodule.  Now let's fetch head.

$ git fetch  # in master repo
$ git checkout -b new-branch origin/master
$ git submodule update
$ cd clang
$ git checkout -b new-branch
# Hack hack hack
$ git commit  # in submodule
$ cd ..
$ git commit  # in master repo

# Now we're ready to switch back.

$ git checkout old-branch  # in master repo
$ git submodule update

For those keeping track at home, this is 5 git commands with the big repo, and 15 commands (11 git commands) in the submodules world.

Above we assumed that all of our changes were only to clang. If we're making changes to both llvm and clang (say), the one-big-repo workflow remains identical, but the submodules workflow becomes even more complicated.

I'm sure people who are better at git than I can golf the above commands, but I'll suggest that I'm an above-average git user, so this is probably a lower-than-average estimate for the number of git commands (particularly git help :). git is hard enough as-is; using submodules like this is asking a lot.

Similarly, I'm sure much of this can be scripted, but...seriously? :)

Sorry for the wall of text. tl;dr: One big repo doesn't actually cost that much, and that cost is dwarfed by the cost to humans of using submodules as proposed.

FYI after talking to Chandler, I'm going to write up a separate proposal for the one-repository thing and send it to the list tomorrow. The suggestion was that this phabricator thread isn't the right place to have this discussion.

You will not be required to use submodules at all, as we'll all use the individual projects, like we have always been. I don't understand why people keep going back to it.

Having a single repository was part of the original proposal for years, and every time it was shot down as impractical.

Indeed, this is not the place for such discussion, but unless you can finish the discussion thus week, I suggest you make you point clear in the survey instead of delaying the process.

I'm not pushing for *my* solution. Thus *has* been discussed already to exhaustion. The current agreement was that we'd do a survey on the proposal, and that's what we need to do. Anything else will just send us back to square one and I seriously don't have the stamina to keep going round in circles.

Ie. Please, try to be considerate.

You will not be required to use submodules at all, as we'll all use the individual projects, like we have always been. I don't understand why people keep going back to it.

There is a key use case that is not supported by the current setup. This is something that I -- and basically anyone who works on an llvm project other than llvm itself -- do every day. It will be supported with submodules, but badly. It is not supported at all today, and it will not be supported at all in the proposed world if I don't use submodules. Maybe the people who keep coming back to this have not explained this use-case clearly, so let me try.

The use-case is: Maintaining a branch which contains changes to clang (or your favorite subproject) and also is locked to a specific revision of llvm. That is, I can check out branch A or branch B of my changes to clang, and it automatically checks out a corresponding revision of llvm that is tied to the branch.

Again, we can make this work with submodules, but it's a giant pain, see my earlier comment. It works with zero effort if we have a monolithic repository. This would be 'uge for anyone who works on clang (or any other subproject that's not llvm itself) and uses branches.

Having a single repository was part of the original proposal for years, and every time it was shot down as impractical.

I've read as many of these as I can find in the past few hours, and every argument I have found is, in my evaluation, very likely overblown or incorrect. There is strong evidence for this, in the single git repository that already exists (and includes the test-suite, so is much larger than what I propose), and also in the fact that adding everything to a single git repository will not more than ~double the size of the llvm git repo. (I'll have better numbers tomorrow, don't quote me on that just yet.)

Moreover, the current setup of unrelated git repos can be *exactly duplicated* by making sparse checkouts of the monolithic repository. You can clone the big repo and then check out only the directories you want (so it's like the others never existed, beyond their presence in your .git packfiles). Or if you want to be able to check out different revisions of (say) clang and llvm, you can do that too: Clone the big repo and make two shallow working copies, one for llvm and the other for clang. You can even place the clang working copy in tools/clang of the llvm working copy, so it functions identically to the current setup in almost every way.

The critical point is that it's trivial to use sparse checkouts to make the monolithic repository behave identically to separate repos. But it is impossible, as far as I'm aware, to make separate repos behave like a monolithic repository. So the monolithic repository is strictly more powerful.

Indeed, this is not the place for such discussion, but unless you can finish the discussion thus week, I suggest you make you point clear in the survey instead of delaying the process.

The e-mail you sent out two days ago said two weeks. Can you give me a bit more than three days?

I'm not pushing for *my* solution. Thus *has* been discussed already to exhaustion. The current agreement was that we'd do a survey on the proposal, and that's what we need to do. Anything else will just send us back to square one and I seriously don't have the stamina to keep going round in circles.

Ie. Please, try to be considerate.

I am very grateful for the work that you're doing here. I have participated in efforts very similar to this one in the past, and I appreciate how difficult and taxing they can be, and also how frustrating it can be to be see perfect be made the enemy of the good. In fact I quit my last job in part over friction created by a botched move to git.

But. I would ask you to please give me a few days to work with the community to dig in to this specific question. If I am right, it will be a boon for all of us every time we type a command that starts with "git". And if I'm wrong, I'll buy you a well-deserved beer or three, and we'll forget it and move on.

Does that sound agreeable to you?

Again, we can make this work with submodules, but it's a giant pain, see my earlier comment.

(...)

I've read as many of these as I can find in the past few hours, and every argument I have found is, in my evaluation, very likely overblown or incorrect.

We've heard both sides making equal claims. People work differently.

The critical point is that it's trivial to use sparse checkouts to make the monolithic repository behave identically to separate repos. But it is impossible, as far as I'm aware, to make separate repos behave like a monolithic repository. So the monolithic repository is strictly more powerful.

LLVM is *not* a single project, but a large selection of smaller ones that *are* used independently by the *majority* of users. It may tax you more than others, but it will tax the majority less than today's solution.

This is not about finding the best possible way for everyone, since that's clearly impossible. This is about finding the least horrible solution for the majority.

The e-mail you sent out two days ago said two weeks. Can you give me a bit more than three days?

Moving to Git has been in discussion for at least 2 years.

This time round, my first email with a concrete proposal to migrate was 2nd June. We had so far 320+ emails about the subject since, and the overwhelming majority is in favour to move to Git and a large part is *content* with sub-modules. Counter proposals were presented (including a monolithic repository) and were all shot down by the community (not me).

This is not the time to be second guessing ourselves. I'll be finishing this proposal this week and asking the foundation to put up a survey as soon as possible.

But. I would ask you to please give me a few days to work with the community to dig in to this specific question. If I am right, it will be a boon for all of us every time we type a command that starts with "git". And if I'm wrong, I'll buy you a well-deserved beer or three, and we'll forget it and move on.

A monolithic repository was proposed and discredited by the community. I can't vouch for it myself (in the interest of progress), but we *will* allow people to add comments on the survey. If there is a sound opposition to sub-modules in the survey, and a solid proposal to use a monolithic repo instead, we'll go to the next cycle, in which case, I'll politely step down and let other people in charge (whomever wants it).

All in all, we (for any definition of "we") are not going to force anyone to do anything they don't want. But as a community, we really should be thinking about the whole process, not just a single use case.

rengolin accepted this revision.Jul 20 2016, 6:15 AM
rengolin added a reviewer: rengolin.

I'm auto accepting this proposal, as it seems to have ran its course.

The commit is r276097.

If anyone has any additional comment/suggestion, please submit a new review.

This revision is now accepted and ready to land.Jul 20 2016, 6:15 AM
rengolin closed this revision.Jul 20 2016, 6:15 AM

Hi, Renato.

Just to explain why I'm going to go forward with this RFC about a monolithic repository: From speaking with some top contributors on IRC, I have heard that they feel that the discussion of whether to move to git has been conflated with the discussion of how the git repository should be set up. So there is a sizable set of important individuals who don't feel that this question has been considered sufficiently.

I would love not to alienate you by asking this question, but I understand if I already have. If so, I sincerely apologize.

Have the alternatives to sub-modules and monolithic repository been discussed ?

Sub-modules have their disadvantages as described in the following blog post: https://codingkilledthecat.wordpress.com/2012/04/28/why-your-company-shouldnt-use-git-submodules/

In our org we're using git-repo framework (https://code.google.com/p/git-repo/) to manage our project's multiple repositories (which include both llvm/clang) and it works very well. This is the same framework google is using to maintain Android project (200+ repos) internally AFAIK.

IMHO, using git-repo is less restrictive then git sub-modules and also allows multiple,independent development flows.

Have the alternatives to sub-modules and monolithic repository been discussed ?

Hi, Vlad.

Please see the ongoing thread in llvm-dev, entitled "[RFC] One or many git repositories?".

Tools such as git-repo, have been discussed briefly. I think the general feeling is that most of us (myself included) would rather not learn a new tool if there's a simpler alternative, such as a vanilla git workflow. But if you would like to discuss more, that thread is the right place.

-Justin

beanz added a subscriber: beanz.Jul 25 2016, 10:08 AM

@rengolin, thank you for putting this all together. It is very well thought out, and I really like the shape it took. I have a few minor nitpick comments inline.

Thanks,
-Chris

docs/Proposals/GitHub.rst
58

Not to be pedantic here, but do we actually /know/ that most LLVM developers use Git? I suspect that most do, but I don't think we actually have any metrics, so we should avoid using declarative language here. Maybe change 'most' to 'many' here and below.

87

In lists you should generally keep the same verb conjugations. Maybe repharse this list to something more like:

... hosting developer meetings, sponsoring disadvantaged people... and fostering diversity and equality in our community.

I think the general feeling is that most of us (myself included) would rather not learn a new tool if there's a simpler >alternative, such as a vanilla git workflow.

Generally you're right, however learning how to use git-repo is much simpler than managing the intricacies of git sub-modules (and Google's experience with Android is a clear example of it). This is IMHO of cause.

+ you get a good integration with Gerrit enabled services like Gerrithub (http://gerrithub.io/) and other CI systems (like Jenkins for example)

I'll try to bring this topic back in llvm-dev - thanks for suggestion.

+ you get a good integration with Gerrit enabled services like Gerrithub (http://gerrithub.io/) and other CI systems (like Jenkins for example)

BTW, Buildbot is also supporting git-repo: http://docs.buildbot.net/latest/manual/cfg-buildsteps.html#repo

I think the general feeling is that most of us (myself included) would rather not learn a new tool if there's a simpler >alternative, such as a vanilla git workflow.

Generally you're right, however learning how to use git-repo is much simpler than managing the intricacies of git sub-modules (and Google's experience with Android is a clear example of it). This is IMHO of cause.

Just to be clear, since it sounds like you haven't been following the llvm-dev discussion -- the alternative to git-repo is *not* submodules. I agree that submodules are awful. The alternative is a monolithic repository (monorepo) that contains, as a single git repository, the full history of all llvm subprojects. Similar to https://github.com/llvm-project/llvm-project.