This is an archive of the discontinued LLVM Phabricator instance.

Add commit message guidelines to developer policy
AbandonedPublic

Authored by rengolin on Mar 10 2015, 4:36 AM.

Details

Summary

These are just guidelines, but we need something on the web so we can point people when they deviate too much from our expected behaviour.

These seem to be the most generic guidelines I could find from all comments on all threads.

Diff Detail

Event Timeline

rengolin updated this revision to Diff 21564.Mar 10 2015, 4:36 AM
rengolin retitled this revision from to Add commit message guidelines to developer policy.
rengolin updated this object.
rengolin edited the test plan for this revision. (Show Details)
rengolin added reviewers: hfinkel, rnk, chandlerc, silvas.
rengolin set the repository for this revision to rL LLVM.
rengolin added subscribers: Unknown Object (MLST), Unknown Object (MLST), dblaikie and 6 others.
hfinkel added inline comments.Mar 10 2015, 6:09 AM
docs/DeveloperPolicy.rst
283

Good commit messages help in reconstructing the rationale for the state of the code, that's their most important purpose.

Saying 'git repositories' is a bit confusing, since we still primarily use svn. Maybe say 'giv/svn'?

294

Please don't say this. IMHO, we don't have a problem with commit messages being too long; and small code examples explaining what the commit is addressing can be really informative after the fact. We could say that, if the commit message is long, to please provide a summary paragraph at the top.

300

use that -> rely on this format

306

I'd remove this paragraph. I don't think we should commit to this. ;)

jroelofs added inline comments.Mar 10 2015, 7:22 AM
docs/DeveloperPolicy.rst
283

Should we put in a couple of pointers to "good" commit messages?

285

s/the optional author/optionally, the author/ ?

306

I do think it's worth mentioning some guideline on what to do about commits with the wrong message vs commits with messages that don't quite fit this style. I think there are cases where it does make sense to revert & re-apply, and others where it's not worth it.

hfinkel added inline comments.Mar 10 2015, 7:46 AM
docs/DeveloperPolicy.rst
306

Fair enough, but if someone were to commit something with a message of "stuff" or "working on it", etc. (which has happened in the past by accident), reverting might certainly be appropriate. We have also, IIRC, reverted in the past because of lack of proper attribution (although normally a follow-up to the commits list is sent to correct omissions). Maybe some statement such as:

For minor violations of this policy, the community normally favors reminding the contributor of this policy over reverting. Minor corrections and omissions can be handled by sending a reply to the commits mailing list.
rnk added inline comments.Mar 10 2015, 8:56 AM
docs/DeveloperPolicy.rst
294

Agreed, I think this is a representative good commit message of mine with an example:

Don't diagnose no-prototype callee-cleanup function definitions

We already have a warning on the call sites of code like this:
  void f() { }
  void g() { f(1, 2, 3); }
t.c:2:21: warning: too many arguments in call to 'f'

We can limit ourselves to diagnosing unprototyped forward declarations
of f to cut down on noise.
probinson added inline comments.Mar 10 2015, 11:54 AM
docs/DeveloperPolicy.rst
285

s/the optional author/(if different from the committer) the author./

Author should not be optional if it wasn't the committer. (And is superfluous if it was the committer.)

silvas requested changes to this revision.Mar 12 2015, 4:38 PM
silvas edited edge metadata.

I don't think there's consensus about any of this. In particular, it seems like Chandler's feedback in the thread was completely ignored (or not seen). I tend to agree with his points too.

docs/DeveloperPolicy.rst
299

Please use an exclamation mark like "Patch by John Doe!". For whatever reason this seems to be the way it is usually done. I also kind of like it because I think it expresses our excitement about committing the patch for a new contributor (new contributors are vital for community growth).

This revision now requires changes to proceed.Mar 12 2015, 4:38 PM
sanjoy added a subscriber: sanjoy.Mar 12 2015, 4:57 PM

I tend to add a prefix like [IRCE] or [SCEV] to the subject of my commit messages to quickly establish context. I've seen others do it too. Does it make sense to codify that practice as a suggestion in this document?

rengolin updated this revision to Diff 21912.Mar 13 2015, 3:39 AM
rengolin edited edge metadata.

Changing in response to comments.

docs/DeveloperPolicy.rst
299

I never used the exclamation mark and I personally find it childish. But if everyone agrees we should keep it, I'll change it to it.

I tend to add a prefix like [IRCE] or [SCEV] to the subject of my commit messages to quickly establish context. I've seen others do it too. Does it make sense to codify that practice as a suggestion in this document?

This is a good point, I'll add a comment to that effect.

rengolin updated this revision to Diff 21913.Mar 13 2015, 3:46 AM
rengolin edited edge metadata.
rengolin removed rL LLVM as the repository for this revision.

Adding paragraph about headline tag

dsanders added inline comments.Mar 13 2015, 4:46 AM
docs/DeveloperPolicy.rst
299

I've never used it either, it's not a natural way for me to write. I think this is a individual personality thing and I don't think we should be codifying that.

hfinkel added inline comments.Mar 13 2015, 6:09 AM
docs/DeveloperPolicy.rst
299

I agree. I normally put something else afterward. I think just saying:

Patch by <Name><Punctuation>.*

is the most we should do.

rsmith added a subscriber: rsmith.Mar 13 2015, 1:11 PM

The attribution requirement, complete with exclamation mark, is already our policy: http://llvm.org/docs/DeveloperPolicy.html#attribution-of-changes

I've heard of people grepping through commit logs looking for such commits based on this, so there are non-obvious costs to changing it.

MatzeB added a subscriber: MatzeB.Mar 13 2015, 5:50 PM

There's room for bikeshedding of the title tag: I personally prefer "Foobar:" instead of [Foobar] as it has less distracting special chars. Both styles are used often currently.

Can we please take this back to LLVMDev? There is clearly no consensus.

rengolin updated this revision to Diff 22003.Mar 15 2015, 1:33 PM

Changing the attribution section to not duplicate information. Following last round of consensus on the mailing list.

rengolin updated this revision to Diff 22006.Mar 15 2015, 2:20 PM

final version

rengolin accepted this revision.Mar 15 2015, 2:20 PM
rengolin added a reviewer: rengolin.
  • What is the commit message policy when reverting a change ?

Is it worth adding a mention of "Differential Revision: <URL>" as per http://llvm.org/docs/Phabricator.html#committing-a-change

hfinkel edited edge metadata.Mar 16 2015, 10:01 AM
  • What is the commit message policy when reverting a change ?

There seem to be two important issues that have come up repeatedly that are likely worth mentioning:

  1. Include a rationale.
  2. Include the svn revision number(s) [and, specifically, not the git-svn hash ids]
rnk accepted this revision.Mar 16 2015, 10:47 AM
rnk edited edge metadata.

lgtm

rsmith added inline comments.Mar 16 2015, 11:52 AM
docs/DeveloperPolicy.rst
320–322

If we want to officially support such processes, we should be very explicit as to exactly what format they're looking for. Presumably, it's a case-insensitive match for the string "patch by "?

rengolin added inline comments.Mar 16 2015, 12:22 PM
docs/DeveloperPolicy.rst
320–322

I'm not being specific about the process of identifying it for a very special reason: we don't need to.

Regular expressions can be quite powerful in detecting patters, and any current script can already deal with all the variations we already have, so there's no reason to be explicit on the format. However, names can be very hard to parse, especially international names or abbreviations, and trying to get a consensus on that would be impossible.

So we rely on common sense. The most used "automated" process right now is to "grep for someone's name on the git log". So what "patch by" string looks like is irrelevant, including capitalization, exclamation marks, or others.

rsmith added inline comments.Mar 16 2015, 1:01 PM
docs/DeveloperPolicy.rst
320–322

Yours is not the only use case. "grep for someone's name on the git log" doesn't let us find all patches authored by anyone who is not the committer, which is sometimes a useful thing to do. (It's also liable to fail if there are variations in how the name is spelled.)

If we want to allow automated processes to identify information in commit messages, we should precisely specify how such information should be formatted / detected. What's the point in having a policy on this if we don't actually make it precise enough to be useful?

rengolin added inline comments.Mar 16 2015, 1:18 PM
docs/DeveloperPolicy.rst
320–322

This is not a policy, it's a guideline, and the consensus was to make it vague on purpose. If people are really going to nit pick on every single detail, than maybe we should just scrap the whole thing and go back to what it was before, ie, no guideline.

We wrote this to help people write better commit messages, not to force them through a git pre-commit hook or anything of the sort.

Maybe I should just remove the note about automated process...

rsmith added inline comments.Mar 16 2015, 1:26 PM
docs/DeveloperPolicy.rst
320–322

The attribution part of this used to be policy rather than guidelines, and should remain that way. Perhaps it shouldn't have been moved to this list of guidelines?

rengolin added inline comments.Mar 16 2015, 1:36 PM
docs/DeveloperPolicy.rst
320–322

Sorry Richard, you're a bit late on the discussion. The "policy" was very vague and didn't reflect reality. You can see how I deleted the "policy" from below in this patch, because that was the consensus we reached.

The reason, as you can read back on this review or on the mailing list, was that there were already many variations of the attribution pattern and favouring one over others didn't make sense. Furthermore, the most used "automation" was to grep for a user's name, which makes any discussion on how we write "patch by" irrelevant.

The final consensus was that writing "patch by foo" would give people the meaning (since this is a guideline), but the first line means that you *should* attribute to a third party if you didn't write the patch. The existence is needed, the format is loose.

rengolin abandoned this revision.Mar 16 2015, 1:37 PM

Enough has been discussed and the patch has landed already. I'll abandon this review in order to get work done.