This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] Keep comments aligned to macros
Needs ReviewPublic

Authored by mzeren-vmw on Jan 14 2018, 7:47 AM.

Details

Reviewers
krasimir
klimek
Summary

r312125, which introduced preprocessor indentation, shipped with a known
issue where "indentation of comments immediately before indented
preprocessor lines is toggled on each run". For example these two forms
toggle:

#ifndef HEADER_H
#define HEADER_H
#if 1
// comment
#   define A 0
#endif
#endif

#ifndef HEADER_H
#define HEADER_H
#if 1
   // comment
#   define A 0
#endif
#endif

This happens because we check vertical alignment against the "#" yet
indent to the level of the "define". This patch resolves this issue by
checking vertical alignment against the "define", and by tracking a
"LevelOffset" (0 or 1) in each AnnotatedLine to account for the
off-by-one indentation of preprocessor lines.

Diff Detail

Event Timeline

mzeren-vmw created this revision.Jan 14 2018, 7:47 AM

Just from a formatting point of view, why not:

//.   Comment
#.    define X

Just from a formatting point of view, why not:

//.   Comment
#.    define X

(assuming the '.'s are unintentional)
There is some logic in placing // in column 0, since we place # in column 0. However, we do not have examples of that style in our code base. We do have examples of aligning // above define:

   // Comment
#  define

Are you suggesting that c-f manage the indent after the // (or ///, etc.)? This seems more complex than managing the space between # and <directive>. I do want c-f to be able to re-indent aligned comments if an #if is inserted above.

krasimir added inline comments.Jan 18 2018, 7:24 AM
lib/Format/TokenAnnotator.cpp
1710

Please comment these.

1756

This feels a bit awkward: we're adding code that implicitly assumes the exact style the preprocessor directives and comments around them are handled. Maybe if this could become part of the level itself, it would feel less awkward.

lib/Format/TokenAnnotator.h
41

Is there a way to not introduce LevelOffset, but have it part of Level?

unittests/Format/FormatTest.cpp
2619

I would like to see test including multiline //-comment sections before, inside and after preprocessor directives as well as /**/-style comments.

Documented CommentAlignment enumerators. Documenting them suggested better
enumerator names.

Added tests for multi-line comments, block comments and trailing comments.

mzeren-vmw marked 2 inline comments as done.
mzeren-vmw added inline comments.
lib/Format/TokenAnnotator.cpp
1756

I agree that the "long distance coupling" is awkward. Perhaps the new enumerator names make this a bit more palatable?

lib/Format/TokenAnnotator.h
41

Level is an abstract indentation level, while LevelOffset is "columns". They have different units. Maybe it would be feasible to change the units of "Level" to columns in order to merge these two variables, but doing so would throw away information. It also seems like a much larger change. We could create a composite type class AnnotatedLevel { private: unsigned Level, unsigned Offset public: <strongly typed operations> } but that seems over-engineered. Any other ideas?

krasimir added inline comments.Jan 22 2018, 9:11 AM
lib/Format/TokenAnnotator.cpp
1725

Why are we checking NextNonComment.Level > 0 here? We could be aligned with the next preprocessor directive even at level 0.

1729

I think this should be enabled only if preprocessor indentation has been enabled.

1756

Could you add a comment that this 1 comes from the # in the preprocessor directive?

unittests/Format/FormatTest.cpp
2629

I would expect this comment to be aligned with the #endif on the next line.

2657

Same here.

While I agree that there is probably a bug to fix, I don't (yet) agree with what is proposed in this patch. I think a comment in between preprocessor directives should always either:

  • Be considered part of the code in between the #-lines
  • Be considered to be commenting on the subsequent #-line

In the former case, we need to indent with the regular IndentWidth, completely irrespective of anything inside the preprocessor lines. In the latter case, we should align with the # in column 0. To me, aligning with the define seems fundamentally wrong.

mzeren-vmw added a comment.EditedJan 22 2018, 4:12 PM

To me, aligning with the define seems fundamentally wrong.

we definitely have code that does that internally. It can also be seen in the wild e.g.:
https://github.com/boostorg/config/blob/develop/include/boost/config/detail/posix_features.hpp
However, it seems reasonable that clang-format's "default" be alignment with #. That will be a simpler patch, and it will resolve the toggling behavior. Let me work that up in a separate review.