This is an archive of the discontinued LLVM Phabricator instance.

clang-format: Do not binpack initialization lists
ClosedPublic

Authored by Typz on Jun 15 2017, 7:15 AM.

Details

Summary

This patch tries to avoid binpacking when initializing lists/arrays, to allow things like:

static int types[] = {
    registerType1(),
    registerType2(),
    registerType3(),
};
std::map<int, std::string> x = {
    { 0, "foo fjakfjaklf kljj" },
    { 1, "bar fjakfjaklf kljj" },
    { 2, "stuff fjakfjaklf kljj" },
};

This is similar to how dictionnaries are formatted, and actually corresponds to the same conditions: when initializing a container (and not just 'calling' a constructor).

Such formatting involves 2 things:

  • Line breaks around the content of the block. This can be forced by adding a comma or comment after the last element
  • Elements should not be binpacked

This patch considers the block is an initializer list if it either ends with a comma, or follows an assignment, which seems to provide a sensible approximation.

Diff Detail

Repository
rL LLVM

Event Timeline

Typz created this revision.Jun 15 2017, 7:15 AM
Typz added a comment.EditedJun 15 2017, 7:30 AM

This patch is probably not complete, though it works fine in all situations I could think of: nested initializers, "short" statement (properly merged), column layout is still performed when needed...

static int types[] = {
    SourcePrivate::registerTypes(),
    registerEnum<MediaItemType>(),
    registerEnum<MediaItemFlag>(),
};
static int types[] = {registerTypes(),
                      registerEnum<MediaItemType>(),
                      registerEnum<MediaItemFlag>()};
static int types[]{
    SourcePrivate::registerTypes(),
    registerEnum<MediaItemType>(),
    registerEnum<MediaItemFlag>(),
};
static int types[]{SourcePrivate::registerTypes(),
                   registerEnum<MediaItemType>(),
                   registerEnum<MediaItemFlag>()};
static int types[]{0, 1, 2};
static int types[] = {0, 1, 2};
static int types[] = {aaaaaaaaaaa,  bbbbbbbbbb,   cccccccccccc,
                      dddddddddddd, eeeeeeeeeee,  fffffffffff,
                      gggggggggg,   hhhhhhhhhhhh, iiiiiiiiii};
static int types[] = {
    aaaaaaaaaaa, bbbbbbbbbb, cccccccccccc, dddddddddddd, eeeeeeeeeee,
    fffffffffff, gggggggggg, hhhhhhhhhhhh, iiiiiiiiii,
};
std::map<int, std::string> x = {
    {0, "foo fjakfjaklf kljj"},
    {1, "bar fjakfjaklf kljj"},
    {2, "stuff fjakfjaklf kljj"},
};

I have seen 2 problems so far, but they do not seem to be related:

  • indent is performed using continuationIndentWidth, while I think IndentWidth may be more appropriate
  • When force-wrapping due to comma, all the content may be merged afterwards, leading to some strange layout. In this case, I think it should either not merge the content at all, or merge the braces as well:
static int types[] = {
    0, 1, 2
};

But I would need some feedback:

  • Would this be an interesting improvement?
  • Should this come with an option, or should this be the new behavior?
  • Any case in particular you think will break?
  • Technically, is this the right approach, or should I introduce a new TokenType and try to determine it in TokenAnnotator? This may help factorize some code from TokenAnnotator (which adds the forced line breaks) and ContinuationIndenter (which prevents BinPacking) and possibly UnwrappedLineFormatter (to fix the issue where the items get merged but not the braces)
  • Should I also update TokenAnalyser to force wrap around the braces after assignment, like what happens when there is a comma at the end?
lib/Format/ContinuationIndenter.cpp
1007 ↗(On Diff #102670)

I don't really understand what this "TT_ArrayInitializerLSquare" case is...

Typz edited the summary of this revision. (Show Details)Jun 15 2017, 7:31 AM

Some people write

auto x = std::map<int, std::string>{
    { 0, "foo fjakfjaklf kljj" },
    { 1, "bar fjakfjaklf kljj" },
    { 2, "stuff fjakfjaklf kljj" },
};
Typz added a comment.Jun 15 2017, 7:44 AM

Some people write

auto x = std::map<int, std::string>{
    { 0, "foo fjakfjaklf kljj" },
    { 1, "bar fjakfjaklf kljj" },
    { 2, "stuff fjakfjaklf kljj" },
};

This case (and other way std::map<> x =) actually already worked, as long it ends with a comma: I suppose the braces at the beginning of each item already caused a line-break.

djasper edited edge metadata.Jun 15 2017, 8:12 AM

I am fine not bin-packing when the last element has a trailing comma. But lets not special case assignments.

Typz updated this revision to Diff 102672.Jun 15 2017, 8:13 AM

fix unit tests

Typz updated this revision to Diff 102808.Jun 16 2017, 5:26 AM
Typz marked an inline comment as done.

remove special case after assignment

Typz updated this revision to Diff 102809.Jun 16 2017, 5:27 AM

fix indent

Typz updated this revision to Diff 103056.EditedJun 19 2017, 8:51 AM

Fix case where the content fits on a line, by wrapping after each comma, like this:

static int types[] = {
        0,
        1,
        2,
};
Typz updated this revision to Diff 103057.Jun 19 2017, 9:04 AM

Fix indentation

djasper accepted this revision.Jun 25 2017, 11:09 PM

Looks good. Thank you!

This revision is now accepted and ready to land.Jun 25 2017, 11:09 PM
This revision was automatically updated to reflect the committed changes.