Page MenuHomePhabricator

[clang-format] Fix ignoring JavaScriptWrapImport when ColumnWidth: 0
Needs ReviewPublic

Authored by andmis on Jan 4 2022, 6:19 PM.

Details

Summary

Prior to this patch, when ColumnWidth: 0, all JS imports of the form import {aaa, bbb, ccc} from "def"; would be formatted as multi-line, and the JavaScriptWrapImport option had no effect (in most cases). This patch makes it so that when ColumnWidth: 0, the JavaScriptWrapImport option causes multi-line imports, for example

import {
  aaa,
  bbb,
  ccc
} from "abc";

to be left alone. If JavaScriptWrapImport: false, then imports of this form are unwrapped to a single line in all cases, regardless of ColumnWidth. The behavior for ColumnWidth > 0 is unchanged.

Fixes: https://github.com/llvm/llvm-project/issues/52935.

Diff Detail

Unit TestsFailed

TimeTest
80 msx64 debian > LLVM.Bindings/Go::go.test
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/llvm-go go=/usr/bin/go test llvm.org/llvm/bindings/go/llvm

Event Timeline

andmis requested review of this revision.Jan 4 2022, 6:19 PM
andmis created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptJan 4 2022, 6:19 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
andmis edited the summary of this revision. (Show Details)Jan 4 2022, 6:20 PM

Thank you for the patch, could you just handle the documentation, but otherwise pretty good

clang/docs/ClangFormatStyleOptions.rst
2820

This file is generated from Format.h you need to make the changes there and regenerate using clang/docs/tools/dump_format_style.py

clang/unittests/Format/FormatTestJS.cpp
1946

you are no longer testing the LLVM Case, please don't remove that, but feel free to ensure they are doing the same for google!

MyDeveloperDay retitled this revision from [ClangFormat] Fix ignoring JavaScriptWrapImport when ColumnWidth: 0 to [clang-format] Fix ignoring JavaScriptWrapImport when ColumnWidth: 0.
curdeius requested changes to this revision.Jan 5 2022, 12:37 AM

Thanks for working on this!
Apart from other reviewer's comments, it looks pretty much ok, but the tests are a bit messy IMO.
I'd like to see something that tests 4 cases (false + no column limit, false + column limit, true + no column limit, true + column limit). In each case you should have the same imports formatted (possibly) differently.
It would be the best to have at least: single short import, single long import, multiple short import, multiple long import. And please keep what's currently tested.

clang/docs/ClangFormatStyleOptions.rst
2842–2856

What does "Original" mean here? It's unclear whether the option is true or false nor what the ColumnWidth is.

I think you want to have 3 examples:

  1. JavaScriptWrapImports: false, independent of ColumnWidth
  2. JavaScriptWrapImports: true, ColumnWidth: 0
  3. JavaScriptWrapImports: true, ColumnWidth: non-zero

In each case you want to have short and long import lines.

clang/unittests/Format/FormatTestJS.cpp
2027

It's already true, cf. line 1977. Remove this line.

This revision now requires changes to proceed.Jan 5 2022, 12:37 AM
MyDeveloperDay added inline comments.Jan 5 2022, 12:48 AM
clang/unittests/Format/FormatTestJS.cpp
1946

Its kind of our golden rule, don't change existing tests, subtle changes can have huge implications on large code bases

mprobst added inline comments.Jan 5 2022, 4:18 AM
clang/docs/ClangFormatStyleOptions.rst
2823–2838

this seems odd to me: my understanding is that clang-format always reflows the entire document, there's no logic to ever "keep" whitespace.

Are you sure you are seeing this behaviour? The logic change below sounds more as if the ColumnWidth: 0, import lines might not break (mustBreak returns false), but might still break?

clang/unittests/Format/FormatTestJS.cpp
2026

I'm not sure this test case really repros the doc changes you made (keeps any line wraps with ColW: 0).

I think if you wanted to demonstrate that, you'd need to add a test case where clang-format would normally make a change, and show that it does not with ColW: 0.

However I think that's not feasible: by design, clang-format cannot not make a change, it always reformats all code. You might need to rethink the intent here.

2027

FWIW, I think the tests might be more readable if each configuration ({true, false} x {col: 0, col: 40}) explicitly set all the options, even if it's a bit redundant.

clang/docs/ClangFormatStyleOptions.rst
2823–2838

From what I can tell with ColumnLimit (and not ColumnWidth) 0, then clang-format does many things differently. If by design or by accident I'm not sure. I can see that often, since I format my code with a limit, but my tests without.

andmis updated this revision to Diff 397579.Jan 5 2022, 7:54 AM
  • Move source code for option documentation to Format.h, from which the RST is autogenerated.
  • Clean up tests in response to review feedback.
andmis added a comment.Jan 5 2022, 7:54 AM

Thanks all for the reviews. I've updated the patch and added responses to your comments.

clang/docs/ClangFormatStyleOptions.rst
2820

Alright. I've added the source to Format.h and run dump_format_style.py.

2823–2838

I just double-checked and yes, as-implemented, with ColumnLimit: 0 and JavaScriptWrapImports: true, import statements retain the number of lines they started with.

It is explicit in the code that with ColumnLimit: 0, we decide whether to break on a token based on whether there was a preexisting line break. (Check out NoColumnLimitLineFormatter.) I think you're right that usually clang-format reflows everything and I was also confused about this point.

There are other options that do similar things, for example EmptyLineAfterAccessModifier.

2842–2856

"Original" means "Before running clang-format", I've changed the comment to the latter.

The reason this is here is that with ColumnLimit: 0, the output format depends on the input, and I don't see how to make that clear without having a before and after view.

clang/unittests/Format/FormatTestJS.cpp
1946

Thanks, this was an oversight on my part.

2026

I think the tests are doing what's claimed. I've rearranged and rewritten them, PTAL.

2027

I've reformatted them in two big blocks: one where ColumnLimit > 0 and one where ColumnLimit: 0. So there are no more redundant option set statements.

This is the most-readable way to format the tests IMO because when ColumnLimit > 0, we can use the simple verifyFormat (where you just pass the expected result and it messes it up, formats, and checks) because clang-format will do a full reflow. But for the ColumnLimit: 0 tests, we need to use the other verifyFormat (where we have input and expected output explicitly), and it's convenient to put a comment explaining what's happening at the top of the ColumnLimit: 0 block.

mprobst added inline comments.Jan 5 2022, 7:57 AM
clang/docs/ClangFormatStyleOptions.rst
2823–2838

Can you check with code that would normally cause reformatting? E.g.

import {
  A, B,
  C,
} from 'url';
andmis added inline comments.Jan 5 2022, 8:50 AM
clang/docs/ClangFormatStyleOptions.rst
2823–2838

Sigh, you are right: your snippet gets formatted to

import {
  A, B, C,
} from 'url';

I had tried

import {
  A,
  B, C,
} from 'url';

which does indeed get left alone.

It seems that the implementation logic of the NoColumnLimitLineFormatter just doesn't play well with certain other clang-format features. In this case, the NoColumnLimitLineFormatter decides whether to allow line breaks based on whether a given token had a newline preceding it in the original input, but tokens also get reordered by SortIncludes so being preceded by a newline doesn't mean what NoColumnLimitLineFormatter thinks it does. For example, in my implementation, these two get formatted differently:

import {
  A,
  B, C,
} from 'url';

import {
  A,
  C, B,
} from 'url';

This example also shows that it is totally unclear how to make sense of both (1) keeping import lines "in the original order" and at the same time (2) sorting the actual imported symbols. So I guess we might need to rethink the intent.

It seems there are two clean options if ColumnLimit: 0:

  • Make JavaScriptWrapImports: true *always* wrap imports to multiple lines. This will be noisy and ugly.
  • Make JavaScriptWrapImports have no effect if ColumnLimit: 0, and *always* unwrap import lines to a single line if ColumnLimit: 0.

In either case, a user who wants ColumnLimit: 0 and "leave my import line wrapping as I have it" (likely to be a common case for people with ColumnLimit: 0, IMO) will not be able to achieve what they want. In the much-more-common ColumnLimit > 0 case, we use the ColumnLimit to decide whether to wrap, but we can't do that here.

Note: before this patch, the option JavaScriptWrapImports is ignored with ColumnLimit: 0, and imports, even short ones, are formatted as:

import {aaa,
        bbb,
        ccc} from "def";

To summarize, it seems like JavaScript formatting with ColumnLimit: 0 is broken for JS ATM, because the flag JavaScriptWrapImports is basically ignored by the code in this case, and imports are always wrapped to multiple lines, even short ones, which is pretty ugly and noisy. This, by the way, suggests that very few people use ColumnLimit: 0 for JS. The two simple resolution options I quoted above are straightforward, but completely miss what I expect would be the most-common use case with ColumnLimit: 0: a user who wants to manually control their line wrapping. The current patch handles what's likely to be the most-common version of this use case (all imports on one line, or every import on its own line), but is broken for some edge cases in a way that is conceptually difficult to see a solution to (due to interaction with SortIncludes).

So I'm a bit unsure how to proceed. This patch seems like an improvement since it fixes the very-common case ColumnLimit: 0 and JavaScriptWrapImports: false. So IMO it would be ok to merge it.

What should we do in the case ColumnLimit: 0 and JavaScriptWrapImports: true? For a given import line, we could check whether it started out multi-line, and if so, we break at every imported symbol, otherwise we leave the import line as a single line. But this cascades into further issues: is the first symbol on the same line as the opening curly? Do we try to infer this from the original layout? Or add another flag? Or just impose a policy?

Make JavaScriptWrapImports: true *always* wrap imports to multiple lines. This will be noisy and ugly.

Isn't this what prettier does when effectively the ColumnLimit is exceeded.

i.e.

import { Controller, Get, Post, Req } from '@nestjs/common';

becomes as I hit the 80 column mark

`
import {
  Controller,
  Get,
  Post,
  Req,
  Request,
  Param,
  Query,
  StreamableFile,
  Body,
} from '@nestjs/common';

So if ColumnLimit is 0 it should wrap them shouldn't it if JavaScriptWrapImports: true

That's what happens when you hit the column limit, when there is a column
limit. But do we really want every one-symbol import to wrap to 3 lines
when ColumnLimit: 0? Slash to force the user to unwrap every import, even
20-symbol 300-column imports, to a single line?

I would say for the ColumnLimit:0 case, we don't have to wrap a single import like this: for JavaScriptWrapImports :true

import {
   Get
}  from '@nestjs/common';

For more than one import then I'd say it should do:

import {
   Get,
   Req
}  from '@nestjs/common';

What tells me that is the "s" at the end of "Imports" of JavaScriptWrapImports meaning more than one. i.e. we don't wrap on 1 but we do on 2 and above

For ColumnLimit: 0 and JavaScriptWrapImports: false then I think that means we shouldn't touch the imports at all.

This is becuase in this case we don't really want "true/false" we want "Leave/Always/Never", I don't think we should assume false means Never and hence its time to disambiguate.

This is part of our natural lifecycle of all options: (which tend to follow this patter)

  1. they start as booleans
  2. they become enums
  3. they become structs

It probably means we've got to the point where they should be changed to an enum so we don't have to guess what all our 100,000's of users will want but give them the power to do what they need.

andmis added a comment.EditedJan 6 2022, 4:57 AM

Thanks for the feedback. Two things:

  1. Force-breaking at >= 2 imports and not breaking at 1 import has the advantage of being simple to state, implement, document, and test, but I don't think it's actually the behavior people will want. For example, the original bug reporter apparently thought about it too and apparently came up with the same semantics I tried to implement as his idea of "the right thing": https://github.com/llvm/llvm-project/issues/52935.
  2. The issue isn't whether we use an enum or a boolean. The issue is that it is not clear what the semantics should be of "keep input file's breaking decisions". If SortIncludes: true, what should "keep" do here?
import {
  A,
  B, C,
} from 'url';

import {
  A,
  C, B,
} from 'url';

In my opinion, this could just be merged, pending a larger revisit: (1) very few people use ColumnLimit: 0, at least with JS code, else we would hear more complaints, since import wrapping is currently really not good in this case, (2) this patch applies the correct fix in the case ColumnLimit: 0, JavaScriptWrapImports: false, (3) this patch admittedly does not completely solve ColumnLimit: 0, JavaScriptWrapImports: true, but is not a regression there, either.

(1) very few people use... I don't think it's actually the behavior people will want....

Subjective or Objective opinion?

https://github.com/search?o=desc&q=%22ColumnLimit%3A+0%22&s=indexed&type=Code

95,000+ occurrences of "ColumnLimit" in github YAML files and multiple ColumnLimit: 0 even on the first page and https://www.hyrumslaw.com/ suggests we shouldn't make that kind of assumption

From my recollection @mprobst is the original author of this and they are on this review as a reviewer so I tend to bow to their better judgement. its possible we never thought about the ColumnLimit: 0 case originally.

curdeius added inline comments.Jan 6 2022, 6:21 AM
clang/include/clang/Format/Format.h
2620

Please change all occurrences of ColumnWidth (old name) to ColumnLimit.

clang/unittests/Format/FormatTestJS.cpp
2027

It's ok for me too, but I'd like to either put all the involved options each time, or minimalistic changes, but not a mix of the two.

andmis added a comment.Jan 6 2022, 6:28 AM

My guess that ColumnLimit: 0 is rarely used for JS is based on the objective fact that JS import formatting is (IMO very) buggy with the column limit set that way, and it took several years for us to hear a bug report about it. And "we should not make assumptions about what people will want" applies just as well to the proposal of force-wrapping at >= 2 imports.

I agree that AlwaysWrap/NeverWrap/LeaveExistingWraps is clearer and better than true/false. But we still have no answer for the semantics of LeaveExistingWraps in edge cases. Are we saying that we should not merge this bugfix without figuring out the answer to the semantics question and/or changing the option to an enum?

applies just as well to the proposal of force-wrapping at >= 2 imports

Absolutely.. but it justifies that this option has got to the point that its no longer one thing or the other based on our personal subjective opinions... now we need to break down what we need and make this a better feature

otherwise we just flip-flop between different groups of users calling it a regression. The ideal would be that whatever options we add, the default would give us exactly as we have today, almost warts and all. but the options would allow us to give all the possibilities we think people might want, or leave us room to expand into those options (moving to an enum normally helps that)

i.e.

JavaScriptWrapImports: Never (false)
JavaScriptWrapImports: Multiple (true)  // meaning more than 1
JavaScriptWrapImports: Always

Therefore wouldn't the current behaviour for ColumnLimit: 0 be JavaScriptWrapImports: Always? (or Never if you want them all on one line)

andmis added a comment.Jan 7 2022, 8:24 AM

The current behavior when ColumnLimit: 0 and JavaScriptWrapImports: false formats this:

import {aaa} from "abc";
import {aaa, bbb, ccc} from "def";
import {aaa, bbb} from "defghi";
import {aaa, looooooooooooooooooooong, ccc,} from "ghi";
import {
  aaa,
  ccc,
  bbb
} from "jkl";

to this:

import {aaa} from "abc";
import {aaa,
        bbb,
        ccc} from "def";
import {aaa,
        bbb} from "defghi";
import {aaa,
        ccc,
        looooooooooooooooooooong,} from "ghi";
import {
  aaa,
  bbb,
  ccc
} from "jkl";
stasm added a subscriber: stasm.Jan 8 2022, 8:50 AM

The reporter of issue 52935 here. Thanks, @andmis, for your work. Thinking about the ColumnLimit: 0 and JavaScriptWrapImports: false case, it seems that there are two issues in the current implementation that could be solved separately.

  1. Single-line imports get force-wrapped despite JavaScriptWrapImports: false. This seems to be a clear bug in clang-format. The expected behavior in this case should be to not touch the import line at all. Instead, the current behavior is the following:
import {aaa, bbb, ccc} from "def";
import {aaa,
        bbb,
        ccc} from "def";
  1. It's not clear what JavaScriptWrapImports: false should do to multiline imports when ColumnLimit: 0. Should it
    • force-unwrap to a single line, or
    • leave the import as-is (i.e. not force-wrap it)?

      Since the expected behavior is not clear there might indeed be different groups of users expecting one behavior or the other. To reduce the ambiguity an enum option like the one proposed by @MyDeveloperDay would be helpful.

I'd personally would love to see both of these issues addressed (and I'd be a happy user of JavaScriptWrapImports: Never if it's available), but just fixing the first bug would go a long way in making ColumnLimit: 0 a viable setting for JavaScript for me.

please mark your review comments as done when done so we know its a complete review