Page MenuHomePhabricator

[clang-format] Prevent Clang-Format from editing leading whitespace on lines outside of the format range
AcceptedPublic

Authored by russellmcc on Nov 25 2018, 2:32 PM.

Details

Summary

When clang-format is set to always use tabs (with tab width 4), when asked to format line 3 (using the -lines=3:3 flag):

int f() {
    int a;
    foobar();
    int b;
}

We would expect clang-format to not modify the leading whitespace on line 4. However, it does - see the new test. This is because the whitespace manager is called to replace the whitespace before the first token of line 4. This is necessary to edit the number of new lines before line 4, and to edit the trailing whitespace on line 3. I've added a flag to replaceWhitespace that allows it to not edit the leading whitespace, and only edit the whitespace up to the last newline.

We ran into this when trying to integrate clang-format with a line filter into our CI to ensure no new formatting diffs were introduced in a patch. With the buggy behavior without this patch, the number of affected lines grew each time clang-format was run with a line filter, so running clang-format locally on the changed lines was not enough to ensure that CI would pass.

Diff Detail

Event Timeline

russellmcc created this revision.Nov 25 2018, 2:32 PM

Bump! Thanks for your consideration.

JonasToth retitled this revision from Prevent Clang-Format from editing leading whitespace on lines outside of the format range to [clang-format] Prevent Clang-Format from editing leading whitespace on lines outside of the format range.Dec 3 2018, 10:17 AM
JonasToth added reviewers: klimek, krasimir, ioeric.
JonasToth set the repository for this revision to rC Clang.
JonasToth added a project: Restricted Project.

Hi russell

thank you for the patch! As I am not a clang-format reviewers these are only general things, and Nits anyway ;)
Hope the reviewers I added will evaluate better.

lib/Format/WhitespaceManager.cpp
54–65

is this a SourceRange? Please spell out the type, as there are different kind of ranges.

unittests/Format/FormatTest.cpp
9159

Spurious change.

Respond to feedback

russellmcc marked 2 inline comments as done.Jan 14 2019, 8:14 AM

Thanks for the feedback!

Thanks for the feedback! This actually isn't a new formatting option, rather it's fixing a bug where clang-format would change lines outside of the line range asked for by the user. This was preventing us from using clang-format in an automated setting.

Bump! Thanks for the consideration

Bump! Thanks for the consideration! Again, this is preventing us from rolling out automated style checking on diffs at work.

Bump! Still waiting feedback on this.

MyDeveloperDay accepted this revision.Mar 15 2019, 12:59 AM
MyDeveloperDay added a subscriber: MyDeveloperDay.

From what I can tell this looks OK as you seem to be mainly only passing down the boolean, my guess is that while most people don't use the -lines directly that often but it probably gets called by some of the formatting tools that work over selections, maybe something like git-clang-format

Assuming you have tested some of those usescase and run the FormatTests, then we should be able to apply the "Beyonce rule" for everything else. (I've actually run the tests with your patch and they are fine)

If that is the case then this LGTM (but I'm not the code owner, just someone wanting to help with these stranded reviews)

I've landed this patch in my fork (which contains list of desired patches which seem to never get landed here..)

https://github.com/mydeveloperday/clang-experimental

IMHO @russellmcc clang-format needs people like you who understand the internals, to help move along some of the reviews that just get stuck in no mans land.

This revision is now accepted and ready to land.Mar 15 2019, 12:59 AM

I don't understand yet why running clang-format locally would not do the same change?

Sorry for the ambiguity. I meant, "running clang-format on the changed lines locally was not enough". I'll edit the description to clarify.

To go in more detail, let's imagine you changed line 3. You run clang-format locally on line 3, but because of the bug fixed in the patch, this affects lines 3 and 4. Then, when you commit and push to CI, CI will see diffs on lines 3 and 4. CI will then try to ensure that running clang-format on line range 3-4 will produce no diffs. However, due to the same bug will produce a diff on line 5, so your code will be rejected by the CI.

The reason our CI works like this is we want to ensure that no _new_ formatting errors are introduced into our code. We're extensive users of git blame and we don't want to do a full format pass on our whole codebase. Clang-format documents its line filter option, so in my opinion, it should work correctly.

russellmcc edited the summary of this revision. (Show Details)Mar 21 2019, 3:01 PM

@MyDeveloperDay Thanks for the approve! Yes, this patch has been working for us as we've been using it on an internal fork since I opened the patch. As I mentioned, without this patch clang-format is useless for us in CI.

I just verified why this doesn't break for our CI:
The trick is that if the indent is actually off (as opposed to being correct, but tab vs spaces), we do re-indent until we find an indent that's correct.
The fix that would make me happier, I think, is to do the same thing here:
Basically, when we expand the range until nothing changes, do not only check whether the indent is correct, but also whether we'd change the spaces in the indent.
WDYT?

klimek: I'm sorry, I don't fully understand your proposed fix. Could you explain it in more detail?

Without being able to respond to your proposal in detail, I strongly believe if you pass in a line range to clang-format, it should not change lines outside that range OR we should modify the documentation to clearly explain what the line filter option does.

klimek added a comment.Apr 9 2019, 8:36 AM

klimek: I'm sorry, I don't fully understand your proposed fix. Could you explain it in more detail?

Without being able to respond to your proposal in detail, I strongly believe if you pass in a line range to clang-format, it should not change lines outside that range OR we should modify the documentation to clearly explain what the line filter option does.

That is a good point - generally, the problem is that one use case for clang-format is to give it some point in an expression, and have it fix that expression and currently "things around it that are off". I'm not saying that's the right solution, but if we want to change that it's a significant change to clang-format's workflow, so I think we'll need to carefully think about the effect this has, or provide reasons why we think this will not change those use cases.

For example, if I put {} around a range I typically use clang-format in some position in that range to fix up the whole range for me. This is a feature. I'm open to hearing more opinions, though.

russellmcc added a comment.EditedApr 11 2019, 8:25 AM

Thanks for the explanation! I do understand your philosophy on this, and agree with the desired behavior case you brought up where you have put in new braces.

After thinking about this more, the thing I really care about is that clang-format is idempotent with a line filter - i.e., running it twice should always have the same effect as running it once, with a line-filter including the new changes.

So, either this fix, or your proposed fix of fixing all lines until the next correct indentation would meet that idempotence criteria.

However, I think in this particular case I still prefer my fix - to me, line filter is meant to limit the effect of clang-format to just fix a particular change and the fallout from that. However, if the lines _after_ a change were wrong before, this feels very unrelated to the change that was made - why is now the time to fix these unrelated lines?

Thanks for the explanation! I do understand your philosophy on this, and agree with the desired behavior case you brought up where you have put in new braces.

After thinking about this more, the thing I really care about is that clang-format is idempotent with a line filter - i.e., running it twice should always have the same effect as running it once.

So, either this fix, or your proposed fix of fixing all lines until the next correct indentation would meet that idempotence criteria.

However, I think in this particular case I still prefer my fix - to me, line filter is meant to limit the effect of clang-format to just fix a particular change and the fallout from that. However, if the lines _after_ a change were wrong before, this feels very unrelated to the change that was made - why is now the time to fix these unrelated lines?

I agree and would be happy with the change if it would only change the line-filtered workflow, but this afaict (unless I'm missing something :) will also affect the workflow where the provided range is 0-length range, which has an implicit "format stuff around this" request from the user inside it. I'd be happy with a patch that differentiates these two sides.

I agree and would be happy with the change if it would only change the line-filtered workflow, but this afaict (unless I'm missing something :) will also affect the workflow where the provided range is 0-length range, which has an implicit "format stuff around this" request from the user inside it. I'd be happy with a patch that differentiates these two sides.

This use case needs capturing in a unit test..it isn't obvious

I agree and would be happy with the change if it would only change the line-filtered workflow, but this afaict (unless I'm missing something :) will also affect the workflow where the provided range is 0-length range, which has an implicit "format stuff around this" request from the user inside it. I'd be happy with a patch that differentiates these two sides.

This use case needs capturing in a unit test..it isn't obvious

Agreed!

Sorry, coming back to this - line-filters are _inclusive_, so how do you indicate a 0-length range from the command line?

Hey, still motivated to land this, but definitely don't want to break any existing workflows.

In particular,

but this afaict (unless I'm missing something :) will also affect the workflow where the provided range is 0-length range

I'm curious how to do this - as I mentioned, I know how to trigger a format on a 1-length range, but I don't understand how to trigger a 0-length range.