Page MenuHomePhabricator

[Clang-Format] New option BeforeLambdaBody to manage lambda line break inside function parameter call (in Allman style)
Needs ReviewPublic

Authored by Wawha on Mar 18 2018, 10:41 AM.

Details

Reviewers
djasper
klimek
Summary

This is a patch to fix the problem identify by this bug: https://bugs.llvm.org/show_bug.cgi?id=27640.

When formatting this code with option "BreakBeforeBraces: Allman", the lambda body are not put entirely on new lines if the lambda is inside a function call.
Here an example of the current formatting in "allman":

connect([]() {
  foo();
  bar();
});

We the new option, the formatting is like that:

connect(
  []()
  {
    foo();
    bar();
  });

It's my first patch for this project, so if I made some mistake, do not hesitate to explain me that is wrong.
I write few test to check that the new option work and check that there is no regression.

This patch should also meet the request of this bug/request: https://bugs.llvm.org//show_bug.cgi?id=32151

Diff Detail

Event Timeline

Wawha created this revision.Mar 18 2018, 10:41 AM

Please always upload all patches with full context (-U999999)

Wawha updated this revision to Diff 138859.Mar 18 2018, 10:53 AM

I upload the full context for these files.

Also, tests?

Sorry, the test files was missing with the first patch. I make mistake using svn... I create the new patch with git, it's easier for me.
There is 4 small tests inside unittests/Format/FormatTest.cpp. The first 3 are testing the new options, and the last one is here to check that there is no regression with initialiser list which contain lambda.

Wawha added a comment.Mar 28 2018, 2:15 AM

I do not know which reviewer to add for this review. Is it possible to give the name of the person that will be able to review this code ?

Please read:

https://clang.llvm.org/docs/ClangFormatStyleOptions.html#adding-additional-style-options

In this case in particular, I would be very interested in a style guide that defines how Allman brace style and lambdas work together. IMO, it has so many corner cases that it mostly doesn't work there. If we find a good and consistent way to do this (and your proposal to always wrap before the [] isn't a bad idea there), then I think we should make that the default behavior for lambdas in Allman style.

Wawha added a comment.Apr 6 2018, 6:51 AM

I'm not working on a public project with a public style guide, so not sure it will fit all the requirement inside. But perhaps the request of Rian for bareflank (https://bugs.llvm.org//show_bug.cgi?id=32151#c4) could help to fit the needs.

The current patch do not exactly do what he ask, but like you say, it's not easy to find all options or one options to match lambda formatting.

In my case, the main requirement is to be able avoid a lambda in one line. It's not easy to manage, when you put a breakpoint on a such line. It could break when the function taking the lambda in arg is call, or when the lambda body is executed. Which very annoying for debugging.
As example:

return something([] ()  { somethingelse(); }); // Breakpoint one this line will break when calling something() AND somethingelse()

For the moment, there is no way to format is code in another way. We could have this possibilities:

return something(
     [] ()
     {
           somethingelse();
     });

or

return something([] ()  {
           somethingelse();
     });

or

return something(
     [] ()  {
           somethingelse();
     });

The current patch are able to manage the first case, and I'm agree it make sense in allman option. So I'm ok to have it by default for allman.

Wawha added a comment.EditedApr 23 2018, 3:23 PM

Hi djasper,
Here a project where there is lambda and allman style for them :
https://github.com/boostorg/hof/blob/develop/example/in.cpp
https://github.com/boostorg/hof/blob/develop/example/sequence.cpp

Example of code:

// Function to find an iterator using a containers built-in find if available
BOOST_HOF_STATIC_LAMBDA_FUNCTION(find_iterator) = first_of(
    [](const std::string& s, const auto& x)
    {
        auto index = s.find(x);
        if (index == std::string::npos) return s.end();
        else return s.begin() + index;
    },
#ifdef _MSC_VER
    // On MSVC, trailing decltype doesn't work with generic lambdas, so a
    // seperate function can be used instead.
    BOOST_HOF_LIFT(member_find),
#else
    [](const auto& r, const auto& x) BOOST_HOF_RETURNS(r.find(x)),
#endif
    [](const auto& r, const auto& x)
    {
        using std::begin;
        using std::end;
        return std::find(begin(r), end(r), x);
    }
);

Here a complete C++ Style Guide with lambda which use allman: http://docs.steinwurf.com/coding_style.html#lambda-functions
As example:

auto callback = [function](const std::string& zone,
                           const std::string& message)
{
    boost::python::call<void>(function, zone, message);
};

in.fetch_data_ready([&](std::vector<uint8_t>& cb)
{
    fetch_data_ready_stub(cb);
});

What could be the next to move forward on this topic?
Do you think that some modification should be make on this patch? Change option name? Make it default behavior for allman?

klimek added inline comments.Apr 26 2018, 5:59 AM
include/clang/Format/Format.h
17

I'd just make that default for Allman style.

lib/Format/TokenAnnotator.cpp
0–1

Instead of parsing the whole line again, which can lead to false positives, you can add a TokenType for LambdaLBrace while parsing and use that here.

Wawha added a comment.Apr 27 2018, 2:21 AM
Hi klimek,

many thank for your comments. I will made the modifications you propose and then update this patch.

Wawha updated this revision to Diff 144460.Apr 28 2018, 12:32 PM
Wawha added a reviewer: klimek.

Hi klimek,
I upload a new patch with the modifications you proposed.
The option is now enable by default in Allman style. So I move the option to the BraceWrappingFlags struct, which make more sense.

I also make the second modification you propose to avoid parsing the complete line.
Tell me if it's fit your remarks.

mephi42 added a subscriber: mephi42.Jun 3 2018, 2:33 AM

This change is very useful for me (I even rebuilt my clang-format with it), but there were no updates for quite some time - do you still intend to integrate it?

Apologies in advance if Phabricator is not the right place for such discussions.

Wawha marked 2 inline comments as done.Jun 26 2018, 12:07 PM
Hello,

after my last modification (require by previous comment), I do not see any feedback or validation for this patch.
Is their something special to do in order to go forward on this patch?

Lambda are more an more used in modern C++, and it's very annoying to not have a way to format them in allman style.

Hello,

after my last modification (require by previous comment), I do not see any feedback or validation for this patch.
Is their something special to do in order to go forward on this patch?

Lambda are more an more used in modern C++, and it's very annoying to not have a way to format them in allman style.

Sorry for the delay, pinging it once a week until somebody replies is the right approach.

lib/Format/ContinuationIndenter.cpp
1417

Why don't lambdas already count as nested blocks? Generally, I'd rather fix BlockParameterCount than introduce another linear scan over the line.

Wawha updated this revision to Diff 153813.Jul 2 2018, 3:34 PM

Here the third version to manage break with lambda in Allman.
I implement the modification propose by klimek.
If a lambda is detected, the BlockParameterCount and ParameterCount are increment one more time in order to avoid extra line parsing after. It's lead to less code and less computation time, so I prefer this modification too.
Thank to klimek for this great remark :).

I hope that the patch is now ok for a submission. If there is other needs or remarks, tell me.

klimek added inline comments.Jul 3 2018, 12:09 AM
lib/Format/TokenAnnotator.cpp
1

Why do we want to increase the parameter count here? Specifically, why is it not enough to || this condition to the first is in the function?

Wawha added inline comments.Jul 3 2018, 1:46 AM
lib/Format/TokenAnnotator.cpp
1

I'm not sure to understand. You want to move this test in the first "if" of updateParameterCount()?
Like :

if ((Current->is(tok::l_brace) && Current->BlockKind == BK_Block) ||
 Style.BraceWrapping.BeforeLambdaBody && Current->is(TT_LambdaLSquare))
++Left->BlockParameterCount;

If it's the case, doing that won't work, because updateParameterCount() is call only for brace, not square, so the condition while be true only once and the BlockParameterCount while be equal to 1.
And ContinuationIndenter::moveStatePastScopeOpener() while require a value greater than 1 to flag true 'HasMultipleNestedBlocks'.
Same for ParameterCount.

I had this code because I was thinking that you want me to have this BlockParameterCount equal to 2 in case a lambda is alone inside a function.

klimek added inline comments.Jul 3 2018, 2:28 AM
lib/Format/TokenAnnotator.cpp
1

Ok, sorry, I'll need to look at this on more detail again. If we only have one nested block Blockparametercount should probably be one. I'm unfortunately out for the next two weeks. If Sam or Daniel or somebody else can look at that it might work earlier, otherwise we'll need to wait.

Wawha updated this revision to Diff 155254.EditedJul 12 2018, 1:06 PM
Wawha retitled this revision from [Clang-Format] New option BreakBeforeLambdaBody to manage lambda line break inside function parameter call to [Clang-Format] New option BeforeLambdaBody to manage lambda line break inside function parameter call (in Allman style).

Here a new patch, which is very similar to Diff 3. I just remove the extra line parsing which was no necessary since a new tag TT_LambdaLSquare was added.
I'm agree with klimek, path Diff 4 was quite weird and after some test contain an error in the code and unittest (inside the first test I added...)

I hope that this new version while be validated :)

klimek added inline comments.Jul 13 2018, 2:28 AM
lib/Format/ContinuationIndenter.cpp
1417

I think I misunderstood some of this the first time around (and thanks for bearing with me :) - iiuc we want to break for Allman even when there's only one nested block. I think I'll need to take another look when I'm back from vacation, unless Daniel or somebody else finds time before then (will be back in 1.5 weeks)

Hi klimek,

do you have time to take a look again to this patch?
Is my last patch ok for you?

best regards,
François

klimek added inline comments.Aug 27 2018, 12:44 AM
lib/Format/ContinuationIndenter.cpp
1417

So, HasMultipleNestedBlocks should only be true if there are multiple nested blocks.
What tests break if you remove this change to HasMultipleNestedBlocks?

Wawha marked 2 inline comments as done.Sep 1 2018, 6:23 AM
Wawha added inline comments.
lib/Format/ContinuationIndenter.cpp
1417

All cases with only one lambda in parameter are break. The Lambda body with be put on the same line as the function and aligned with [] instead of putting the body [] on another line with just one simple indentation.

So if restore the line to :

State.Stack.back().HasMultipleNestedBlocks = Current.BlockParameterCount > 1;

Here some cases.
FormatTest.cpp, ligne 11412:

// With my modification
FunctionWithOneParam(
    []()
    {
      // A cool function...
      return 43;
    });

// Without my modification
FunctionWithOneParam([]()
                     {
                       // A cool function...
                       return 43;
                     });

The "{" block of the lambda will be aligned on the "[" depending of the function name length.

You have the same case with multiple lambdas (FormatTest.cpp, ligne 11433):

// With my modification
TwoNestedLambdas(
    []()
    {
      return Call(
          []()
          {
            return 17;
          });
    });

// Without my modification
TwoNestedLambdas([]()
                 {
                   return Call([]()
                               {
                                 return 17;
                               });
                 });
klimek added inline comments.Sep 3 2018, 1:04 AM
lib/Format/ContinuationIndenter.cpp
1417

Do we want to always break before lambdas in Allman style?

iburyl added a subscriber: iburyl.Sep 4 2018, 1:46 AM
Wawha marked an inline comment as done.Sep 4 2018, 3:02 PM
Wawha added inline comments.
lib/Format/ContinuationIndenter.cpp
1417

Perhaps not always. I make that current implementation, because, I have some difficulty to make the other. I alreay tried to modified the code to have that :

TwoNestedLambdas([]()
    {
      return Call(
          []()
          {
            return 17;
          });
    });

With just a tabulation after the line return. But with no success for the moment.

I could try again to implement it, and why not, add an option to tell if we break or not before the lambda. And set it to false by default.

In my case, the main requirement is to be able avoid a lambda in one line.

That doesn't work for me. I would like short lambdas still be formatted in one line, when AllowShortFunctionsOnASingleLine or a new option is on.

christophe-calmejane added a comment.EditedOct 4 2018, 12:20 AM

Hi,

consider the following sample code (formatted without your patch)

void func()
{
  myFuncCall(8, [](int x) {
    if (x)
      call();
  });
  myOtherCall([](int x) {
    if (x)
      call();
  });
  nestedLambda([](int x) {
    return [](float f) {
      if (f)
        return f * 2;
    };
  });
}

When activating your BeforeLambdaBody option, it formats it as follow

void func()
{
  myFuncCall(8, [](int x)
    {
      if (x)
        call();
    });
  myOtherCall(
    [](int x)
    {
      if (x)
        call();
    });
  nestedLambda(
    [](int x)
    {
      return [](float f)
      {
        if (f)
          return f * 2;
      };
    });
}

This is not consistent at all and I'd rather see this patch only breaking line for the body (like the new configuration option name implies) and using ContinuationIndentWidth to indent the body.

But the main issue I'm having with this patch is if I declare the lambda as noexcept, clang-format changes the code to this (looks like it's not detecting and applying your option):

myFunc([](int x) noexcept {
	if (x)
		call();
});

Actually, without your change to HasMultipleNestedBlocks, I'm almost getting the expected result: Lambda body is correctly indented (and not by function name's length).
The only thing not working (and it's not either way, with or without your change to HasMultipleNestedBlocks), is nested lambdas. The body is not properly indented (but it's not that bad).

About the "noexcept" issue, it might not be related to your patch, I'm not sure. Although the "mutable" tag is working as expected.

Ok I found the issue with noexcept, it's not related to your patch.
There is a missing

case tok::kw_noexcept:

in

UnwrappedLineParser::tryToParseLambda

I fixed it like this (not sure it's 100% correct though!!)

  State.Stack.back().HasMultipleNestedBlocks = Current.BlockParameterCount > 1;
  if (Style.BraceWrapping.BeforeLambdaBody && Current.Next != nullptr &&
      Current.Tok.getKind() == tok::TokenKind::l_paren &&
      Current.BlockParameterCount >= 1) {
		// Search for any parameter that is a lambda
    auto const *nextTok = Current.Next;
    while (nextTok != nullptr) {
      if (nextTok->is(TT_LambdaLSquare)) {
        State.Stack.back().HasMultipleNestedBlocks = true;
        break;
      }
      nextTok = nextTok->Next;
    }
  }

It works for all cases I'm using to test:

noOtherParams([](int x){call();});
paramBeforeLambda(8,[](int x){call();});
paramAfterLambda([](int x){call();},5);
paramBeforeAndAfterLambda(8,[](int x){call();},5);
doubleLambda(8,[](int x){call();},6,[](float f){return f*2;},5);
nestedLambda1([](int x){return [](float f){return f*2;};});
nestedLambda2([](int x){ call([](float f){return f*2;});});
noExceptCall([](int x) noexcept {call();});
mutableCall([](int x) mutable{call();});

funcCallFunc(call(),[](int x){call();});

auto const l=[](char v){if(v)call();};
void f()
{
	return [](char v){ if(v) return v++;};
}

I still think it's a hack (changing "HasMultipleNestedBlocks"), but it seems to work.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 19 2019, 11:57 PM
Wawha updated this revision to Diff 216061.Tue, Aug 20, 12:50 AM

I update the patch for the last version of clang format, especially since option "AllowShortLambdasOnASingleLine" has been added (which require more change for this patch).
Indeed, for AllowShortLambdasOnASingleLine patch (https://reviews.llvm.org/D57687), the isAllmanBrace() function has been changed (adding TT_LambdaLBrace) which require more test to handle break for lambda body.

Unfortunately, there is one case I'm not able to handle correctly for the moment.
With option "BeforeLambdaBody : true", "AllowShortLambdasOnASingleLine : SLS_All;" and "BinPackParameters : false", the expected code below:

FctWithLonglineInLambda(
  param,
  []()
  {
    return HereAVeryLongLineThatWillBeFormattedOnMultipleLineAndShouldNotBeConsiderAsInline;
  });

Is formatted like this with the current patch:

FctWithLonglineInLambda(param, []() {
    return HereAVeryLongLineThatWillBeFormattedOnMultipleLineAndShouldNotBeConsiderAsInline;
  });

Can someone help me and tell where to look to set break for parameters (and lambda) when the whole parameters line (included the lambda body, which is not the case actually) do not fit the inside one line.
Before the AllowShortLambdasOnASingleLine option, there is no need for that, due to the fact that the lambda was always break. But with this new option, we need to handle that too.

cvenegas added a subscriber: cvenegas.EditedFri, Aug 23, 10:18 PM

I'm testing this patch on our codebase and it is working pretty well. We use the Allman style and the lambda problem has been an issue for many years. One thing to note in this patch is that some of the files have CRLF line endings but should be LF endings, which is why they're showing so many edits. I'm also seeing a clang tidy test failing with this patch. The readability-braces-around-statements tests seem to fail because the indent width is appending double of what it should.

void test() {
      if (cond("if0") /*comment*/) {
         do_something("same-line");
   }

Hope this helps as this patch is the only reason why we still need to build clang-format instead of using the prebuilt binaries. Thanks!