[clang-format] SortIncludes should support "@import" lines in Objective-C
Fixes #38995
This is an attempt to modify the regular expression to identify
@import and import alongside the regular #include. The challenging
part was not to support @ in addition to # but how to handle
everything that comes after the include|import keywords. Previously
everything that wasn't " or < was consumed. But as you can see in
this example from the issue #38995, there is no " or < following the
keyword:
@import Foundation;
I experimented with a lot of fancy and useful expressions in this
online regex tool only to find out that some
things are simply not supported by the regex implementation in LLVM.
- For example the beginning [\t\ ]* should be replacable by the horizontal whitespace character \h* but this will break the SortIncludesTest.LeadingWhitespace test.
That's why I've chosen to come back to the basic building blocks.
The essential change in this patch is the change from this regular
expression:
^[\t\ ]*#[\t\ ]*(import|include)[^"<]*(["<][^">]*[">]) ~ ~~~~~~~~~~~~~~ ^ ^ | | only support # prefix not @ | only support "" and <> as delimiters no support for C++ modules and ; ending. Also this allows for "> or <" or "" or <> which all seems either off or wrong.
to this:
^[\t\ ]*[@#][\t\ ]*(import|include)([^"/]*("[^"]+")|[^</]*(<[^>]+>)|[\t/\ ]*([^;]+;)) ~~~~ ~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~ ^ ^ ^ ^ ^ | | | | | Now support @ and #. Clearly support "" and <> as ell as an include name without enclosing characters. Allows for no mixture of "> or <" or empty include names. Trailing comments on include lines as shown in D124513 are ignored now by ignoring slashes in the characters that lead up to a match group.
Here is how I've tested this patch:
ninja clang-Format ninja FormatTests ./tools/clang/unittests/Format/FormatTests --gtest_filter=SortIncludesTest*
And if that worked I doubled checked that nothing else broke by running
all format checks:
./tools/clang/unittests/Format/FormatTests
One side effect of this change is it should partially support
C++20 Module
import lines without the optional export in front. Adding
this can be a change on its own that shouldn't be too hard. I say
partially because the @ or # are currently *NOT* optional in the
regular expression.
I see an opportunity to optimized the matching to exclude @include for
example. But eventually these should be caught by the compiler, so...
With my change, the matching group is not at a fixed position any
longer. I decided to choose the last match (group) that is not empty.
I think you meant @import bar;, which is valid in Objective-C only. So is #import btw.