This is an archive of the discontinued LLVM Phabricator instance.

clang-format: [JS] sort ES6 imports.
ClosedPublic

Authored by mprobst on May 12 2016, 2:58 AM.

Details

Summary

This change automatically sorts ES6 imports and exports into four groups:
absolute imports, parent imports, relative imports, and then exports. Exports
are sorted in the same order, but not grouped further.

Diff Detail

Event Timeline

mprobst updated this revision to Diff 57009.May 12 2016, 2:58 AM
mprobst retitled this revision from to clang-format: [JS] sort ES6 imports..
mprobst updated this object.
mprobst added a reviewer: djasper.
mprobst added a subscriber: cfe-commits.
djasper edited edge metadata.May 12 2016, 3:29 AM

Just two high-level comments. Will review in more depth later.

include/clang/Format/Format.h
770

Why is this a separate function as opposed to looking at Style.Language?

lib/Format/Format.cpp
1958

I know this file is already a huge mess, but can you move all of the JS import sorting stuff to a separate file?

mprobst updated this revision to Diff 57037.May 12 2016, 7:28 AM
mprobst edited edge metadata.
  • extract TokenAnalyzer.h and SortJavaScriptImports.h/cpp
  • clean up imports
mprobst marked 2 inline comments as done.May 12 2016, 7:30 AM
mprobst updated this revision to Diff 57039.May 12 2016, 7:35 AM
  • extract TokenAnalyzer.h and SortJavaScriptImports.h/cpp
  • clean up imports/
  • includes
djasper added inline comments.May 12 2016, 10:05 AM
lib/Format/Format.cpp
20

Use clang-format to fix the order :-)

lib/Format/FormatTokenLexer.h
28 ↗(On Diff #57039)

I believe it is a bit harmful to have that much code in a header. It now gets pulled into multiple translation units and is thus compiled multiple times only for the linker to deduplicate it afterwards. We should move the implementation in FormatTokenLexer.cpp. I am happy to do that as a follow up, though.

lib/Format/SortJavaScriptImports.cpp
54 ↗(On Diff #57039)

This comment doesn't really explain what it is and it is not obvious to me.

112 ↗(On Diff #57039)

No braces.

150 ↗(On Diff #57039)

Add:

// FIXME: Pull this into a common function.
162 ↗(On Diff #57039)

Is there any chance you are changing the total length of an import block?

226 ↗(On Diff #57039)

No braces.

291 ↗(On Diff #57039)

s/TODO(martinprobst)/FIXME/

Also, I'd probably remove the parameter until this is supported. This FIXME is quite hidden but if there is no Cursor in the interface, there is no confusion.

lib/Format/TokenAnalyzer.h
1 ↗(On Diff #57039)

This isn't quite as bad as the other, but we probably want a dedicated .cpp file, too.

mprobst updated this revision to Diff 57073.May 12 2016, 10:57 AM
mprobst marked 9 inline comments as done.
  • extract TokenAnalyzer.h and SortJavaScriptImports.h/cpp
  • clean up imports/
  • includes
  • address review comments
  • pull out implementations from header files.

PTAL.

lib/Format/SortJavaScriptImports.cpp
162 ↗(On Diff #57039)

No, this only ever monotonously increases the length. I'm aware things fall apart if it doesn't ;-)

226 ↗(On Diff #57039)

Just can't get used to that, it feels so wrong...

mprobst updated this revision to Diff 57138.May 13 2016, 1:24 AM
  • extract TokenAnalyzer.h and SortJavaScriptImports.h/cpp
  • clean up imports/
  • includes
  • address review comments
  • pull out implementations from header files.
  • support side effect imports, keep in relative order at top.
djasper accepted this revision.May 16 2016, 11:08 PM
djasper added a reviewer: klimek.
djasper edited edge metadata.
djasper removed a subscriber: klimek.

I think, this looks good.

Manuel: Are you fine with this, design-wise?

lib/Format/SortJavaScriptImports.cpp
77 ↗(On Diff #57138)

NB?

255 ↗(On Diff #57138)

We probably should really have a variant of AnnotatedLine::startsWith() that can start at an arbitrary token. But we can try that in a follow-up.

This revision is now accepted and ready to land.May 16 2016, 11:09 PM
mprobst updated this revision to Diff 57438.May 17 2016, 12:10 AM
mprobst marked an inline comment as done.
mprobst edited edge metadata.
  • extract TokenAnalyzer.h and SortJavaScriptImports.h/cpp
  • clean up imports/
  • includes
  • address review comments
  • pull out implementations from header files.
  • support side effect imports, keep in relative order at top.
  • Move getLanguageName to Format.h
mprobst marked an inline comment as done.May 17 2016, 12:10 AM
mprobst added inline comments.
lib/Format/SortJavaScriptImports.cpp
77 ↗(On Diff #57138)

"Nota bene". Seems to be uncommon in this code base, so removed.

255 ↗(On Diff #57138)

Maybe that should be a function on FormatToken itself, that searches the linked list rooted at it?

Current->startsSequence(tok::star, Keywords.kw_as, tok::identifier)?

I'll take a stab at that.

mprobst updated this revision to Diff 57440.May 17 2016, 12:15 AM
mprobst marked an inline comment as done.
  • review comments
klimek added inline comments.May 17 2016, 7:21 AM
include/clang/Format/Format.h
863

s/for debugging/.
Or do you want to claim that we'll arbitrarily change the strings here?

lib/Format/Format.cpp
2285–2293

I'd make this symmetric by pulling out 2 functions instead of falling through into the function.

lib/Format/SortJavaScriptImports.cpp
45 ↗(On Diff #57440)

As not all of the members will be default initialized I'd prefer to have constructors.

45–46 ↗(On Diff #57440)

This is really weird - a class called ImportExport that then has a bool IsExport. Please explain in a comment why you're doing that instead of any of:
a) managing a set of imports and exports on their own
b) calling this something else
c) having a class hierarchy

115 ↗(On Diff #57440)

When I see ImpExp I think "ImplicitExpression". I'd name this ImportExport.
Also, it seems like we now have an uninitialized bool and enum in the struct.

163 ↗(On Diff #57440)

If that's the case, please add a comment to that end.

207 ↗(On Diff #57440)

Usually we call Lexer::getSourceText for that (don't we have a Lexer?)

252–253 ↗(On Diff #57440)

This function is really long - can we perhaps break out smaller syntactic options.

258 ↗(On Diff #57440)

Please don't put side-effects into an if (). This might be generally easier to read if nextToken() can't fail, and instead creates an invalid token.

mprobst updated this revision to Diff 57497.May 17 2016, 10:04 AM
mprobst marked 8 inline comments as done.
  • address review comments

PTAL.

lib/Format/SortJavaScriptImports.cpp
45–46 ↗(On Diff #57440)

Discussed offline. I stand by this code. Yes, it has an odd boolean flag, but imports and exports in ES6 are essentially the same syntactical construct, with exports being syntactic sugar for import followed by export.

I've updated the comment on this struct substantially to explain that (I can tell how it's non-obvious).

Note that the boolean is never dispatched on in this code, except for the "empty line before exports" part. Exports are the same as imports wrt sorting them here.

115 ↗(On Diff #57440)

Now called Reference.

207 ↗(On Diff #57440)

We don't, FormatTokenLexer doesn't expose the Lexer, and TokenAnalyzer doesn't export FormatTokenLexer.

258 ↗(On Diff #57440)

Done, much cleaner indeed. LLVM style guide says not to introduce new static globals with constructors, so I'm just using a field. Please advise if there's a better way.

klimek added inline comments.May 18 2016, 6:37 AM
lib/Format/SortJavaScriptImports.cpp
46–47 ↗(On Diff #57497)

Thanks, this is much better.

84–85 ↗(On Diff #57497)

Any reason you're not using a SourceRange?

93–96 ↗(On Diff #57497)

Doesn't that break strict weak ordering requirements?

x = non-side-effect (url: a)
y = side-effect (url: b)
z = non-side-effect (url: c)
now x and y are incomparable:
!(x < y) because x.url < y.url
&& !(y < x)
because of y == side-effect
but x and z are comparable:
x < z
breaks transitivity of incomparability (https://en.wikipedia.org/wiki/Weak_ordering#Strict_weak_orderings)

To keep them equal, wouldn't you need:

if (LHS == SIDE_EFFECT && RHS == SIDE_EFFECT)
  return false;

(that would at least seem symmetric)

128 ↗(On Diff #57497)

It seems like you can pull the loop below into its own function parseModuleReferences or something.

129 ↗(On Diff #57497)

LastStart is just the start of the currently processed module reference, right? I'd just call it Start then.

135 ↗(On Diff #57497)

Why are you defining this here, far before its use?

I think this code might be slightly simpler (for me to understand) if you just pulled Reference out of the loop, and got rid of LastStart.

mprobst marked 6 inline comments as done.May 18 2016, 7:33 AM

PTAL.

lib/Format/SortJavaScriptImports.cpp
93–96 ↗(On Diff #57497)

As discussed offline, the comparison before makes sure that RHS == SIDE_EFFECT here.

Added a comment.

135 ↗(On Diff #57497)

As discussed offline, moved the declaration a bit down.

I think exposing just a SourceLocation Start makes the algorithm easier to read, otherwise it'd look like a whole JsModuleReference could leave the loop, which is never the case. It explicitly points out we only have state outside of the loop to track the start.

mprobst updated this revision to Diff 57621.May 18 2016, 7:33 AM
mprobst marked 2 inline comments as done.
  • address review comments
  • - extract parseModuleReferences
klimek edited edge metadata.May 18 2016, 7:56 AM

We're getting there. Couple of nits left.

lib/Format/SortJavaScriptImports.cpp
94–97 ↗(On Diff #57621)

Yea, completely missed that the != above.

128 ↗(On Diff #57621)

Return by value.

216–217 ↗(On Diff #57621)

Both of these are used only once, perhaps inline?

229 ↗(On Diff #57621)

I'd put that down after References.push_back so calculating the Reference is at least a single flow.

mprobst marked 4 inline comments as done.May 18 2016, 8:52 AM
mprobst added inline comments.
lib/Format/SortJavaScriptImports.cpp
216–217 ↗(On Diff #57621)

These are setting fields for the parse state (note: this is not a variable declaration).

mprobst updated this revision to Diff 57637.May 18 2016, 8:52 AM
mprobst marked an inline comment as done.
mprobst edited edge metadata.
  • address review comments
  • - extract parseModuleReferences
  • more review comments
klimek accepted this revision.May 19 2016, 6:00 AM
klimek edited edge metadata.

lg

mprobst updated this revision to Diff 57794.May 19 2016, 8:06 AM
mprobst edited edge metadata.
  • correctly insert breaks after import block
mprobst updated this revision to Diff 57795.May 19 2016, 8:12 AM
  • ranges
klimek added inline comments.May 19 2016, 8:17 AM
lib/Format/SortJavaScriptImports.cpp
160 ↗(On Diff #57795)

Between categories of imports and imports and exports, right?

170–171 ↗(On Diff #57795)

Shouldn't we add 2 if NewlinesBefore is 0? Or is that syntactically impossible?

mprobst marked 2 inline comments as done.May 19 2016, 8:22 AM

Thanks for the review, appreciated!

lib/Format/SortJavaScriptImports.cpp
160 ↗(On Diff #57795)

Added a comment.

170–171 ↗(On Diff #57795)

The next token is on a new line, so it gets a wrap just from that already. I've added a test though.

This revision was automatically updated to reflect the committed changes.
mprobst marked 2 inline comments as done.
Via Conduit