This is an archive of the discontinued LLVM Phabricator instance.

clang-format: [JS] re-quote single or double quoted strings.
ClosedPublic

Authored by mprobst on Feb 18 2016, 8:50 AM.

Details

Reviewers
djasper
Summary

Automatically turns "foo" into 'foo' (or vice versa, depending on configuration).
This makes it more convenient to follow the Google JavaScript style guide:
https://google.github.io/styleguide/javascriptguide.xml?showone=Strings#Strings

This functionality is behind the option "Quotes", which can be:

  • "leave" (no re-quoting)
  • "single" (change to single quotes)
  • or "double" (change to double quotes)

This also changes single quoted JavaScript string literals to be treated as
tok::string_literal, not tok::char_literal, which fixes two unrelated tests.

Diff Detail

Event Timeline

mprobst updated this revision to Diff 48334.Feb 18 2016, 8:50 AM
mprobst retitled this revision from to clang-format: [JS] single quote double quoted strings..
mprobst updated this object.
mprobst added a reviewer: djasper.
mprobst added a subscriber: cfe-commits.
mprobst updated this revision to Diff 48338.Feb 18 2016, 9:09 AM

Fixed typo.

djasper added inline comments.Feb 24 2016, 3:01 AM
lib/Format/Format.cpp
1622

What's the reason not to do this in FormatTokenLexer, ideally at the same time when increasing the length?

1628

LLVM comes with its own string stream implementation. Use that: raw_string_ostream.

mprobst updated this revision to Diff 48945.Feb 24 2016, 7:50 AM
mprobst marked an inline comment as done.
  • Move code closer together by piping Replacements into the FormatTokenLexer.
mprobst marked an inline comment as done.Feb 24 2016, 7:51 AM
mprobst added inline comments.
lib/Format/Format.cpp
1622

Done. I initially wanted to avoid having FormatTokenLexer generate Replacements, but this seems cleaner indeed.

mprobst updated this revision to Diff 48946.Feb 24 2016, 7:52 AM
mprobst marked an inline comment as done.
  • Move code closer together by piping Replacements into the FormatTokenLexer.
  • remove sstream import
djasper edited edge metadata.Feb 26 2016, 1:11 AM

Nice, the behavior now seems nicely isolated. Is this a purely stylistic kind of change or do you think every JS developer working on any kind of codebase will want it? Generally this feels like something we should carefully enable with a dedicated style option. Where can I find more information about it?

Nice, the behavior now seems nicely isolated. Is this a purely stylistic
kind of change or do you think every JS developer working on any kind of
codebase will want it? Generally this feels like something we should
carefully enable with a dedicated style option. Where can I find more
information about it?

According to Google JavaScript style, you should always use single quotes
for string literals. Outside of that, people usually prefer one over the
other (I think), some people just use both indiscriminately. We could make
it an option, and only turn it on for a specific Google style.

Martin

Sounds like a good plan. Should also mention the corresponding style guide reference in the commit message:

https://google.github.io/styleguide/javascriptguide.xml?showone=Strings#Strings

mprobst updated this revision to Diff 49214.Feb 26 2016, 11:09 AM
mprobst edited edge metadata.
  • Move code closer together by piping Replacements into the FormatTokenLexer.
  • remove sstream import
  • * Treat JavaScript single quoted string literals as tok::string_literal, not
mprobst retitled this revision from clang-format: [JS] single quote double quoted strings. to clang-format: [JS] re-quote single or double quoted strings..Feb 26 2016, 11:13 AM
mprobst updated this object.
mprobst updated this object.
djasper added inline comments.Mar 1 2016, 3:37 PM
include/clang/Format/Format.h
606

As this only affects JavaScript and doesn't make any sense for Java or C++, I'd like to have JS/JavaScript in the option and enum name.

lib/Format/Format.cpp
76

For almost all the other enums we use upper case.

1087

Are there more than the two options? Otherwise, I would find:

Style.Quotes == FormatStyle::QS_Single && FormatTok->TokenText.startswith("\'")

more intuitive.

1094

Move this up so you can use it in the if-statement?

unittests/Format/FormatTestJS.cpp
638

Why change this test? I think this was meant to test that we always add a line break between two consecutive string literals as otherwise the user could just have written this without the "' + '".

878

I don't think this is intended.

mprobst updated this revision to Diff 49562.Mar 1 2016, 4:05 PM
mprobst marked 6 inline comments as done.
  • Move code closer together by piping Replacements into the FormatTokenLexer.
  • remove sstream import
  • * Treat JavaScript single quoted string literals as tok::string_literal, not
  • Address review comments, mostly renaming things.
lib/Format/Format.cpp
1087

I would too, but think about template strings. Added a comment. I think it's safer to specifically check for the thing that we do handle/fix below, instead of just accepting everything we think is not correct.

unittests/Format/FormatTestJS.cpp
638

This was exposed by the char_literal/string_literal test, it aligns the behaviour with what would always happen for a double quoted string. It fits the no-alignment rule.

I've changed the test to make sure it actually wraps on two lines.

878

It is. This was incorrectly handled before, the line is longer than 20 chars. It's because I switched single quoted strings from char_literal to string_literal.

mprobst updated this revision to Diff 49569.Mar 1 2016, 5:10 PM
mprobst marked an inline comment as done.
  • Move code closer together by piping Replacements into the FormatTokenLexer.
  • remove sstream import
  • * Treat JavaScript single quoted string literals as tok::string_literal, not
  • Address review comments, mostly renaming things.
  • Fix mustBreakBefore to handle tok::string_literal
djasper added inline comments.Mar 1 2016, 9:03 PM
lib/Format/Format.cpp
1131

Maybe add a FIXME saying that we also need to modify the TokenText if we want to be able to correctly split a long string literal.

1134

Thinking about this some more, I think it would be better to have replacements just changing each of the replaced quotes. Otherwise, this can get really tricky once we want to start to automatically split up long string literals. Should not make a difference performance-wise.

mprobst updated this revision to Diff 49579.Mar 1 2016, 9:32 PM
  • Move code closer together by piping Replacements into the FormatTokenLexer.
  • remove sstream import
  • * Treat JavaScript single quoted string literals as tok::string_literal, not
  • Address review comments, mostly renaming things.
  • Fix mustBreakBefore to handle tok::string_literal
  • Insert individual replacements for each character change.
mprobst marked 2 inline comments as done.Mar 1 2016, 9:36 PM
mprobst added inline comments.
unittests/Format/FormatTestJS.cpp
638–641

Reverted after some more discussion, this is the "must break before otherwise pointless string wrapping" feature.

djasper accepted this revision.Mar 2 2016, 2:48 PM
djasper edited edge metadata.

Looks good.

lib/Format/Format.cpp
1140

I think this function doesn't carry its own weight.

This revision is now accepted and ready to land.Mar 2 2016, 2:48 PM
djasper closed this revision.Mar 2 2016, 2:49 PM

Submitted as r262534.