Page MenuHomePhabricator

[clang-format] PR49960 clang-format doesn't handle ASI after "return" on JavaScript
Changes PlannedPublic

Authored by MyDeveloperDay on Jun 29 2021, 1:09 AM.

Details

Summary

https://bugs.llvm.org/show_bug.cgi?id=49960

clang-format can mutate legal javascript code due to missing semicolons, but clang format pulls the next line into the first causing incorrect grammar.

function t() {
  if (true) return
  const v = 42
}

clang-format result:

function t() {
  if (true)
    return const v = 42
}

This was already somewhat addressed by the work of @mprobst in https://reviews.llvm.org/rG1dcbbcfc5cf06d2eacc68fbe9b6fc1fb12168d6f back in 2016, but this revision aims to identify other possible places so clang-format wouldn't be quite so destructive as people develop.

Diff Detail

Unit TestsFailed

TimeTest
40 msx64 debian > Clang-Unit.Format/_/FormatTests::FormatTestJS.ASIAfterReturn
Script: -- /var/lib/buildkite-agent/builds/llvm-project/build/tools/clang/unittests/Format/./FormatTests --gtest_filter=FormatTestJS.ASIAfterReturn
70 msx64 windows > Clang-Unit.Format/_/FormatTests_exe::FormatTestJS.ASIAfterReturn
Script: -- C:\ws\w7\llvm-project\premerge-checks\build\tools\clang\unittests\Format\.\FormatTests.exe --gtest_filter=FormatTestJS.ASIAfterReturn

Event Timeline

MyDeveloperDay requested review of this revision.Jun 29 2021, 1:09 AM
MyDeveloperDay created this revision.

It's a strange language :)

This revision is now accepted and ready to land.Jun 29 2021, 11:19 AM
mprobst added inline comments.Jun 29 2021, 11:27 AM
clang/lib/Format/UnwrappedLineParser.cpp
1021

can you add comments explaining what syntax is being detected here?

1025

parsePPEndIf uses a different method to peek - any opinion on why you're doing it differently here?

Move Peek next token into a function, add a comment for clarity

MyDeveloperDay marked 2 inline comments as done.Jul 1 2021, 3:28 AM
MyDeveloperDay added inline comments.
clang/lib/Format/UnwrappedLineParser.cpp
1025

I'm not entirely convinced parsePPEndIf IS looking at the next token

shouldn't

unsigned TokenPosition = Tokens->getPosition();
FormatToken *PeekNext = AllTokens[TokenPosition];

be

unsigned TokenPosition = Tokens->getPosition();
FormatToken *PeekNext = AllTokens[TokenPosition+1];

I followed the pattern in parseStructuralElement() but I think I should avoid that repitition by adding a function

also from elsewhere we see this:

FormatToken *Previous = AllTokens[Tokens->getPosition() - 1];

and so by definition, I would think it was.

FormatToken *CurrentToken= AllTokens[Tokens->getPosition()];
FormatToken *NextToken= AllTokens[Tokens->getPosition() + 1];

MyDeveloperDay marked an inline comment as done.Jul 1 2021, 3:29 AM
mprobst accepted this revision.Jul 1 2021, 4:36 AM
mprobst added inline comments.
clang/lib/Format/UnwrappedLineParser.cpp
1001

nit: just return AllTokens[TokenPosition]?

1020

do you still need this, given peeknext protects against running past the doc end?

clang/lib/Format/UnwrappedLineParser.h
188

cam you document what this does to the "current token" pointer?

MyDeveloperDay marked 3 inline comments as done.
MyDeveloperDay set the repository for this revision to rG LLVM Github Monorepo.

Updating based on revision comments, but found that the indentation can be incorrect. (will put to planned changes)

MyDeveloperDay planned changes to this revision.Jul 1 2021, 5:44 AM