This is an archive of the discontinued LLVM Phabricator instance.

[clang-format] New API guessLanguage()
ClosedPublic

Authored by benhamilton on Feb 20 2018, 12:20 PM.

Details

Summary

For clients which don't have a filesystem, calling getStyle() doesn't
make much sense (there's no .clang-format files to search for).

In this diff, I hoist out the language-guessing logic from getStyle()
and move it into a new API guessLanguage().

I also added support for guessing the language of files which have no
extension (they could be C++ or ObjC).

Test Plan: New tests added. Ran tests with:

% make -j12 FormatTests && ./tools/clang/unittests/Format/FormatTests

Diff Detail

Repository
rL LLVM

Event Timeline

benhamilton created this revision.Feb 20 2018, 12:20 PM
krasimir accepted this revision.Feb 21 2018, 1:18 AM
This revision is now accepted and ready to land.Feb 21 2018, 1:18 AM
jolesiak accepted this revision.Feb 21 2018, 2:18 AM
This revision was automatically updated to reflect the committed changes.
This revision was automatically updated to reflect the committed changes.
djasper added inline comments.
cfe/trunk/lib/Format/Format.cpp
2298

Here and in several other places: Variables should be named with upper camel case (https://llvm.org/docs/CodingStandards.html#name-types-functions-variables-and-enumerators-properly).

2308

In LLVM, we generally don't add braces for single statement ifs.

2309

Why not just return here?

cfe/trunk/unittests/Format/FormatTest.cpp
11955

IMO, this is a bit over-engineered. IIUC, this only calls a single free standing function with two different parameters and expects a certain result. You don't need this struct, a test fixture or parameterized tests for that. Just write:

TEST(GuessLanguageTest, FileAndCode) {
  EXPECT_EQ(guessLanguage("foo.cc", ""), FormatStyle::LK_Cpp);
  EXPECT_EQ(guessLanguage("foo.m", ""), FormatStyle::LK_ObjC);
  ...
}

Yes, you'd be duplicating the "EXPECT_EQ" and "guessLanguage", but I think that's actually good here. It makes the tests intuitively readable.

benhamilton added inline comments.Feb 21 2018, 10:09 AM
cfe/trunk/lib/Format/Format.cpp
2298

Thanks, will send a follow-up.

2308

Mmmm.. is this a hard requirement? I've personally been bitten so many times by adding statements after missing braces, I'd rather add them unless they're required to not be there by the style guide.

2309

I don't like early returns in case an else sneaks in later:

https://llvm.org/docs/CodingStandards.html#don-t-use-else-after-a-return

cfe/trunk/unittests/Format/FormatTest.cpp
11955

I hear you. I much prefer it when a single test tests a single issue, so failures are isolated to the actual change:

https://elgaard.blog/2011/02/06/multiple-asserts-in-a-single-unit-test-method/

If this isn't a hard requirement, I'd like to keep this the way it is.

djasper added inline comments.Feb 21 2018, 3:20 PM
cfe/trunk/lib/Format/Format.cpp
2308

Yes. This is done as consistently as possible throughout LLVM and Clang and I'd like to keep clang-format's codebase consistent.

clang-format itself makes it really hard to get bitten by missing braces :).

2309

But I would argue exactly the opposite. At this point, you have pretty uniquely determined that this is ObjC (from this originally being viewed as LK_Cpp). Now suppose you add additional logic before the return statement in line 2313: That would make the state space that this function can have quite complex.

I would even go one step further and completely eliminate the variable "result". That would be slightly less efficient, so maybe I'd be ok with:

const auto GuessFromFilename = getLanguageByFileName(FileName);

And then you can return this at the end or have early exits / other code paths if you come up with different languages. Having both, a complex control flow and state in local variables seems unnecessarily complex here.

cfe/trunk/unittests/Format/FormatTest.cpp
11955

It certainly makes sense for asserts, as a tests stops upon finding the first assert. But these are expectations. Each failing expectation will be reported individually, with a direct reference to the line in question and an easily understandable error message.

I understand what you are saying but I think my proposal will actually make test failures easier to diagnose and understand. Please do change it.

benhamilton added inline comments.Feb 21 2018, 3:28 PM
cfe/trunk/lib/Format/Format.cpp
2308

OK, will remove the braces for consistency. Maybe we should make clang-format do this automatically?

2309

OK, happy to change it.

cfe/trunk/unittests/Format/FormatTest.cpp
11955

Thanks, I appreciate the feedback and will make the change.

Follow-ups (aside from the case, which I already fixed) in D43598.