This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Refactor IncludeStructure: use File (unsigned) for most computations
ClosedPublic

Authored by kbobyrev on Sep 24 2021, 12:09 AM.

Diff Detail

Event Timeline

kbobyrev created this revision.Sep 24 2021, 12:09 AM
kbobyrev requested review of this revision.Sep 24 2021, 12:09 AM
sammccall added inline comments.Sep 24 2021, 12:32 AM
clang-tools-extra/clangd/Headers.h
116

this concept needs documentation.

116

there's a naming issue here:

  • "File" is an overloaded enough concept that you still end up referring to these as "file indexes" below, which isn't clear.
  • obviously this integer isn't either the file itself nor our record of it, so the metaphor is a bit confusing

What do you think about just ID, which would be spelled from the outside as IncludeStructure::ID?
Maybe HeaderID is better. I don't think it's a big problem that we have one for the main file too...

116–127

this public member is now sandwiched between methods

118

"Name" certainly needs to be clarified here.
Do you have any use cases in mind where you wouldn't have a FileEntry on hand? That would make this typesafe (and in particular avoid passing absolute filenames, or strings like "utility" or "<utility>)

125

it's unclear from this patch whether you want to expose the whole graph or just allow querying per-file - I doubt we need both.

125

tests?

135–140

The public interface now uses the File abstraction, so this method sticks out. (It has an awkward interface in the first place due to the need to use strings)

Suggest replacing it with DenseMap<File, unsigned> includeDepth(File Root). We'd need to add StringRef RealPath(File) for the caller, but that seems to make sense anyway.

144

this isn't abstract :-)

also the public nature of it isn't used, in this patch at least.

145

File should be capitalized.

FWIW, for a private method I prefer the old name here

kbobyrev updated this revision to Diff 374806.Sep 24 2021, 4:42 AM
kbobyrev marked 9 inline comments as done.

Resolve review comments.

kbobyrev updated this revision to Diff 374807.Sep 24 2021, 4:45 AM

s/ToVector/Flatten/g

kbobyrev updated this revision to Diff 374841.Sep 24 2021, 7:00 AM

Update non-working test

I resolved the comments we discussed offline, there's a weird SIGSEGV (out of boundary error) somewhere in the tests but it looks fine otherwise.

kbobyrev updated this revision to Diff 374853.Sep 24 2021, 8:12 AM

Fix the last failing test. All tests are green now.

kbobyrev updated this revision to Diff 374863.Sep 24 2021, 8:39 AM

Clean tests.

Do you want to wrap the unsigned in a struct to make it slightly safer, but more complicated to use?

Just nits left really.

Do you want to wrap the unsigned in a struct to make it slightly safer, but more complicated to use?

I like this idea, though it's probably not a huge deal either way.
(I marginally prefer enum class HeaderID : unsigned {}; over a struct, but it's basically the same thing)

clang-tools-extra/clangd/CodeComplete.cpp
1392

this isn't the ID of the include structure, it's the ID of the main file

1393

IncludeStructure is mutable here, I think it'd be safe to getOrCreate here.

Alternatively, make this infallible having IncludeStructure record the HeaderID for the main file in its constructor, and just use Includes.Root here.

Either way, I'd rather not deal with Expected<> error handling at runtime if we can define this error away.

clang-tools-extra/clangd/Headers.h
116–127

As mentioned offline, this is an implementation comment, but it doesn't say *what the HeaderID is*.

Suggest something like:

HeaderID identifies file in the include graph.
It corresponds to a FileEntry rather than a FileID, but stays stable across preamble & main file builds.

I'd move this impl comment back down to the private StringMap, but it could also follow the description.

119

comment still refers to "file index" as if it were introducing a term we use elsewhere

125

I think this can be Optional rather than Expected - much more lightweight.

129

this should be an assertion (or just leave that up to the vector) - passing an out-of-range HeaderID is invalid

134–135

comment is out of date

136

getName() is not right here.
(Hopefully the example will fit on the "Usually" line when fixed?)

143

we don't need this method if IncludeChildren is exposed, right?

kbobyrev updated this revision to Diff 375149.Sep 26 2021, 11:23 PM
kbobyrev marked 9 inline comments as done.

Resolve all review comments.

kbobyrev updated this revision to Diff 375151.Sep 26 2021, 11:35 PM

Rebase on top of main.

sammccall accepted this revision.Sep 27 2021, 3:47 AM
sammccall added inline comments.
clang-tools-extra/clangd/Headers.cpp
174

no auto here. I really hope this isn't std::string :-)

Actually I think you can avoid extracting this variable at all, because the !RealPathName.empty() check is redundant. The case where both sides are empty is rare enough that it's fine to assign as long as the *stored* name is empty.

176

extract std::string &RealPathName = RealPathNames[...]?

186

This is a "can't happen" condition - remove or replace with an assert?

199–200

This is no longer true, we don't need the check.

clang-tools-extra/clangd/Headers.h
116–127

This doesn't seem to be done.
You've added a line to the *bottom* of the implementation comment defining HeaderID in terms of the implementation.

A reader rather needs complete self-contained description of the concept at the *top*, optionally followed by an explanation about the implementation (separated by a blank line).

clang-tools-extra/clangd/unittests/HeadersTests.cpp
71

reparsing the code several times to obtain header IDs doesn't seem reasonable. It's surprising behavior from a test helper, and it requires that header IDs are stable across parses!

Can we just pass in the IncludeStructure instead?

241

Why are we asserting on every element of the map one at a time, instead of the whole map at once? Seems like it would be more regular and easier to read.

I'd probably just write:

DenseMap<HeaderID, SmallVector<HeaderID>> Expected = { ... };
EXPECT_EQ(Expected, Includes.IncludeChildren);
243

EXPECT_THAT(foo.empty()) -> EXPECT_THAT(foo, IsEmpty()).
(Much better error messages)

clang-tools-extra/clangd/unittests/ParsedASTTests.cpp
511–512

The ratio of setup to punchline of the rest of this test seems like it's gotten out of hand (it wasn't great before, but it's doubled).

What about just

auto &FM = PatchedAST->getSourceManager().getFileManager();
auto &Includes = PatchedAST->getIncludeStructure();
auto MainID = Includes.getID(FM.getFile(testPath("foo.cpp")));
auto AuxID = Includes.getID(FM.getFile(testPath("sub/aux.h")));
EXPECT_THAT(Includes.IncludeChildren.at(MainID), Contains(AuxID));

It tests a bit less I guess, but seems like enough

554–555

and here

This revision is now accepted and ready to land.Sep 27 2021, 3:47 AM
kbobyrev updated this revision to Diff 375237.Sep 27 2021, 6:11 AM
kbobyrev marked 7 inline comments as done.

Resolve all but two comments, leave the clarification questions for these two.

clang-tools-extra/clangd/Headers.cpp
199–200

Wait, why not? We still have unresolved includes, e.g. preamble patches are like that, their RealPathNames stay empty.

clang-tools-extra/clangd/unittests/HeadersTests.cpp
241

This would expect the elements in the map to be in a particular order, isn't this something we don't want?

kbobyrev marked an inline comment as done.Sep 27 2021, 6:11 AM
sammccall added inline comments.Sep 27 2021, 6:27 AM
clang-tools-extra/clangd/Headers.cpp
199–200

Right, but why do we need to filter them out here?
Can't we just drop them when we use them to seed the proximity sources?

clang-tools-extra/clangd/unittests/HeadersTests.cpp
241

== on DenseMap doesn't consider order.

If you mean the order within map values (i.e. lists of child edges): these are in the order the #includes appear in the file, which seems fine to depend on to me

kbobyrev updated this revision to Diff 375253.Sep 27 2021, 7:34 AM
kbobyrev marked 2 inline comments as done.

Switch to full DenseMap comparison in the tests.

clang-tools-extra/clangd/Headers.cpp
199–200

I feel like producing them in the first place is kind of redundant, what signal does it give to the user through such API?

Getting some "hidden"/"unresolved" includes certainly feels rather confusing to me, post-filtering outside probably requires understanding of the internals which does not look like a good idea. WDYT?

clang-tools-extra/clangd/unittests/HeadersTests.cpp
241

Yes, I initially meant the SmallVector order, but yeah, this seems reasonable, thank you!

kbobyrev marked an inline comment as not done.Sep 27 2021, 7:52 AM
kbobyrev updated this revision to Diff 375277.Sep 27 2021, 8:22 AM
kbobyrev marked 2 inline comments as done.

Resolve the last comment

This revision was landed with ongoing or failed builds.Sep 27 2021, 8:51 AM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Sep 27 2021, 9:48 AM

Looks like this breaks tests on Windows: http://45.33.8.238/win/45971/step_9.txt

PTAL!