This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Fixes for #include insertion.
ClosedPublic

Authored by ioeric on Feb 19 2018, 6:37 AM.

Details

Summary

o Avoid inserting a header include into the header itself.
o Avoid inserting non-header files.
o Canonicalize include paths for symbols in dynamic index.

Diff Detail

Repository
rL LLVM

Event Timeline

ioeric created this revision.Feb 19 2018, 6:37 AM
ilya-biryukov added inline comments.Feb 19 2018, 7:41 AM
clangd/Headers.cpp
45 ↗(On Diff #134904)

There is another list of header extensions in ClangdServer::switchSourceHeader, let's reuse the list of extensions used here and there.

48 ↗(On Diff #134904)

Maybe use llvm::sys::path::extension and search it in HeaderExtensions using StringRef::equals_lower?
Will also handle upper- and mixed-case.

60 ↗(On Diff #134904)

Maybe replace !hasHeaderExtension(Header) to hasSourceExtension(Header)?
Knowing about all header extensions in advance is hard, knowing a subset of source extensions seems totally true.

clangd/index/FileIndex.cpp
18 ↗(On Diff #134904)

Maybe store a static inside the function body (to properly free memory) and return a const reference (to avoid mutating shared data)?

const CanonicalIncludes &CanonicalIncludesForSystemHeaders() {
  static CanonicalInclude Includes = []() -> CanonicalHeader {
    CanonicalInclude Includes;
    addSystemHeadersMapping(Includes);
    return Includes;
  };
  return Includes;
}

Also Sam mentioned that CanonicalIncludes is not thread-safe, as the Regex's match() mutates.

unittests/clangd/FileIndexTests.cpp
173 ↗(On Diff #134904)

Why not on Windows?

sammccall added inline comments.Feb 19 2018, 7:42 AM
clangd/Headers.cpp
42–43 ↗(On Diff #134904)

As discussed offline, this seems dubious.

  • this is probably the generalization of '.inc' handling, and so probably wants the include stack and belongs at indexing time (we're determining "what's the public header for this symbol, or is the symbol private")?
    • as discussed offline, it seems like we can sidestep this whole problem by not indexing symbols only visible in main files, for now

(later we'll want a flag on the symbol for whether it's public, so we can support find-symbol-by-name for mainfile-declared symbols)

ioeric updated this revision to Diff 134932.Feb 19 2018, 8:52 AM
ioeric marked 6 inline comments as done.
  • Stop indexing main files in dynamic index; addressed review comments.
clangd/Headers.cpp
45 ↗(On Diff #134904)

Acked. (The change is not relevant anymore.)

clangd/index/FileIndex.cpp
18 ↗(On Diff #134904)

As discussed offline, we want to avoid destructing static objects because ...

Made CanonicalIncludes thread-safe by adding a mutex around matches.

unittests/clangd/FileIndexTests.cpp
173 ↗(On Diff #134904)

because our system header map doesn't really support windows...

sammccall added inline comments.Feb 19 2018, 10:07 AM
clangd/Headers.h
26 ↗(On Diff #134932)

File also needs to be absolute.
(May want to add asserts for this at the start of the impl - up to you)

clangd/index/FileIndex.cpp
36 ↗(On Diff #134932)

if this is always true now, we could remove the option?

37 ↗(On Diff #134932)

This is not going to handle IWYU right?
It seems like at this point the right thing to do is totally hide the CanonicalIncludes inside SymbolCollector, which would init it (to system headers), write IWYU mappings using the PP, and consume the mappings (when emitting symbols).
Churn :(

ilya-biryukov added inline comments.Feb 19 2018, 10:08 AM
clangd/Headers.cpp
60 ↗(On Diff #134932)

Not related to this patch, but just noticed that we don't call FS->setCurrentWorkingDirectory(CompileCommand.Directory) here.
This might break if CompileCommand has non-absolute paths.

clangd/index/CanonicalIncludes.cpp
21 ↗(On Diff #134932)

Technically, if one thread initializes CanonicalIncludes and the other thread reads from it, this is a data race.
Maybe we're ok with our current usage, but I don't have enough confidence in that (i.e. don't know C++'s memory model good enough) and would advice to take the lock in other functions of CanonicalIncludes too.

clangd/index/FileIndex.cpp
18 ↗(On Diff #134932)

NIT: lowerCase the function name.

38 ↗(On Diff #134932)

NIT: remove local var, replace with CollectorOpts.Includes = CanonicalIncludesForSystemHeaders()?

unittests/clangd/FileIndexTests.cpp
173 ↗(On Diff #134904)

Thanks for clarification.
I thought it's standard-library specific, not OS-specific. I would have expected it to work on Windows with libstdc++. (which we're "mocking" here).
Anyway, this not related to this patch.

ilya-biryukov added inline comments.Feb 19 2018, 10:16 AM
clangd/index/CanonicalIncludes.cpp
21 ↗(On Diff #134932)

As Sam pointed out, it's fine the way it is right now. We don't want any thread-safety guarantees from the non-const methods of the class.
But could you please add a comment to the class that non-const methods of it are not thread-safe even though it has a mutex inside?

ioeric marked 5 inline comments as done.Feb 19 2018, 10:25 AM
ioeric added inline comments.
clangd/Headers.cpp
60 ↗(On Diff #134932)

This was done in ClangdServer, but we should probably do it here as you suggested.

clangd/index/FileIndex.cpp
36 ↗(On Diff #134932)

Will do this in a followup patch. Added a FIXME

37 ↗(On Diff #134932)

You are right :(

I'll prepare a followup patch to handle this. Added a FIXME.

ioeric updated this revision to Diff 134943.Feb 19 2018, 10:25 AM
ioeric marked 3 inline comments as done.
  • addressed comments.
ioeric updated this revision to Diff 134944.Feb 19 2018, 10:28 AM
ioeric marked 2 inline comments as done.
  • added a comment about thread safety
ioeric updated this revision to Diff 134947.Feb 19 2018, 10:32 AM
  • assert in the very beginning!
This revision is now accepted and ready to land.Feb 19 2018, 10:35 AM
This revision was automatically updated to reflect the committed changes.
This revision was automatically updated to reflect the committed changes.