Page MenuHomePhabricator

[clangd] Fix crash-bug in preamble indexing when using modules.
AcceptedPublic

Authored by adamcz on May 25 2020, 9:24 AM.

Details

Reviewers
sammccall
Summary

When preamble contains #undef, indexing code finds the matching #define
and uses that during indexing. However, it would only look for local
definitions. If the macro was defined in a module, MacroInfo
would be nullptr and clangd would crash.

This change fixes the crash by looking for definition in module when
that happens. The indexing result is then exactly the same whether
modules are used or not.

The indexing of macros happens for preamble only, so then #undef must be
in the preamble, which is why we need two .h files in a test.

Note that clangd is currently not ready for module support, but this
brings us one step closer.

Diff Detail

Event Timeline

adamcz created this revision.May 25 2020, 9:24 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 25 2020, 9:24 AM
sammccall added inline comments.May 25 2020, 11:16 AM
clang-tools-extra/clangd/test/symbols-modules.test
1 ↗(On Diff #266039)

Generally, we prefer to write tests as gunit tests where feasible, this gathers the inputs into one place, avoids too much messing around with the filesystem, and assertions on something a little simpler than the full JSON output. It also make it easier to factor out common bits across tests.

If I'm understanding this test right it should be fairly easy to port to a gunit test with TestTU (setting AdditionalFiles and ExtraArgs) which would assert on headerSymbols(). This would probably go in SymbolCollectorTests, similar to e.g. TEST_F(SymbolCollectorTest, NonModularHeader).

Obviously this relates to where other tests around indexing modules might live once we have those. If you'd prefer them to be lit tests, it'd be nice to know a bit about that.

clang/include/clang/Lex/Preprocessor.h
1126 ↗(On Diff #266039)

This looks fairly grungy and I don't totally understand it :-(
I guess walking the getPrevious() chain on the macro definition until we hit one with info doesn't work?
If we can't reuse something existing for this, we should probably ask rsmith if this is sensible.

API quibbles:

  • it seems this only differs for modules, but the name doesn't mention modules
  • maybe this should just be a "bool LookIntoModules" in getMacroInfo? it has only 3 callers.
clang/lib/Index/IndexingAction.cpp
171

You don't need to call getMacroInfo() here, getMacroInfoForLatestDefinition() does this.

adamcz updated this revision to Diff 266240.May 26 2020, 9:05 AM
adamcz marked 3 inline comments as done.

Addressed review comments.

So this is late in the game but... maybe we should just not report this case as a reference?

#undef foo is valid if foo was never defined, and doesn't refer to anything. If we similarly don't resolve the reference in this case as we only import macros that were used and undef doesn't count as a use... I think that's defensible as a weird edge case.

In practice, I doubt anyone cares - AIUI the cases we've seen this, it's defensive claiming of the macro namespace, and not a targeted undef of a particular known macro at all.

adamcz updated this revision to Diff 274482.Jun 30 2020, 8:05 AM

Changed to just ignore undefs without matching def

Resurrecting this, with the ignore #undef when #define is in a module approach. Still worth having a test, especially that it's the first test we have for modules.

Note that the test code changed since last version due to introduction of ThreasafeFS since my original change.

sammccall accepted this revision.Jul 2 2020, 7:21 AM
sammccall added inline comments.
clang-tools-extra/clangd/unittests/SymbolCollectorTests.cpp
1509

are the leading slashes here needed, or can we use "bar.h" and have everything be relative to testRoot()?
(relative paths in AdditionalFiles are relative to testRoot().)

1523

I'm a bit torn on this - this is obviously a bit messy (writing modules to the real filesystem and exposing the whole FS to our test).
It's not a huge thing and we could slap a FIXME on it, though you clearly have to plumb it through a few layers.

I think the alternative is an explicit module build:

  • subclass clang::GenerateHeaderModuleActionto override the output so it ends up in a buffer
  • add a method to TestTU like buildHeaderModule() that returns a std::string containing the module content
  • This test would have two TestTU instances, and the second looks something like:
TU2.AdditionalFiles["foo.pcm"] = TU1.buildHeaderModule();
TU2.ExtraArgs = {"-fprebuild-module-path=.", "-std=c++20", "-fmodules"};
TU2.HeaderCode = "import foo;\n#undef X";

I guess whether this is actually better probably depends on whether we're likely to go in this direction (explicitly building and managing a module cache ourselves) for modules. If we're likely to let clang implicitly build modules and maybe just put the physical cache storage behind an abstraction, maybe it's best to overlay the FS for now and rip this out later. If we're going to do explicit module builds and schedule them ourselves, then starting to experiment in that direction is useful, and there won't be an obvious time to rip out the OverlayRealFileSystem feature.

I think all of which is to say feel free to land it in this form, or experiment further.
I think the OverlayRealFileSystem should maybe be ExposeModuleCacheDir or something with a FIXME to come up with a better solution, though. It's not a feature of TesttU we should be using for other purposes (without similar careful consideration)

This revision is now accepted and ready to land.Jul 2 2020, 7:21 AM