This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Improvements to header mapping: more precise parsing of cppreference symbol pages.
ClosedPublic

Authored by sammccall on Apr 30 2019, 8:16 AM.

Details

Summary

Previously we were just jumping from the symbol index to the symbol page, and
grabbing all the headers mentioned there. But the page often lists multiple
symbols, and so we got false positives and thus ambiguities (which were dropped).

Now we look at which declarations are for the symbol we want, and prefer headers
listed above that symbol. If there are none, we fall back to the old behavior.

Diff Detail

Event Timeline

sammccall created this revision.Apr 30 2019, 8:16 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 30 2019, 8:16 AM
sammccall updated this revision to Diff 197350.Apr 30 2019, 9:09 AM

fix std::chrono regression

kadircet added inline comments.May 2 2019, 12:52 AM
clangd/include-mapping/gen_std.py
90

does this also work with functions? For example floor will be written as:

constexpr ToDuration floor(const duration<Rep, Period>& d);

which does not contain floor as a word. https://en.cppreference.com/w/cpp/chrono/duration/floor

95

why not perform clean-up unconditionally?

kadircet added inline comments.May 2 2019, 12:57 AM
clangd/include-mapping/gen_std.py
60

I think this requires some corresponding changes in clangd/include-mapping/test.py

kadircet added inline comments.May 2 2019, 1:06 AM
clangd/include-mapping/gen_std.py
95

ok I see the reason for that one, https://en.cppreference.com/w/cpp/numeric/math/abs, NVM

sammccall updated this revision to Diff 197715.May 2 2019, 1:43 AM
sammccall marked 2 inline comments as done.

Update tests.

sammccall added inline comments.May 2 2019, 1:45 AM
clangd/include-mapping/gen_std.py
90

which does not contain floor as a word

\b matches between a word character and non-word character.
Both and ( are non-word characters, so we should be fine?

95

This isn't cleanup.

header1
header2
declX

declX is associated with header1 and header2

header1
declY
header2
declX

declX is associated with header2 only. We reset current_headers when we see header2 because it was preceded by a decl. I'll add a comment.

sammccall updated this revision to Diff 197716.May 2 2019, 1:45 AM

Add explanatory comment about headers reset.

Harbormaster completed remote builds in B31272: Diff 197716.
Harbormaster completed remote builds in B31273: Diff 197717.
kadircet accepted this revision.May 2 2019, 2:20 AM

Thanks!

This revision is now accepted and ready to land.May 2 2019, 2:20 AM
This revision was automatically updated to reflect the committed changes.