This is an archive of the discontinued LLVM Phabricator instance.

[Tooling/Inclusion] Add symbol mappings for `std::experimental::filesystem`
ClosedPublic

Authored by zyounan on Jan 29 2023, 1:48 AM.

Details

Summary

Clangd maintains a symbol map from standard library, in order to prevent
unexpected header/symbol leaks from internal files. (e.g. files under
bits/ for libstdc++) This symbol map was generated by a python script
that parses pages of offline cppreference archive. The script didn't
handle the case for std::experimental::, where most symbols are from
TS. It works well as symbols are directly laid out in the corresponding
header under experimental directory for most of time.

However, libstdc++'s implementation split symbols of TS FS into a few
header files located in experimental/bits. This would make the code
completion provide internal headers when we simply select the symbols.

There are slightly differences between TS FS and C++17 FS. Some
functions like system_complete was replaced by absolute and
relative-related operations were introduced later by another proposal.
Even so, all mainstream implementation are based on N4100, the final
filesystem TS draft that was published in 2014 and from which symbols
we've added are exported.

This fixes https://github.com/clangd/clangd/issues/1481

Diff Detail

Event Timeline

zyounan created this revision.Jan 29 2023, 1:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 29 2023, 1:48 AM
zyounan published this revision for review.Jan 29 2023, 1:54 AM
zyounan edited the summary of this revision. (Show Details)
zyounan added reviewers: hokein, sammccall, nridge.
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptJan 29 2023, 1:56 AM
zyounan updated this revision to Diff 493081.Jan 29 2023, 2:51 AM

Do not format inc file

thanks a lot for the patch, we're already having some efforts right now to improve our cppreference parsing and get to a more complete set of symbols (@VitaNuo for visibility, who drives these efforts).

Regarding the experimental symbols, first of all I am not sure if it's a good idea to include them in clangd (or other clang-tools) at all. As a clang-tool binary can stay around for years (e.g. people can use 3 years old clangd) and symbols under std::experimental are subject to change throughout that period, hence the information provided is going to be vastly different than what the truth is in that time window (also i don't know if they'll be "same" across libc++ vs libstdc++ implementations).
In addition to that, AFAICT there's no great index page to parse for technical-specs hence we might indeed be forced to manually curate these symbols and that's something we've been trying to avoid due to maintenance costs. In the case of technical-specs this cost will increase, because they're subject to change a lot more frequent than regular std symbols and value is even less as these are not used by majority of the developers.

So I am afraid this is not some extra complexity we can take in.

And to follow up, I definitely see the problem you're facing and it's something we'd like to address as well, just not in the way you proposed.

This falls under the bucket of "we might have symbols missing from our stdlib mappings and should try to compensate for that", e.g. we should probably try disabling include-insertion for such cases where we have a high confidence that the symbol is part of stdlib but we don't have an exact mapping in our symbol database to remedy the issue around inserting "wrong" includes.

This comment was removed by zyounan.
zyounan added a comment.EditedJan 29 2023, 11:13 PM

we're already having some efforts right now to improve our cppreference parsing and get to a more complete set of symbols (@VitaNuo for visibility, who drives these efforts)..

Yes, this is another issue I want to address, but you've already pointed out that the parser is improving.

The cppref archive page is not always up to date. As of now, the official version is still dumped in 2019, that means we're likely missing (partial) symbols from C++20/23, e.g. std::expected.

zyounan abandoned this revision.Jan 29 2023, 11:23 PM
zyounan updated this revision to Diff 496875.Feb 13 2023, 1:56 AM

Revise symbols from N4100

zyounan edited the summary of this revision. (Show Details)Feb 13 2023, 1:58 AM
zyounan added a reviewer: kadircet.

Thanks for updating this, as I mentioned in https://reviews.llvm.org/D143319#4115186, we should put these symbols into their own symbol map. ATM StdSymbolMap.inc is still generated by an automated tool and shouldn't be modified by hand.
So can you rather put these symbols into llvm/llvm-project/clang/lib/Tooling/Inclusions/Stdlib/StdTsSymbolMap.inc ? you also need to rebase your branch, we've moved these mappings from include directory to lib. they're implementation details now, and not public interfaces.
After putting it into a new file, you'll also need to include it in https://github.com/llvm/llvm-project/blob/main/clang/lib/Tooling/Inclusions/Stdlib/StandardLibrary.cpp#L65.

As for tests, rather than using clangd, can you introduce tests into StandardLibrary instead in https://github.com/llvm/llvm-project/blob/main/clang/unittests/Tooling/StandardLibraryTest.cpp ?

Sorry for the late update, I'm rebasing my branch now.

zyounan updated this revision to Diff 496934.Feb 13 2023, 4:51 AM

Rebase to main && Move tests to clang/Tooling

thanks a lot! mostly LG couple more nits

clang/lib/Tooling/Inclusions/Stdlib/StdTsSymbolMap.inc
1

i think comment is a little too verbose. can you just say:

These are derived from N4100[fs.filesystem.synopsis], final draft for experimental filesystem.

There's no need for mentioning that these became the standard in C++17 or being a cornerstone for stdlib implementations. As they won't necessarily be true for other technical specs nor if we were adding this pre-c++17 standardisation. But we'd still like to have these widely adapted symbols included in the mapping to make sure we're not generating false negatives.

6

can you strip clang-format pragmas to be similar to other mapping files.

clang/unittests/Tooling/StandardLibraryTest.cpp
83

another EXPECT_FALSE with Lang::C would also be important to make sure we're not adding these for C mappings by mistake.

91

can you also check for Symbol->headers()

93

i don't think there's much point in asserting these "meta" details about the mapping (same for the test below).

zyounan added inline comments.Feb 13 2023, 6:01 AM
clang/lib/Tooling/Inclusions/Stdlib/StdTsSymbolMap.inc
6

I'd love to if possible, but for some reason clang-format would add extra spaces before and after the slash, <experimental / filesystem>, which looks ugly and I don't expect it right.

kadircet added inline comments.Feb 13 2023, 6:08 AM
clang/lib/Tooling/Inclusions/Stdlib/StdTsSymbolMap.inc
6

i don't follow, why do you need to run clang-format on this file? symbols are already ordered

zyounan updated this revision to Diff 496963.Feb 13 2023, 6:44 AM
zyounan marked 4 inline comments as done.

Revise

clang/lib/Tooling/Inclusions/Stdlib/StdTsSymbolMap.inc
1

Well, I'm not quite sure if I should address these small differences between TS and C++17. But FWIW, the verbosity should go away.

clang/unittests/Tooling/StandardLibraryTest.cpp
93

They're used to emphasize the slight difference. They can be removed if you don't think it necessary :)

zyounan added inline comments.Feb 13 2023, 6:48 AM
clang/lib/Tooling/Inclusions/Stdlib/StdTsSymbolMap.inc
6

Specifically, it is git clang-format. The pipeline would fail if this command does modify some files IIRC.

zyounan marked an inline comment as not done.Feb 13 2023, 6:52 AM
kadircet accepted this revision.Feb 13 2023, 6:56 AM

thanks!

clang/lib/Tooling/Inclusions/Stdlib/StdTsSymbolMap.inc
6

not sure what workflow you're using but clang-format is not a mandatory part of the workflow.
recommended way is updating/uploading patches using arcanist, whose config only contains files with one of the cc|h|cpp extensions, see https://github.com/llvm/llvm-project/blob/main/.arclint.

anyways, happy to land this for you if it's not easy to disable clang-format linting on your workflow for whatever reason.

This revision is now accepted and ready to land.Feb 13 2023, 6:56 AM

Thank you for your detailed explanation and sorry again for my dumb mistake before. :)

clang/lib/Tooling/Inclusions/Stdlib/StdTsSymbolMap.inc
6

I've looked into the script of the running pipeline, it shows that it is running ${SRC}/scripts/premerge_checks.py --check-clang-format, which would [run git clang-format](https://github.com/google/llvm-premerge-checks/blob/abe5c8991b5e81f0182528ff8ce515ba89a66c0a/scripts/premerge_checks.py#L163). I'm using arcanist with default config, and I don't know if I'm missing something to skip this. (Perhaps it is because the rule you mentioned doesn't exclude .inc files, and we didn't write any slashes in such files either.)

zyounan added inline comments.Feb 13 2023, 7:20 AM
clang/lib/Tooling/Inclusions/Stdlib/StdTsSymbolMap.inc
6
zyounan updated this revision to Diff 496971.Feb 13 2023, 7:23 AM

Apply clang-format to StandardLibrary.cpp to fix pipeline failure

zyounan retitled this revision from [clangd] Add symbol mappings for `std::experimental::filesystem` to [Tooling/Inclusion] Add symbol mappings for `std::experimental::filesystem`.Feb 13 2023, 8:48 AM