This is an archive of the discontinued LLVM Phabricator instance.

[clangd] Add <bits/ranges_algo.h> to header map
Needs ReviewPublic

Authored by falbrechtskirchinger on Jun 6 2022, 11:32 PM.

Details

Reviewers
sammccall
Summary

Header insertion inserts the internal <bits/ranges_algo.h> header when
completing, e.g., std::ranges::transform. By adding a mapping to the
<algorithm> header to the system header map, the expected header is
inserted.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJun 6 2022, 11:32 PM
falbrechtskirchinger requested review of this revision.Jun 6 2022, 11:32 PM
nridge added a subscriber: nridge.Jun 10 2022, 1:48 AM

This change looks fine to me.

I wonder though if we should be a bit more systematic about it, and try to cover other newly added libstdc++ implementation headers?

I checked out gcc's git repository, and ran:

$ git diff --name-only --diff-filter=A releases/gcc-9.1.0 releases/gcc-10.1.0 | grep include/bits

which turned up several additional implementation headers newly added in gcc 10:

libstdc++-v3/include/bits/charconv.h
libstdc++-v3/include/bits/int_limits.h
libstdc++-v3/include/bits/iterator_concepts.h
libstdc++-v3/include/bits/range_cmp.h
libstdc++-v3/include/bits/ranges_algo.h
libstdc++-v3/include/bits/ranges_algobase.h
libstdc++-v3/include/bits/ranges_uninitialized.h

(There are other new ones in gcc 11, and some in older versions that we've missed, but I think handling a single gcc version would be a good scope for this patch.)

This change looks fine to me.

I wonder though if we should be a bit more systematic about it, and try to cover other newly added libstdc++ implementation headers?

Sure thing.

(There are other new ones in gcc 11, and some in older versions that we've missed, but I think handling a single gcc version would be a good scope for this patch.)

I don't mind doing a systematic review including older and newer versions. Give me a few days to find some spare time to do it.

falbrechtskirchinger added a comment.EditedJun 13 2022, 2:19 AM

Here's the preliminary list of mappings I've identified:

"bits/align.h", "<memory>"
"bits/atomic_timed_wait.h", "<semaphore>"
"bits/atomic_wait.h", "<atomic>"
"bits/boost_concept_check.h", "<numeric>"
"bits/charconv.h", "<charconv>"
"bits/chrono.h", "<chrono>"
"bits/cow_string.h", "<string>"
"bits/fs_dir.h", "<filesystem>"
"bits/fs_fwd.h", "<filesystem>"
"bits/fs_ops.h", "<filesystem>"
"bits/fs_path.h", "<filesystem>"
"bits/iterator_concepts.h", "<iterator>"
"bits/max_size_type.h", "<iterator>"
"bits/memory_resource.h", "<memory_resource>"
"bits/move_only_function.h", "<functional>"
"bits/ranges_algo.h", "<algorithm>"
"bits/ranges_algobase.h", "<algorithm>"
"bits/ranges_base.h", "<ranges>"
"bits/ranges_cmp.h", "<functional>"
"bits/ranges_uninitialized.h", "<memory>"
"bits/ranges_util.h", "<ranges>"
"bits/semaphore_base.h", "<semaphore>"
"bits/std_thread.h", "<thread>"
"bits/this_thread_sleep.h", "<thread>"
"bits/unique_lock.h", "<mutex>"
"bits/uses_allocator_args.h", "<memory>"
"bits/utility.h", "<utility>"

Furthermore, the following headers have no mapping:

// Not clearly associated
bits/c++0x_warning.h
bits/enable_special_members.h
bits/erase_if.h
bits/node_handle.h

// Not included anywhere
bits/mofunc_impl.h
bits/new_allocator.h
bits/specfun.h

The search was limited to bits/*.h in GCC trunk.
I've seen *.tcc files being mapped and have identified the following missing files:

bits/regex.tcc
bits/regex_automaton.tcc
bits/regex_compiler.tcc
bits/regex_executor.tcc
bits/regex_scanner.tcc
bits/string_view.tcc

Should these be added as well?

I'll push a new revision after I have a chance to review the list a second time.

This comment was removed by falbrechtskirchinger.

Add more missing headers to the system header map

Thanks for taking the time to do a systematic review!

bits/mofunc_impl.h

I see this included from bits/move_only_function.h, so I think <functional> would make sense for it.

bits/new_allocator.h

I see this included from <experimental/memory_resource>, we could add that. (There are a few more implementation headers in experimental/bits which are included by standard headers in <experimental/...> that we could consider adding.)

bits/specfun.h

I see this included from <cmath>.

I've seen *.tcc files being mapped and have identified the following missing files:

bits/regex.tcc
bits/regex_automaton.tcc
bits/regex_compiler.tcc
bits/regex_executor.tcc
bits/regex_scanner.tcc
bits/string_view.tcc

Should these be added as well?

I think we can skip these as they only contain definitions, and code completion should choose the file containing the declaration.

clang-tools-extra/clangd/index/CanonicalIncludes.cpp
197

The choice of <numeric> is pretty random here, as is already the case for bits/concept_check.h.

Given that these headers don't declare any standard symbols, only symbols which are pure libstdc++ implementation details, maybe we should just omit them from the list?

Apologies for not getting to this before vacation, and thanks Nathan for looking at this. (I'll leave for you to stamp)

This is fine with me as-is, but I think this mapping shouldn't be our *preferred* way to solve this problem, and should eventually go away.

We also have a mapping of symbol names to headers (StdSymbols.inc). This is obviously more portable, both in the sense that it works when editing using e.g. MS STL etc, and that the results don't reflect quirks of the stdlib you're using.
The reason this mapping fails for std::ranges::transform is that the mapping file was extracted from an old C++17 cppreference dump. The cppreference format has changed so to run it on a newer dump it'd need some updates.

clang-tools-extra/clangd/index/CanonicalIncludes.cpp
197

< Given that these headers don't declare any standard symbols, only symbols which are pure libstdc++ implementation details, maybe we should just omit them from the list?

+1

bits/mofunc_impl.h

I see this included from bits/move_only_function.h, so I think <functional> would make sense for it.

Right. I somehow erroneously concluded that bits/mofunc_impl.h was replaced by bits/move_only_function.h.

bits/new_allocator.h

I see this included from <experimental/memory_resource>, we could add that. (There are a few more implementation headers in experimental/bits which are included by standard headers in <experimental/...> that we could consider adding.)

I did see that and excluded <experimental/... initially. I can add those as well.

bits/specfun.h

I see this included from <cmath>.

I missed the C headers in general because I didn't realize that they're kept in a separate directory in the libstdc++ source tree. The difference between source and install tree has bitten me a few times.

I've seen *.tcc files being mapped and have identified the following missing files:
Should these be added as well?

I think we can skip these as they only contain definitions, and code completion should choose the file containing the declaration.

Then maybe we should just remove all *.tcc files instead? I may do so in a separate commit that can be discarded.

About some seemingly random mappings. There are more candidates. I'll handle those in a separate commit for easy review.

Apologies for not getting to this before vacation, and thanks Nathan for looking at this. (I'll leave for you to stamp)

This is fine with me as-is, but I think this mapping shouldn't be our *preferred* way to solve this problem, and should eventually go away.

We also have a mapping of symbol names to headers (StdSymbols.inc). This is obviously more portable, both in the sense that it works when editing using e.g. MS STL etc, and that the results don't reflect quirks of the stdlib you're using.
The reason this mapping fails for std::ranges::transform is that the mapping file was extracted from an old C++17 cppreference dump. The cppreference format has changed so to run it on a newer dump it'd need some updates.

With some additional pointers, I'd be happy to look into that next.

One more thing. What about extension headers (<ext/...>)?
The current mappings are both very incomplete and also very wrong, or so I'd argue.
For example, "ext/new_allocator.h" maps to <string> but should map to itself. It's not an internal header but a GNU extension to the C++ standard library.
I've identified and mapped about 50 headers in that category, but omitted the subdirectory <ext/pb_ds/...>.

$ find ext/pb_ds -type f | wc -l
243

That's a bit much for now.

I'll finish everything up as described within about a week.

What is the status of this patch -- is it still relevant, and if so are there plans to finish/merge it?

we can also handle them through the stdlib symbol mappings, see https://github.com/llvm/llvm-project/issues/61373