This is an archive of the discontinued LLVM Phabricator instance.

[clangd] NFC: Use SmallVector<T> where possible
ClosedPublic

Authored by kbobyrev on Dec 7 2020, 2:18 PM.

Details

Summary

SmallVector<T> with default size is now the recommended version (D92522).

Diff Detail

Event Timeline

kbobyrev created this revision.Dec 7 2020, 2:18 PM
kbobyrev requested review of this revision.Dec 7 2020, 2:18 PM

I've excluded the cases where the size does not seem arbitrary (e.g. 1/ code relying on sizeof(Something) == Value, e.g. dex payloads and CodeCompletion bundles). The rest looks fairly innocent but I don't know all the code around changed lines.

Not sure I'm a huge fan of this, Some of these cases the size is specified because that's the upper limit we typically expect the SmallVector to use.

Thanks for cleaning up!

Not sure I'm a huge fan of this, Some of these cases the size is specified because that's the upper limit we typically expect the SmallVector to use.

The idea behind the new LLVM-wide recommendation is that in principle this is true but in practice it's mostly false (at least, these estimates often aren't chosen carefully).

This seems true to me - we do try to guess a sensible value (when forced to choose something), but we probably shouldn't be specifying these where we wouldn't choose to e.g. reserve() in a std::vector.
There's some readability/maintenance cost and in most cases no performance difference at the level we typically care about.

I've flagged cases where it seems like it might matter - please do the same if you find any!

clang-tools-extra/clangd/CompileCommands.cpp
334

This is significant in size (around half a megabyte per the comment on ArgStripper). Rule is 24 bytes, so we're reducing N from 4 to 2 I believe.
Please leave this alone or check that it's a reduction in memory usage (if memory usage is roughly equal, fewer heap allocations is better).

clang-tools-extra/clangd/Selection.cpp
784

I'd prefer to keep the 2 here for readability, it's a nice hint/reminder that we *always* return 1 or 2 values

clang-tools-extra/clangd/TUScheduler.cpp
1414

This *always* allocates and the previous version *never* allocates, please revert this one.

clang-tools-extra/clangd/index/FileIndex.cpp
330 ↗(On Diff #310021)

this is a reduction from 4 to 2, and this is fairly performance-sensitive code (for the preamble index).
My intuition is that not that many symbols are referenced 1-2 times across all TUs, and when many are, it's a tiny project where performance doesn't matter.
Guessing about performance is probably folly. But I'd probably avoid reducing this without measuring.

kbobyrev updated this revision to Diff 310232.Dec 8 2020, 8:31 AM
kbobyrev marked 4 inline comments as done.

Resolve review comments.

@sammccall, thanks for clarifying the cases you thought should be remaining as
they are!

sammccall accepted this revision.Dec 9 2020, 4:31 AM
This revision is now accepted and ready to land.Dec 9 2020, 4:31 AM
njames93 added inline comments.Dec 9 2020, 8:50 AM
clang-tools-extra/clangd/ConfigProvider.cpp
100

This should be using the same size as the vector below as they are guaranteed to be the same size. There may be an argument for merging them both into the same vector using a pair but obviously that wouldn't be part of this patch.

clang-tools-extra/clangd/SemanticHighlighting.cpp
608

Looks like this is referring to how many bytes are in a line, having 128 seems like a good amount, most coding standards don't like lines longer than that. As a follow up refractor, this could be extracted out the loop to reuse the buffer on the case it does need to allocate.

clang-tools-extra/clangd/unittests/TestTU.cpp
190

A follow up refractor, this doesn't strictly need to be stored in a vector.

sammccall added inline comments.Dec 9 2020, 9:00 AM
clang-tools-extra/clangd/SemanticHighlighting.cpp
608

Nit; This is rather the encoded data for all the highlighted tokens on a line of source code. Which appears to be a constant 8 bytes for token. So previously 16 tokens, now 32 tokens. Either seems fine to me given we don't highlight most punctuation I think.

(Incidentally, this is the old pre-standard semantic highlighting protocol that will go away soon, so I wouldn't worry too much about its performance)

kbobyrev updated this revision to Diff 310650.Dec 9 2020, 1:48 PM
kbobyrev marked an inline comment as done.

Revert the case flagged by @njames93.

clang-tools-extra/clangd/unittests/TestTU.cpp
190

Could you please elaborate? I don't really understand this comment.

kbobyrev added inline comments.Dec 10 2020, 4:36 AM
clang-tools-extra/clangd/unittests/TestTU.cpp
190

(D92986 for context)

This revision was landed with ongoing or failed builds.Dec 10 2020, 4:37 AM
This revision was automatically updated to reflect the committed changes.
kbobyrev added inline comments.Dec 10 2020, 4:37 AM
clang-tools-extra/clangd/unittests/TestTU.cpp
190

sigh, sorry, I forgot to remove it before pushing :(

arphaman added inline comments.Dec 16 2020, 10:31 AM
clang-tools-extra/clangd/Headers.h
139

It looks like the change on this line is failing to compile on Ubuntu 16.04 with the System GCC. (https://ci.swift.org/job/oss-swift-incremental-RA-linux-ubuntu-16_04-next/23655/console).

Can I reinstate the llvm::DenseMap<unsigned, SmallVector<unsigned, 8>> for now until we drop support for Ubuntu 16.04?

kbobyrev added inline comments.Dec 16 2020, 10:57 AM
clang-tools-extra/clangd/Headers.h
139

Uh, didn't see that one, I'm sorry to hear that. Yes, sure, feel free to change it to make your setup work!

sammccall added inline comments.Dec 16 2020, 2:27 PM
clang-tools-extra/clangd/Headers.h
139

From the error message, it looks like the declarations of clang::SmallVector in LLVM.h are missing the default template arguments (not sure why this is different in the swift CI).

Does changing to qualifying as llvm::SmallVector work here, rather than adding the explicit size? (We usually do this anyway as a matter of style)

arphaman added inline comments.Dec 16 2020, 6:44 PM
clang-tools-extra/clangd/Headers.h
139

It looks like the GCC there might have a bug and is picking up the template<typename T, unsigned N> class SmallVector; in clang/include/clang/Basic/LLVM.h without taking the declaration from the LLVM headers into account. I'm not quite sure why it does that, but it seems to be fixed in GCC included in Ubuntu 18.

Should be fixed in 894c4761c67ac850e156a26aa427035a811d7aed, LMK if this workaround doesn't work