Page MenuHomePhabricator

[clangd] NFC: Use SmallVector<T> where possible

Authored by kbobyrev on Dec 7 2020, 2:18 PM.



SmallVector<T> with default size is now the recommended version (D92522).

Diff Detail

Event Timeline

kbobyrev created this revision.Dec 7 2020, 2:18 PM
kbobyrev requested review of this revision.Dec 7 2020, 2:18 PM

I've excluded the cases where the size does not seem arbitrary (e.g. 1/ code relying on sizeof(Something) == Value, e.g. dex payloads and CodeCompletion bundles). The rest looks fairly innocent but I don't know all the code around changed lines.

Not sure I'm a huge fan of this, Some of these cases the size is specified because that's the upper limit we typically expect the SmallVector to use.

Thanks for cleaning up!

Not sure I'm a huge fan of this, Some of these cases the size is specified because that's the upper limit we typically expect the SmallVector to use.

The idea behind the new LLVM-wide recommendation is that in principle this is true but in practice it's mostly false (at least, these estimates often aren't chosen carefully).

This seems true to me - we do try to guess a sensible value (when forced to choose something), but we probably shouldn't be specifying these where we wouldn't choose to e.g. reserve() in a std::vector.
There's some readability/maintenance cost and in most cases no performance difference at the level we typically care about.

I've flagged cases where it seems like it might matter - please do the same if you find any!


This is significant in size (around half a megabyte per the comment on ArgStripper). Rule is 24 bytes, so we're reducing N from 4 to 2 I believe.
Please leave this alone or check that it's a reduction in memory usage (if memory usage is roughly equal, fewer heap allocations is better).


I'd prefer to keep the 2 here for readability, it's a nice hint/reminder that we *always* return 1 or 2 values


This *always* allocates and the previous version *never* allocates, please revert this one.

330 ↗(On Diff #310021)

this is a reduction from 4 to 2, and this is fairly performance-sensitive code (for the preamble index).
My intuition is that not that many symbols are referenced 1-2 times across all TUs, and when many are, it's a tiny project where performance doesn't matter.
Guessing about performance is probably folly. But I'd probably avoid reducing this without measuring.

kbobyrev updated this revision to Diff 310232.Dec 8 2020, 8:31 AM
kbobyrev marked 4 inline comments as done.

Resolve review comments.

@sammccall, thanks for clarifying the cases you thought should be remaining as
they are!

sammccall accepted this revision.Dec 9 2020, 4:31 AM
This revision is now accepted and ready to land.Dec 9 2020, 4:31 AM
njames93 added inline comments.Dec 9 2020, 8:50 AM
100 ↗(On Diff #310232)

This should be using the same size as the vector below as they are guaranteed to be the same size. There may be an argument for merging them both into the same vector using a pair but obviously that wouldn't be part of this patch.


Looks like this is referring to how many bytes are in a line, having 128 seems like a good amount, most coding standards don't like lines longer than that. As a follow up refractor, this could be extracted out the loop to reuse the buffer on the case it does need to allocate.


A follow up refractor, this doesn't strictly need to be stored in a vector.

sammccall added inline comments.Dec 9 2020, 9:00 AM

Nit; This is rather the encoded data for all the highlighted tokens on a line of source code. Which appears to be a constant 8 bytes for token. So previously 16 tokens, now 32 tokens. Either seems fine to me given we don't highlight most punctuation I think.

(Incidentally, this is the old pre-standard semantic highlighting protocol that will go away soon, so I wouldn't worry too much about its performance)

kbobyrev updated this revision to Diff 310650.Dec 9 2020, 1:48 PM
kbobyrev marked an inline comment as done.

Revert the case flagged by @njames93.


Could you please elaborate? I don't really understand this comment.

kbobyrev added inline comments.Dec 10 2020, 4:36 AM

(D92986 for context)

This revision was landed with ongoing or failed builds.Dec 10 2020, 4:37 AM
This revision was automatically updated to reflect the committed changes.
kbobyrev added inline comments.Dec 10 2020, 4:37 AM

sigh, sorry, I forgot to remove it before pushing :(

arphaman added inline comments.Dec 16 2020, 10:31 AM

It looks like the change on this line is failing to compile on Ubuntu 16.04 with the System GCC. (

Can I reinstate the llvm::DenseMap<unsigned, SmallVector<unsigned, 8>> for now until we drop support for Ubuntu 16.04?

kbobyrev added inline comments.Dec 16 2020, 10:57 AM

Uh, didn't see that one, I'm sorry to hear that. Yes, sure, feel free to change it to make your setup work!

sammccall added inline comments.Dec 16 2020, 2:27 PM

From the error message, it looks like the declarations of clang::SmallVector in LLVM.h are missing the default template arguments (not sure why this is different in the swift CI).

Does changing to qualifying as llvm::SmallVector work here, rather than adding the explicit size? (We usually do this anyway as a matter of style)

arphaman added inline comments.Dec 16 2020, 6:44 PM

It looks like the GCC there might have a bug and is picking up the template<typename T, unsigned N> class SmallVector; in clang/include/clang/Basic/LLVM.h without taking the declaration from the LLVM headers into account. I'm not quite sure why it does that, but it seems to be fixed in GCC included in Ubuntu 18.

Should be fixed in 894c4761c67ac850e156a26aa427035a811d7aed, LMK if this workaround doesn't work