Page MenuHomePhabricator

ADT: Fix reference invalidation in SmallVector::emplace_back and the size+value version of SmallVector::assign

Authored by dexonsmith on Jan 14 2021, 6:42 PM.



Fix the final (I think?) reference invalidation in SmallVector that
we need to fix to align with std::vector. (There is still some left in
the range insert / append / assign, but the standard calls that UB for
std::vector so I think we don't care?)

Avoid reference invalidation during SmallVector::emplace_back when one
of the arguments is an internal reference to storage and the vector has
to grow.

For POD-like types, reimplement emplace_back in terms of push_back.

For other types, split the grow operation in three and construct the new
element in the middle.

Also reimplement SmallVector:assign(size_type, const T&) to avoid
invalidating the argument during clear() when it's an internal

There is a minor semantic change when not growing. Previously, assign
would start with clear(), and so the old elements were destructed and
all elements of the new vector were copy-constructed. The new
implementation skips destruction and uses copy-assignment for the prefix
of the new vector less than size(). The new semantics match what
libc++ does for std::vector::assign.

Note that the following is another possible implementation:

void assign(size_type NumElts, ValueParamT Elt) {
  std::fill_n(this->begin(), std::min(NumElts, this->size()), Elt);
  this->resize(NumElts, Elt);

The downside of this is that if the vector has to grow there will be
size() redundant copy operations.

I should probably split this patch up into three for committing:

  1. NFC refactoring to split up grow().
  2. Fix emplace_back().
  3. Fix assign().

... but I wanted to start the review all together since the patches
will be interdependent.

Diff Detail

Event Timeline

dexonsmith created this revision.Jan 14 2021, 6:42 PM
dexonsmith requested review of this revision.Jan 14 2021, 6:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 14 2021, 6:42 PM
dexonsmith planned changes to this revision.Jan 15 2021, 8:18 AM

Note that this is a follow-up to (and the patches split out from it), and an alternative implementation for

I still need to do compile-time measurements; feel free to delay review until then.

The main concern I have (without having run numbers) is POD-like version of emplace_back (which now calls push_back). For large T where TakesParamByValue is false, this is now making a copy and I could imagine a slowdown. The other primary option is for emplace_back to use a split grow operation, but then it won't be able to use realloc()... I suspect the copy is less bad than dropping the realloc.

Hmm... or I could update the POD-like !TakesParamByValue to only copy when growing (only when it's useful). Marking this as "Plan Changes" to try implementing that before collecting compile-time.

Simplified emplace_back and addressed the concern I had with copying large POD-like types too often.

  • Moved it back to SmallVectorImpl, only delegating to SmallVectorTemplateBase for growAndEmplaceBack().
  • Now it only defers to push_back when growing (still just for POD-like types).
  • Also improved the comment to explain why the copy makes sense even for large POD-like types: that it's better than losing the realloc optimization.

See also, which potentially speeds up push_back in a minor way for small POD-like types.

dexonsmith added a subscriber: nikic.

I did a spot check of code size / compile-time, capturing clang's binary size in KiB (78460KiB => 78484KiB) and 5 runs of compiling VirtualFileSystem.cpp.o (looks like it's in the noise):

78460	baseline/build/bin/clang
        3.84 real         3.74 user         0.09 sys
        3.85 real         3.75 user         0.09 sys
        3.83 real         3.73 user         0.09 sys
        3.84 real         3.73 user         0.10 sys
        3.84 real         3.74 user         0.09 sys
78484	invalidation/build/bin/clang
        3.83 real         3.73 user         0.09 sys
        3.87 real         3.77 user         0.09 sys
        3.83 real         3.73 user         0.09 sys
        3.83 real         3.72 user         0.10 sys
        3.90 real         3.80 user         0.09 sys

"baseline" is ceaf0110ff5e0c2de1f03d65d13703d34d0d5737, and "invalidation" has this patch applied on top.

I also checked a big file, X86ISelLowering.cpp.o:

78460	baseline/build/bin/clang
       31.05 real        30.70 user         0.34 sys
       30.83 real        30.46 user         0.36 sys
       30.94 real        30.59 user         0.34 sys
       31.15 real        30.76 user         0.38 sys
       30.80 real        30.43 user         0.36 sys
78484	invalidation/build/bin/clang
       31.67 real        30.71 user         0.37 sys
       30.90 real        30.53 user         0.36 sys
       31.53 real        31.16 user         0.35 sys
       31.30 real        30.93 user         0.36 sys
       31.01 real        30.65 user         0.34 sys

Best of 5 gives 30.43 vs. 30.53, which is less than 0.4%; not sure if that's significant (certainly my environment isn't stabilized, so it's not a high quality result).

@nikic, maybe I can trigger your compile-time tracker myself (I just sent you an email), but if that doesn't work out, any chance you could kick off a perf run for this patch as well?

Remove some unnecessary templates in SmallVector.cpp. I just realized I hadn't uploaded this version of the patch, which is what I ran the perf on.

[...] maybe I can trigger [the] compile-time tracker myself [...]

Done (thanks @nikic for the help!); here are the results:

Looks mostly in the noise to me, with minor improvements and regressions on a few metrics (vaguely more improvements than regressions). IMO, this is good to go from a compile-time perspective.

dblaikie accepted this revision.Jan 19 2021, 5:26 PM

Haven't reviewed the logic in detail - but generally looks about right to me.

This revision is now accepted and ready to land.Jan 19 2021, 5:26 PM

Thanks for the review; I'm holding off committing this for now because of (I must not have run all of check-llvm before).