This is an archive of the discontinued LLVM Phabricator instance.

[libcxx] Almost fix some UB in <map> and <unordered_map>
ClosedPublic

Authored by erik.pilkington on May 31 2018, 3:00 PM.

Details

Summary

<map> and <unordered_map> define __value_type as a union between pair<K, V> and pair<const K, V> so that various operations can move into/from these pairs [1]. This is a pretty blatant strict aliasing violation, and has the potential break code that was optimized by TBAA. We can't remove this, not only would that be a significant pessimization, but we are also required to provide a non-const reference to the key_type to implement splicing maps & sets. This patch fixes __value_type by downgrading it from "accidentally launch the missiles"-level UB to theoretical UB, namely, instead of type-punning between pairs we just const_cast the key. It's still undefined to mutate a const object, but clang doesn't actually optimize on this rule. Just to be paranoid this patch also launder()s to "pick-up" the new value of key whenever we read from it, though this isn't launder's actual use-case. Thanks to @rsmith for the suggestions!

To do this, this patch:

  • removes the __nc_value_type data member of the __value_type union
  • provides access to the pair with pair<key_type&, mapped_type&>, which is used to assign through to the pair
  • changes __value_type.__cc to __value_type.__get_value(), the latter calls std::launder in C++1z mode.

https://reviews.llvm.org/D46845 depends on this patch.

[1]: https://stackoverflow.com/a/31667355

Thanks for taking a look!
Erik

Diff Detail

Repository
rL LLVM

Event Timeline

rsmith added inline comments.May 31 2018, 6:28 PM
libcxx/include/map
617 ↗(On Diff #149358)

This doesn't need to be a union any more; change to class?

633 ↗(On Diff #149358)

Formally this should be return *_VSTD::launder(&this->__cc);, because this must already point to an in-lifetime object or the member function call would have been invalid.

Overall this change seems reasonable to me. Thanks for working on this. Initially I was concerned we would hit issues optimizing const objects,
but I should have read your description first! Thanks for ensuring Clang doesn't optimize on this.

One other concern I have is if there will be any performance impact to using launder all the time, but better safe than sorry.

Other than the inline comments, this LGTM.

libcxx/include/map
633 ↗(On Diff #149358)

return *_VSTD::launder(_VSTD::addressof(this->__cc)); because ADL.

648 ↗(On Diff #149358)

I think __ref can be private.

653 ↗(On Diff #149358)

Add a newline?

708 ↗(On Diff #149358)

_LIBCPP_INLINE_VISIBILITY here and below it.

libcxx/include/unordered_map
586 ↗(On Diff #149358)

union -> class

597 ↗(On Diff #149358)

_LIBCPP_INLINE_VISIBILITY

606 ↗(On Diff #149358)

_LIBCPP_INLINE_VISIBILITY

615 ↗(On Diff #149358)

This can be private.

621 ↗(On Diff #149358)

_LIBCPP_INLINE_VISIBILITY

677 ↗(On Diff #149358)

_LIBCPP_INLINE_VISIBILITY

I should have asked, have we actually seen a midcompile caused by this? Is there a reproducer? Or is this purerly speculative?

erik.pilkington marked 10 inline comments as done.

Address review comments. Thanks!

I should have asked, have we actually seen a midcompile caused by this? Is there a reproducer? Or is this purerly speculative?

Nope, pure speculation. I still think we should still fix this though.

libcxx/include/map
648 ↗(On Diff #149358)

That's true for this patch, but I'm planning on using __ref in __node_handle to implement key() and mapped() so we don't have to duplicate the const cast.

rsmith added a comment.Jun 1 2018, 3:36 PM

One other concern I have is if there will be any performance impact to using launder all the time, but better safe than sorry.

In the short-to-medium term, I think we can tackle this by adding an attribute:

// ...
struct pair {
  T first;
  [[clang::not_actually_const]] U second;
};

... to directly express that the second field of a non-const pair is never a const subobject, even if U is a const-qualified type. I think we'll need this once we head down the path of constexpr containers, otherwise node_handles are not going to work in constant expression evaluation.

Then we can remove the launder calls for implementations that support the attribute.

This revision was not accepted when it landed; it landed in state Needs Review.Jun 4 2018, 1:42 PM
This revision was automatically updated to reflect the committed changes.