This is an archive of the discontinued LLVM Phabricator instance.

[libcxx] Fix for ALL undefined behavior in <list>.
ClosedPublic

Authored by EricWF on Aug 24 2015, 3:09 PM.

Details

Summary

This patch fixes std::list for builtin pointer types in the current ABI version and fixes std::list for all fancy pointer types in the next ABI version. The patch was designed to minimize the amount of code needed to support both ABI configurations. Currently only ~5 lines of code differ.

Diff Detail

Event Timeline

EricWF updated this revision to Diff 33009.Aug 24 2015, 3:09 PM
EricWF retitled this revision from to [libcxx] ABI-Breaking Fix for ALL undefined behavior in <list>..
EricWF updated this object.
EricWF added reviewers: mclow.lists, danalbert, jroelofs.
EricWF added a subscriber: cfe-commits.
awi added a subscriber: awi.Aug 25 2015, 11:40 AM

Hello,
I like your patch very much. I looks similar to my patch from January, so I am confident it solves my problem, but it does so a little more elegantly at some places.
I tried it with my two sources files main.cpp and main2.cpp. While main.cpp works with or without your patch, main2.cpp fails with the current svn head, but succeeds with your patch.
After that I executed the libc++ test suite, using -std=c++1y as the language version. Here, a couple of tests failed to compile. This is not as bad as it may sound, though, because I could fix it by adding some missing calls to your new __list_node_base functions __as_base() and __as_node():

--- include/list	2015-08-25 20:35:00.769897378 +0200
+++ include/list.corr	2015-08-25 20:10:38.000000000 +0200
@@ -1362,9 +1362,9 @@
         __node_alloc_traits::construct(__na, _VSTD::addressof(__hold->__value_), __x);
         ++__ds;
 #if _LIBCPP_DEBUG_LEVEL >= 2
-        __r = iterator(__hold.get(), this);
+        __r = iterator(__hold.get()->__as_base(), this);
 #else
-        __r = iterator(__hold.get());
+        __r = iterator(__hold.get()->__as_base());
 #endif
         __hold.release();
         iterator __e = __r;
@@ -1376,7 +1376,7 @@
             {
                 __hold.reset(__node_alloc_traits::allocate(__na, 1));
                 __node_alloc_traits::construct(__na, _VSTD::addressof(__hold->__value_), __x);
-                __e.__ptr_->__next_ = __hold.get();
+                __e.__ptr_->__next_ = __hold.get()->__as_base();
                 __hold->__prev_ = __e.__ptr_;
                 __hold.release();
             }
@@ -1388,7 +1388,7 @@
             {
                 __node_alloc_traits::destroy(__na, _VSTD::addressof(*__e));
                 __node_base_pointer __prev = __e.__ptr_->__prev_;
-                __node_alloc_traits::deallocate(__na, __e.__ptr_, 1);
+                __node_alloc_traits::deallocate(__na, __e.__ptr_->__as_node(), 1);
                 if (__prev == 0)
                     break;
 #if _LIBCPP_DEBUG_LEVEL >= 2
EricWF updated this revision to Diff 33219.Aug 26 2015, 10:39 AM

Address failing test that @awi pointed out.

mclow.lists edited edge metadata.Aug 26 2015, 1:13 PM

So... what do we need to do to enumerate the "ABI breakage" of this patch?

I get that there's a couple of member variables that are changing type. Can they change (as a result of this patch) from a plain pointer to a fancy pointer (or vice versa)? Can they change size? Is this a header-only change (probably) with no need to ship a new dylib?

I've spent some time testing this patch to see if it causes ABI problem. Here is what I did:

  1. Add external instantiations of std::list into the libc++ dylib for the types used in the tests.
  2. Compile libcxx-old.dylib with the external instantations but without this patch.
  3. Apply this patch and compile libcxx-new.dylib.
  4. For each test, T, in std/containers/sequences/list do the following:
    1. Compile and link T against the patched headers and libcxx-new.dylib. Run T using libcxx-old.dylib.
    2. Compile and link T against the original headers and libcxx-old.dylib`. Run T using libcxx-new.dylib.

I didn't see any issues arise when doing this but that doesn't mean there are none. I'm going to ensure this method actually would catch
a ABI breakage so I can have some more faith in the results.

EricWF added inline comments.Dec 11 2015, 12:33 PM
include/list
198

Possible ABI break here.

EricWF updated this revision to Diff 42557.Dec 11 2015, 5:38 PM
EricWF retitled this revision from [libcxx] ABI-Breaking Fix for ALL undefined behavior in <list>. to [libcxx] Fix for ALL undefined behavior in <list>..
EricWF updated this object.
EricWF edited edge metadata.

Ping.

Originally I considered this change to be ABI breaking because we changed the type of the link pointers from __list_node_pointer to __list_node_base_pointer. However I now believe this change is ABI compatible in 99% of cases. I assert that this change is safe whenever __list_node_pointer and __list_node_base_pointer are builtin pointer types.

Unfortunately std::list supports fancy "class-like" pointers and we cannot guarantee that these fancy pointers will be ABI compatible in the same why builtin pointers are. Fancy pointers are free to have entirely different representations depending on the pointee type. For this reason I don't think we can fix std::list with fancy pointers until the next ABI version.

This patch fixes std::list for builtin pointer types in the current ABI version and fixes std::list for all fancy pointer types in the next ABI version. The patch was designed to minimize the amount of code needed to support both ABI configurations. Currently only ~5 lines of code differ.

The patch works by introducing a __link_pointer typedef that defines the type of pointers used to make the link list. __link_pointer is either __list_node_pointer or __list_node_base_pointer' depending on the ABI version. Instead of working with __node_pointer` or __node_base_pointer types in std::list we now only work with __link_pointer types except when we absolutely need to access the underlying node. When we need to convert between __node_pointer, __node_base_pointer or __link_pointer we use the conversion methods supplied defined in __list_node_base.

One problem with __link_pointer is that its easy to accidentally depend on it being defined as __node_pointer or __node_base_pointer. For example it would be incorrect to access __list_node::__value_ through a pointer declared as __link_pointer even though it will work when __link_pointer is an alias for __node_pointer. This code will compile

EricWF updated this revision to Diff 42612.Dec 11 2015, 5:39 PM

Update with correct patch this time.

mclow.lists accepted this revision.Dec 30 2015, 11:44 AM
mclow.lists edited edge metadata.

Minor changes; other than that, LGTM.

include/__config
43 ↗(On Diff #42612)

I would make this macro less generic than "fix". Maybe something like _LIBCPP_ABI_LIST_REMOVE_UB_NODEPTR

include/list
690

Do you really need __n here?
How about:

__node_pointer __np = __f->__as_node();
__f = __f->__next_;
This revision is now accepted and ready to land.Dec 30 2015, 11:44 AM
EricWF updated this revision to Diff 43802.Dec 30 2015, 1:00 PM
EricWF edited edge metadata.
  • Change ABI macro name to _LIBCPP_ABI_LIST_REMOVE_NODE_POINTER_UB.
  • Remove unnecessary variable pointed out in review.
  • Rename __node_pointer_traits to __list_node_pointer_traits to prevent name collisions.
EricWF closed this revision.Dec 30 2015, 1:01 PM