This is an archive of the discontinued LLVM Phabricator instance.

ADT: Remove UB in ilist (and use a circular linked list)
ClosedPublic

Authored by dexonsmith on Aug 12 2016, 2:07 PM.

Details

Summary

This removes the undefined behaviour (UB) in ilist/ilist_node/etc.,
mainly by removing (gutting) the ilist_sentinel_traits customization
point and canonicalizing on a single, efficient memory layout.

The new ilist is a doubly-linked circular list.

  • ilist_node_base has two ilist_node_base*: Next and Prev. Size-of: two pointers.
  • ilist_node<T> (size-of: two pointers) is a type-safe wrapper around ilist_node_base.
  • ilist_iterator<T> (size-of: two pointers) operates on an ilist_node<T>*, and downcasts to T* on dereference.
  • ilist_sentinel<T> (size-of: two pointers) is a wrapper around ilist_node<T> that has some extra API for list management.
  • ilist<T> (size-of: two pointers) has an ilist_sentinel<T>, whose address is returned for end().

The new memory layout matches ilist_half_embedded_sentinel_traits<T>
exactly. The Head pointer that previously lived in ilist<T> is
effectively glued to the ilist_half_node<T> that lived in
ilist_half_embedded_sentinel_traits<T>, becoming the Next and Prev in
the ilist_sentinel_node<T>, respectively. sizeof(ilist<T>) is now the
size of two pointers, and there is never any additional storage for a
sentinel.

This is a much simpler design for a doubly-linked list, removing most of
the corner cases of list manipulation (add, remove, etc.). In follow-up
commits, I intend to move as many algorithms as possible into a
non-templated base class (ilist_base) to reduce code size.

Moreover, this fixes the UB in ilist_iterator/getNext/getPrev
operations. Previously, ilist_iterator<T> operated on a T*, even when
the sentinel was not of type T (i.e., ilist_embedded_sentinel_traits and
ilist_half_embedded_sentinel_traits). This added UB to all operations
involving end(). Now, ilist_iterator<T> operates on an ilist_node<T>*,
and only downcasts when the full type is guaranteed to be T*.

What did we lose? There used to be a crash (in some configurations) on
++end(). Curiously (via UB), ++end() would return begin() for users of
ilist_half_embedded_sentinel_traits<T>, but otherwise ++end() would
cause a nice dependable nullptr dereference, crashing instead of a
possible infinite loop. Options:

  1. Lose that behaviour.
  2. Keep it, by stealing a bit from Prev in asserts builds.
  3. Crash on dereference instead, using the same technique.

Hans convinced me (because of the number of problems this and r278532
exposed on Windows) that we really need some assertion here, at least in
the short term. I've opted for #3 since I think it catches more bugs.

I added only a couple of unit tests to root out specific bugs I hit
during bring-up, but otherwise this is tested implicitly via the
extensive usage throughout LLVM.

Planned follow-ups:

  • Remove ilist_*sentinel_traits<T>. Here I've just gutted them to prevent build failures in sub-projects. Once I stop referring to them in sub-projects, I'll come back and delete them.
  • Add ilist_base and move algorithms there.

Eventually, there are other interesting directions:

  • Remove ilist_traits::createNode, by deleting the remaining API that creates nodes. Intrusive lists shouldn't be creating nodes themselves.
  • Remove ilist_traits::deleteNode, by (1) asserting that lists are empty on destruction and (2) changing API that calls it to take a Deleter functor (intrusive lists shouldn't be in the memory management business).
  • Reconfigure the remaining callback traits (addNodeToList, etc.) to be higher-level, pulling out a simple_ilist<T> that is much easier to read and understand.
  • Allow tags (e.g., ilist_node<T,tag1> and ilist_node<T,tag2>) so that T can be a member of multiple intrusive lists.

Diff Detail

Event Timeline

dexonsmith updated this revision to Diff 67904.Aug 12 2016, 2:07 PM
dexonsmith retitled this revision from to ADT: Remove UB in ilist (and use a circular linked list).
dexonsmith updated this object.
dexonsmith added a subscriber: llvm-commits.

Another direction I plan to pursue rather quickly is "fixing" reverse iterators, so that rbegin().getNodePtr()==&*rbegin(). This allows much simpler logic when erasing elements during a reverse traversal.

mehdi_amini accepted this revision.Aug 12 2016, 3:36 PM
mehdi_amini edited edge metadata.

LGTM overall, with a few inline comments that you should look at.

include/llvm/ADT/ilist.h
17–18

Stalled comment about createSentinel()?

26–28

It seems to be one of the main change in the model: before it was possible to create a list from something that would not inherit from ilist_node right?
I guess we don't lose much...

236

Stalled assert (at least assert message)?

323–324

Could be removed?

404–405

I'm not sure this is true, ++it will just work fine, but it will be in a "default constructed" state (NodePtr being a nullptr).
Only ilist_iterator &operator--() is asserting right now (incidentally, see previous comment).

470

Are these different from first and last?

This revision is now accepted and ready to land.Aug 12 2016, 3:36 PM
dexonsmith updated this revision to Diff 67964.Aug 13 2016, 2:52 PM
dexonsmith added a reviewer: hans.

Updated to address comments from Justin and Mehdi, but I'll wait to commit until Monday. Hans volunteered to run tests on Windows pre-commit. r278532 (up to r278539) somehow exposed a number of bugs in the backends (see r278572, r278577, and r278587), and I suspect this could expose others.

hans edited edge metadata.Aug 15 2016, 1:06 PM

I tried this out on Windows.

Some tests are currently red with ToT: http://lab.llvm.org:8011/builders/clang-x86-win2008-selfhost/builds/9727
On my machine, only CodeGen/AMDGPU/amdgpu.private-memory.ll fails.

When I apply this patch, CodeGen/AMDGPU/predicate-dp4.ll starts hanging, but that's the only change I get.

dexonsmith updated this revision to Diff 68256.Aug 16 2016, 2:53 PM
dexonsmith updated this object.
dexonsmith edited edge metadata.

Hans ran the tests yesterday on Windows, and an end() dereference manifested as an infinite loop. He has convinced me we really need some assertions to root out long-standing bugs that the optimizers may suddenly be exploiting. Hans suggested asserting in operator*() and that makes sense to me.

  • I'm updating the summary here on Phab to match my working commit message (options #1, #2 and #3 have changed).
  • The updated patch asserts that the sentinel is never dereferenced (it uses a PointerIntPair to pack IsSentinel into Prev).
  • I'm working on fixing the (many, many) crashes that this gives when running 'check'.
  • I'll post back here when 'check' and 'check-clang' are clean. At that point I'll try a bootstrap of 'clang' on Darwin/x86_64, since this is exposing a lot of bugs and I don't entirely trust our lit-test coverage. Volunteers for other configurations welcome! (Otherwise I'll leave it to the bots.)
hans added a comment.Aug 16 2016, 3:37 PM
  • I'll post back here when 'check' and 'check-clang' are clean. At that point I'll try a bootstrap of 'clang' on Darwin/x86_64, since this is exposing a lot of bugs and I don't entirely trust our lit-test coverage. Volunteers for other configurations welcome! (Otherwise I'll leave it to the bots.)

I'll be happy to test a Linux bootstrap and on Chromium.

Thanks Hans!

'check' is clean on Darwin as of r278887. 'check-lld' is also clean. I'm currently running 'check-clang' on one box and optimistically running 'tools/clang/cmake/caches/3-stage.cmake' to run 'stage3-clang' on another in parallel (llvm/clang/compiler-rt/libcxx at r278887 + this patch, using -DCMAKE_BUILD_TYPE=RelWithDebInfo -DLLVM_ENABLE_ASSERTIONS=ON). Assuming everything just works I hope to commit tomorrow, but I'll wait to hear back from Hans.

'check-clang' failed one test. As of r278898, it's also clean.

hans added a comment.Aug 17 2016, 10:35 AM

'check-clang' failed one test. As of r278898, it's also clean.

I applied your patch on Windows and ran llvm and clang's lit tests and a chromium build with no issues :-)

dexonsmith closed this revision.Aug 17 2016, 1:56 PM

Committed in r278974!