This is an archive of the discontinued LLVM Phabricator instance.

ADT: Give ilist<T>::reverse_iterator a handle to the current node
ClosedPublic

Authored by dexonsmith on Aug 23 2016, 10:30 AM.

Details

Reviewers
dexonsmith
Summary

Reverse iterators to doubly-linked lists can be simpler (and cheaper)
than std::reverse_iterator. Make it so.

In particular, change ilist<T>::reverse_iterator so that it is *never*
invalidated unless the node it references is deleted. This matches the
guarantees of ilist<T>::iterator.

(Note: MachineBasicBlock::iterator is *not* an ilist iterator, but a
MachineInstrBundleIterator<MachineInstr>. This commit does not change
MachineBasicBlock::reverse_iterator, but it does update
MachineBasicBlock::reverse_instr_iterator. See note at end of commit
message for details on bundle iterators.)

Given the list (with the Sentinel showing twice for simplicity):

[Sentinel] <-> A <-> B <-> [Sentinel]

the following is now true:

  1. begin() represents A.
  2. begin() holds the pointer for A.
  3. end() represents [Sentinel].
  4. end() holds the poitner for [Sentinel].
  5. rbegin() represents B.
  6. rbegin() holds the pointer for B.
  7. rend() represents [Sentinel].
  8. rend() holds the pointer for [Sentinel].

The changes are #6 and #8. Here are some properties from the old
scheme (which used std::reverse_iterator):

  • rbegin() held the pointer for [Sentinel] and rend() held the pointer for A;
  • operator*() cost two dereferences instead of one;
  • converting from a valid iterator to its valid reverse_iterator involved a confusing increment; and
  • "RI++->erase()" left RI invalid. The unintuitive replacement was "RI->erase(), RE = end()".

With vector-like data structures these properties are hard to avoid
(since past-the-beginning is not a valid pointer), and don't impose a
real cost (since there's still only one dereference, and all iterators
are invalidated on erase). But with lists, this was a poor design.

Specifically, the following code (which obviously works with normal
iterators) now works with ilist::reverse_iterator as well:

for (auto RI = L.rbegin(), RE = L.rend(); RI != RE;)
  fooThatMightRemoveArgFromList(*RI++);

Converting between iterator and reverse_iterator for the same node uses
the getReverse() function.

reverse_iterator iterator::getReverse();
iterator reverse_iterator::getReverse();

Why doesn't iterator <=> reverse_iterator conversion use constructors?

In order to catch and update old code, reverse_iterator does not even
have an explicit conversion from iterator. It wouldn't be safe because
there would be no reasonable way to catch all the bugs from the changed
semantic (see the changes at call sites that are part of this patch).

Old code used this API:

std::reverse_iterator::reverse_iterator(iterator);
iterator std::reverse_iterator::base();

Here's how to update from old code to new (that incorporates the
semantic change), assuming I is an ilist<>::iterator and RI is an
ilist<>::reverse_iterator:

          [Old]         ==>          [New]
  reverse_iterator(I)         ++I.getReverse()
--reverse_iterator(I)           I.getReverse()
  reverse_iterator(++I)         I.getReverse()
        RI.base()            ++RI.getReverse()
      --RI.base()              RI.getReverse()
    (++RI).base()              RI.getReverse()
delete &*RI, RE = end()         delete &*RI++
RI->erase(), RE = end()         RI++->erase()

Note: bundle iterators are out of scope

MachineBasicBlock::iterator, also known as
MachineInstrBundleIterator<MachineInstr>, is a wrapper to represent
MachineInstr bundles. The idea is that each operator++ takes you to the
beginning of the next bundle. Implementing a sane reverse iterator for
this is harder than ilist. Here are the options:

  • Use std::reverse_iterator<MBB::i>. Store a handle to the beginning of the next bundle. A call to operator*() runs a loop (usually operator--() will be called 1 time, for unbundled instructions). Increment/decrement just works. This is the status quo.
  • Store a handle to the final node in the bundle. A call to operator*() still runs a loop, but it iterates one time fewer (usually operator--() will be called 0 times, for unbundled instructions). Increment/decrement just works.
  • Store a handle to the first node in the bundle. A call to operator*() just works. However, reverse_iterator::operator++() (moving the handle back one in the list) requires checking against end() before incrementing (not too hard with MachineInstr::getParent).

I tried implementing this last option (and I may follow up with a patch
that finishes it). Updating iterator/reverse_iterator conversion call
sites was error-prone, though. With ilist::iterator/reverse_iterator,
++end().getReverse() (i.e., ++before_rbegin()) gets you rbegin(),
through the magic of circular lists. Symmetrically,
++rend().getReverse() (i.e., ++before_begin()) gives you begin(). With
MachineBasicBlock::iterator/reverse_iterator, ++before_begin() is not
safe, since we need MachineInstr::getParent to be valid to know whether
the handle is dereferenceable.

Another fix is a fourth option that is out-of-scope for this commit:

  • Make the ilist_sentinel<MachineInstr> *always* store that it's the sentinel (instead of just in asserts mode). Then the bundle iterator can sniff the sentinel bit in operator++() so that ++before_rbegin() and ++before_begin() both work.

Diff Detail

Event Timeline

dexonsmith retitled this revision from to ADT: Give ilist<T>::reverse_iterator a handle to the current node.
dexonsmith updated this object.
dexonsmith added a subscriber: llvm-commits.
dexonsmith updated this object.Aug 23 2016, 4:59 PM

Justin Bogner pointed out offline that my code update table was wrong (backwards) for reverse_iterator=>iterator conversions. I fixed the table both on Phab and in my local commit message to this table:

          [Old]         ==>          [New]
  reverse_iterator(I)         ++I.getReverse()
--reverse_iterator(I)           I.getReverse()
  reverse_iterator(++I)         I.getReverse()
        RI.base()            ++RI.getReverse()
      --RI.base()              RI.getReverse()
    (++RI).base()              RI.getReverse()
delete &*RI, RE = end()         delete &*RI++
RI->erase(), RE = end()         RI++->erase()
dexonsmith updated this revision to Diff 69059.Aug 23 2016, 5:32 PM

Updated to fix four compile errors in Lanai after r279498 turned it non-experimental!

dexonsmith accepted this revision.Aug 29 2016, 5:23 PM
dexonsmith added a reviewer: dexonsmith.

Thanks for the review! Committed in r280032.

This revision is now accepted and ready to land.Aug 29 2016, 5:23 PM
dexonsmith closed this revision.Aug 29 2016, 5:23 PM