This is an archive of the discontinued LLVM Phabricator instance.

[pseudo] WIP: GSS node refcounting (dumb pointers)
Needs ReviewPublic

Authored by sammccall on May 24 2022, 3:31 PM.

Details

Reviewers
hokein
Summary

This adds a refcount to GSS::Node, and uses a freelist to reuse nodes rather
than reallocating.

The freelist works well (on AST.cpp, 35K nodes created but only 74 allocated),
but it's not the only point here. Tracking the lifetime of GSS nodes is going to
be useful for error handling: when a node dies having never been successfully
reduced, we can capture it to run recovery on it.

This version of the patch continues to use Node*, with calls to GSS::ref() and
GSS::unref() to manipulate the count. I don't think this is workable - it
took me a depressingly long time to track down the bugs, and the code is hard
to reason about.
I think we want a smart pointer here to make the code clearer. Unfortunately
this is a bit tricky: the destructor needs the GSS to do the destruction, but
we don't want to pay the cost of storing an extra pointer to the GSS everywhere.
I can't think of a better alternative than (ick) thread-local storage.

Diff Detail

Event Timeline

sammccall created this revision.May 24 2022, 3:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 24 2022, 3:31 PM
Herald added a subscriber: mgrang. · View Herald Transcript
sammccall requested review of this revision.May 24 2022, 3:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 24 2022, 3:31 PM
hokein added inline comments.May 30 2022, 11:49 AM
clang-tools-extra/pseudo/lib/GLR.cpp
118

The freelist works well (on AST.cpp, 35K nodes created but only 74 allocated),

The number of saving is quite promising! But as discussed, it didn't speed up the parser (mostly we're paying extra cost of updating nodes in dead branches, causing ~10% slowdown).

Wanted to explore an alternative (which we discussed briefly) -- for errory-recovery, the key point is that when there is no active head, we'd like to find a recovery state (retrieved from the most-recent live heads) and continue the parsing from there.

This main parsing for-loop can provide a global view of active heads:

  • we can assume here the NewHeads is not empty (for-loop body always starts with active heads);
  • if all heads die, we can detect it after the glrReduce (on Line130, by checking PendingShift is empty);

So the code is roughly like

for ( ... : Terminals) {
  auto PreviousNewHeads = NewHeads;

  ....
  glrReduce(...);

  if (PendingShift.empty()) {
    // run error-recovery from the PreviousNewHeads
  } else {
    glrShift(...);
  }
}