This is an archive of the discontinued LLVM Phabricator instance.

[pseudo] Perform unconstrained recovery prior to completion.
ClosedPublic

Authored by sammccall on Jul 25 2022, 3:00 PM.

Details

Summary

Our GLR uses lookahead: only perform reductions that might be consumed by the
shift immediately following. However when shift fails and so reduce is followed
by recovery instead, this restriction is incorrect and leads to missing heads.

In turn this means certain recovery strategies can't be made to work. e.g.

ns := NAMESPACE { namespace-body } [recover=Skip]
ns-body := namespace_opt

When namespace { namespace { is parsed, we can recover the inner ns (using
the Skip strategy to ignore the missing }). However this namespace will
not be reduced to a namespace-body as EOF is not in the follow-set, and so we
are unable to recover the outer ns.

This patch fixes this by tracking which heads were produced by constrained
reduce, and discarding and rebuilding them before performing recovery.

This is a prerequisite for the Skip strategy mentioned above, though there are
some other limitations we need to address too.

Diff Detail

Event Timeline

sammccall created this revision.Jul 25 2022, 3:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 25 2022, 3:00 PM
sammccall requested review of this revision.Jul 25 2022, 3:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 25 2022, 3:00 PM
sammccall updated this revision to Diff 447487.Jul 25 2022, 3:02 PM

restore removed debug

Apart from this patch, the main other things we need to allow missing brackets to be inferred:

  • allow recovery to trigger subsequent recovery, even at EOF. (Simplest way is to address the FIXME at 660, it's pretty involved)
  • allow opaque nodes to represent terminals

I have a prototype of all this working together locally, it seems to work...

hokein accepted this revision.Jul 28 2022, 7:01 AM
hokein added inline comments.
clang-tools-extra/pseudo/include/clang-pseudo/grammar/LRTable.h
109

this if doesn't make sense, and is not needed, I think.

clang-tools-extra/pseudo/lib/GLR.cpp
620

I think we can move the Line634 Heads.resize(HeadsPartition) before the glrShift() as we only do shift on the nearly-created heads, we might gain some performance back.

648

Can we have some comments on GLRReduce::operator() on how does parameter Head get modified (new heads are appended to it)?

This revision is now accepted and ready to land.Jul 28 2022, 7:01 AM
hokein added inline comments.Jul 28 2022, 7:11 AM
clang-tools-extra/pseudo/lib/GLR.cpp
620

oops, my previous comment is incorrect, here we want the second part of the partition; while on recovery, we want the first part of partition.

we can pass llvm::ArrayRef<const GSS::Node *>(Heads).drop_front(HeadsPartition); as the Heads to glrShift.

sammccall marked 2 inline comments as done.Aug 19 2022, 6:14 AM
sammccall added inline comments.
clang-tools-extra/pseudo/lib/GLR.cpp
620

As discussed offline, shifting onto a head that was produced by shift should be allowed.

given grammar

foo := [ ]
bar := [
baz := bar ]

and input [], after [ we have Heads={ [, bar} } with the former shifted and the latter reduced.
If we applied your suggestion here, we would fail to parse foo (but would succeed in parsing baz).