This is an archive of the discontinued LLVM Phabricator instance.

[pseudo] Add first and follow set computation in Grammar.
ClosedPublic

Authored by hokein on Feb 4 2022, 5:58 AM.

Details

Summary

These will be used when building parsing table for LR parsers.

Separate from https://reviews.llvm.org/D118196.

Diff Detail

Event Timeline

hokein requested review of this revision.Feb 4 2022, 5:58 AM
hokein created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2022, 5:58 AM
sammccall accepted this revision.Feb 4 2022, 8:46 AM
sammccall added inline comments.
clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h
148

AFAICS, firstSets are iterated over in followSets, and followSets are iterated over when building SLR.

Seems like exposing vector<vector<SymbolID>> would be enough. Up to you.
(Or the Epochs<SymbolID> data structure from the prototype - there must be a name for that data structure)

clang/lib/Tooling/Syntax/Pseudo/Grammar.cpp
142

nit: X := ... Y Z

(no confusion that the ... is associated with Y or that YZ is one symbol)

156

X := ... Z

166

Seems more natural to catch this problem in the constructor?
Then we'd stash the SymbolID in a member variable and just return it here.

clang/unittests/Tooling/Syntax/Pseudo/GrammarTest.cpp
25

whoops, I missed these in the previous review, these should really be targetID and sequence as they're functions.

108

nit: ToPairs, plural

This revision is now accepted and ready to land.Feb 4 2022, 8:46 AM
hokein updated this revision to Diff 407062.Feb 9 2022, 12:14 AM
hokein marked 4 inline comments as done.

address comments.

hokein added inline comments.Feb 9 2022, 12:15 AM
clang/include/clang/Tooling/Syntax/Pseudo/Grammar.h
148

that's right, vector<vector<...>> is enough, but it adds an extra step in the implementation (we need the set anyway for deduplication, and then convert to a vector), and it introduces some complexity in the followSet computation, we can't use a unified ExpandFollowSet (as the input could be a vector or llvm::Denseset) sigh.

clang/unittests/Tooling/Syntax/Pseudo/GrammarTest.cpp
25

they're public member variables :)

This revision was landed with ongoing or failed builds.Feb 9 2022, 12:16 AM
This revision was automatically updated to reflect the committed changes.
sammccall added inline comments.Feb 9 2022, 1:57 AM
clang/unittests/Tooling/Syntax/Pseudo/GrammarTest.cpp
25

MATCHER_P(TargetID) produces a template function named TargetID. It should be named targetID per the style rules.