This is an archive of the discontinued LLVM Phabricator instance.

[pseudo] Implement guard extension.
ClosedPublic

Authored by hokein on Jun 9 2022, 3:12 PM.

Details

Summary
  • Extend the GLR parser to allow conditional reduction based on the guard functions;
  • Implement two simple guards (contextual-override/final) for cxx.bnf;
  • layering: clangPseudoCXX depends on clangPseudo (as the guard function need to access the TokenStream);

Diff Detail

Event Timeline

hokein created this revision.Jun 9 2022, 3:12 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2022, 3:13 PM
Herald added a subscriber: mgorny. · View Herald Transcript
hokein requested review of this revision.Jun 9 2022, 3:13 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2022, 3:13 PM
hokein updated this revision to Diff 435889.Jun 10 2022, 6:10 AM
  • add comments;
  • add unittest and lit test;
  • misc improvements;

The ParseLang + CLI library seem clearly separate from everything else, I think this patch should be split.

Comments on that part first...

clang-tools-extra/pseudo/benchmarks/Benchmark.cpp
68

getLanguage() should not be inside the loop. This is happening in a few places

Maybe it should stay in setupGrammarAndSource?

clang-tools-extra/pseudo/include/clang-pseudo/Language.h
36

why are these all references with ParseLang (presumably) passed by value, instead of values with ParseLang passed by reference?

clang-tools-extra/pseudo/include/clang-pseudo/cli/CLI.h
25

nit: I don't think a namespace is needed here, compared to just getLanguageFromFlags()

27

either "language" is a clear enough name or it isn't - given naming elsewhere this should be getParseLang I think

clang-tools-extra/pseudo/lib/GLR.cpp
298–299

I'm not sure exactly what the best thing to do about this is, but Params.Lang.G is hard to read - it's a long complicated name that doesn't even name the thing.

I think probably we should split ParseParams up instead of nesting ParseLang in it, e.g.

struct GLRStorage { ForestArena&, GSS }
glrReduce(vector<ParseStep>, const ParseLang&, GLStorage&, NewHeadCallback)

Or even put a forest pointer in the GSS just for convenience.

clang-tools-extra/pseudo/lib/cli/CLI.cpp
53

huh? diagnostics aren't fatal (at least haven't been considered so before)

54

"missing guard function" is probably a grammar diagnostic, maybe we should add some logic to validate extensions?
Probably a FIXME for now

58–59

I don't think the CLI library should be in charge of knowing the fallback logic for each type of extension.
Could the parser handle a missing guard itself instead?

clang-tools-extra/pseudo/tool/ClangPseudo.cpp
94–95

drop if(true)?

clang-tools-extra/pseudo/unittests/GLRTest.cpp
51–59

either do it or don't - this is too small to check in a fixme IMO

57

can this just be a member?

(comments for guard part)

clang-tools-extra/pseudo/include/clang-pseudo/Language.h
23

nit: I think the first sentence should try to describe what this *is* in as simple terms as possible, rather than how it interacts with the rest of the system.

// A guard restricts when a grammar rule can be used
26

I think it's worth including an example here. "e.g. a guard may allow virt-specifier := IDENTIFIER only if the identifier's text is 'override'".

28

I think this sentence adds more confusion than it helps. (Not sure it's true depending on what you mean by "parsing branch" - the guard runs before the reduce not after, so the branch never exists)

clang-tools-extra/pseudo/lib/GLR.cpp
306

I think diagnosing missing guards but then treating them as always-passing is less surprising and more ergonomic while modifying a grammar

hokein updated this revision to Diff 441677.Jul 1 2022, 6:28 AM
hokein marked 8 inline comments as done.

rebase and address comments.

hokein updated this revision to Diff 441680.Jul 1 2022, 6:41 AM
hokein marked 5 inline comments as done.

fix format.

hokein added inline comments.Jul 1 2022, 6:46 AM
clang-tools-extra/pseudo/lib/GLR.cpp
298–299

Changed the signature of glrShift/Reduce/Parse.

306

I wanted to make this as a contract (guard in each rule must refer to a valid extension function) in GLR parser.

I think loosing the restriction is also fine -- the only downside I can see is that if we have a typo guard in the grammar file, which we don't have a corresponding function, the parser will run and emit confusing results. We can fix it by adding "missing guard function" diagnostics.

hokein retitled this revision from [wip][pseudo] Implement guard extension. to [pseudo] Implement guard extension..Jul 1 2022, 6:47 AM
hokein edited the summary of this revision. (Show Details)
sammccall accepted this revision.Jul 5 2022, 5:35 AM

Great!

clang-tools-extra/pseudo/include/clang-pseudo/GLR.h
116–117

delete "being".
Or possibly the whole comment, this seems obvious?

clang-tools-extra/pseudo/lib/GLR.cpp
295

nit: hadn't really intended SequenceRef to be a general vocabulary type, you might just want ArrayRef to match the guard signature, up to you

299

this should be guarded by LLVM_DEBUG

this whole thing might be clearer with lookup

if (auto Guard = Lang.Guards.lookup(GuardID))
  return Guard(RHS.S, Params.Code);

LLVM_DEBUG(...);
return true;
clang-tools-extra/pseudo/unittests/GLRTest.cpp
435

insert(make_pair(...)) => emplace(...)?

438

remove

This revision is now accepted and ready to land.Jul 5 2022, 5:35 AM
hokein updated this revision to Diff 442290.Jul 5 2022, 6:45 AM
hokein marked 5 inline comments as done.

address comments.

This revision was landed with ongoing or failed builds.Jul 5 2022, 6:56 AM
This revision was automatically updated to reflect the committed changes.