This is an archive of the discontinued LLVM Phabricator instance.

[flang][driver] Add documentation for Plugins
ClosedPublic

Authored by stuartellis on Aug 18 2021, 2:41 AM.

Details

Summary

Adding documentation covering the Frontend Driver Plugins

Diff Detail

Event Timeline

stuartellis created this revision.Aug 18 2021, 2:41 AM
stuartellis requested review of this revision.Aug 18 2021, 2:41 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2021, 2:41 AM
stuartellis added a project: Restricted Project.

This is a great first draft, thank you for working on this @stuartellis ! Could you expand a bit and add more code examples? Also, it would be nice to mention Fortran::parser::Walk here. It's quite key to your example in https://reviews.llvm.org/D107089.

More comments inline.

flang/docs/FlangDriver.md
278
279

Both Clang and Flang leverage the API available in LLVM (e.g. LoadLibraryPermanently. Perhaps:

Similarly to Clang, Flang leverges LoadLibraryPermanently from LLVM's llvm::sys::DynamicLibrary to load dynamic objects that implement plugins.

?

280–285

This list mixes two things:

  • implementation ("Creating a Plugin Shared Object library", "Registering the Plugin")
  • usage ("Loading the Plugin library", "Calling the Plugin")

I would simplify as:

The process for using Plugins includes:

  • Creating a Plugin Shared Object library (i.e. implementation)
  • Loading and Running a Plugin Shared Object library (i.e. usage)

I would also restructure the document and make the above your 2 top sections.

289

I would expand this into 3 parts (copying code from https://reviews.llvm.org/D107089, every item is one part):

  • class PrintFunctionNamesAction : public PluginParseTreeAction that wraps everything together and represents the frontend action class corresponding to your plugin
  • PrintFunctionNamesAction::ExecuteAction that implements the actual action (i.e. what it does)
  • static FrontendPluginRegistry::Add<PrintFunctionNamesAction> X that registers the plugin

It would also be worthwhile to mention struct ParseTreeVisitor (and Pre and Post methods). No need to go into too much detail - you can provide links to the relevant docs/implementation.

314

This whole section would be more interesting with a sample invocation line. You can use your example from as reference: https://reviews.llvm.org/D107089.

338–341

Instead of this section, I suggest extracting snippets of code from your example plugin and use those throughout this document. Similarly to https://clang.llvm.org/docs/ClangPlugins.html.

Can you clarify where the plugins should reside? I am guessing that the examples directory is just for examples. Is it in tools? Or can it be anywhere including out of tree?

In Flang the frontend consists of a parse-tree as well as a high-level IR. For clarity should we specify that this is for Parse Tree plugins? Should we further clarify that we are enabling the plugins after semantic checks?

flang/docs/FlangDriver.md
289

As @awarzynski says it will be beneficial to mention the mechanism used to traverse the parse-tree. The clang plugins page calls out the RecursiveAstVisitor.
https://clang.llvm.org/docs/ClangPlugins.html#introduction

Since this is the API (Visitor class with Pre and Post functions) that we are exposing for traversing, it will also be good to get approval from @klausler.

Adding more code examples and slight restructure

Thank you for addressing my comments @stuartellis ! I really like the structure of this document. I've left a few more suggestions inline. I've noticed that you spell certain things using CamelCase (e.g. Plugin, Frontend Action, Frontend Driver, Parse Tree), but that's inconsistent with the convention here and other documents within Flang. Could you update this?

Adding @Meinersbur to the list of reviewers. He's got extensive experience with plugins in LLVM and might have some suggestions for you too.

I will be away for 2 weeks, so won't be able to return to this earlier. If this is approved by other reviewers and my comments are addressed, please go ahead and merge.

Many thanks for working on this!

flang/docs/FlangDriver.md
277

The convention in this document is frontend driver rather than Frontend Driver.

278

I think that this would be clearer:

(...) extra user defined frontend actions, in the form of a specialization of a `PluginParseTreeAction`. These actions are run during compilation, after semantic checks.
281

Why Plugin rather than plugin?

282–283

Apologies, I didn't mean my suggestion literally when making the comment earlier :) I think that now this is a bit too verbose. How about:

* Creating a Plugin
*  Loading and Running a Plugin

Could you make these lit items link to the corresponding sections?

289–292

Also, could make these items link to the corresponding sections in the document? The descriptions that I am deleting here could be moved to the corresponding sections.

292
  1. A line to register the Plugin

In your example it's more than just one line.

295

Could you be more specific here? What do you mean by Parse Tree? The ParseTree API?

Also, function _and_ subroutine names declared in the input file.

301
307

your Plugin class will need to implement

No, that's not required. But if users don't implement this, the default ExectueAction is run and that doesn't do anything. Could you clarify this?

337
372

This way you can refer to <dsopath> in your description and also, IMHO, the option itself becomes clearer.

378

This way you can refer to <name> in your description and also, IMHO, the option itself becomes clearer.

383–384

This whole section is only relevant for in-tree plugins (i.e. plugins that are developed inside llvm-project/flang). Otherwise, these CMake variables don't really exist. It would be good to clarify this and also to mention that it is possible to develop plugins out-of-tree. Feel free to document the process for the latter or just to "leave it for later".

Btw, this is related to the question from @kiranchandramohan on the location of plugins. You could expand this section to answer his question here. Also, I would rename this section as e.g. Enabling In Tree Plugins.

Is PluginParseTreeAction the only supported kind of plugin? It seems clang's equivalent would be PluginASTAction which is abstract, why does PluginParseTreeAction provide a default implementation?

Note that clang/LLVM's newer -fpass-plugin has a different design that avoids global registries and instead calls a function in the loaded library.

flang/docs/FlangDriver.md
289
361

clang also supports -fplugin. Does flang as well?

@Meinersbur, thank you for your comments! Quickly replying before I go away for 2 weeks.

Is PluginParseTreeAction the only supported kind of plugin? It seems clang's equivalent would be PluginASTAction which is abstract, why does PluginParseTreeAction provide a default implementation?

That's an oversight, it should be abstract: https://reviews.llvm.org/D108518. Thanks for pointing this out!

Note that clang/LLVM's newer -fpass-plugin has a different design that avoids global registries and instead calls a function in the loaded library.

Avoiding global registries would be nice, so this is worth looking into, thanks! Isn't -fpass-plugin for middle-end plugins though?

flang/docs/FlangDriver.md
361

IIUC, -fplugin is a clang flag that's translated into -load for clang -cc1 (see here). For now, we have limited Flang plugins to flang-new -fc1. It would be worth clarifying somewhere in this document.

Once we add support for -fplugin (this should be rather straightforward), we can update this document.

Meinersbur added a comment.EditedAug 23 2021, 4:37 PM

Note that clang/LLVM's newer -fpass-plugin has a different design that avoids global registries and instead calls a function in the loaded library.

Avoiding global registries would be nice, so this is worth looking into, thanks! Isn't -fpass-plugin for middle-end plugins though?

Yes, indeed. It was introduced with the new pass manager for new pass manager plugins. It seemed to me that this was the general direction the LLVM infrastructure would want to go for a plugin infrastructure. Clang's current plugin types cannot be changed for compatibility reasons, but new plugin types could also use this approach and potentially deprecate the old infrastructure.

This does not mean we have to re-do flang plugin now; if clang redesigns its frontend plugin infrastructure one day, I think it will be straightforward do apply the same to flang.

Addressing review comments
Some formatting changes

awarzynski accepted this revision.Sep 6 2021, 12:12 PM

This reads really well @stuartellis , thank you! I've left a few minor comments, but these are purely cosmetic, so accepting as is.

From what I can see, all comments have been addressed. Let's wait a day or two before merging this in case other reviewers have more suggestions/questions.

flang/docs/FlangDriver.md
289

Shouldn't this be Creating a Plugin (like in the top section)? I'm not a native speaker and might be wrong here.

301
335–336

[nit] It would be nice to keep these consistent.

This revision is now accepted and ready to land.Sep 6 2021, 12:12 PM

Addressing nit comments

LGTM.

@klausler We are exposing the Parser::Walk framework using a Visitor Class with Pre and Post functions as the mechanism for traversing the parse-tree in flang plugins. I hope this is OK with you.

LGTM.

@klausler We are exposing the Parser::Walk framework using a Visitor Class with Pre and Post functions as the mechanism for traversing the parse-tree in flang plugins. I hope this is OK with you.

Well, I don't like the Pre()/Post() paradigm much, myself, and would prefer an API based on function objects (i.e., classes with overloaded operator()(const parser::xxx &) member functions) if I had to write a traversal client. See Evaluate/traverse.h for how this looks for Expr<>; there's a default traversal template that visits everything, and it's instantiated upon its subclasses ("CRTP") which can override its behavior for targets of interest as it likes. The Pre()/Post() paradigm tends to lead to confusion when both are present and performing stateful actions, and it's too easy to introduce bugs by changing one and not the other. A function-object visitation template for parse tree data structures could be built pretty easily using the existing visitor as a reference, and then used as a new substrate for that visitor. That's more of a project, obviously, and may only be worth doing if you agreed with my opinion on the API (which you're probably regretting asking me for by this point; sorry!).

Second: we just can't commit to the current parse tree classes and enums being a stable API forever, and the documentation should have disclaimers that plug-ins may stop compiling or working as the parse tree data structures evolve over time to accommodate future language standards, bugfixing, and basic clean-up.

Third, be advised: I did not intend this parse tree to be the front-end's output, and it has many of the hallmarks of an "internal" data structure meant only to survive through semantic checking, at which point it was meant to be raised to a more abstract and less messy representation suitable for lowering as well as for exposure to tools and plug-ins as a stable and *mutable* semantic tree. FIR happened instead. This parse tree was designed to be really inexpensive to create and discard as the product of a fast backtracking parser, and trade-offs were made in that direction. Consequently, the current parse tree is really hard to edit in situ (as we do to correct a few syntactically ambiguous parsing cases) and its definition explicitly disables all of its copy constructors. (The copy constructors are intentionally disabled to ensure that the parser combinators always use fast move construction.) If you base the plug-in API on the current parse tree, I think it likely that one of your early tool-writing clients will want to use it to duplicate some code for some transformation, and they will be sad. So manage expectations carefully. It's called the parse tree, not an AST, for good reasons.

Last: besides editing the parse tree to fix up ambiguous parses with symbolic information, Semantics also decorates the parse tree with unowned pointers to symbol table entries and owned pointers to the strongly typed Expr<> representation of expressions. Tool writers will probably have to be aware of the Scope and Symbol classes, and also know that they'll have to re-analyze any expressions if they modify their parse trees (unless they want to modify Expr<>'s in situ and perhaps make them inconsistent with the parse trees to which they're attached).

LGTM.

@klausler We are exposing the Parser::Walk framework using a Visitor Class with Pre and Post functions as the mechanism for traversing the parse-tree in flang plugins. I hope this is OK with you.

Well, I don't like the Pre()/Post() paradigm much, myself, and would prefer an API based on function objects (i.e., classes with overloaded operator()(const parser::xxx &) member functions) if I had to write a traversal client. See Evaluate/traverse.h for how this looks for Expr<>; there's a default traversal template that visits everything, and it's instantiated upon its subclasses ("CRTP") which can override its behavior for targets of interest as it likes. The Pre()/Post() paradigm tends to lead to confusion when both are present and performing stateful actions, and it's too easy to introduce bugs by changing one and not the other. A function-object visitation template for parse tree data structures could be built pretty easily using the existing visitor as a reference, and then used as a new substrate for that visitor. That's more of a project, obviously, and may only be worth doing if you agreed with my opinion on the API (which you're probably regretting asking me for by this point; sorry!).

Second: we just can't commit to the current parse tree classes and enums being a stable API forever, and the documentation should have disclaimers that plug-ins may stop compiling or working as the parse tree data structures evolve over time to accommodate future language standards, bugfixing, and basic clean-up.

Third, be advised: I did not intend this parse tree to be the front-end's output, and it has many of the hallmarks of an "internal" data structure meant only to survive through semantic checking, at which point it was meant to be raised to a more abstract and less messy representation suitable for lowering as well as for exposure to tools and plug-ins as a stable and *mutable* semantic tree. FIR happened instead. This parse tree was designed to be really inexpensive to create and discard as the product of a fast backtracking parser, and trade-offs were made in that direction. Consequently, the current parse tree is really hard to edit in situ (as we do to correct a few syntactically ambiguous parsing cases) and its definition explicitly disables all of its copy constructors. (The copy constructors are intentionally disabled to ensure that the parser combinators always use fast move construction.) If you base the plug-in API on the current parse tree, I think it likely that one of your early tool-writing clients will want to use it to duplicate some code for some transformation, and they will be sad. So manage expectations carefully. It's called the parse tree, not an AST, for good reasons.

Last: besides editing the parse tree to fix up ambiguous parses with symbolic information, Semantics also decorates the parse tree with unowned pointers to symbol table entries and owned pointers to the strongly typed Expr<> representation of expressions. Tool writers will probably have to be aware of the Scope and Symbol classes, and also know that they'll have to re-analyze any expressions if they modify their parse trees (unless they want to modify Expr<>'s in situ and perhaps make them inconsistent with the parse trees to which they're attached).

Thanks, @klausler for the detailed reply. It definitely helps to have your expert opinion.

While I found it quite straightforward to use, I agree that the existence of both Pre and Post can be confusing. If function object-based traversal API is the right way to go then we can try to implement that. At the moment we do not have time to do this. But I am putting this into our list of things to do for consideration next month or early next year (we review every 3 months).

Can it be made to look like the RecursiveASTVisitor of Clang?

class FindNamedClassVisitor
  : public RecursiveASTVisitor<FindNamedClassVisitor> {
public:
  explicit FindNamedClassVisitor(ASTContext *Context) : Context(Context) {}

  bool VisitCXXRecordDecl(CXXRecordDecl *Declaration) {
    return true;
  }

private:
  ASTContext *Context;
};

I guess it is OK to proceed with warnings and caveats either in this patch or an immediate follow-up patch. Something like the following based on @klausler's points.
Plugin writers should note the following,
-> The traversal API will change in the immediate future.
-> The parse-tree classes and enums will change as Flang development progresses.
-> The current parse-tree structure is not suitable for modifications. Also, the copy constructors are not available and hence duplicating code might not be trivial.
-> If parse-tree modifications are performed, then it might be necessary to re-analyze expressions and modify scope or symbols.

Can it be made to look like the RecursiveASTVisitor of Clang?

class FindNamedClassVisitor
  : public RecursiveASTVisitor<FindNamedClassVisitor> {
public:
  explicit FindNamedClassVisitor(ASTContext *Context) : Context(Context) {}

  bool VisitCXXRecordDecl(CXXRecordDecl *Declaration) {
    return true;
  }

private:
  ASTContext *Context;
};

Yes, although I've used the more idiomatic operator() as the member function for better composability with the STL, and recommend it again here, if you choose to replace Pre()/Post().

I guess it is OK to proceed with warnings and caveats either in this patch or an immediate follow-up patch. Something like the following based on @klausler's points.
Plugin writers should note the following,
-> The traversal API will change in the immediate future.
-> The parse-tree classes and enums will change as Flang development progresses.
-> The current parse-tree structure is not suitable for modifications. Also, the copy constructors are not available and hence duplicating code might not be trivial.
-> If parse-tree modifications are performed, then it might be necessary to re-analyze expressions and modify scope or symbols.

Note that it seems straightforward to add some preprocessing shenanigans to parse-tree.h that could be used to not disable copy constructors when parse-tree.h is included in plug-ins.

You could also consider extending Semantics with an API that scrubs the parse tree of all its decorations, so that it could be used as input to a fresh instantiation of Semantics so that the whole source file could be re-analyzed after parse tree modifications.

(It won't be surprising to users of clang plug-ins, but people experienced in Fortran might not be expecting f18 to work on whole source files at once instead of the traditional Fortran approach of compiling each program unit independently.)

@kiranchandramohan , thank you for bringing this up. @klausler , your suggestions are much appreciated!

I concur with everything that's been discussed here. I assume that initially this API will be mostly used for experimenting, but it definitely makes sense to emphasize that this is under active development and is likely to change. I would like us to do this in a separate commit though. IMO, this patch is ready as. I am also being mindful that @stuartellis has recently finished his rotation with us and it might be easier for us to refine this document in a separate PR.

On a related note, the example plugin implemented by @stuartellis is regularly tested by 2 LLVM Flang Buildbot workers:

So at least for the content presented here, we should be able to monitor and maintain its stability.

This revision was automatically updated to reflect the committed changes.