This commits adds a C++ generator to PDLL that generates wrapper PDL patterns
directly usable in C++ code, and also generates the definitions of native constraints/rewrites
that have code bodies specified in PDLL. This generator is effectively the PDLL equivalent of
the current DRR generator, and will allow easy replacement of DRR patterns with PDLL patterns.
A followup will start to utilize this for end-to-end integration testing and show case how to
use this as a drop-in replacement for DRR tablegen usage.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
mlir/lib/Tools/PDLL/CodeGen/CPPGen.cpp | ||
---|---|---|
97 | As currently implemented, we'd end up with C++ interspersed with PDL string representation. Thinking a bit more, is this actually related to PDLL or just to PDL? This way there would be a single MLIR source of truth one can go look at. OTOH, this would allow behaviors that seem nicer in MLIR rather than PDLL. None of this would require looking at any generated C++. | |
188 | Nothing here touches constParams. |
mlir/lib/Tools/PDLL/CodeGen/CPPGen.cpp | ||
---|---|---|
97 | We should print generic here, I still to have a build of MLIR that strips the custom assembly support. Ideally we would even not including a string to be parsed at runtime, but emit the C++ that construct the IR! :) |
mlir/lib/Tools/PDLL/CodeGen/CPPGen.cpp | ||
---|---|---|
97 |
This sounds like a scary amount of duplication to me .. |
mlir/lib/Tools/PDLL/CodeGen/CPPGen.cpp | ||
---|---|---|
8 | File description? | |
71 | OOC should there be a guard against a pattern named GeneratedPDLLPattern2? :) | |
93 | Nit: this form won't trigger the syntax highlighing added to vscode. These for me are just MLIR snippets inside C++ code and the fact that they include PDL ops isn't that interesting at this level (at least not to the level that I'd want to special case detection or would me act differently) | |
97 |
That shouldn't really matter, these should only be generated and consumed at the same rev, this should be a build system artifact. That being said, this could reduce number of test case updates if syntax changes.
Having something like #include "<pattern_name>.mlir here could be a compromise. Or are you proposing even beyond that : from a single generated pattern file one has these patterns merely refer into it? If you had this in bytecode format, would that be too late for pattern optimization? (e.g., we want this in this form C++ side, vs serializing the bytecode variant here and then keeping the "raw" pattern in external file for dynamic loading etc) |
mlir/lib/Tools/PDLL/CodeGen/CPPGen.cpp | ||
---|---|---|
97 |
I'm not sure I see why? Talking about this with River I even think we could have a generic backend for mlir-translate that given an op emits the C++ calls to build it (including nested region).
Sure: the thing is that when you say "load all patterns dynamically" it begs the question about what is the storage format for the patterns. Ideally the engine does not really care: what it wants is a "in-memory" representation of a pattern a way to load it. |
mlir/lib/Tools/PDLL/CodeGen/CPPGen.cpp | ||
---|---|---|
71 | Haha, added. | |
93 | Yeah, I had this written before we added support for that. Updated. | |
97 | There are various different things at play here. There is the generated C++ API that is exposed to users, and the internal storage of the patterns that we use. As Mehdi mentions, the storage representation is generally opaque to the user, whether that be an inlined string/bitcode blob/directly using the builder API/etc.
What behaviors do you have in mind here? The only thing PDLL is doing is generating MLIR and splatting that to the source file, all analysis/optimization/transformation of the PDL happens at runtime.
You can already mix PDLL generated patterns with PDL patterns from other places. Maybe I'm not understanding what kind of mixing you refer to here. The way that PDL works is that when the rewrite pattern set is frozen (i.e. when you create a FrozenRewritePatternSet) all of the PDL patterns are merged together into a single giant module, and then lowered/optimized together. Generating separate patterns or one giant pattern module has no effect on the end result, it's only about how you expose things to the user. Generating separate C++ patterns (one per PDLL pattern) allows for the user to directly reference specific patterns by name in the C++ API; e.g. if you had one .pdll file for canonicalization patterns of a dialect, you'd want to separate out the patterns for specific operations when adding them to the canonicalization pattern list for that op. | |
188 | Yep, also added a comment here. |
Looks good, I think this is fine starting point and can be refined from here.
mlir/lib/Tools/PDLL/CodeGen/CPPGen.cpp | ||
---|---|---|
107 | Nice touch, if have forgotten this. |
File description?