There isn't really a good pre-existing syntax highlighter for tablegen, so this
commit adds a textmate version that covers nearly everything in the current
spec.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Added a few reviewers that I've been able to bamboozle into reviewing tablegen related things. Feel free to comment on the spec itself, but the thing I wanted some thoughts on was the currently existing tablegen textmate bundle. It hasn't been touched in ~10 years and is written in xml, should I just delete all of the existing stuff? (https://github.com/llvm/llvm-project/tree/main/llvm/utils/textmate) I don't use the macos textmate editor referenced there, so I'm not sure how to update that README.
Some context for this change is that I'm working on integrating tablegen language features into the MLIR vscode extension, and syntax highlighting is a big part of that.
I'm not a competent reviewer for this patch, but I love tblgen getting better tooling :-)
Yeah, I'm not sure who would be good to review the actual json itself. Mostly just wondering if I should care about the existing xml package or not, happy to land this and not care about the other thing.
Here are some vscode screen grabs comparing the "LLVM TableGen" extension (which is what I think most people use for vscode, and is semi-based on the current grammar in-tree) with this (in the MLIR extension):
Before | After
Very cool, if there is no good reviewer for this, I'd say "land it; advertise it; react to feedback from users" :-)
The highlighting looks nice with this.
I didn’t see anything blatantly wrong or missing, so thumbs up from me.
FWIW, there’s a tree-sitter grammar here: https://github.com/Flakebi/tree-sitter-tablegen
which also lists some interesting edge cases like nested comments or identifiers that start with digits: https://github.com/Flakebi/tree-sitter-tablegen/blob/master/test/corpus/regressions.td
For fun facts, 2var is a valid identifier in tablegen, but in LLVM IR %2var is actually a variable and a keyword (yes, %2 = load i8, i8* %1store i8 %2, i8* %1 is valid IR).
llvm/utils/textmate/tablegen.json | ||
---|---|---|
44–51 | I’m not familiar with textmate parsers, so not sure if this is handled here (or can be handled), but tablegen supports nested block comments. I.e. /* nested /* comment */ still a comment */ (and for the record, only tablegen supports nested comments, LLVM IR or MIR doesn’t) | |
218 | field is also a keyword (although deprecated) |
Ooh nice, I was looking for one for an experiment.
+1, the output looks good, structuring seems good and we can iterate.
llvm/utils/textmate/tablegen.json | ||
---|---|---|
44–51 | Oooh thanks for pointing this out. I just need to add recursion, but this slipped my mind. | |
218 | Yep thanks, I originally dropped it because I wasn't sure how widespread it was actually still being used (despite the "deprecated" tagline). We have never used it in MLIR, but seems like it's still heavily used in older LLVM .td files. |
I’m not familiar with textmate parsers, so not sure if this is handled here (or can be handled), but tablegen supports nested block comments. I.e.
(and for the record, only tablegen supports nested comments, LLVM IR or MIR doesn’t)