This is an archive of the discontinued LLVM Phabricator instance.

[TableGen] Add a new json textmate description for syntax highlighting
ClosedPublic

Authored by rriddle on May 11 2022, 3:51 PM.

Details

Summary

There isn't really a good pre-existing syntax highlighter for tablegen, so this
commit adds a textmate version that covers nearly everything in the current
spec.

Diff Detail

Event Timeline

rriddle created this revision.May 11 2022, 3:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2022, 3:51 PM
Herald added a subscriber: bollu. · View Herald Transcript
rriddle requested review of this revision.May 11 2022, 3:51 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2022, 3:51 PM

Added a few reviewers that I've been able to bamboozle into reviewing tablegen related things. Feel free to comment on the spec itself, but the thing I wanted some thoughts on was the currently existing tablegen textmate bundle. It hasn't been touched in ~10 years and is written in xml, should I just delete all of the existing stuff? (https://github.com/llvm/llvm-project/tree/main/llvm/utils/textmate) I don't use the macos textmate editor referenced there, so I'm not sure how to update that README.

Some context for this change is that I'm working on integrating tablegen language features into the MLIR vscode extension, and syntax highlighting is a big part of that.

rriddle updated this revision to Diff 428833.May 11 2022, 6:45 PM

I'm not a competent reviewer for this patch, but I love tblgen getting better tooling :-)

rriddle added a comment.EditedMay 13 2022, 12:21 AM

Yeah, I'm not sure who would be good to review the actual json itself. Mostly just wondering if I should care about the existing xml package or not, happy to land this and not care about the other thing.

Here are some vscode screen grabs comparing the "LLVM TableGen" extension (which is what I think most people use for vscode, and is semi-based on the current grammar in-tree) with this (in the MLIR extension):

Before | After


lattner accepted this revision.May 13 2022, 12:29 AM

Very cool, if there is no good reviewer for this, I'd say "land it; advertise it; react to feedback from users" :-)

This revision is now accepted and ready to land.May 13 2022, 12:29 AM

The highlighting looks nice with this.
I didn’t see anything blatantly wrong or missing, so thumbs up from me.

FWIW, there’s a tree-sitter grammar here: https://github.com/Flakebi/tree-sitter-tablegen
which also lists some interesting edge cases like nested comments or identifiers that start with digits: https://github.com/Flakebi/tree-sitter-tablegen/blob/master/test/corpus/regressions.td

For fun facts, 2var is a valid identifier in tablegen, but in LLVM IR %2var is actually a variable and a keyword (yes, %2 = load i8, i8* %1store i8 %2, i8* %1 is valid IR).

llvm/utils/textmate/tablegen.json
44–51

I’m not familiar with textmate parsers, so not sure if this is handled here (or can be handled), but tablegen supports nested block comments. I.e.

/* nested /*
comment */
still a comment
*/

(and for the record, only tablegen supports nested comments, LLVM IR or MIR doesn’t)

218

field is also a keyword (although deprecated)

jpienaar accepted this revision.May 13 2022, 7:33 AM

FWIW, there’s a tree-sitter grammar here: https://github.com/Flakebi/tree-sitter-tablegen

Ooh nice, I was looking for one for an experiment.

Very cool, if there is no good reviewer for this, I'd say "land it; advertise it; react to feedback from users" :-)

+1, the output looks good, structuring seems good and we can iterate.

rriddle updated this revision to Diff 429326.May 13 2022, 12:22 PM
rriddle marked 2 inline comments as done.
rriddle added inline comments.May 13 2022, 12:23 PM
llvm/utils/textmate/tablegen.json
44–51

Oooh thanks for pointing this out. I just need to add recursion, but this slipped my mind.

218

Yep thanks, I originally dropped it because I wasn't sure how widespread it was actually still being used (despite the "deprecated" tagline). We have never used it in MLIR, but seems like it's still heavily used in older LLVM .td files.

This revision was landed with ongoing or failed builds.May 13 2022, 1:13 PM
This revision was automatically updated to reflect the committed changes.

Thanks all for the review!