This is an archive of the discontinued LLVM Phabricator instance.

[TableGen] Eliminate the 'code' type
ClosedPublic

Authored by Paul-C-Anagnostopoulos on Nov 28 2020, 11:40 AM.

Details

Summary

This extensive revision eliminates the 'code' type in TableGen. The purpose of this revision is to simplify TableGen and make all the string-manipulation bang operators available for code manipulation.

The 'code' keyword remains, as a synonym for 'string'. The [{...}] code brackets remain.

I updated the documentation. I found all the uses of the CodeInit value class in the backends and updated them to use StringInit instead. I updated tests.

Diff Detail

Event Timeline

Herald added projects: Restricted Project, Restricted Project, Restricted Project. · View Herald Transcript
Paul-C-Anagnostopoulos requested review of this revision.Nov 28 2020, 11:40 AM

Ah, I thought TypeOf_xxx was a transitionary thing that we'd drop over time (to reduce complexity). If that's the case, I'd recommend not documenting it.

Losing types and permitting nonsensical operators like !subst on code seems a bit sad. Why not just define a !codeconcat, like how we have a separate !listconcat? I agree that the TypeOf_xxx stuff is really ugly and so having _greater_ expressivity like !codeconcat would go a long way to help that. Also, cast<code>(str) would be the natural way to express the cases where you still really do want to construct code as a string without needing TypeOf_xxx.

The TypeOf_xxx feature of searchable tables already exists. I just co-opted it as the way to tell the searchable table backend to emit the code without quotes. If we go with this revision, we will always need it in certain cases, when the code is constructed as a string and the backend cannot tell that it is code. We could take Jessica's suggestion and allow !cast<code>(string), which would set the string format to 'code'.

I do agree that the TypeOf_xxx feature is funky. But it does let you tell the searchable table emitter that the field is code without worrying how it was created and/or whether !cast<code> was used (if we allow that).

Yes, we could create !codeconcat instead, along with !codeinterleave, !codeeq, etc. We could also just extend the existing bang operators to work on the code type. I thought it was cleaner just to get rid of the distinction.

I agree that !subst is a mess, even without this change. My plan is eventually to implement a saner !replace. But I'm not sure why doing string replacement in code is sad. There is even a utility to do it: GlobalISel\CodeExpander.

Yes, we could create !codeconcat instead, along with !codeinterleave, !codeeq, etc. We could also just extend the existing bang operators to work on the code type. I thought it was cleaner just to get rid of the distinction.

You wouldn't need a new !eq, you can just add another overloading. Really there should probably be an overloaded !concat that works on string, bits, list and code; !strconcat is the unusual case where it's qualified with the type in the name.

I agree that !subst is a mess, even without this change. My plan is eventually to implement a saner !replace. But I'm not sure why doing string replacement in code is sad. There is even a utility to do it: GlobalISel\CodeExpander.

I was confused and thought !subst was substring not substitute, which is normally a meaningless operation to perform on code. Substitution is less weird, thought normally the useful/meaningful representation for code is something like a list of tokens which would make !subst a bit more useful as currently you just get raw string replacement which risks all manner of false positives for matches.

Oh ok, well I have no problem keeping it around in searchable tables, I misread this as making that more general.

jrtc27, we had a long conversation about this. We'd like to eliminate the Code type as a way to simplify the internals of tblgen. It is a distinction without a useful difference for many things, and this is one step towards making tblgen simpler and more consistent.

The TypeOf_xxx field is only used by the searchable tables backend. My trek through all the backends revealed it as the only one that needs to distinguish string and code, because it can emit the C++ initializer for any type of field. All the other backends know what is in each field they process. You can see from the various patches that there is nothing special about code fields in the other backends.

I will fix the lint issues. I will let this stew here for a few days in case there are other comments.

(A new !replace operator, along with !lookup, are on my to-do list.)

llvm/test/TableGen/generic-tables.td
138

I will get rid of this line.

Fixed some of the lint issues.

lattner accepted this revision.Dec 1 2020, 10:03 PM

Woot! Go Paul!

This revision is now accepted and ready to land.Dec 1 2020, 10:03 PM

I'm about to push this revision. I will be surprised if it does not break the build.

This revision was automatically updated to reflect the committed changes.
silvas added a subscriber: silvas.Dec 3 2020, 5:47 PM

Wow, awesome work!