This is an archive of the discontinued LLVM Phabricator instance.

Preprocessing support in tablegen
ClosedPublic

Authored by vzakhari on Oct 29 2018, 3:28 PM.

Details

Summary

This adds preprocessing support to TableGen (requested in http://lists.llvm.org/pipermail/llvm-dev/2017-November/118831.html and making conditional compilations more convenient). The supported directives are #define/#ifdef/#else/#endif. Command line option for defining a macro is '-DNAME'.

Diff Detail

Repository
rL LLVM

Event Timeline

vzakhari created this revision.Oct 29 2018, 3:28 PM

I have to say I'm feeling a bit ambivalent about this. I'd say it would be nicer to have a mechanism that integrates with the rest of the TableGen language, but that's admittedly non-trivial. So I guess this approach is okay.

The handling of end-of-files is a bit wonky. Have you considered just returning EOF from getNextChar at end of file, even if there's a parent file, and changing the EOF case in LexToken to just loop (or tail-recurse, I suppose) if it was the EOF of an included file?

lib/TableGen/TGLexer.cpp
605–608 ↗(On Diff #171575)

Please just inline this function where it's used. Having to jump here increases the cognitive overhead of reading the code.

lib/TableGen/TGLexer.h
169 ↗(On Diff #171575)

Should be an SMLoc.

185 ↗(On Diff #171575)

Coding style: Redundant space between closing angle brackets.

Also, just remove PreprocessorControl and instead refer to PrepIncludeStack.back() for the current one.

test/TableGen/prep-diag1.td
9 ↗(On Diff #171575)

Please keep all RUN lines at the top of the file.

I have to say I'm feeling a bit ambivalent about this. I'd say it would be nicer to have a mechanism that integrates with the rest of the TableGen language, but that's admittedly non-trivial. So I guess this approach is okay.

Thank you for the prompt reply, Nikolai! Yes, I decided to keep the preprocessing aside to minimize changes in the actual lexing.

The handling of end-of-files is a bit wonky. Have you considered just returning EOF from getNextChar at end of file, even if there's a parent file, and changing the EOF case in LexToken to just loop (or tail-recurse, I suppose) if it was the EOF of an included file?

This is possible, though, EOF handling will be required in SkipCComment() and LexBracket(). If you agree with me changing these routines, I can do this. Returning EOF from getNextChar() allows handling cross-file C-style comments and bracket contructs - I can disallow these use-cases or keep supporting them. What do you think I should do?

vzakhari updated this revision to Diff 172018.Oct 31 2018, 2:18 PM
vzakhari updated this revision to Diff 172019.Oct 31 2018, 2:24 PM

I have to say I'm feeling a bit ambivalent about this. I'd say it would be nicer to have a mechanism that integrates with the rest of the TableGen language, but that's admittedly non-trivial. So I guess this approach is okay.

Thank you for the prompt reply, Nikolai! Yes, I decided to keep the preprocessing aside to minimize changes in the actual lexing.

The handling of end-of-files is a bit wonky. Have you considered just returning EOF from getNextChar at end of file, even if there's a parent file, and changing the EOF case in LexToken to just loop (or tail-recurse, I suppose) if it was the EOF of an included file?

This is possible, though, EOF handling will be required in SkipCComment() and LexBracket(). If you agree with me changing these routines, I can do this. Returning EOF from getNextChar() allows handling cross-file C-style comments and bracket contructs - I can disallow these use-cases or keep supporting them. What do you think I should do?

I think adding EOF handling there is fine. C-style comments across files should be forbidden IMO. Same for LexBracket, I think; that code is for [[ code blocks ]], and allowing those across files seems very odd indeed. If there's genuinely a use case for it, it's always easier to allow it later than going in the reserve direction.

vzakhari updated this revision to Diff 172672.EditedNov 5 2018, 3:22 PM

The new EOF handling looks much better to me. I added a couple of negative tests for C-style comments and code blocks crossing the file boundary. I also renamed include files from .td to .inc so that they are not considered as real tablegen tests.

And I did not have to change anything in SkipCComment() and LexBracket(), because they already handle EOF, but an error is only emitted for the final EOF.

Thanks for making those changes. The EOF handling does look better.

docs/TableGen/LangRef.rst
455 ↗(On Diff #172672)

Could you please add something along the lines of: "TableGen's embedded preprocessor is only intended for conditional compilation. It supports the following directives:"

lib/TableGen/TGLexer.cpp
605–608 ↗(On Diff #171575)

What about this one?

vzakhari added inline comments.Nov 6 2018, 9:37 AM
lib/TableGen/TGLexer.cpp
605–608 ↗(On Diff #171575)

Oh, your previous comment referred to prepNewPreprocessorControl(), and I inlined it into prepEnterInclude(). I wanted to keep prepEnterInclude() to have a complete pair of enter/exit include method, but probably it would better be inlined. I will do it shortly.

vzakhari updated this revision to Diff 172782.Nov 6 2018, 9:59 AM
vzakhari marked an inline comment as done.
This revision is now accepted and ready to land.Nov 15 2018, 12:35 PM
This revision was automatically updated to reflect the committed changes.