This is an archive of the discontinued LLVM Phabricator instance.

[TableGen] Add `!dump` and `dump`.
AbandonedPublic

Authored by fpetrogalli on Jul 27 2023, 5:04 AM.

Details

Summary

This allows dumping expressions (Init *) as soon as they become concrete. Both of these take a string as the first operand that prefixes the dump of the expression.

!dump can be used in a expression/functional context. For example:

!foreach(i, lst, !dump("elt = ", i))

dump can be used in a "statement" context:

def c: C<3>;
dump "c = ", c;

dump is implemented as a syntactic sugar. The above dump is parsed into:

assert 1, !cast<string>(!dump("c = ", c));

Original patch from Adam Nemet <anemet@apple.com>

Diff Detail

Unit TestsFailed

Event Timeline

fpetrogalli created this revision.Jul 27 2023, 5:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2023, 5:04 AM
Herald added a subscriber: hiraditya. · View Herald Transcript
fpetrogalli requested review of this revision.Jul 27 2023, 5:04 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 27 2023, 5:04 AM
fpetrogalli edited the summary of this revision. (Show Details)Jul 27 2023, 5:06 AM
fpetrogalli added a reviewer: tra.

Hello @tra - this patch is addressing your comment at https://reviews.llvm.org/D148915#4288193

We originally implemented these extra functionalities as print and !print, but I agree that dump and !dump is less likely to clash with names in user-written tablegen files.

Francesco

wangpc added a subscriber: wangpc.

Thanks! The implementation is just what I have imagined before!
One suggestion: can we accept variable-length arguments (at least one) and the first needn't to be a string? The output is just concatenation of all arguments and we make current implementation a special case. So that we can dump multiple values in a single line.

llvm/docs/TableGen/ProgRef.rst
1247

standard error output?

llvm/lib/TableGen/TGParser.cpp
3876
wangpc added inline comments.Jul 27 2023, 9:53 PM
llvm/lib/TableGen/TGLexer.h
161

Put Dump togather with other keywords instead of bang operators?

fpetrogalli marked 3 inline comments as done.

Address code review.

Thanks! The implementation is just what I have imagined before!

Good good! Thank you for the review.

One suggestion: can we accept variable-length arguments (at least one) and the first needn't to be a string? The output is just concatenation of all arguments and we make current implementation a special case. So that we can dump multiple values in a single line.

I'd keep it simple and avoid having to deal with variable-length args, because once D156429 is in, we can achieve the same with !dump("the concat is = ", !format("{0} {1}...", vars...))

Thanks! The implementation is just what I have imagined before!

Good good! Thank you for the review.

One suggestion: can we accept variable-length arguments (at least one) and the first needn't to be a string? The output is just concatenation of all arguments and we make current implementation a special case. So that we can dump multiple values in a single line.

I'd keep it simple and avoid having to deal with variable-length args, because once D156429 is in, we can achieve the same with !dump("the concat is = ", !format("{0} {1}...", vars...))

Or loose the constraint that first operand shoule be a string? for your example, it can be simpler: !dump(!format("the concat is = {0} {1}...", vars...)). XDump can still be a binary operator and set second operand to StringInit("") when only one operand is provided.
By doing this, it won't be imcompatible if we want to extend it to be with variable-length arguments in the future.

llvm/lib/TableGen/Record.cpp
1189

Can we print it in debug mode and mute it in release mode?
A CLI option may be added to control it and set it via cmake build type.

Thanks! The implementation is just what I have imagined before!

Good good! Thank you for the review.

One suggestion: can we accept variable-length arguments (at least one) and the first needn't to be a string? The output is just concatenation of all arguments and we make current implementation a special case. So that we can dump multiple values in a single line.

I'd keep it simple and avoid having to deal with variable-length args, because once D156429 is in, we can achieve the same with !dump("the concat is = ", !format("{0} {1}...", vars...))

Or loose the constraint that first operand shoule be a string? for your example, it can be simpler: !dump(!format("the concat is = {0} {1}...", vars...)). XDump can still be a binary operator and set second operand to StringInit("") when only one operand is provided.
By doing this, it won't be imcompatible if we want to extend it to be with variable-length arguments in the future.

OK, !dump should return a value, but what about dump?

Thanks! The implementation is just what I have imagined before!

Good good! Thank you for the review.

One suggestion: can we accept variable-length arguments (at least one) and the first needn't to be a string? The output is just concatenation of all arguments and we make current implementation a special case. So that we can dump multiple values in a single line.

I'd keep it simple and avoid having to deal with variable-length args, because once D156429 is in, we can achieve the same with !dump("the concat is = ", !format("{0} {1}...", vars...))

Or loose the constraint that first operand shoule be a string? for your example, it can be simpler: !dump(!format("the concat is = {0} {1}...", vars...)). XDump can still be a binary operator and set second operand to StringInit("") when only one operand is provided.
By doing this, it won't be imcompatible if we want to extend it to be with variable-length arguments in the future.

We could do what you suggest, however please notice that dump is just a facility for TD debugging, not for string formatting. You are writing some TD file, you get unexpected results out of it and you want to debug what is going on in your code by checking the content of your variables, one by one. You find the bug, fix the code, remove the dump statements. If someone will need to debug more than a variable at once, they can use !debug + !format. Adding varargs to !debug seems to duplicate the varargs functionality of format at the cost of making the implementation of dump more complicated.

llvm/lib/TableGen/Record.cpp
1189

That would make the TableGen language depending on its build configuration. I am not sure that's the right thing to do.

OK, !dump should return a value, but what about dump?

dump is built on top of assert + !dump, so it returns the same as assert (nothing, I guess?)

Rebase is needed.
The patch looks great to me, but I will let @tra give a final approval. :-)

Rebase on top of main.

Rebase is needed.
The patch looks great to me, but I will let @tra give a final approval. :-)

Thank you,

Francesco

tra added a comment.Aug 7 2023, 12:19 PM

Sorry about the silence -- I was away for the last couple of weeks.

Thank you very much for the patch!

Or loose the constraint that first operand shoule be a string? for your example, it can be simpler: !dump(!format("the concat is = {0} {1}...", vars...)). XDump can still be a binary operator and set second operand to StringInit("") when only one operand is provided.

In general the dump(string, value) API looks a bit too specialized. I would rather have a single-argument !dump(value) and, if a value needs some annotation, just use !dump("annotation string"); !dump(value). Once we have !format() it would be a straightforward to transition to !dump(!format("annotation...", value)).

Sorry about the silence -- I was away for the last couple of weeks.

No worries, it is that time of the year for everybody!

Thank you very much for the patch!

Or loose the constraint that first operand shoule be a string? for your example, it can be simpler: !dump(!format("the concat is = {0} {1}...", vars...)). XDump can still be a binary operator and set second operand to StringInit("") when only one operand is provided.

In general the dump(string, value) API looks a bit too specialized. I would rather have a single-argument !dump(value) and, if a value needs some annotation, just use !dump("annotation string"); !dump(value). Once we have !format() it would be a straightforward to transition to !dump(!format("annotation...", value)).

I see your point, and I have actually created an implementation for it...

However, I have the following considerations:

  1. !dump is a debugging facility that is not supposed to be used in production code
  2. !format is a formatting facility that is supposed to be used in production code for generating strings out of values.

In particular, the use of 1 requires to be able to quickly find in the output of tablegen that prints the variable. It might be just my personal preference, but think it is more efficient to just type dump("a=", myvar) than !dump(!format("a=...", myvar)), especially given the fact that such use case is there as a one-off, just for debugging.

Francesco

@tra - please find the alternative implementation that uses UnaryOp at D157492

To reiterate on my argument for preferring the current implementation, consider the first few lines of the uni test.

In this current patch, we have:

defvar a = 1;
defvar b = !add(!dump("a = ", a), 1);
dump "b = ", b;
// CHECK-LABEL: a = 1
// CHECK-LABEL: b = 2

In the unary op implementation, we have:

defvar a = 1111;
defvar b = !add(!dump( a), 1);
dump  b;
// CHECK-LABEL: 1111
// CHECK-LABEL: 1112

I think that the first version is more useful for debugging: one command allows to label the variable we are debugging in the (potentially thousands) of lines of output of tablegen.

tra added a comment.Aug 9 2023, 10:40 AM

However, I have the following considerations:

  1. !dump is a debugging facility that is not supposed to be used in production code

I don't think it rules out keeping tablegen design clean.

  1. !format is a formatting facility that is supposed to be used in production code for generating strings out of values.

That part I'm fine with.

All I'm saying is that !dump() accepting a description is something that tablegen will be able to do by using a function or lambda, once D148915 lands, which would render the description part superfluous.

So, !format would handle conversion of objects to printable strings,
!dump would be responsible for printing out whatever it gets as an argument.
functions would allow users to compose whatever is needed from the primitive components.

In particular, the use of 1 requires to be able to quickly find in the output of tablegen that prints the variable.
It might be just my personal preference, but think it is more efficient to just type dump("a=", myvar) than !dump(!format("a=...", myvar)), especially given the fact that such use case is there as a one-off, just for debugging.

IMO, it's a moot point, as !dump(!format("a=...", myvar)), can be wrapped in a function, and become exactly the dump("a=", myvar) you're looking for.
The difference is that you propose to hardcode dump(description, value), while I'm suggesting to leave it up to the user to decide what's the best way for them. E.g. they may not want a description, or they may want to accept multiple values.

fpetrogalli abandoned this revision.Aug 10 2023, 12:46 AM

However, I have the following considerations:

  1. !dump is a debugging facility that is not supposed to be used in production code

I don't think it rules out keeping tablegen design clean.

  1. !format is a formatting facility that is supposed to be used in production code for generating strings out of values.

That part I'm fine with.

All I'm saying is that !dump() accepting a description is something that tablegen will be able to do by using a function or lambda, once D148915 lands, which would render the description part superfluous.

So, !format would handle conversion of objects to printable strings,
!dump would be responsible for printing out whatever it gets as an argument.
functions would allow users to compose whatever is needed from the primitive components.

In particular, the use of 1 requires to be able to quickly find in the output of tablegen that prints the variable.
It might be just my personal preference, but think it is more efficient to just type dump("a=", myvar) than !dump(!format("a=...", myvar)), especially given the fact that such use case is there as a one-off, just for debugging.

IMO, it's a moot point, as !dump(!format("a=...", myvar)), can be wrapped in a function, and become exactly the dump("a=", myvar) you're looking for.
The difference is that you propose to hardcode dump(description, value), while I'm suggesting to leave it up to the user to decide what's the best way for them. E.g. they may not want a description, or they may want to accept multiple values.

Let's go with D157492, then :)

I will update it soon.