This is an archive of the discontinued LLVM Phabricator instance.

[clang-query] Add non-exclusive output API
ClosedPublic

Authored by steveire on Oct 3 2018, 6:17 PM.

Details

Summary

Add granular options for AST dumping, text printing and diagnostics.

This makes it possible to

  • Have both diag and dump active at once
  • Extend the output with other queryable content in the future.

Event Timeline

steveire created this revision.Oct 3 2018, 6:17 PM
steveire updated this revision to Diff 168218.Oct 3 2018, 6:52 PM

Commit message

steveire added inline comments.
clang-query/Query.cpp
50

Is the 'print' output useful at all? I'm thinking the updated API shouldn't support it at all. Thoughts?

steveire added a reviewer: pcc.Oct 3 2018, 7:07 PM

Can I convince someone to be interested enough in this to approve it?

I have a talk coming up and I'd like to be able to say that you don't have to switch between dump mode and diag mode all the time.

I also want to enable other relevant outputting features in clang-query that I have in the pipeline.

I'm not all that keen on the idea of deprecating set output like this -- that command has been around since forever, and scripts outside of our control are likely to rely on it. However, it seems like you should be able to make set output accept a comma-delimited list of options. e.g., set output dump, diag This would give you the same level of granularity while still being backwards compatible. Do you think that would work for your needs?

  • The scripts will continue to work at least until set output is removed, which is not going to happen soon.
  • A comma-delimited list of options means that if I have foo, bar, bat enabled and want to add bang, I need to set output foo, bar, bat, bang. Or alternatively if I want to remove bat, I need to write out all the others. I don't think that's suitable.
  • The scripts will continue to work at least until set output is removed, which is not going to happen soon.

Certainly, but given that deprecation implies eventual removal, people are still being asked to update their scripts. What's more, given that clang-output has no real documentation to speak of, how will users even *know* to update their scripts? Rather than invent ways to address these concerns, I'm much more in favor of making the feature work using a backwards compatible syntax. Then no one has to update their scripts to get identical functionality.

  • A comma-delimited list of options means that if I have foo, bar, bat enabled and want to add bang, I need to set output foo, bar, bat, bang. Or alternatively if I want to remove bat, I need to write out all the others. I don't think that's suitable.

Correct -- if you want different output because new options are available, you need to opt into it. I'm not certain what the issue is with that; can you expound on your concerns? From what I'm seeing in the patch, it looks like you want user to write:

set dump-output
set diag-output
match foo(bar(baz()))

and I'm asking for it to instead be:

set output dump, diag
match foo(bar(baz()))

Functionally, I believe these are equivalent. However, I think the latter is more clear for users as it implies a set of output options rather than leaving you to wonder whether set blah-output options are mutually exclusive or not.

What's more, given that clang-output has no real documentation to speak of, how will users even *know* to update their scripts?

The release notes will tell them.

But your comment also raises a different point: If there is so little clang-query documentation, what justification is there for relying on current behavior?

Also, what scripts are you referring to that rely on clang-query? Do we know of any that are expected to work long-term without maintenance (unlikely, as the AST itself is not stable), or is this concern imagined?

can you expound on your concerns?

After this patch, the user writes

set dump-output true

or false.

and after the follow-up (https://reviews.llvm.org/D52859) they write

set matcher-output true

After more features I have in the pipeline they will write other similar commands.

With my command-design, if a user wants to enable dump output in addition to what they have, they just

set dump-output true

In your design, they have to recall/find out what options are currently enabled, and add them in a comma separated list. That is not user-friendly.

wonder whether set blah-output options are mutually exclusive or not.

These options are obviously and clearly toggles. There is definitely no reason to think that they are mutually exclusive any more than there is reason to think set output dump is mutually exclusive with set bind-root false.

And even if someone did wonder that, they would just try it and learn that they are toggles.

What's more, given that clang-output has no real documentation to speak of, how will users even *know* to update their scripts?

The release notes will tell them.

That's not user friendly.

But your comment also raises a different point: If there is so little clang-query documentation, what justification is there for relying on current behavior?

That's pretty much the only reason I think it's reasonable to consider deprecating the command in the first place. The justification for relying on it is: this has been the behavior since Day 1 of clang-query and we do not often remove user-facing options without there being a Very Good Reason to do so. This is not a situation where it's an undocumented feature that's in-flux or there's no indication it's being used.

Also, what scripts are you referring to that rely on clang-query? Do we know of any that are expected to work long-term without maintenance (unlikely, as the AST itself is not stable), or is this concern imagined?

I'm referring to third-party scripts kept by people out of tree. I have several, I know a few other folks internally at my company who do as well.

I think it's reasonable to expect differences in AST output when updating clang-query, but this is not that. It's reasonable to expect command stability.

can you expound on your concerns?

After this patch, the user writes

set dump-output true

or false.

and after the follow-up (https://reviews.llvm.org/D52859) they write

set matcher-output true

After more features I have in the pipeline they will write other similar commands.

With my command-design, if a user wants to enable dump output in addition to what they have, they just

set dump-output true

In your design, they have to recall/find out what options are currently enabled, and add them in a comma separated list. That is not user-friendly.

Yes -- you are required to make a decision that you want to opt in to new functionality. The exact same thing is true with your proposal -- you have to find the right place to stick the set dump-output true in order to enable it.

wonder whether set blah-output options are mutually exclusive or not.

These options are obviously and clearly toggles.

Your notions of obvious and clear do not match mine.

There is definitely no reason to think that they are mutually exclusive any more than there is reason to think set output dump is mutually exclusive with set bind-root false.

Well, except for the fact that each of these commands says "output" in the name and they are replacing a command argument named "output" that was mutually exclusive and there's zero documentation explaining the commands.

And even if someone did wonder that, they would just try it and learn that they are toggles.

I don't see this functionality being so critical that we need to deprecate the existing spelling when there are backwards compatible options available, which is why I'm opposed to this patch going in with the proposed syntax.

you have to find the right place to stick the set dump-output true in order to enable it.

What do you mean "the right place"?

you have to find the right place to stick the set dump-output true in order to enable it.

What do you mean "the right place"?

The point from which they want to enable dump outputting.

I don't see this functionality being so critical that we need to deprecate the existing spelling when there are backwards compatible options available, which is why I'm opposed to this patch going in with the proposed syntax.

I don't think we're going to go anywhere except around in circles :). I don't see your comma separated syntax as viable. Maybe there is a third way that both of us don't see.

I tried to catch you on IRC without success. Thanks for the reviews! I appreciate your time. I'll see if I can find a different reviewer for this in a while.

Thanks,

@steveire and I chatted over IRC and I wanted to capture another alternate proposal idea as part of the review (we did not reach agreement on this proposal, however). I've provided it side-by-side with the other proposals for comparison purposes.

# Original proposal
set dump-output true
match foo(bar(baz())) # Uses one output format
match bar(foo()) # Uses one output format
set diag-output true
set matcher-output true
match foo(bar(baz())) # Uses all three output formats
match bar(foo()) # Uses all three output formats
set diag-output false
set dump-output false
match foo(bar(baz())) # Uses only matcher output
 
# First Alternative
set output dump
match foo(bar(baz())) # Uses one output format
match bar(foo()) # Uses one output format
set output dump, diag, matcher
match foo(bar(baz())) # Uses all three output formats
match bar(foo()) # Uses all three output formats
set output matcher
match foo(bar(baz())) # Uses only matcher output
 
# Second Alternative
set output dump
match foo(bar(baz())) # Uses one output format
match bar(foo()) # Uses one output format
set output diag toggle on
set output matcher toggle on
match foo(bar(baz())) # Uses all three output formats
match bar(foo()) # Uses all three output formats
set output matcher
match foo(bar(baz())) # Uses only matcher output
steveire updated this revision to Diff 170304.Oct 20 2018, 1:46 AM

Rename dump-output to ast-output.

steveire marked an inline comment as done.Oct 20 2018, 1:48 AM
steveire updated this revision to Diff 170305.Oct 20 2018, 1:50 AM

Fix tests

steveire updated this revision to Diff 170308.Oct 20 2018, 2:40 AM

Update test

steveire added a comment.EditedOct 20 2018, 2:49 AM

Perhaps the best solution is to introduce this new API, but not deprecate the existing 'exclusive' API.

What do you think?

Then the only thing I would have to do with this patch is restore the documentation of 'set output foo'

steveire updated this revision to Diff 170310.Oct 20 2018, 4:47 AM

Don't deprecate existing API

steveire retitled this revision from Deprecate 'set output foo' API of clang-query to [clang-query] Add non-exclusive output API .Oct 20 2018, 4:47 AM
steveire edited the summary of this revision. (Show Details)
steveire updated this revision to Diff 170435.Oct 22 2018, 8:56 AM
steveire retitled this revision from [clang-query] Add non-exclusive output API to [clang-query] Add non-exclusive output API.

New design

steveire updated this revision to Diff 170439.Oct 22 2018, 9:03 AM

Update docs

pcc added a comment.Oct 22 2018, 12:01 PM

I'm not really involved in clang-query development these days, so this is more of a drive-by comment. It wasn't really obvious to me what the wording of the help text for set enable-output meant:

Set whether to enable <feature> content non-exclusively.

I figured out from the name of the option and from reading the source code that it just means "Enable output in format <feature>." This seems confusing to me because I would expect set foo bar to set the value of a variable foo to bar, which doesn't match what this command does. What do you think about spelling it as something like enable output <feature>?

This seems confusing to me because I would expect set foo bar to set the value of a variable foo to bar,

This is already not the case with set output diag etc.

That said, I don't mind spelling this <enable|disable> output foo. It's not my decision. It's up to the reviewers.

This is already not the case with set output diag etc.

Oh, I'm wrong here. I misread your post, sorry.

I prefer the API from Peter. I think it's a good additional step from where Aaron and I reached in IRC discussion (this patch currently).

I can change the patch to use that later if you agree Aaron?

I prefer the API from Peter. I think it's a good additional step from where Aaron and I reached in IRC discussion (this patch currently).

I can change the patch to use that later if you agree Aaron?

To be extra clear, we're talking about changing "set enable-output detailed-ast" to be "enable output detailed-ast" instead? If so, I'm okay with that -- it seems like an improvement.

Yep, that's the suggestion. That will result in commands such as:

  • enable output detailed-ast
  • disable output detailed-ast
  • set output detailed-ast
  • enable output diag
  • disable output diag
  • set output diag

etc, which I think addresses all concerns.

Yep, that's the suggestion. That will result in commands such as:

  • enable output detailed-ast
  • disable output detailed-ast
  • set output detailed-ast
  • enable output diag
  • disable output diag
  • set output diag

etc, which I think addresses all concerns.

Agreed -- I think this is the right path forward.

steveire updated this revision to Diff 171046.Oct 25 2018, 1:08 AM

New command design

aaron.ballman accepted this revision.Oct 29 2018, 5:10 AM

LGTM with some minor fixes.

At some point (not necessarily as part of this patch), you should also update docs\ReleaseNotes.rst to broadly list the new features you've been adding to clang-query.

clang-query/QueryParser.cpp
294

This seems incorrect to me; we expect the command identifier output here, don't we?

300–306

Elide braces here.

This revision is now accepted and ready to land.Oct 29 2018, 5:10 AM
This revision was automatically updated to reflect the committed changes.
steveire marked 2 inline comments as done.Oct 29 2018, 12:09 PM
steveire added inline comments.
clang-query/QueryParser.cpp
294

Actually this is the case where the user writes enable with no more output.

I match set command behavior and output here.

I added a test for this, similar to the existing equivalent set test.