Page MenuHomePhabricator

[lit] Improve lit's output with default settings and --verbose.
Needs ReviewPublic

Authored by varungandhi-apple on Jun 29 2020, 10:47 AM.
This revision needs review, but all reviewers have resigned.

Details

Reviewers
yln
Summary
  • Before:
    • default (no flags): no script, no command output
    • -v: full script, no command output
    • -vv: full script, adds 'set +x', shows command output
  • After:
    • default (no flags): no script, show only failing line in command output
    • -v: full script, adds 'set +x', shows command output
    • -vv: alias to -v (preserve backwards compatibility)

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2020, 10:47 AM

Include missing commit separating verbose and showOutput.

Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2020, 2:33 PM
  • [docs] [lit] Add a more helpful description for lit.py's -s flag.
  • [NFC] [lit] Separate verbose and showOutput.
  • [lit] Improve lit's output with default settings and --verbose.
jdenny added a subscriber: jdenny.Jun 29 2020, 3:31 PM

Rebase after changes in D82808.

yln added a comment.Jul 6 2020, 12:00 PM

The changes here are for text printed for failing tests, right?
And showAllOutput is orthogonal and just shows everything even for non-failing tests?

llvm/utils/lit/lit/OutputSettings.py
35

I really like the "communicating intent" part of this infrastructure. However, this is a lot of code and ifs considering we really only have to distinguish two cases. From your commit message:

  • default (no flags): no script, show only failing line in command output
  • -v: full script, adds 'set +x', shows command output

Am I missing something or could everything be keyed off a single verbose (reading from the code below showAllOutput implies verbose) flag. Do we anticipate other combinations being useful? (In general I would argue for striving for the simplest implementation that gives us what we currently want and not try to anticipate future extensions.)

Have you considered (or started out with) just using a single verbose flag to base your decisions in the implementation functions?

llvm/utils/lit/lit/TestRunner.py
1499

I feel like this function is the most complex part of this patch. I don't fully follow it, but you experimented and believe this is the best approach and wrote a whole test suite so I am happy with it.

1503–1508

Why use a complicated parser-like return value? Our only caller below could just receive the potentially found RUN line.

1593

Why are we using two local functions here?

The whole thing could be (already assuming just one verbose flag):

def make_script_output(lit_config, script_lines, exit_code):
    if not lit_config.verbose:
        return ""
    return ...
1614–1617

This block should be pushed into the lit_config.command_output_style == OutputSettings.ONLY_FAILING_COMMAND case, otherwise we are always searching for the last line, even if we don't really use the result of the computation.

Also: if we fail to find the RUN line, we might print the note to use --verbose even if the user already specified it.

1624

Please try to simplify this a bit, maybe the following?

def make_command_output():
    if not lit_config.quiet:
        return ""
    
    verbose = lit_config.verbose
    output = "header {} ...".format(", truncated" if verbose else "")
    if verbose:
        output += cmd_output
    else:
        run_line = locate_last_run_line() # then deal with "not found" case
        output += ...
        output += "Note:  try to use --verbose"

     return output
llvm/utils/lit/lit/cl_arguments.py
56

I think aliases kwarg is something else (seems to be related to subparsers). You could just add "-vv", "--echo-all-commands" after "--verbose" to allow for additional names for the flag, but I think I prefer to have it separate (makes it easer to remove it if we ever decide to drop the alias in the future).

So long comment short: just drop this comment please.

177

Unconditionally overwritten below

194

opts.echo_all_commands = opts.command_output_style in {OutputSettings.UP_TO_AND_INCLUDING_FAILING_COMMAND, ...}

llvm/utils/lit/tests/shtest-run-at-line.py
70

I really like the extension to this test. Thanks!

llvm/utils/lit/tests/unittest-failing-locator.py
123

Thanks for being diligent and adding these tests. This will also make it easy to add additional tests for corner cases if we need them in the future!

varungandhi-apple marked 2 inline comments as done.Jul 6 2020, 12:29 PM

Thanks for the review, it's a big patch. 😅 I'm a bit busy at the moment, I will respond to the other comments later this week or sometime next week.

varungandhi-apple marked 11 inline comments as done.

Address review comments.

Some of the changes probably don't make perfect sense in this revision because I initially started out with the whole change being one revision and then split things into commits, while trying to minimize the churn between revisions. Please also take a look at the follow-up revision D82811 and let me know if that's acceptable (less churn, less self-contained, things make sense when you look at the two together) or if you'd prefer that I make this revision more self-contained (that would create more churn in D82811).

llvm/utils/lit/lit/OutputSettings.py
35

Not sure what you mean by two cases, I think there are three cases:

  1. Quiet -> Corresponds on NO_COMMAND
  2. Default (no quiet, no verbose) -> Corresponds to ONLY_FAILING.
  3. Verbose -> Corresponds to UP_TO_AND_INCLUDING_FAILING.

Since there is a 1-1 correspondence, we _could_ compare quiet vs (!quiet && !verbose) vs verbose in make_command_output. I chose not to do that because I figured this makes the intent clearer by distinguishing between user-specified options and derived options.

Does that make sense? Would you still prefer that I get rid of this?

llvm/utils/lit/lit/TestRunner.py
1503–1508

Sorry, it's not super clear because in this revision, there is only one caller, which could do with just using the index. However, the highlighting patch (which comes right after this logically) ends up making use of the substring. The line of reasoning is:

  1. A substring by itself is not good enough (and hence make_command_output in this patch makes use of the first index return value), because we want to distinguish the case line_start == 0 from the case line_start > 0 by printing Command Output (stderr) instead of Command Output (stderr, truncated) when line_start == 0.
  2. An index by itself could be "good enough", but we try to be smarter in the highlighting patch (https://reviews.llvm.org/D82811) by not highlighting the shell-specific prefix, and starting highlighting from the "RUN", which make_script_output makes use of.

Does that make sense? I decided against splitting changes to this function into two revisions and have the full functionality in the first iteration because I felt like that would create more churn with the tests and the implementation, but I see how it can be confusing as to why two related-but-slightly-different-values are being returned simultaneously when only one is really being used in this patch.

1593

It mimics the structure of the make_command_output function right below. Right now, this function is very simple, so it probably seems overkill. However, make_script_output becomes much more complicated in the highlighting patch after this: https://reviews.llvm.org/D82811, so I figured it would be helpful to have the two functions make_script_output and make_command_output follow a similar pattern.

1614–1617

This block should be pushed into the lit_config.command_output_style == OutputSettings.ONLY_FAILING_COMMAND case, otherwise we are always searching for the last line, even if we don't really use the result of the computation.

This part is actually used by the highlighting patch (https://reviews.llvm.org/D82811), where the last line of this function becomes:

failing_lines = highlight_failure_lines(cmd_output[line_start:])
return make_output(cmd_output[:line_start] + failing_lines,
                   is_truncated=False)

That's why I didn't push it into the conditional branch.

Also: if we fail to find the RUN line, we might print the note to use --verbose even if the user already specified it.

Good catch, I've fixed make_output to check for is_truncated properly.

yln added inline comments.Aug 31 2020, 11:55 AM
llvm/docs/CommandGuide/lit.rst
102

blank line

llvm/utils/lit/lit/OutputSettings.py
35

Okay, I agree now that having an enum-like thing is really nice because we can give it a good explanatory name.

Can we roll the following options into one enum-like OutputStyle.XXX corresponding to our 3 cases: "quiet, default/only_failing, verbose/up_to_first_failing"; since the user's can't configure them individually.

I imagine the following things
opts.show_output_on_failure, opts.script_output_style, opts.command_output_style, and opts.quiet being replaced with opts.output_style

llvm/utils/lit/lit/TestRunner.py
1503–1508

Yes, please implement it in the "simplest way possible" for the current functionality without anticipating future requirements. We can always make it more complex if required for a future feature, but's it's harder to "see" that there is opportunity for simplification once the code is there.

Also, another reviewer might suggest a solution you haven't thought about.

For example, could highlighting be implemented via (without worrying about indexes)?

s = get_important_bits(full_output)
s_highlighted = '<highlight>{}</highlight>'.format(s)
highlighted_output = full_output.replace(s, s_highlighted)
1593

I like symmetry in code, but please simplify this as much as possible.

llvm/utils/lit/lit/cl_arguments.py
194

Or even simpler (the 1 out of 3 cases that doesn't set it to true)?

opts.echo_all_commands = opts.output_style != OutputStyle.QUIET

yln resigned from this revision.Jul 22 2021, 3:59 PM
martong removed a subscriber: martong.Jul 28 2021, 9:18 AM