This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Parse the dotest output to determine the most appropriate result code
ClosedPublic

Authored by JDevlieghere on Jun 7 2022, 3:28 PM.

Details

Summary

Currently we look for keywords in the dotest.py output to determine the result code. By parsing the dotest output and looking at the number of tests that passed/failed/were skipped etc we can be more structured in determining the appropriate lit result code. We're still mapping multiple tests to one result code, so some loss of information is inevitable, but I think the new approach is much more sane.

Diff Detail

Event Timeline

JDevlieghere created this revision.Jun 7 2022, 3:28 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 7 2022, 3:28 PM
JDevlieghere requested review of this revision.Jun 7 2022, 3:28 PM
mib accepted this revision.Jun 7 2022, 3:35 PM

Makes more sense. LGTM!

This revision is now accepted and ready to land.Jun 7 2022, 3:35 PM
kastiglione accepted this revision.Jun 7 2022, 3:55 PM

I was in the process of writing a comment saying that we could be even more accurate by parsing the number of tests with a particular result code from the dotest output but that that seemed overkill before I changed my mind mid-sentence and implemented that instead.

mib accepted this revision.Jun 7 2022, 4:13 PM

Even better! LGTM!

JDevlieghere retitled this revision from [lldb] Mark API tests as XFAIL if they have expected failures and no passing tests to [lldb] Parse the dotest output to determine the most appropriate result code.Jun 7 2022, 4:13 PM
JDevlieghere edited the summary of this revision. (Show Details)

Here are the results:

Before:

  Unsupported: 145
  Passed     : 873
  Failed     :   7

After:

  Unsupported      : 167
  Passed           : 834
  Expectedly Failed:  16
  Unresolved       :   1
  Failed           :   7

Note that this patch does change the behavior where passes and failures are no longer prioritized. That means that if you have a test that has 3 passes and 4 skips, it will be reported as unsupported (while currently it's considered a pass). We could special case PASS and FAIL so that if there's at least on PASS or FAIL the test is marked as such. That would be more in line with the current state of things.

That means that if you have a test that has 3 passes and 4 skips, it will be reported as unsupported (while currently it's considered a pass).

My initial reaction to this is that I don't love it. I think N passes and M skips is a pass, even if M>N.

lldb/test/API/lldbtest.py
91

previously out was also being searched, I take it that's not actually needed?

115

you could use operator.itemgetter(0) instead of the lambda.

instead of reducing and picking returning a single result, can we return the raw counts and then report the totals of the counts?

Mark the test as FAIL or PASS if there's at least one test that respectively failed or passed.

instead of reducing and picking returning a single result, can we return the raw counts and then report the totals of the counts?

No, the lit test format does not support that.

kastiglione accepted this revision.Jun 8 2022, 8:36 AM
kastiglione added inline comments.
lldb/test/API/lldbtest.py
122

you could alternatively use max(lit_results, key=...)

Herald added a project: Restricted Project. · View Herald TranscriptJun 8 2022, 8:58 AM