This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/utils/lit/lit/
-
utils/
-
lit/
-
lit/
-
ProgressBar.py
8/9
display.py
-
main.py

Differential D99073

[lit] Reliable progress indicator and ETA
ClosedPublic

Authored by lebedev.ri on Mar 22 2021, 6:07 AM.

Download Raw Diff

Details

Reviewers

jhenderson
jmorse
yln
mehdi_amini
davezarzycki

Commits

rG962339a5eca2: [lit] Reliable progress indicator and ETA

Summary

Quality of progress bar and ETA in lit has always bothered me.

For example, given ./bin/llvm-lit /repositories/llvm-project/clang/test/CodeGen* -sv
at 1%, it says it will take 10 more minutes,
at 25%, it says it will take 1.25 more minutes,
at 50%, it says it will take 30 more seconds,
and in the end finishes with Testing Time: 39.49s. That's rather wildly unprecise.

Currently, it assumes that every single test will take the same amount of time to run on average.
This is is a somewhat reasonable approximation overall, but it is quite clearly imprecise,
especially in the beginning.

But, we can do better now, after D98179! We now know how long the tests took to run last time.
So we can build a better ETA predictor, by accumulating the time spent already,
the time that will be spent on the tests for which we know the previous time,
and for the test for which we don't have previous time, again use the average time
over the tests for which we know current or previous run time.
It would be better to use median, but i'm wary of the cost that may incur.

Now, on first run of ./bin/llvm-lit /repositories/llvm-project/clang/test/CodeGen* -sv
at 10%, it says it will take 30 seconds,
at 25%, it says it will take 50 more seconds,
at 50%, it says it will take 27 more seconds,
and in the end finishes with Testing Time: 41.64s. That's pretty reasonable.

And on second run of ./bin/llvm-lit /repositories/llvm-project/clang/test/CodeGen* -sv
at 1%, it says it will take 1 minutes,
at 25%, it says it will take 30 more seconds,
at 50%, it says it will take 19 more seconds,
and in the end finishes with Testing Time: 39.49s. That's amazing i think!

I think people will love this :)

Note that currently i have not added any test coverage here.
I guess i could try, but i'm not really sure if this is testable..

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

lebedev.ri created this revision.Mar 22 2021, 6:07 AM

Herald added a subscriber: delcypher. · View Herald TranscriptMar 22 2021, 6:07 AM

lebedev.ri requested review of this revision.Mar 22 2021, 6:07 AM

Interesting. Thanks for looking into this and building on D98179. I'm sure that people will appreciate this change. :-)

(I'm not personally invested in the outcome, nor am I looking to become a regular contributor to lit, so I'll resign and defer to regular lit reviewers.)

Harbormaster completed remote builds in B94973: Diff 332260.Mar 22 2021, 8:03 AM

In D99073#2641255, @davezarzycki wrote:

Interesting. Thanks for looking into this and building on D98179. I'm sure that people will appreciate this change. :-)

(I'm not personally invested in the outcome, nor am I looking to become a regular contributor to lit, so I'll resign and defer to regular lit reviewers.)

No problem. Thank you for D98179!

Very cool, thanks! :)

LGTM, with nits.

llvm/utils/lit/lit/display.py
4	`tests` is now the list of test objects, not just a count, right?
9–10	Could now use string interpolation (because we require Python 3.6).
35	The test here should be `if test.previous_elapsed:` (and flip the if-else) to make sure `None` is counted as "zero" (in case that could ever happen) and avoid floating point comparison with `0.0`. Same below in `update()`.
73–79

This revision is now accepted and ready to land.Mar 22 2021, 11:01 AM

lebedev.ri marked an inline comment as done.Mar 22 2021, 11:02 AM

lebedev.ri added inline comments.

llvm/utils/lit/lit/display.py
4	Yep.

Nice!

@yln @mehdi_amini thank you for taking a look!
Addressing review notes.

Shall i wait for more review here, or can i commit this?

llvm/utils/lit/lit/display.py
9–10	I think i'll prefer to leave that as-is. If wanted, a cleanup en masse could be performed later.

Harbormaster completed remote builds in B95050: Diff 332376.Mar 22 2021, 1:18 PM

jhenderson added inline comments.Mar 23 2021, 1:36 AM

llvm/utils/lit/lit/display.py
51	Is it a major bit of reworking to confirm whether median is slower than mean? I guess if it's straightforward, you could relatively quickly compare the total time for a significant set of tests before and after to see if it really is slower.

@jhenderson thank you for taking a look!

llvm/utils/lit/lit/display.py
51	It certainly won't be faster for obvious reasons. The question is how much of an overhead it will induce. For small number of tests (N=1'000, maybe N=10'000) it should be fine, but for N=100'000 i suspect it will not be good. In the end, it will mainly only help on the first run, when we don't have previous times for many of the tests. Unless you insist, i think that may or may not be a follow-up improvement?

I've not looked at this area enough to give a full LGTM, but am happy if others are.

llvm/utils/lit/lit/display.py
51	Fair enough. Not something I've thought about too much, and you're right that it could be a future improvement if really needed.

Closed by commit rG962339a5eca2: [lit] Reliable progress indicator and ETA (authored by lebedev.ri). · Explain WhyMar 23 2021, 2:16 AM

This revision was automatically updated to reflect the committed changes.

lebedev.ri marked an inline comment as done.

lebedev.ri added a commit: rG962339a5eca2: [lit] Reliable progress indicator and ETA.

CC @goncharov

It looks like this one is causing the Windows premerge checks to timeout, see e.g. the run for this particular patch: https://buildkite.com/llvm-project/premerge-checks/builds/30849

When running ninja check-lit in a separate test setup on Windows, it fails (and hangs) with the following message:

Exception in thread Thread-3:
Traceback (most recent call last):
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\threading.py", line 954, in _bootstrap_inner
    self.run()
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\threading.py", line 892, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 592, in _handle_results
    cache[job]._set(i, obj)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 776, in _set
    self._callback(self._value)
  File "C:\code\llvm-project\llvm\build\bin\..\..\utils\lit\lit\display.py", line 98, in update
    percent = self.progress_predictor.update(test)
  File "C:\code\llvm-project\llvm\build\bin\..\..\utils\lit\lit\display.py", line 59, in update
    return self.time_elapsed / total_time
ZeroDivisionError: float division by zero

I went ahead and pushed a trivial fix for the division by zero. (As long as the patches that the premerge bot tests are based on a version of the monorepo before the division by zero fix though, the premerge jobs will hang and timeout I guess.)

In D99073#2648210, @mstorsjo wrote:
CC @goncharov

It looks like this one is causing the Windows premerge checks to timeout, see e.g. the run for this particular patch: https://buildkite.com/llvm-project/premerge-checks/builds/30849

When running ninja check-lit in a separate test setup on Windows, it fails (and hangs) with the following message:
Exception in thread Thread-3:
Traceback (most recent call last):
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\threading.py", line 954, in _bootstrap_inner
    self.run()
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\threading.py", line 892, in run
    self._target(*self._args, **self._kwargs)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 592, in _handle_results
    cache[job]._set(i, obj)
  File "C:\Users\Administrator\AppData\Local\Programs\Python\Python39\lib\multiprocessing\pool.py", line 776, in _set
    self._callback(self._value)
  File "C:\code\llvm-project\llvm\build\bin\..\..\utils\lit\lit\display.py", line 98, in update
    percent = self.progress_predictor.update(test)
  File "C:\code\llvm-project\llvm\build\bin\..\..\utils\lit\lit\display.py", line 59, in update
    return self.time_elapsed / total_time
ZeroDivisionError: float division by zero
I went ahead and pushed a trivial fix for the division by zero. (As long as the patches that the premerge bot tests are based on a version of the monorepo before the division by zero fix though, the premerge jobs will hang and timeout I guess.)

Thanks!
How can total_time even be 0.0 there?

In D99073#2648236, @lebedev.ri wrote:

Thanks!
How can total_time even be 0.0 there?

Not entirely sure - for some reason, it runs this print routine a couple times right away (before any tests have completed) when starting testing when I invoke it as ninja check-lit, but not when I run it as python bin\llvm-lit.py -sv ..\utils\lit\tests. (On Windows, it uses SimpleProgressBar, not the more elaborate one anyway.)

Revision Contents

Path

Size

llvm/

utils/

lit/

ProgressBar.py

2 lines

display.py

50 lines

main.py

3 lines

Diff 332573

llvm/utils/lit/lit/ProgressBar.py

Show First 20 Lines • Show All 247 Lines • ▼ Show 20 Lines	def update(self, percent, message):
sys.stdout.write(self.header)		sys.stdout.write(self.header)
self.cleared = 0		self.cleared = 0
prefix = '%3d%% ' % (percent*100,)		prefix = '%3d%% ' % (percent*100,)
suffix = ''		suffix = ''
if self.useETA:		if self.useETA:
elapsed = time.time() - self.startTime		elapsed = time.time() - self.startTime
if percent > .0001 and elapsed > 1:		if percent > .0001 and elapsed > 1:
total = elapsed / percent		total = elapsed / percent
eta = int(total - elapsed)		eta = total - elapsed
h = eta//3600.		h = eta//3600.
m = (eta//60) % 60		m = (eta//60) % 60
s = eta % 60		s = eta % 60
suffix = ' ETA: %02d:%02d:%02d'%(h,m,s)		suffix = ' ETA: %02d:%02d:%02d'%(h,m,s)
barWidth = self.width - len(prefix) - len(suffix) - 2		barWidth = self.width - len(prefix) - len(suffix) - 2
n = int(barWidth*percent)		n = int(barWidth*percent)
if len(message) < self.width:		if len(message) < self.width:
message = message + ' '*(self.width - len(message))		message = message + ' '*(self.width - len(message))
Show All 33 Lines

llvm/utils/lit/lit/display.py

import sys import sys

def create_display(opts, tests, total_tests, workers): def create_display(opts, tests, total_tests, workers):

ylnUnsubmitted

Done

tests is now the list of test objects, not just a count, right?

yln: `tests` is now the list of test objects, not just a count, right?

lebedev.riAuthorUnsubmitted

Done

Yep.

lebedev.ri: Yep.

if opts.quiet: if opts.quiet:

return NopDisplay() return NopDisplay()

of_total = (' of %d' % total_tests) if (tests != total_tests) else '' num_tests = len(tests)

header = '-- Testing: %d%s tests, %d workers --' % (tests, of_total, workers) of_total = (' of %d' % total_tests) if (num_tests != total_tests) else ''

header = '-- Testing: %d%s tests, %d workers --' % (

ylnUnsubmitted

Done

Could now use string interpolation (because we require Python 3.6).

yln: Could now use string interpolation (because we require Python 3.6).

lebedev.riAuthorUnsubmitted

Done

I think i'll prefer to leave that as-is.
If wanted, a cleanup en masse could be performed later.

lebedev.ri: I think i'll prefer to leave that as-is. If wanted, a cleanup en masse could be performed later.

num_tests, of_total, workers)

progress_bar = None progress_bar = None

if opts.succinct and opts.useProgressBar: if opts.succinct and opts.useProgressBar:

import lit.ProgressBar import lit.ProgressBar

try: try:

tc = lit.ProgressBar.TerminalController() tc = lit.ProgressBar.TerminalController()

progress_bar = lit.ProgressBar.ProgressBar(tc, header) progress_bar = lit.ProgressBar.ProgressBar(tc, header)

header = None header = None

except ValueError: except ValueError:

progress_bar = lit.ProgressBar.SimpleProgressBar('Testing: ') progress_bar = lit.ProgressBar.SimpleProgressBar('Testing: ')

return Display(opts, tests, header, progress_bar) return Display(opts, tests, header, progress_bar)

class ProgressPredictor(object):

def __init__(self, tests):

self.completed = 0

self.time_elapsed = 0.0

self.predictable_tests_remaining = 0

self.predictable_time_remaining = 0.0

self.unpredictable_tests_remaining = 0

for test in tests:

if test.previous_elapsed:

ylnUnsubmitted

Done

The test here should be if test.previous_elapsed: (and flip the if-else) to make sure None is counted as "zero" (in case that could ever happen) and avoid floating point comparison with 0.0. Same below in update().

yln: The test here should be `if test.previous_elapsed:` (and flip the if-else) to make sure `None`…

self.predictable_tests_remaining += 1

self.predictable_time_remaining += test.previous_elapsed

else:

self.unpredictable_tests_remaining += 1

def update(self, test):

self.completed += 1

self.time_elapsed += test.result.elapsed

if test.previous_elapsed:

self.predictable_tests_remaining -= 1

self.predictable_time_remaining -= test.previous_elapsed

else:

self.unpredictable_tests_remaining -= 1

# NOTE: median would be more precise, but might be too slow.

jhendersonUnsubmitted

Done

Is it a major bit of reworking to confirm whether median is slower than mean? I guess if it's straightforward, you could relatively quickly compare the total time for a significant set of tests before and after to see if it really is slower.

jhenderson: Is it a major bit of reworking to confirm whether median is slower than mean? I guess if it's…

lebedev.riAuthorUnsubmitted

Done

It certainly won't be faster for obvious reasons. The question is how much of an overhead it will induce.
For small number of tests (N=1'000, maybe N=10'000) it should be fine,
but for N=100'000 i suspect it will not be good.
In the end, it will mainly only help on the first run, when we don't have previous times for many of the tests.

Unless you insist, i think that may or may not be a follow-up improvement?

lebedev.ri: It certainly won't be faster for obvious reasons. The question is how much of an overhead it…

jhendersonUnsubmitted

Not Done

Fair enough. Not something I've thought about too much, and you're right that it could be a future improvement if really needed.

jhenderson: Fair enough. Not something I've thought about too much, and you're right that it could be a…

average_test_time = (self.time_elapsed + self.predictable_time_remaining) / \

(self.completed + self.predictable_tests_remaining)

unpredictable_time_remaining = average_test_time * \

self.unpredictable_tests_remaining

total_time_remaining = self.predictable_time_remaining + unpredictable_time_remaining

total_time = self.time_elapsed + total_time_remaining

return self.time_elapsed / total_time

class NopDisplay(object): class NopDisplay(object):

def print_header(self): pass def print_header(self): pass

def update(self, test): pass def update(self, test): pass

def clear(self, interrupted): pass def clear(self, interrupted): pass

class Display(object): class Display(object):

def __init__(self, opts, tests, header, progress_bar): def __init__(self, opts, tests, header, progress_bar):

self.opts = opts self.opts = opts

self.tests = tests self.num_tests = len(tests)

self.header = header self.header = header

self.progress_predictor = ProgressPredictor(

tests) if progress_bar else None

self.progress_bar = progress_bar self.progress_bar = progress_bar

self.completed = 0 self.completed = 0

def print_header(self): def print_header(self):

if self.header: if self.header:

ylnUnsubmitted

Done

self.header = header

- self.progress_predictor = None

self.progress_bar = progress_bar

+ self.progress_predictor = ProgressPredictor(tests) if progress_bar else None

self.completed = 0

- if self.progress_bar:

- self.progress_predictor = ProgressPredictor(tests)

def print_header(self):

yln:

print(self.header) print(self.header)

if self.progress_bar: if self.progress_bar:

self.progress_bar.update(0.0, '') self.progress_bar.update(0.0, '')

def update(self, test): def update(self, test):

self.completed += 1 self.completed += 1

show_result = test.isFailure() or \ show_result = test.isFailure() or \

self.opts.showAllOutput or \ self.opts.showAllOutput or \

(not self.opts.quiet and not self.opts.succinct) (not self.opts.quiet and not self.opts.succinct)

if show_result: if show_result:

if self.progress_bar: if self.progress_bar:

self.progress_bar.clear(interrupted=False) self.progress_bar.clear(interrupted=False)

self.print_result(test) self.print_result(test)

if self.progress_bar: if self.progress_bar:

if test.isFailure(): if test.isFailure():

self.progress_bar.barColor = 'RED' self.progress_bar.barColor = 'RED'

percent = float(self.completed) / self.tests percent = self.progress_predictor.update(test)

self.progress_bar.update(percent, test.getFullName()) self.progress_bar.update(percent, test.getFullName())

def clear(self, interrupted): def clear(self, interrupted):

if self.progress_bar: if self.progress_bar:

self.progress_bar.clear(interrupted) self.progress_bar.clear(interrupted)

def print_result(self, test): def print_result(self, test):

# Show the test result line. # Show the test result line.

test_name = test.getFullName() test_name = test.getFullName()

print('%s: %s (%d of %d)' % (test.result.code.name, test_name, print('%s: %s (%d of %d)' % (test.result.code.name, test_name,

self.completed, self.tests)) self.completed, self.num_tests))

# Show the test failure output, if requested. # Show the test failure output, if requested.

if (test.isFailure() and self.opts.showOutput) or \ if (test.isFailure() and self.opts.showOutput) or \

self.opts.showAllOutput: self.opts.showAllOutput:

if test.isFailure(): if test.isFailure():

print("%s TEST '%s' FAILED %s" % ('*'*20, test.getFullName(), print("%s TEST '%s' FAILED %s" % ('*'*20, test.getFullName(),

'*'*20)) '*'*20))

out = test.result.output out = test.result.output

▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

llvm/utils/lit/lit/main.py

Show First 20 Lines • Show All 199 Lines • ▼ Show 20 Lines	def mark_excluded(discovered_tests, selected_tests):
excluded_tests = set(discovered_tests) - set(selected_tests)		excluded_tests = set(discovered_tests) - set(selected_tests)
result = lit.Test.Result(lit.Test.EXCLUDED)		result = lit.Test.Result(lit.Test.EXCLUDED)
for t in excluded_tests:		for t in excluded_tests:
t.setResult(result)		t.setResult(result)


def run_tests(tests, lit_config, opts, discovered_tests):		def run_tests(tests, lit_config, opts, discovered_tests):
workers = min(len(tests), opts.workers)		workers = min(len(tests), opts.workers)
display = lit.display.create_display(opts, len(tests), discovered_tests,		display = lit.display.create_display(opts, tests, discovered_tests, workers)
workers)

run = lit.run.Run(tests, lit_config, workers, display.update,		run = lit.run.Run(tests, lit_config, workers, display.update,
opts.max_failures, opts.timeout)		opts.max_failures, opts.timeout)

display.print_header()		display.print_header()

interrupted = False		interrupted = False
error = None		error = None
▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[lit] Reliable progress indicator and ETAClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 332573

llvm/utils/lit/lit/ProgressBar.py

llvm/utils/lit/lit/display.py

llvm/utils/lit/lit/main.py

[lit] Reliable progress indicator and ETA
ClosedPublic