This is an archive of the discontinued LLVM Phabricator instance.

[lit] Add --ignore-fail
ClosedPublic

Authored by jdenny on Feb 9 2021, 2:20 PM.

Details

Summary

For some build configurations, check-all calls lit multiple times to
run multiple lit test suites. Most recently, I've found this to be
true when configuring openmp as part of LLVM_ENABLE_RUNTIMES, but
this is not the first time.

If one test suite fails, none of the remaining test suites run, so you
cannot determine if your patch has broken them. It can then be
frustrating to try to determine which check- targets will run the
remaining tests without getting stuck on the failing tests.

When such cases arise, it is probably best to adjust the cmake
configuration for check-all to run all test suites as part of one
lit invocation. Because that fix will likely not be implemented and
land immediately, this patch introduces --ignore-fail to serve as a
workaround for developers trying to see test results until it does
land:

$ LIT_OPTS=--ignore-fail ninja check-all

One problem with --ignore-fail is that it makes it challenging to
detect test failures in a script, perhaps in CI. This problem should
serve as motivation to actually fix the cmake configuration instead of
continuing to use --ignore-fail indefinitely.

Diff Detail

Event Timeline

jdenny created this revision.Feb 9 2021, 2:20 PM
jdenny requested review of this revision.Feb 9 2021, 2:20 PM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: sstefan1. · View Herald Transcript

Personally, I think it should be the default to run all testsuites rather than exiting after the first failure. The exit code should then be the combined one, i.e. whether any of the suites has failed.

Is it not possible to pass the exit code up the stack rather than sys.exit(1) there? The caller (llvm-lit.py?) of the multiple suites would then be responsible for checking all main() calls passed.

Personally, I think it should be the default to run all testsuites rather than exiting after the first failure. The exit code should then be the combined one, i.e. whether any of the suites has failed.

Is it not possible to pass the exit code up the stack rather than sys.exit(1) there? The caller (llvm-lit.py?) of the multiple suites would then be responsible for checking all main() calls passed.

Was about to comment the same. Something a la --keep-going instead of --ignore-fail

Personally, I think it should be the default to run all testsuites rather than exiting after the first failure. The exit code should then be the combined one, i.e. whether any of the suites has failed.

Is it not possible to pass the exit code up the stack rather than sys.exit(1) there? The caller (llvm-lit.py?) of the multiple suites would then be responsible for checking all main() calls passed.

The caller of lit each time is ninja/make. To fix this case, cmake files would likely be redesigned so that lit is called once for all suites together. Surely that's technically possible, and I agree it would be better than using --ignore-fail.

However, I argue that this isn't the first time I've seen check-all call lit multiple times, and I'm assuming it will happen again even if this case is fixed. This patch provides an easy workaround every time, and it has a small footprint within lit.

Personally, I think it should be the default to run all testsuites rather than exiting after the first failure. The exit code should then be the combined one, i.e. whether any of the suites has failed.

Is it not possible to pass the exit code up the stack rather than sys.exit(1) there? The caller (llvm-lit.py?) of the multiple suites would then be responsible for checking all main() calls passed.

The caller of lit each time is ninja/make. To fix this case, cmake files would likely be redesigned so that lit is called once for all suites together. Surely that's technically possible, and I agree it would be better than using --ignore-fail.

However, I argue that this isn't the first time I've seen check-all call lit multiple times, and I'm assuming it will happen again even if this case is fixed. This patch provides an easy workaround every time, and it has a small footprint within lit.

Surely you can ignore the error code in the lit caller one way or another?

Personally, I think it should be the default to run all testsuites rather than exiting after the first failure. The exit code should then be the combined one, i.e. whether any of the suites has failed.

Is it not possible to pass the exit code up the stack rather than sys.exit(1) there? The caller (llvm-lit.py?) of the multiple suites would then be responsible for checking all main() calls passed.

The caller of lit each time is ninja/make. To fix this case, cmake files would likely be redesigned so that lit is called once for all suites together. Surely that's technically possible, and I agree it would be better than using --ignore-fail.

However, I argue that this isn't the first time I've seen check-all call lit multiple times, and I'm assuming it will happen again even if this case is fixed. This patch provides an easy workaround every time, and it has a small footprint within lit.

Surely you can ignore the error code in the lit caller one way or another?

Agreed. But my point is that --ignore-fail can serve as a workaround each time this happens until someone figures out how to fix the cmake scripts.

I'm still not sure, I think a failure is a powerful motivator to fix something but patch LGTM otherwise. @jhenderson what do you think?

I'm still not sure, I think a failure is a powerful motivator to fix something but patch LGTM otherwise. @jhenderson what do you think?

Sorry, I'm a little bit busy with other reviews, so haven't had a chance to dig into this properly yet.

If I follow what you're saying correctly, rather than check-all executing lit in a single run across multiple directories, in some cases, it is spawning a new lit process to run some test subset. Is that right? If so, I wonder if there's a better solution of getting check-all to actually run all the tests as one big test.

jdenny edited the summary of this revision. (Show Details)Feb 12 2021, 10:20 AM

I'm still not sure, I think a failure is a powerful motivator to fix something but patch LGTM otherwise. @jhenderson what do you think?

Sorry, I'm a little bit busy with other reviews, so haven't had a chance to dig into this properly yet.

If I follow what you're saying correctly, rather than check-all executing lit in a single run across multiple directories, in some cases, it is spawning a new lit process to run some test subset. Is that right?

Yes. In the current case, that subset is for runtimes specified in LLVM_ENABLE_RUNTIMES. I believe the last case I saw was compiler-rt/test/gwp_asan/unit, but I think that one has since been fixed.

If so, I wonder if there's a better solution of getting check-all to actually run all the tests as one big test.

Agreed.

However, especially after these responses, I see --ignore-fail as a temporary workaround. By "temporary", I mean until the cmake config issue is fixed. I see it as a permanent lit feature because this isn't the first time our cmake config has had this issue, and I'm betting it won't be the last time. Otherwise, I wouldn't have proposed this patch upstream.

I'm still not sure, I think a failure is a powerful motivator to fix something

Good point. Even so, I think there is still motivation to fix such cmake issues: because the exit status is zero when using --ignore-fail, trying to detect failures in a script can be ugly.

I've updated the review summary to account for everyone's feedback, which has honed my view. Thanks.

I'm coming around to the benefits of this, but first have you considered whether the D96662 would be sufficient for your needs? It seems that you could use that to ignore/xfail the bad tests and continue, without using --ignore-fail and risk accidentally missing failures to do with your work (possibly in more obscure places).

I'm coming around to the benefits of this, but first have you considered whether the D96662 would be sufficient for your needs? It seems that you could use that to ignore/xfail the bad tests and continue,

--xfail could prove frustrating for my current use case. The problem is that openmp test failures are often racy. I'd have to run check-all potentially many times trying to make my --xfail match the failing tests.

--skip avoids that problem because it doesn't care about the specified tests' results. But I want to see those results, so I'd need to run check-all twice, once with the problematic tests, and once skipping them. Seems ok, but...

without using --ignore-fail and risk accidentally missing failures to do with your work (possibly in more obscure places).

This point almost convinced me in the case of --skip. However, one advantage of --ignore-fail is that it makes it possible to just simply run everything. With --ignore-fail, I don't have to figure out the right pattern that skips just tests in the first lit invocation without accidentally skipping other tests I didn't realize matched my pattern. For example, some openmp-related patterns would match in both lit invocations because clang's test suite runs in the second lit invocation. With --ignore-fail, I don't have to run check-all (almost 2 hours on my laptop) many times trying to get the --skip pattern just right. I won't get through the second lit invocation and think I'm done without realizing a third lit invocation has been introduced (no, I'm not aware of a third right now). --ignore-fail just runs everything.

The caveat is that I might miss test failures in the scroll, but anyone using --ignore-fail should know they're ignoring failures, so surely they're on the lookout.

To be clear, I think --xfail and --skip will prove useful for other purposes, especially when you're investigating a smaller set of tests or trying to write careful scripts. In contrast, check-all can be huge. I don't want to tinker trying to make it run everything. After waiting two hours, I just want to see the test results... especially if I previously ran it once, came back two hours later, and discovered it skipped entire test suites.

Okay. I'm happy to proceed. I think the lit documentation needs updating to mention this change?

llvm/utils/lit/tests/ignore-fail.py
8

What's the point of this line?

jdenny marked an inline comment as done.Feb 17 2021, 9:12 AM

Okay. I'm happy to proceed.

Thanks for considering it. Likewise to @thopre.

I think the lit documentation needs updating to mention this change?

You mean lit.rst? I'm happy to put it there. However, first, what is the threshold in general? Many options are missing there. Is that intentional or by accident? (If it's always by accident, we really ought to consider generating that part from lit --help so we don't have to maintain two versions of the same thing. That should be a different patch of course.)

llvm/utils/lit/tests/ignore-fail.py
8

Without it, lit chokes on XFAIL: below. I actually didn't notice it mattered when I wrote it. But I'm developing a habit of using END. in lit's test suite, where lit directives often appear in FileCheck patterns.

I think the lit documentation needs updating to mention this change?

You mean lit.rst? I'm happy to put it there. However, first, what is the threshold in general? Many options are missing there. Is that intentional or by accident?

At least in the LLVM binutils, the threshold has been "any option", and the only way to catch it is to make sure reviewers remember to mention it in the review. I think it's likely the missing options are all accidental, but I haven't dug into the history to confirm this one way or the other.

(If it's always by accident, we really ought to consider generating that part from lit --help so we don't have to maintain two versions of the same thing. That should be a different patch of course.)

I am not opposed to someone doing that, but good luck getting it to work nicely (complete with sub-sectioning as is already present). Any solution should be applied more generally across the other tools with documentation in the Command Guide, I think. I also wouldn't say that the help text and the documentation should always match, so they're not always two places for the same thing. In general, the documentation can be more detailed, with e.g. examples, which don't really belong in the help text.

jdenny updated this revision to Diff 325440.Feb 22 2021, 7:14 AM

Addressed @jhenderson's review. Rebased.

I think the lit documentation needs updating to mention this change?

You mean lit.rst? I'm happy to put it there. However, first, what is the threshold in general? Many options are missing there. Is that intentional or by accident?

At least in the LLVM binutils, the threshold has been "any option", and the only way to catch it is to make sure reviewers remember to mention it in the review. I think it's likely the missing options are all accidental, but I haven't dug into the history to confirm this one way or the other.

(If it's always by accident, we really ought to consider generating that part from lit --help so we don't have to maintain two versions of the same thing. That should be a different patch of course.)

I am not opposed to someone doing that, but good luck getting it to work nicely (complete with sub-sectioning as is already present). Any solution should be applied more generally across the other tools with documentation in the Command Guide, I think.

Agreed, but we could start with one tool and extend to others as we figure out the best approach.

I also wouldn't say that the help text and the documentation should always match, so they're not always two places for the same thing. In general, the documentation can be more detailed, with e.g. examples, which don't really belong in the help text.

Perhaps there could be some way to override/extend the help text for specific options.

Well, for now, I've added --ignore-fail to lit.rst, and I'm not going to work on this more general approach. Maybe later. Thanks for presenting your thoughts.

jhenderson accepted this revision.Feb 24 2021, 1:29 AM

LGTM, thanks!

This revision is now accepted and ready to land.Feb 24 2021, 1:29 AM
This revision was automatically updated to reflect the committed changes.