Currently the UNSUPPORTED and XFAIL clauses support specifying
substrings of the target triple; but REQUIRES does not, which can trip
people up or lead to hacking config files to insert substitute feature
names. Consistency across all three lit clauses seems preferable.
Details
Diff Detail
Event Timeline
I'll likely follow up with removing some unused or now-redundant feature keywords.
This does seem like the same direction as D104572 although I haven't actually tested pattern-matching on triples in REQUIRES; hard to see how it would differ from the matching for UNSUPPORTED and XFAIL.
Currently the UNSUPPORTED and XFAIL clauses support specifying
substrings of the target triple; but REQUIRES does not
My belief was that this was intentional but I never questioned that belief: disable tests broadly, but mark them running narrowly. Consistency is definitely better.
@probinson I'm not a huge fan of this change. If I've understood this change correctly then this could have unintended consequences in existing test suites. If an existing test suite uses "features" that are substrings of the target triple then this behaviour change means they'll be a match now where there wasn't before. That doesn't seem desirable at all. I also think this "matching a substring of the target triple" should be removed completely. I believe @ldionne recently added regex support in REQUIRES conditions and that seems like a better solution here.
What we've run into, repeatedly, is that people will naively do something like REQUIRES: linux expecting it to match against the triple; except it doesn't, because REQUIRES doesn't match triples, and you end up with a test that inadvertently doesn't run at all (as in the one I fixed on Friday).
It's true that you can substitute UNSUPPORTED: !linux to get the desired effect, but this is less than obvious, besides needing a double negative.
Eliminating triple substring matching (in favor of explicit regexes) would affect a lot more existing tests than the hypothetical feature-that-looks-like-a-substring case, IMO. I don't claim that my check-all run found everything, but summary of test counts at the end differed by only that one test mentioned previously.
@probinson: Can you confirm my understanding, as follows?
- Without this patch, REQUIRES doesn't match the target triple at all (unless it happens to have been added as a feature to a particular test suite). This patch enables matching the entire target triple or a substring of it.
- In any hypothetical test suite, this patch can only expand the number of tests that run because it enables REQUIRES to be more easily satisfied (by the current target triple). It is impossible for this patch to reduce the number of tests that run.
- In practice so far, only one test is affected.
A few comments, assuming my above understanding is correct:
- Hiding test failures accidentally is bad. Exposing test failures accidentally is not so bad.
- From that perspective, it seems exactly backward that UNSUPPORTED and XFAIL currently match more broadly than REQUIRES. Why was that decision made?
- The existence of a test case and documentation pointing out the existing behavior means this behavior was intentional. That makes me concerned I've misunderstood what's happening.
- I do agree that eventually it would be nice to get rid of substring matching in favor of regexes for XFAIL and UNSUPPORTED, where matching more broadly than intended is dangerous. I'm not sure I buy that argument for REQUIRES.
- If this patch will be landed, it should update the docs: https://llvm.org/docs/TestingGuide.html#constraining-test-execution
llvm/utils/lit/tests/Inputs/xunit-output/lit.cfg | ||
---|---|---|
10 | Why does this need to change? |
Correct.
I'll note that there are a bunch of triple-based (and sometimes triple-like) feature keywords added by lit, based on the target triple (see the block starting if target_triple: at circa line 106 in llvm/utils/lit/lit/llvm/config.py), which of course could be replaced by regexes in UNSUPPORTED/XFAIL already, and also in REQUIRES if this patch goes in.
- In any hypothetical test suite, this patch can only expand the number of tests that run because it enables REQUIRES to be more easily satisfied (by the current target triple). It is impossible for this patch to reduce the number of tests that run.
Correct.
- In practice so far, only one test is affected.
Out of the projects I generally build: llvm;clang;clang-tools-extra;lld;lldb. I can try doing "all" projects and see what happens, although my track record on getting clean builds and tests is imperfect.
A few comments, assuming my above understanding is correct:
- Hiding test failures accidentally is bad. Exposing test failures accidentally is not so bad.
- From that perspective, it seems exactly backward that UNSUPPORTED and XFAIL currently match more broadly than REQUIRES. Why was that decision made?
"It was like that when I found it." Others who were more involved in lit in the past might be better able to explain. Roaming through the history, UNSUPPORTED was added by Eric Fiselier in 2014; it did not support triple-checking at first either, that was added by Peter Collingbourne about a year later. I didn't try to research the XFAIL history.
- The existence of a test case and documentation pointing out the existing behavior means this behavior was intentional. That makes me concerned I've misunderstood what's happening.
I accept that REQUIRES intentionally has not supported target-triple matching; I still think we're better off with it in place, unless someone can come up with a really nice statement of why it's beneficial enough that we should live with the inconsistency and the need for miscellaneous ad-hoc triple substitutes as mentioned above.
- I do agree that eventually it would be nice to get rid of substring matching in favor of regexes for XFAIL and UNSUPPORTED, where matching more broadly than intended is dangerous. I'm not sure I buy that argument for REQUIRES.
- If this patch will be landed, it should update the docs: https://llvm.org/docs/TestingGuide.html#constraining-test-execution
Ah, thanks for pointing to the doc; I will update it. I see that REQUIRES was not documented at all until 2016 but that's no excuse for not keeping it up-to-date now.
llvm/utils/lit/tests/Inputs/xunit-output/lit.cfg | ||
---|---|---|
10 | Because the test execution goes through the REQUIRES path, and the target_triple check gets a Python error if it's None (says something like "not iterable"). |
Thanks for confirming.
I'll note that there are a bunch of triple-based (and sometimes triple-like) feature keywords added by lit, based on the target triple (see the block starting if target_triple: at circa line 106 in llvm/utils/lit/lit/llvm/config.py),
Ah, thanks.
which of course could be replaced by regexes in UNSUPPORTED/XFAIL already, and also in REQUIRES if this patch goes in.
Maybe I missed a patch since D104572, but it did not add support for regexes matching the target triple. It added support for regexes matching features, as described in the documentation I pointed out. Now that I think about it more, I'm not sure I know what @ldionne ultimately wanted to see happen with target triples. It might be good to hear him comment before moving forward.
I can try doing "all" projects and see what happens, although my track record on getting clean builds and tests is imperfect.
Same here. Sometimes we just have to let the bots find things, unfortunately.
- Hiding test failures accidentally is bad. Exposing test failures accidentally is not so bad.
- From that perspective, it seems exactly backward that UNSUPPORTED and XFAIL currently match more broadly than REQUIRES. Why was that decision made?
"It was like that when I found it."
:-)
In any hypothetical test suite, this patch can only expand the number of tests that run because it enables REQUIRES to be more easily satisfied (by the current target triple). It is impossible for this patch to reduce the number of tests that run.
That is a good point and sometime I missed the first time I looked over this patch. Overly broad matching would mostly be a problem for the UNSUPPORTED declaration which we are already doing. So I withdraw my objections to this patch.
I also took a quick look at the regex support in lit. Looks like the current implementation doesn't actually support regexes for target triples based on the tests we have. So it can't be used as alternative right now.
# When matching against the triple, a regex is treated as an identifier and checked # for a literal match. This preserves existing behavior before regexes were introduced. self.assertFalse(BooleanExpression.evaluate('arch-{{vendor}}-os', {}, triple)) self.assertTrue(BooleanExpression.evaluate('arch-{{vendor}}-os', {}, 'arch-{{vendor}}-os'))
I'd still like to see support for substring matching on target triples completely removed but I don't think we should do that until we have a viable replacement for it.
llvm/utils/lit/tests/Inputs/xunit-output/lit.cfg | ||
---|---|---|
10 | To avoid breaking any existing test suite that sets config.target_triple to None we could do this. def getMissingRequiredFeaturesFromList(self, features): triple = getattr(self.suite.config, 'target_triple', "") if triple is None: triple = '' try: However getUnsupportedFeatures() doesn't do this currently so that would be a bit inconsistent. |
Reviewing comments from @yln and @delcypher, it seems we're all mostly in agreement now, and I haven't found a good reason to object.
LGTM with a few remaining caveats:
- It needs the documentation we discussed.
- I'd like to give @ldionne a few more days to comment given that he's already been thinking about the handling of target triples.
- We might have to revert the patch if it undesirably enables tests in test suites that haven't been tried yet.
Thanks for working on this, @probinson.
In order to find other affected tests, I enabled as many projects as possible; this turns out to be all of them except libc and cross-project-tests. (libc won't build with gcc 7.5, apparently, and cross-project-tests appears to have an error that keeps it from building at all.)
The test summary counts show there are two Unsupported tests that change to Pass, and one Fail that changes to Pass (which really doesn't make sense, maybe it's just a flaky test). I'm rerunning with and without the patch, collecting the complete lists of Unsupported tests in order to identify which ones have started passing. Most likely they are in the same category as the one I fixed previously, i.e. some inadvertent mistake made when writing the REQUIRES clause, except they don't happen to fail when actually run.
I'll get the docs fixed up so they can be reviewed before I push this, and also report back on which tests were affected and why.
Thanks all, especially for the correction about which cases allow regexes.
llvm/utils/lit/tests/Inputs/xunit-output/lit.cfg | ||
---|---|---|
10 | In order for an existing test suite to have config.target_triple = None and not get this error today, all tests controlled by that lit.cfg file would have to avoid using UNSUPPORTED and XFAIL. That seems unlikely. |
The two tests that went from Unsupported to Pass are
LLVM :: tools/llvm-readobj/COFF/call-graph-profile-err.s LLVM :: tools/llvm-readobj/COFF/call-graph-profile.s
both of which have REQUIRES: x86 which matches the default triple on my system (x86_64-unknown-linux) but isn't the right way to constrain the test in a pre-patch world; that would be target-x86 or target-x86_64 or both.
Reviewers have expressed some need for caution in applying this patch, and there might well be tests out there that mistakenly have things like REQUIRES: arm or REQUIRES: powerpc and don't happily pass once they are suddenly enabled for the first time. So, I guess that means I won't be pushing this upstream at quitting time on Friday after all. :-)
FTR, I found an old note to myself that D83759 fixed a test that originally expected REQUIRES to support the triple. (The patch did other things too, but that was one of them.) In case anyone wanted more evidence that people somewhat naturally expect all three clauses to behave the same way.
llvm/docs/TestingGuide.rst | ||
---|---|---|
472–473 | I'm not sure this comment about precedence is meaningful. It seems to say that, if you have an UNSUPPORTED directive, any REQUIRES directive is ignored. Actually, in the implementation, REQUIRES is checked first. I think there's no sense of precedence. The test runs if REQUIRES is matched and UNSUPPORTED isn't matched. I think they could be evaluated in either order. |
llvm/docs/TestingGuide.rst | ||
---|---|---|
472–473 | They can be evaluated in either order; but if they are both true, which one wins? The one that has precedence. |
llvm/docs/TestingGuide.rst | ||
---|---|---|
472–473 | You could just as easily ask: If they are both false, which one wins? |
llvm/docs/TestingGuide.rst | ||
---|---|---|
472–473 | Now that I think about it, REQUIRES isn't required :) so by default tests are enabled. That means REQUIRES doesn't actually enable a test, So really we should say: REQUIRES disables the test if any expression is false. UNSUPPORTED disables the test if any expression is true. Then there's no debate about precedence; both clauses act to disable the test. Which is how it actually works. |
llvm/docs/TestingGuide.rst | ||
---|---|---|
472–473 | Yes, I like that better too. And now I get what led to the suggestion of precedence. :-) |
Looks like this breaks check-lld on macOS: http://45.33.8.238/macm1/15677/step_10.txt
Please take a look.
I don't have a mac to try it on, but based on the FileCheck output it looks like there's one extra CHECK line, so I tried removing it in f88ad8d00f970ea937e8d3fcb71f4d4019f2821a
Let me know whether that works.
Looks like adding the not helped. That's a bit unexpected since lld isn't supposed to exit non-0 there. Do you know how the test worked before? Did it just never run?
@int3 too.
Sorry this went under my Radar, I think I was on vacation at the time when this discussion happened and it went straight to the bottom of my Differential email pile.
Anyway, what we do in libc++/libc++abi/libunwind is that we explicitly add a feature that has the name of the target triple, so that we're able to regex-match on it. I've never liked that the target triple is treated specially by Lit (IMO there shouldn't even be a notion of a target triple in Lit, since Lit is more general than just compiler testing). I think it used to be necessary because the triple was handled specially, but it doesn't need to anymore since one can regex-match on arbitrary features.
So one migration path could be:
- Manually add a feature consisting of the target triple in all existing test suites (or have Lit add the triple as a feature automatically)
- Migrate tests that use substring matching of triples to using regex-matching
- Remove the special handling of target triples in Lit (since nobody should be using it)
To avoid confusion between the target triple as defined by Lit and the target triple as a feature, we've named the feature target=<xxxxxx> in libc++ and it works quite well. So for example, I moved a bunch of
XFAIL: apple-macosx10.9 XFAIL: apple-macosx10.10 XFAIL: apple-macosx10.11
to instead become:
XFAIL: target={{.+}}-apple-macosx10.{{9|10|11}}
Having a feature named target=<xxx> is a bit unusual, but it has been working really well and I feel like it clearly expresses the nature of the Lit feature.
As this patch has had to be reverted due to a completely baffling failure on one bot, we can reopen the debate about how to solve the original problem, which is: XFAIL and UNSUPPORTED allow triples (with substring handling), but REQUIRES doesn't know about triples at all.
I think adding things into lit itself is a means to avoid redundantly adding the same feature to many subprojects, which carries the attendant risk of adding *almost* the same feature to many subprojects. I understand that lit can be adapted to other kinds of testing, but the name after all is an acronym for "LLVM Integrated Tester" so having llvm-project-specific stuff in "lit itself" doesn't seem beyond the pale to me.
I'm in the middle of something else but getting back to this is very near the top of my list. I'd want to scope the existing use of triples in XFAIL and UNSUPPORTED clauses before I'd sign up to convert all of them, rather than do what this patch tried to do (make REQUIRES behave exactly the same as XFAIL/UNSUPPORTED already do).
Also, if we're leaning toward doing a bulk syntactic change like you propose, that should be brought up on the dev lists first to get broader exposure and buy-in.
Lit could add the feature itself. The main point is that instead of treating the triple specially during boolean evaluation, we'd add a feature consisting of the triple and then handle all features the same way.
Also, if we're leaning toward doing a bulk syntactic change like you propose, that should be brought up on the dev lists first to get broader exposure and buy-in.
I was just talking about libc++'s experience -- we don't really have to rename everything if people think it causes too much churn. But using target=<xxx> instead of just <xxx> has the benefit that
- it's clear that we're talking about a property of the target, not a property of the host, and
- one can't confuse the triple-as-specially-handled-by-lit and the triple-as-a-normal-lit-feature during the transitional period where both exist
FTR, posted an RFC here: https://discourse.llvm.org/t/rfc-lits-requires-and-triples/66041
Please let's have additional discussion there. I'm hoping to more-or-less resurrect this but a wider discussion felt appropriate.
@yln I would have tagged you there but I'm not finding a corresponding Discourse handle.
I'm not sure this comment about precedence is meaningful. It seems to say that, if you have an UNSUPPORTED directive, any REQUIRES directive is ignored. Actually, in the implementation, REQUIRES is checked first.
I think there's no sense of precedence. The test runs if REQUIRES is matched and UNSUPPORTED isn't matched. I think they could be evaluated in either order.