This is an archive of the discontinued LLVM Phabricator instance.

[UTC] Add --version argument
ClosedPublic

Authored by nikic on Jan 24 2023, 7:18 AM.

Details

Summary

We have a number of pending changes to update_test_checks.py (and friends) that are essentially blocked on test churn: If the output of UTC for an existing flag combination changes, then the next time a test is regenerated, it will contain many spurious changes. This makes changes to UTC default behavior essentially impossible.

Examples of such changes are:

  • D133943/D142373 want --function-signature to also check the return type/attributes.
  • D139006/D140212 want to make --function-signature the default behavior.
  • D142452 wants to add wildcards for block labels.

This patch tries to resolve this issue by adding a --version argument, which works as follows:

  • When regenerating an old test, the default version is 1.
  • When generating a new test, the default version is the newest.
  • When an explicit version is specified, that of course wins.

This means that any currently existing tests will keep using --version 1 format, while any new tests will automatically embed --version 2 (or whatever the latest is), and then keep using that test format from then on.

Diff Detail

Event Timeline

nikic created this revision.Jan 24 2023, 7:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 24 2023, 7:18 AM
Herald added a subscriber: StephenFan. · View Herald Transcript
nikic requested review of this revision.Jan 24 2023, 7:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 24 2023, 7:18 AM
nikic updated this revision to Diff 491785.Jan 24 2023, 7:21 AM

Add missing test file.

asb added a comment.Jan 24 2023, 7:49 AM

My initial reaction is that this is a nice solution to the problem - thank you! The main practical challenge I see is the the one you suggest in the summary - that it might not be easy to decide when it's "worth" introducing a new version, and there might be pressure in practice to group together multiple changes at once.

I'm interested to hear what others think.

A couple of thoughts, not really having a fully-formed opinion on this yet:

  • This is a nice trick to avoid the problem of test churn.
  • I worry about the complexity of maintaining the different versions.
  • In the past, I have on occasion made commits that only re-run UTC on existing tests, without making any other changes, and without posting them for review.
    • I added update_any_test_checks.py in part to make that sort of change easier (though the primary motivation was to just update tests after code changes)
    • I wonder if / to what extent normalizing and perhaps automating this process is an alternative solution.

I would not be so worried about the complexity of having to maintain multiple version formats -- we can think of the version flag as setting defaults for command-line options that alter the test format.
We already have such options (e.g. mentioned --function-signature), so this is not new in principle, it just provides a convenient means to set these.

On the implementation side, I'm wondering whether it would be better to avoid directly checking version in the implementation, and instead derive some options (similar to args, or even args itself?) from the version and then only use that?
For example, if some old version introduced several changes (to avoid "version churn"), it is later probably easier to later remove support for one of them is there is a dedicated internal flag for each change.

asb added a comment.Jan 24 2023, 8:44 AM
  • In the past, I have on occasion made commits that only re-run UTC on existing tests, without making any other changes, and without posting them for review.
    • I added update_any_test_checks.py in part to make that sort of change easier (though the primary motivation was to just update tests after code changes)
    • I wonder if / to what extent normalizing and perhaps automating this process is an alternative solution.

This is a good point, and worrying less about churn to existing tests is I think not a bad option. It's natural to want to compare such changes to other automated changes like code reformatting, but I think in practice it's very different - as the only changes are to tool-generated lines, it doesn't really cause the same issues with merge conflicts.

nikic added a comment.Jan 24 2023, 9:27 AM

My initial reaction is that this is a nice solution to the problem - thank you! The main practical challenge I see is the the one you suggest in the summary - that it might not be easy to decide when it's "worth" introducing a new version, and there might be pressure in practice to group together multiple changes at once.

I think from a maintenance perspective, it actually makes little difference whether these are in one version or in three. I believe the implementation of each change is essentially orthogonal, and would not become substantially simpler (or simpler at all?) when combined in one version. So this would mostly be a cosmetic question for me.

Maybe worth mentioning that I don't think every change to UTC output needs a version bump: Only those that could cause significant churn. For example, I don't think we should bump the version every time we add support for recognizing some new metadata, or other changes that don't affect many tests.

  • In the past, I have on occasion made commits that only re-run UTC on existing tests, without making any other changes, and without posting them for review.

Yes, this isn't an unusual thing to do. Generally if you either hit a test that hasn't been converted yet, or there have been some substantial changes to instruction names. It's mostly a question of scale. At this point, with most tests (that I care about) using UTC, this is something that one has to do only occasionally. With changes like "enable --function-signature by default" you have to do that for literally every single test you touch.

Marginally increased maintenance costs in UTC are something that affects only few people and few changes, while test churn affects everyone. The "regenerate test checks, see that some need a regenerated baseline, checkout main, rebuild, regenerate baseline, push the change, rebase, find out that somebody 'improved' an ADT header in the meantime, wait 15 minutes for the build to finish, go back to your actual work" cycle adds non-trivial overhead to development, and test management is already a big time-sink as it is.

I like this approach, it's a lot less hacky than my attempt to default function signatures to on. I'll rebase that to check for version>1 instead.

I'm not sure if this should include the change to match full lines though, I think that should be a separate commit (but the implementation looks fine to me).

nikic updated this revision to Diff 492723.Jan 27 2023, 5:57 AM
nikic edited the summary of this revision. (Show Details)

Only add the --version flag, don't implement any behavior change in this revision. As such, the default version stays at 1 and this is a no-op for now.

Future changes should bump the DEFAULT_VERSION constant and add a changelog entry.

jsilvanus added inline comments.Jan 27 2023, 6:05 AM
llvm/utils/UpdateTestChecks/common.py
24

Maybe mention that this is equivalent to no version specified.

148

default=DEFAULT_VERSION?

nikic updated this revision to Diff 492730.Jan 27 2023, 6:18 AM

Additional comments

nikic marked an inline comment as done.Jan 27 2023, 6:20 AM
nikic added inline comments.
llvm/utils/UpdateTestChecks/common.py
148

This one should always stay at 1, added a comment to explain.

jsilvanus added inline comments.Jan 27 2023, 6:21 AM
llvm/utils/UpdateTestChecks/common.py
148

Makes sense, thanks.

arichardson accepted this revision.Jan 27 2023, 12:50 PM
This revision is now accepted and ready to land.Jan 27 2023, 12:50 PM
This revision was landed with ongoing or failed builds.Jan 30 2023, 1:00 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJan 30 2023, 1:00 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
bogner added a subscriber: bogner.Feb 6 2023, 3:35 PM

Multiple people had concerns with the added complexity of having to maintain historical versions of behaviour here (a concern which I share), and absolutely none of them accepted the revision or commented after the comment that you don’t think it’ll be too bad in practice (with little explanation as to why).

I really don’t think it was appropriate to commit this without further discussion here.

nikic added a comment.Feb 14 2023, 2:34 AM

Sorry, I did not interpret those comments as blocking concerns. Happy to revert this if they were intended as such -- this is effectively an NFC patch until DEFAULT_VERSION gets bumped, so we can still backtrack to a different approach (though I'm not aware of any that would have less maintenance burden).