Page MenuHomePhabricator

[UpdateTestChecks] Match unnamed values like "@[0-9]+" and "![0-9]+"
ClosedPublic

Authored by jdoerfert on Aug 2 2020, 1:21 PM.

Details

Summary

With this patch we will match most *uses* of "temporary" named things in
the IR via regular expressions, not their name at creation time. The new
"values" we match are:

  • "unnamed" globals: @[0-9]+
  • debug metadata: !dbg ![0-9]+
  • loop metadata: !loop ![0-9]+
  • tbaa metadata: !tbaa ![0-9]+
  • range metadata: !range ![0-9]+
  • generic metadata: metadata ![0-9]+
  • attributes groups: #[0-9]

We still don't match the declarations but that can be done later. This
patch can introduce churn when existing check lines contain the old
hardcoded versions of the above "values". We can add a flag to opt-out,
or opt-in, if necessary.

Diff Detail

Event Timeline

jdoerfert created this revision.Aug 2 2020, 1:21 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 2 2020, 1:21 PM
Herald added subscribers: bollu, kosarev. · View Herald Transcript
jdoerfert requested review of this revision.Aug 2 2020, 1:21 PM
jdoerfert edited the summary of this revision. (Show Details)Aug 2 2020, 1:24 PM

Will update the clang tests shortly.

llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/various_ir_values.ll
118

Maybe I add matchers for these next.

lgtm, apart from clang tests. Someone else should take a look as well, though.

llvm/utils/UpdateTestChecks/common.py
302

typo: unnamed.

304

typo: group

MaskRay added inline comments.Aug 3 2020, 9:38 PM
llvm/utils/UpdateTestChecks/common.py
319

Is this a capturing group? If not, use `(?:\s+)

The 3 in the range(3, match.lastindex): below looks strange. I hope we can get rid if it.

327

Use non-capturing group

420

Do you have an example that not substituting repeatedly can fail?

jdoerfert added inline comments.Aug 4 2020, 5:18 AM
llvm/utils/UpdateTestChecks/common.py
319

It is, we use in 386 for example. I did not change that part.

The 3 is a bit magic, I agree. It basically is to skip this group, the outer group of all "IR values", which I could avoid and the entire match. We have to have some hardcoded values there though, like before. Though I can make it a variable to explain better.

327

We use this in 386 and other places.

420

Line 8 in llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/various_ir_values.ll.funcsig.expected :)

MaskRay added inline comments.Aug 4 2020, 2:37 PM
llvm/utils/UpdateTestChecks/common.py
307

[[ATTRIBUTES0:#.*]] is very long. I wonder whether we should abbreviate it to ATTR0 or the like.

307

Instead of Structure of arrays, would it be better organizing these constants as Array of structures?

For example, you can write ['TMP', '%', r'[\w.-]+?'], ['GLOBAL', '@', ...], ...

307

Tuples should be better: [('TMP', '%', r'[\w.-]+?'), ('GLOBAL', '@', ...), ...]

jdoerfert added inline comments.Aug 4 2020, 2:48 PM
llvm/utils/UpdateTestChecks/common.py
307

I thought about it afterwards as well. Will make it a class so it's easier to see what is what.

arichardson added inline comments.Aug 6 2020, 2:52 AM
llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/check_attrs.ll.funcattrs.expected
20

Since we are capturing the attribute number instead of using a #{{[0-9]+}} regex, should we be using a non-capturing [[ATTRIBUTES0]] after the first time?

I guess this would require something like global_vars_seen in addition to vars_seen

jdoerfert added inline comments.Aug 6 2020, 8:42 AM
llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/check_attrs.ll.funcattrs.expected
20

Argh, good catch. Will fix this for sure.

jdoerfert updated this revision to Diff 284492.Aug 10 2020, 1:30 PM
jdoerfert marked 9 inline comments as done.

global_seen_vars, better grouping of inormation, addressed comments

llvm/test/tools/UpdateTestChecks/update_test_checks/Inputs/check_attrs.ll.funcattrs.expected
20

I think here it actually made sense to capture the name again, it is a different prefix after all. However, I did introduce the global_vars_seen now changed the test to include two functions. All but %{{.*}} values are considered global and we will reuse the existing mapping from the same prefix. You can see tbaa is reused in the test.

jdoerfert updated this revision to Diff 284537.Aug 10 2020, 5:49 PM

Update clang tests

Herald added a project: Restricted Project. · View Herald TranscriptAug 10 2020, 5:49 PM
Herald added a subscriber: cfe-commits. · View Herald Transcript
jdoerfert edited the summary of this revision. (Show Details)Aug 10 2020, 5:52 PM

Mostly looks good.

llvm/utils/UpdateTestChecks/common.py
309
nameless_values = [
    NamelessValue(r'TMP',  r'%',            r'[\w.-]+?'),
    NamelessValue(r'TMP',  r'%',            r'[\w.-]+?'),
    ...
]
330

Can the suffix use (?:...)?

420

This is indeed strange. If we don't have a good explanation, probably worth adding count=1

jdoerfert updated this revision to Diff 284543.Aug 10 2020, 6:27 PM
jdoerfert marked an inline comment as done.

added @MaskRays's tweaks

(I'm going to sneak in !range ! as well)

llvm/utils/UpdateTestChecks/common.py
330

We use the captured value later, and I guess people want to keep the exact same "spaces". TBH, it won't make much difference to the logic anywhere (we'd still use match.lastindex). All it would do is it would make us "unify" the output, unsure if that is what we want.

MaskRay accepted this revision.Aug 10 2020, 9:37 PM

LGTM. I don't have further comments. Worth waiting for other opinions.

This revision is now accepted and ready to land.Aug 10 2020, 9:37 PM
thakis added inline comments.
llvm/utils/UpdateTestChecks/common.py
390

This is Py3.4+. Per https://llvm.org/docs/GettingStarted.html#id7 we still support 2.7, which is why all my bots run that, which is why this breaks my bots :) (...and everyone still using Python 2.7 locally). Can you do this some other way? Maybe just surround the re with "^..$'?

jdoerfert added inline comments.Aug 12 2020, 3:52 PM
llvm/utils/UpdateTestChecks/common.py
390

Addressed and restored actual functionality in 07448c550457d2afb1e7d69254a58ad44dece3d2