This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
docs/CommandGuide/
-
CommandGuide/
15/15
FileCheck.rst
-
lib/Support/
-
Support/
15/15
FileCheck.cpp
2/2
FileCheckImpl.h
-
test/FileCheck/
-
FileCheck/
1/1
numeric-expression.txt
-
unittests/Support/
-
Support/
4/4
FileCheckTest.cpp

Differential D81667

[FileCheck] Add precision to format specifier
ClosedPublic

Authored by thopre on Jun 11 2020, 8:42 AM.

Download Raw Diff

Details

Reviewers

jhenderson
jdenny
probinson
grimar
arichardson

Commits

rG998709b7d553: [FileCheck] Add precision to format specifier

Summary

Add printf-style precision specifier to pad numbers to a given number of
digits when matching them if the value is smaller than the given
precision. This works on both empty numeric expression (e.g. variable
definition from input) and when matching a numeric expression. The
syntax is as follows:

[[#%.<precision><format specifier>, ...]

where <format specifier> is optional and ... can be a variable
definition or not with an empty expression or not. In the absence of a
precision specifier, a variable definition will accept leading zeros.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

thopre created this revision.Jun 11 2020, 8:42 AM

Herald added a subscriber: hiraditya. · View Herald TranscriptJun 11 2020, 8:42 AM

Harbormaster failed remote builds in B59987: Diff 270159!Jun 11 2020, 10:28 AM

I think I agree with your conclusions. More generally, I think we should be permissive, where permissiveness is not going to be surprising (i.e. no explicit format specifier seems reasonable in the general context), and should follow scanf style format specifiers where reasonable. If I follow it right, it would therefore be possible to specify a 16-digit hex field with %.16x, right? Could you clarify what the motivation of the "with empty expression" bit is for? Is that just because when there is an empty expression, your regex is incorrect, or something else?

The code change in general seems simple enough to support the proposal too, though I haven't reviewed it in detail. I'll wait until you've added documentation/tests etc, so that I can review it all at once.

In D81667#2092260, @jhenderson wrote:

I think I agree with your conclusions. More generally, I think we should be permissive, where permissiveness is not going to be surprising (i.e. no explicit format specifier seems reasonable in the general context), and should follow scanf style format specifiers where reasonable. If I follow it right, it would therefore be possible to specify a 16-digit hex field with %.16x, right? Could you clarify what the motivation of the "with empty expression" bit is for? Is that just because when there is an empty expression, your regex is incorrect, or something else?

When only using a numeric expression, numeric substitution blocks are behaving as printf: print a value as text to be matched against the input. When defining a variable with an empty expression (the majority of definition cases), it behaves more like a scanf. Only printf support a precision in its syntax. Scanf doesn't support it. This is the main reason why I ask the question. It is also the case that we currently allow leading zeros when matching an unknown numeric value for a numeric variable definition with empty expression.

I think the case of a variable defined from an expression is a special case since you are matching something specific so in itself doesn't mandate extending the precision to variable definition with empty expression. However I think allowing a precision when matching an unknown variable is both useful and makes for syntax consistency.

To answer your earlier question, yes it'll be possible to match a 16-bit hex with #%.16x,VAR1: or an 8-bit hex with #%.8x, VAR2:. However #%.8, VAR1 will print all 16-bit of VAR1, same as printf. Does that seem reasonable or should we deviate from printf and give an error in such a case?

The code change in general seems simple enough to support the proposal too, though I haven't reviewed it in detail. I'll wait until you've added documentation/tests etc, so that I can review it all at once.

jhenderson mentioned this in D81144: [MC] Generate .debug_line in the 64-bit DWARF format [2/7].Jun 16 2020, 12:15 AM

In D81667#2092827, @thopre wrote:

In D81667#2092260, @jhenderson wrote:

I think I agree with your conclusions. More generally, I think we should be permissive, where permissiveness is not going to be surprising (i.e. no explicit format specifier seems reasonable in the general context), and should follow scanf style format specifiers where reasonable. If I follow it right, it would therefore be possible to specify a 16-digit hex field with %.16x, right? Could you clarify what the motivation of the "with empty expression" bit is for? Is that just because when there is an empty expression, your regex is incorrect, or something else?

When only using a numeric expression, numeric substitution blocks are behaving as printf: print a value as text to be matched against the input. When defining a variable with an empty expression (the majority of definition cases), it behaves more like a scanf. Only printf support a precision in its syntax. Scanf doesn't support it. This is the main reason why I ask the question. It is also the case that we currently allow leading zeros when matching an unknown numeric value for a numeric variable definition with empty expression.

I think the case of a variable defined from an expression is a special case since you are matching something specific so in itself doesn't mandate extending the precision to variable definition with empty expression. However I think allowing a precision when matching an unknown variable is both useful and makes for syntax consistency.

To answer your earlier question, yes it'll be possible to match a 16-bit hex with #%.16x,VAR1: or an 8-bit hex with #%.8x, VAR2:. However #%.8, VAR1 will print all 16-bit of VAR1, same as printf. Does that seem reasonable or should we deviate from printf and give an error in such a case?

I think that seems reasonable to me overall. Thanks for explaining.

ikudrin added a subscriber: ikudrin.Jun 16 2020, 4:33 AM

MaskRay added a subscriber: MaskRay.Jun 16 2020, 9:53 PM

MaskRay added inline comments.

llvm/lib/Support/FileCheckImpl.h
56	Prefer default member initializer (`unsigned Precision = 0;`)

Should the regex wildcard for a numeric variable definition with empty expression also respect the precision, i.e. #%.5u, VAR2: would be matched by (([1-9][0-9]+)? [0-9]{1,5})

I believe I followed the comments about matching behavior for an empty expression (scanf-like) vs. an expression (printf-like). So the above question is about whether, in the empty-expression case, it's worthwhile to support a precision specified by . even though scanf does not support that. Right?

I don't understand the above regex due to the space character after the ?. Was that intended?

Can you give some example inputs and explain the intended matching behavior for #%.5u, VAR2:? Why is this behavior needed in FileCheck but not in scanf?

thopre edited the summary of this revision. (Show Details)Jun 17 2020, 9:53 AM

In D81667#2098443, @jdenny wrote:

Should the regex wildcard for a numeric variable definition with empty expression also respect the precision, i.e. #%.5u, VAR2: would be matched by (([1-9][0-9]+)? [0-9]{1,5})

I believe I followed the comments about matching behavior for an empty expression (scanf-like) vs. an expression (printf-like). So the above question is about whether, in the empty-expression case, it's worthwhile to support a precision specified by . even though scanf does not support that. Right?

Correct.

I don't understand the above regex due to the space character after the ?. Was that intended?

No, fixed now.

Can you give some example inputs and explain the intended matching behavior for #%.5u, VAR2:? Why is this behavior needed in FileCheck but not in scanf?

Say the directive is:

CHECK: Address #%.8x,ADDR: is aligned

and the input text is:

Address 12345678 is aligned

I'd expect the directive to match and the value in ADDR to be 0x12345678. Now if the input text was:

Address FFFFFFFF12345678

I'd expect the directive to fail. If the directive was #%x, ADDR: the first input would have led to the same outcome but the second input would have led the directive matching and the value in ADDR to be 0xFFFFFFFF12345678.

Besides whether this is a useful feature, it makes for easier parsing and consistency in the syntax (no difference between variables defined from an expression where the precision would be allowed and variables defined from an empty expression where precision would not be allowed).

In D81667#2098537, @thopre wrote:

In D81667#2098443, @jdenny wrote:

Can you give some example inputs and explain the intended matching behavior for #%.5u, VAR2:? Why is this behavior needed in FileCheck but not in scanf?

Besides whether this is a useful feature, it makes for easier parsing and consistency in the syntax (no difference between variables defined from an expression where the precision would be allowed and variables defined from an empty expression where precision would not be allowed).

I forgot to mention that scanf doesn't need this because it's separate from printf (weaker need for consistency) and I guess aims at parsing some value more than checking format.

I don't understand the above regex due to the space character after the ?. Was that intended?

No, fixed now.

It now says #%.5u, VAR2: matches (([1-9][0-9]+)?[0-9]{1,5}), but that matches 123456789. I think that's unintended.

Can you give some example inputs and explain the intended matching behavior for #%.5u, VAR2:? Why is this behavior needed in FileCheck but not in scanf?

Say the directive is:

CHECK: Address #%.8x,ADDR: is aligned

and the input text is:

Address 12345678 is aligned

I'd expect the directive to match and the value in ADDR to be 0x12345678. Now if the input text was:

Address FFFFFFFF12345678

I'd expect the directive to fail.

You mean fail to match and continue searching? Or fail immediately?

So, %.8x is a maximum? For printf, it's a minimum. scanf's %8x (no .) feels more like what you're going for except that it discards additional digits instead of failing to match.

In D81667#2098625, @jdenny wrote:

I don't understand the above regex due to the space character after the ?. Was that intended?

No, fixed now.

It now says #%.5u, VAR2: matches (([1-9][0-9]+)?[0-9]{1,5}), but that matches 123456789. I think that's unintended.

Can you give some example inputs and explain the intended matching behavior for #%.5u, VAR2:? Why is this behavior needed in FileCheck but not in scanf?

Say the directive is:

CHECK: Address #%.8x,ADDR: is aligned

and the input text is:

Address 12345678 is aligned

I'd expect the directive to match and the value in ADDR to be 0x12345678. Now if the input text was:

Address FFFFFFFF12345678

I'd expect the directive to fail.

You mean fail to match and continue searching? Or fail immediately?

So, %.8x is a maximum? For printf, it's a minimum. scanf's %8x (no .) feels more like what you're going for except that it discards additional digits instead of failing to match.

My bad, my example was completely wrong. My personal motivation is consistency in the syntax. New example:

I'd expect 0x[[#%.8x, ADDR:]] to match 00001234 or FFFFFFFF12345678 but not 1234 due to there not being enough digits. I guess it could be useful to check alignment in a tool but as I said my main motivation is keeping a common format specifier syntax for all numeric substitution blocks. Note that my regex was indeed wrong anyway, it should be (([1-9][0-9]+)?[0-9]{5}).

thopre edited the summary of this revision. (Show Details)Jun 17 2020, 10:57 AM

I'd expect 0x[[#%.8x, ADDR:]] to match 00001234 or FFFFFFFF12345678 but not 1234 due to there not being enough digits.

OK, it would expect a value that could have been printed by printf with %.8x.

I guess it could be useful to check alignment in a tool but as I said my main motivation is keeping a common format specifier syntax for all numeric substitution blocks. Note that my regex was indeed wrong anyway, it should be (([1-9][0-9]+)?[0-9]{5}).

I think you want + to be * to permit 123456.

What would happen on 012345? Would it match 01234 and leave 5 for a later directive, or would FileCheck fail immediately?

In D81667#2098944, @jdenny wrote:

I'd expect 0x[[#%.8x, ADDR:]] to match 00001234 or FFFFFFFF12345678 but not 1234 due to there not being enough digits.

OK, it would expect a value that could have been printed by printf with %.8x.

FWIW, this is what I'm imagining the overall behaviour to be. If printf could have produced the output for a given format specifier, we should accept it, and conversely if it can't produce the output for a given format specifier, we shouldn't accept it.

I'm not sure whether we should consume all digits before applying the precision check or not though. I can see benefits for either side.

In D81667#2100049, @jhenderson wrote:

In D81667#2098944, @jdenny wrote:

I'd expect 0x[[#%.8x, ADDR:]] to match 00001234 or FFFFFFFF12345678 but not 1234 due to there not being enough digits.

OK, it would expect a value that could have been printed by printf with %.8x.

FWIW, this is what I'm imagining the overall behaviour to be. If printf could have produced the output for a given format specifier, we should accept it, and conversely if it can't produce the output for a given format specifier, we shouldn't accept it.

I'm not sure whether we should consume all digits before applying the precision check or not though. I can see benefits for either side.

We currently accept numbers with leading zeroes but printf would not produce those without a precision. Should we start by fixing this then?

In D81667#2100079, @thopre wrote:

In D81667#2100049, @jhenderson wrote:

In D81667#2098944, @jdenny wrote:

I'd expect 0x[[#%.8x, ADDR:]] to match 00001234 or FFFFFFFF12345678 but not 1234 due to there not being enough digits.

OK, it would expect a value that could have been printed by printf with %.8x.

FWIW, this is what I'm imagining the overall behaviour to be. If printf could have produced the output for a given format specifier, we should accept it, and conversely if it can't produce the output for a given format specifier, we shouldn't accept it.

I'm not sure whether we should consume all digits before applying the precision check or not though. I can see benefits for either side.

We currently accept numbers with leading zeroes but printf would not produce those without a precision. Should we start by fixing this then?

I think we need leading zeros to be accepted until we have an alternative in place. Otherwise, there may be existing tests that rely on the current behaviour which we can't migrate. I think that means a rough order of: 1) Add precision support; 2) Migrate existing tests to use it where needed; 3) Stop accepting leading zeros except via precision. 2) and 3) can probably be done at the same time. We should only do them as part of 1) if it's harder to keep them separate, in my opinion.

I want to raise one point. Some people may expect format specifier to be similar to scanf, instead of printf. scanf uses similar but less powerful format specifiers than printf. For instance, . is not valid in scanf. %.4u should fail (though glibc appears to be weird things; musl is good). In scanf, %4u reads at most 4 digits, not exactly 4 digits. The only way is %4c plus a conversion -> this is certainly not suitable in FileCheck. Anyway %.4u stills looks good to me.

If no variable is captured, is the syntax [[#%.4u:]]?

In D81667#2102317, @MaskRay wrote:

I want to raise one point. Some people may expect format specifier to be similar to scanf, instead of printf. scanf uses similar but less powerful format specifiers than printf. For instance, . is not valid in scanf. %.4u should fail (though glibc appears to be weird things; musl is good). In scanf, %4u reads at most 4 digits, not exactly 4 digits. The only way is %4c plus a conversion -> this is certainly not suitable in FileCheck. Anyway %.4u stills looks good to me.

That's exactly the point of the second question in the description. Capturing a variable feels more like scanf but I think a unified syntax makes more sense. This is where we need to diverge from the printf/scanf analogy. Since the accepted format is defined explicitely in the documentation I don't think it's a big problem.

If no variable is captured, is the syntax [[#%.4u:]]?

It would be #%.4u or simply #%.4 since u is the default format specifier.

In D81667#2103464, @thopre wrote:

In D81667#2102317, @MaskRay wrote:

I want to raise one point. Some people may expect format specifier to be similar to scanf, instead of printf. scanf uses similar but less powerful format specifiers than printf. For instance, . is not valid in scanf. %.4u should fail (though glibc appears to be weird things; musl is good). In scanf, %4u reads at most 4 digits, not exactly 4 digits. The only way is %4c plus a conversion -> this is certainly not suitable in FileCheck. Anyway %.4u stills looks good to me.

That's exactly the point of the second question in the description. Capturing a variable feels more like scanf but I think a unified syntax makes more sense. This is where we need to diverge from the printf/scanf analogy. Since the accepted format is defined explicitely in the documentation I don't think it's a big problem.

If no variable is captured, is the syntax [[#%.4u:]]?

It would be #%.4u or simply #%.4 since u is the default format specifier.

Nice. [[#%.4u]] (non-capturing) and [[#%.4u,ADDR:]] (capturing) looks good to me. Might be worth noting that it is not a scanf-supported specifier.

Finish implementation based on consensus reached on questions raised by the proof of concept version.

Harbormaster completed remote builds in B67601: Diff 284150.Aug 8 2020, 4:19 PM

Add example of precision in documentation

Harbormaster completed remote builds in B67647: Diff 284233.Aug 9 2020, 2:19 PM

Functionality looks reasonable, although I haven't checked the testing yet.

llvm/docs/CommandGuide/FileCheck.rst
754	If we expand this out, the full syntax is apparently `[[#%.<precision><precision><conversion specifier>,<NUMVAR:]]`, which I don't think is what you mean :-)
758–759	Should we say something about leading zeros beyond those required by the precision value?
762	Nit: There's a double space after "to".
770
llvm/lib/Support/FileCheck.cpp
48	`StringRef`?
779	Can you fix the case of `fmtloc` whilst you're modifying this line, please?
llvm/lib/Support/FileCheckImpl.h
82

Address most comments

llvm/docs/CommandGuide/FileCheck.rst
758–759	Is that what you expected?
llvm/lib/Support/FileCheck.cpp
48	ostringstream below does not understand StringRef so I would need to do .str() which can be expensive. Any reason not to keep const char*?

grimar added inline comments.Aug 11 2020, 3:23 AM

llvm/lib/Support/FileCheck.cpp
69	Seems you should be able to do the following instead? return (RegexPrefix + Twine(Precision) + "}").str();
72	Perhaps, it might be simpler just to merge switches and write the logic here as: Expected<std::string> ExpressionFormat::getWildcardRegex() const { if (Value == Kind::NoFormat) return createStringError(std::errc::invalid_argument, "trying to match value with invalid format"); switch (Value) { case Kind::Unsigned: if (Precision) return ("-?([1-9][0-9]*)?[0-9]{" + Twine(Precision) + "}").str(); return std::string("[0-9]+"); case Kind::Signed: ... default: llvm_unreachable("...."); } }
764	Use `trim`? FormatExpr.trim(SpaceChars)
llvm/unittests/Support/FileCheckTest.cpp
164	This will fail if `NumStr` is empty. Is it OK (I guess so), though perhaps a bit cleaner would be to use `StringRef::startswith`.
170	PaddedStr = "-";

Address more review comments

thopre added inline comments.Aug 11 2020, 3:40 AM

llvm/lib/Support/FileCheck.cpp
72	I'm not a big fan of repeating the formatting logic for the Precision case so I've kept that bit as is. What do you think of the result?

grimar added inline comments.Aug 11 2020, 3:56 AM

llvm/lib/Support/FileCheck.cpp
72	I see 2 possible improvements: When you have a dedicated `RegexPrefix` variable, you postpone the return and have to add `break`s everywhere. If you just do not want to repeat the formatting logic, I'd suggest to add a little helper. E.g: auto CreatePrecisionRegex = [](StringRef S) -> std::string { return (S + Twine(Precision) + "}").str(); }; switch (Value) { case Kind::Unsigned: if (Precision) return CreatePrecisionRegex("-?([1-9][0-9]*)?[0-9]{"); return std::string("[0-9]+"); default: llvm_unreachable("ddd"); } The main benefit is that you can return early and avoid having a one more variable. Perhaps it doesn't make much sence to use `createStringError` for the `default` case? It is unreachable now and can't be tested either (I believe). So I'd either remove the `if (Value == Kind::NoFormat)` block and handle the error in the `default`, like you initially did, or keep it and switch to using `llvm_unreachable` in `default`.

grimar added inline comments.Aug 11 2020, 3:58 AM

llvm/lib/Support/FileCheck.cpp
72	Oh, and for `1)` there is no need to use `-> std::string`: auto CreatePrecisionRegex = [](StringRef S) { return (S + Twine(Precision) + "}").str(); };

Harbormaster completed remote builds in B67868: Diff 284632.Aug 11 2020, 4:04 AM

Harbormaster completed remote builds in B67870: Diff 284635.Aug 11 2020, 4:09 AM

Add review comments

thopre added inline comments.Aug 11 2020, 4:23 AM

llvm/lib/Support/FileCheck.cpp
72	Ah yes, I started doing it your way and changed in the middle. I'll remove the top if block

Harbormaster completed remote builds in B67874: Diff 284647.Aug 11 2020, 4:25 AM

jdenny added inline comments.Aug 11 2020, 9:26 AM

llvm/docs/CommandGuide/FileCheck.rst
736–751	"`%<fmtspec>` is an optional" -> "`%<fmtspec>,` is an optional"? That is, you must either have `%<fmtspec>` and `,` or neither, right? "the what" -> "what"
737	"how many leading zeros" -> "how many digits" given that you can directly specify the latter (as a minimum) but not the former?
760–762	`IMM`->`ADDR` The documentation above says 8 is the minimum, but `F0F0` has 4 digits.
764–766	Isn't `:` supposed to be `,`? That's how the tests seem to work, and FileCheck complains when I try this syntax with `:`.
774	"variable" -> "variables,"
792	When `<expr>` is empty (here or in the variable definition syntax), then the precision specifier specifies the minimum number of digits to be matched, right? When `<expr>` is non-empty, then the precision specifier combined with the actual value of the expression specifies an exact number of digits to be matched, right? I understand that the precision is a minimum here too, but I think it's a printing/substitution minimum not a matching/capturing minimum. My point is that this case is a bit hard to follow. It seems to me that the numeric substitution syntax with no `<expr>` is actually more like a variable definition syntax with no variable (and thus no `:`): there's no existing value to match against, so there's nothing to "substitute". Instead you're capturing a new value and either saving it as a variable or discarding it. Can we document it that way? If so, instead of calling the first syntax "The syntax to define a numeric variable", you might call it "The syntax to capture a numeric value". It can optionally define a numeric variable.

jhenderson added inline comments.Aug 13 2020, 2:10 AM

llvm/docs/CommandGuide/FileCheck.rst
758–759	I think that is much simpler.
llvm/lib/Support/FileCheck.cpp
47	This doesn't compile. I don't think you can use `->` in a capture list. You just need to specify `this` and then use appropriately below.
llvm/test/FileCheck/numeric-expression.txt
147	Same goes elsewhere.
llvm/unittests/Support/FileCheckTest.cpp
142	I think you could simplify this code by starting with `std::string ExtendedInput = Input;` and then just using `ExtendedInput` in the checks below.
152–153	It sounds to me like this is really just two completely different functions. I'd recommend splitting.

Address all remaining review comments

llvm/docs/CommandGuide/FileCheck.rst
792	I like the idea of distinguishing between capturing a value and substituting a value. Good call.
llvm/lib/Support/FileCheck.cpp
47	It's what I found out before I submit this diff, I must have forgotten to undo the change. Sorry about that.

Harbormaster completed remote builds in B68258: Diff 285356.Aug 13 2020, 7:28 AM

thopre retitled this revision from [RFC, FileCheck] Add precision to format specifier to [FileCheck] Add precision to format specifier.Aug 19 2020, 8:49 AM

I think this is basically ready now, barring my example comment.

llvm/docs/CommandGuide/FileCheck.rst
758–763	If this example is meant to demonstrate the precision as well as conversion, it probably makes sense to say something like "but would not match `mov r5, 0x00F0F0FEFE`" and/or change the example to `mov r5, 0x0000F0F0`, so that it shows the precision behaviour.

Better demonstrate precision in documentation

llvm/docs/CommandGuide/FileCheck.rst
758–763	Good point.

LGTM, but best wait for someone else to confirm too.

This revision is now accepted and ready to land.Aug 20 2020, 1:43 AM

Harbormaster completed remote builds in B68991: Diff 286738.Aug 20 2020, 2:21 AM

In D81667#2227851, @jhenderson wrote:

LGTM, but best wait for someone else to confirm too.

Ping anyone else?

I've debugged this and it LGTM.
Have a few minor suggestions about the code (up to you).

llvm/lib/Support/FileCheck.cpp
48	Perhaps, a bit cleaner would be to add the "{" right here.
86	You can just use the value you have already.
103	You can combine these cases I think: case Kind::HexUpper: case Kind::HexLower: AbsoluteValueStr = utohexstr(AbsoluteValue, Value == Kind::HexLower); break;

Herald added a subscriber: danielkiss. · View Herald TranscriptAug 30 2020, 2:39 AM

Closed by commit rG998709b7d553: [FileCheck] Add precision to format specifier (authored by thopre). · Explain WhyAug 30 2020, 11:40 AM

This revision was automatically updated to reflect the committed changes.

thopre marked 3 inline comments as done.

thopre added a commit: rG998709b7d553: [FileCheck] Add precision to format specifier.

Revision Contents

Path

Size

llvm/

docs/

CommandGuide/

FileCheck.rst

76 lines

lib/

Support/

FileCheck.cpp

132 lines

FileCheckImpl.h

20 lines

test/

FileCheck/

numeric-expression.txt

93 lines

unittests/

Support/

FileCheckTest.cpp

124 lines

Diff 288863

llvm/docs/CommandGuide/FileCheck.rst

Show First 20 Lines • Show All 724 Lines • ▼ Show 20 Lines

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

:program:`FileCheck` also supports numeric substitution blocks that allow :program:`FileCheck` also supports numeric substitution blocks that allow

defining numeric variables and checking for numeric values that satisfy a defining numeric variables and checking for numeric values that satisfy a

numeric expression constraint based on those variables via a numeric numeric expression constraint based on those variables via a numeric

substitution. This allows ``CHECK:`` directives to verify a numeric relation substitution. This allows ``CHECK:`` directives to verify a numeric relation

between two numbers, such as the need for consecutive registers to be used. between two numbers, such as the need for consecutive registers to be used.

The syntax to define a numeric variable is ``[[#%<fmtspec>,<NUMVAR>:]]`` where: The syntax to capture a numeric value is

``[[#%<fmtspec>,<NUMVAR>:]]`` where:

* ``%<fmtspec>`` is an optional scanf-style matching format specifier to * ``%<fmtspec>,`` is an optional format specifier to indicate what number

indicate what number format to match (e.g. hex number). Currently accepted format to match and the minimum number of digits to expect.

jdennyUnsubmitted

Done

"how many leading zeros" -> "how many digits" given that you can directly specify the latter (as a minimum) but not the former?

jdenny: "how many leading zeros" -> "how many digits" given that you can directly specify the latter…

format specifiers are ``%u``, ``%d``, ``%x`` and ``%X``. If absent, the

format specifier defaults to ``%u``. * ``<NUMVAR>:`` is an optional definition of variable ``<NUMVAR>`` from the

captured value.

The syntax of ``<fmtspec>`` is: ``.<precision><conversion specifier>`` where:

* ``.<precision>`` is an optional printf-style precision specifier in which

``<precision>`` indicates the minimum number of digits that the value matched

must have, expecting leading zeros if needed.

* ``<conversion specifier>`` is an optional scanf-style conversion specifier

to indicate what number format to match (e.g. hex number). Currently

accepted format specifiers are ``%u``, ``%d``, ``%x`` and ``%X``. If absent,

the format specifier defaults to ``%u``.

jdennyUnsubmitted

Done

"%<fmtspec> is an optional" -> "%<fmtspec>, is an optional"? That is, you must either have %<fmtspec> and , or neither, right?

"the what" -> "what"

jdenny: "`%<fmtspec>` is an optional" -> "`%<fmtspec>,` is an optional"? That is, you must either have…

* ``<NUMVAR>`` is the name of the numeric variable to define to the matching

value.

For example: For example:

jhendersonUnsubmitted

Done

If we expand this out, the full syntax is apparently [[#%.<precision><precision><conversion specifier>,<NUMVAR:]], which I don't think is what you mean :-)

jhenderson: If we expand this out, the full syntax is apparently `[[#%.<precision><precision><conversion…

.. code-block:: llvm .. code-block:: llvm

; CHECK: mov r[[#REG:]], 0x[[#%X,IMM:]] ; CHECK: mov r[[#REG:]], 0x[[#%.8X,ADDR:]]

jhendersonUnsubmitted

Done

Should we say something about leading zeros beyond those required by the precision value?

jhenderson: Should we say something about leading zeros beyond those required by the precision value?

thopreAuthorUnsubmitted

Done

Is that what you expected?

thopre: Is that what you expected?

jhendersonUnsubmitted

Done

``<precision>`` indicates the minimum number of digits that the value matched

- must have, expecting leading zeros to have that amount of digits should the

- value be too small otherwise.

+ must have, expecting leading zeros if needed.

* ``<conversion specifier>`` is an optional scanf-style conversion specifier

I think that is much simpler.

jhenderson: I think that is much simpler.

would match ``mov r5, 0x0000FEFE`` and set ``REG`` to the value ``5`` and

``ADDR`` to the value ``0xFEFE``. Note that due to the precision it would fail

to match ``mov r5, 0xFEFE``.

jhendersonUnsubmitted

Done

Nit: There's a double space after "to".

jhenderson: Nit: There's a double space after "to".

jdennyUnsubmitted

Done

IMM->ADDR

The documentation above says 8 is the minimum, but F0F0 has 4 digits.

jdenny: `IMM`->`ADDR` The documentation above says 8 is the minimum, but `F0F0` has 4 digits.

jhendersonUnsubmitted

Done

If this example is meant to demonstrate the precision as well as conversion, it probably makes sense to say something like "but would not match mov r5, 0x00F0F0FEFE" and/or change the example to mov r5, 0x0000F0F0, so that it shows the precision behaviour.

jhenderson: If this example is meant to demonstrate the precision as well as conversion, it probably makes…

thopreAuthorUnsubmitted

Done

Good point.

thopre: Good point.

As a result of the numeric variable definition being optional, it is possible

to only check that a numeric value is present in a given format. This can be

useful when the value itself is not useful, for instance:

jdennyUnsubmitted

Done

Isn't : supposed to be ,? That's how the tests seem to work, and FileCheck complains when I try this syntax with :.

jdenny: Isn't `:` supposed to be `,`? That's how the tests seem to work, and FileCheck complains when…

.. code-block:: gas

; CHECK-NOT: mov r0, r[[#]]

jhendersonUnsubmitted

Done

in this context indicating how a numeric expression value should be matched

- against. If absent, both component of the format specifier are inferred from

+ against. If absent, both components of the format specifier are inferred from

the matching format of the numeric variable(s) used by the expression

jhenderson:

to check that a value is synthesized rather than moved around.

would match ``mov r5, 0xF0F0`` and set ``REG`` to the value ``5`` and ``IMM``

to the value ``0xF0F0``.

jdennyUnsubmitted

Done

"variable" -> "variables,"

jdenny: "variable" -> "variables,"

The syntax of a numeric substitution is The syntax of a numeric substitution is

``[[#%<fmtspec>: <constraint> <expr>]]`` where: ``[[#%<fmtspec>, <constraint> <expr>]]`` where:

* ``%<fmtspec>`` is the same matching format specifier as for defining numeric * ``<fmtspec>`` is the same format specifier as for defining a variable but

variables but acting as a printf-style format to indicate how a numeric in this context indicating how a numeric expression value should be matched

expression value should be matched against. If absent, the format specifier against. If absent, both components of the format specifier are inferred from

is inferred from the matching format of the numeric variable(s) used by the the matching format of the numeric variable(s) used by the expression

expression constraint if any, and defaults to ``%u`` if no numeric variable constraint if any, and defaults to ``%u`` if no numeric variable is used,

is used. In case of conflict between matching formats of several numeric denoting that the value should be unsigned with no leading zeros. In case of

variables the format specifier is mandatory. conflict between format specifiers of several numeric variables, the

conversion specifier becomes mandatory but the precision specifier remains

optional.

* ``<constraint>`` is the constraint describing how the value to match must * ``<constraint>`` is the constraint describing how the value to match must

relate to the value of the numeric expression. The only currently accepted relate to the value of the numeric expression. The only currently accepted

constraint is ``==`` for an exact match and is the default if constraint is ``==`` for an exact match and is the default if

``<constraint>`` is not provided. No matching constraint must be specified ``<constraint>`` is not provided. No matching constraint must be specified

when the ``<expr>`` is empty. when the ``<expr>`` is empty.

jdennyUnsubmitted

Done

When <expr> is empty (here or in the variable definition syntax), then the precision specifier specifies the minimum number of digits to be matched, right?

When <expr> is non-empty, then the precision specifier combined with the actual value of the expression specifies an exact number of digits to be matched, right? I understand that the precision is a minimum here too, but I think it's a printing/substitution minimum not a matching/capturing minimum.

My point is that this case is a bit hard to follow. It seems to me that the numeric substitution syntax with no <expr> is actually more like a variable definition syntax with no variable (and thus no :): there's no existing value to match against, so there's nothing to "substitute". Instead you're capturing a new value and either saving it as a variable or discarding it. Can we document it that way?

If so, instead of calling the first syntax "The syntax to define a numeric variable", you might call it "The syntax to capture a numeric value". It can optionally define a numeric variable.

jdenny: When `<expr>` is empty (here or in the variable definition syntax), then the precision…

thopreAuthorUnsubmitted

Done

I like the idea of distinguishing between capturing a value and substituting a value. Good call.

thopre: I like the idea of distinguishing between capturing a value and substituting a value. Good call.

* ``<expr>`` is an expression. An expression is in turn recursively defined * ``<expr>`` is an expression. An expression is in turn recursively defined

as: as:

* a numeric operand, or * a numeric operand, or

* an expression followed by an operator and a numeric operand. * an expression followed by an operator and a numeric operand.

A numeric operand is a previously defined numeric variable, an integer A numeric operand is a previously defined numeric variable, an integer

▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines .. code-block:: gas

load r5, [r0] load r5, [r0]

load r7, [r1] load r7, [r1]

Loading from 0xa0463440 to 0xa0463443 Loading from 0xa0463440 to 0xa0463443

Due to ``7`` being unequal to ``5 + 1`` and ``a0463443`` being unequal to Due to ``7`` being unequal to ``5 + 1`` and ``a0463443`` being unequal to

``a0463440 + 7``. ``a0463440 + 7``.

The syntax also supports an empty expression, equivalent to writing {{[0-9]+}},

for cases where the input must contain a numeric value but the value itself

does not matter:

.. code-block:: gas

; CHECK-NOT: mov r0, r[[#]]

to check that a value is synthesized rather than moved around.

A numeric variable can also be defined to the result of a numeric expression, A numeric variable can also be defined to the result of a numeric expression,

in which case the numeric expression constraint is checked and if verified the in which case the numeric expression constraint is checked and if verified the

variable is assigned to the value. The unified syntax for both defining numeric variable is assigned to the value. The unified syntax for both checking a

variables and checking a numeric expression is thus numeric expression and capturing its value into a numeric variable is thus

``[[#%<fmtspec>,<NUMVAR>: <constraint> <expr>]]`` with each element as ``[[#%<fmtspec>,<NUMVAR>: <constraint> <expr>]]`` with each element as

described previously. One can use this syntax to make a testcase more described previously. One can use this syntax to make a testcase more

self-describing by using variables instead of values: self-describing by using variables instead of values:

.. code-block:: gas .. code-block:: gas

; CHECK: mov r[[#REG_OFFSET:]], 0x[[#%X,FIELD_OFFSET:12]] ; CHECK: mov r[[#REG_OFFSET:]], 0x[[#%X,FIELD_OFFSET:12]]

; CHECK-NEXT: load r[[#]], [r[[#REG_BASE:]], r[[#REG_OFFSET]]] ; CHECK-NEXT: load r[[#]], [r[[#REG_BASE:]], r[[#REG_OFFSET]]]

▲ Show 20 Lines • Show All 63 Lines • Show Last 20 Lines

llvm/lib/Support/FileCheck.cpp

Show All 37 Lines StringRef ExpressionFormat::toString() const {

case Kind::HexUpper: case Kind::HexUpper:

return StringRef("%X"); return StringRef("%X");

case Kind::HexLower: case Kind::HexLower:

return StringRef("%x"); return StringRef("%x");

} }

llvm_unreachable("unknown expression format"); llvm_unreachable("unknown expression format");

} }

Expected<StringRef> ExpressionFormat::getWildcardRegex() const { Expected<std::string> ExpressionFormat::getWildcardRegex() const {

auto CreatePrecisionRegex = [this](StringRef S) {

jhendersonUnsubmitted

Done

This doesn't compile. I don't think you can use -> in a capture list. You just need to specify this and then use appropriately below.

jhenderson: This doesn't compile. I don't think you can use `->` in a capture list. You just need to…

thopreAuthorUnsubmitted

Done

It's what I found out before I submit this diff, I must have forgotten to undo the change. Sorry about that.

thopre: It's what I found out before I submit this diff, I must have forgotten to undo the change.

return (S + Twine('{') + Twine(Precision) + "}").str();

jhendersonUnsubmitted

Done

StringRef?

jhenderson: `StringRef`?

thopreAuthorUnsubmitted

Done

ostringstream below does not understand StringRef so I would need to do .str() which can be expensive. Any reason not to keep const char*?

thopre: ostringstream below does not understand StringRef so I would need to do .str() which can be…

grimarUnsubmitted

Done

auto CreatePrecisionRegex = [this](StringRef S) {

- return (S + Twine(Precision) + "}").str();

+ return (S + "{" + Twine(Precision) + "}").str();

};

switch (Value) {

Perhaps, a bit cleaner would be to add the "{" right here.

grimar: Perhaps, a bit cleaner would be to add the "{" right here.

};

switch (Value) { switch (Value) {

case Kind::Unsigned: case Kind::Unsigned:

return StringRef("[0-9]+"); if (Precision)

return CreatePrecisionRegex("([1-9][0-9]*)?[0-9]");

return std::string("[0-9]+");

case Kind::Signed: case Kind::Signed:

return StringRef("-?[0-9]+"); if (Precision)

return CreatePrecisionRegex("-?([1-9][0-9]*)?[0-9]");

return std::string("-?[0-9]+");

case Kind::HexUpper: case Kind::HexUpper:

return StringRef("[0-9A-F]+"); if (Precision)

return CreatePrecisionRegex("([1-9A-F][0-9A-F]*)?[0-9A-F]");

return std::string("[0-9A-F]+");

case Kind::HexLower: case Kind::HexLower:

return StringRef("[0-9a-f]+"); if (Precision)

return CreatePrecisionRegex("([1-9a-f][0-9a-f]*)?[0-9a-f]");

return std::string("[0-9a-f]+");

default: default:

return createStringError(std::errc::invalid_argument, return createStringError(std::errc::invalid_argument,

grimarUnsubmitted

Done

Seems you should be able to do the following instead?

return (RegexPrefix + Twine(Precision) + "}").str();

grimar: Seems you should be able to do the following instead? ``` return (RegexPrefix + Twine…

"trying to match value with invalid format"); "trying to match value with invalid format");

} }

grimarUnsubmitted

Done

Perhaps, it might be simpler just to merge switches and write the logic here as:

Expected<std::string> ExpressionFormat::getWildcardRegex() const {
  if (Value == Kind::NoFormat)
    return createStringError(std::errc::invalid_argument,
                             "trying to match value with invalid format");
  switch (Value) {
  case Kind::Unsigned:
    if (Precision)
      return ("-?([1-9][0-9]*)?[0-9]{" + Twine(Precision) + "}").str();
    return std::string("[0-9]+");
  case Kind::Signed:
     ...
  default:
    llvm_unreachable("....");
  }
}

grimar: Perhaps, it might be simpler just to merge switches and write the logic here as: ```…

thopreAuthorUnsubmitted

Done

I'm not a big fan of repeating the formatting logic for the Precision case so I've kept that bit as is. What do you think of the result?

thopre: I'm not a big fan of repeating the formatting logic for the Precision case so I've kept that…

grimarUnsubmitted

Done

I see 2 possible improvements:

When you have a dedicated RegexPrefix variable, you postpone the return and have to add breaks everywhere. If you just do not want to repeat the formatting logic, I'd suggest to add a little helper. E.g:

auto CreatePrecisionRegex = [](StringRef S) -> std::string {
  return (S + Twine(Precision) + "}").str();
};

switch (Value) {
case Kind::Unsigned:
  if (Precision)
    return CreatePrecisionRegex("-?([1-9][0-9]*)?[0-9]{");
  return std::string("[0-9]+");
default:
  llvm_unreachable("ddd");
}

The main benefit is that you can return early and avoid having a one more variable.

Perhaps it doesn't make much sence to use createStringError for the default case? It is unreachable now and can't be tested either (I believe).

So I'd either remove the if (Value == Kind::NoFormat) block and handle the error in the default, like you initially did,
or keep it and switch to using llvm_unreachable in default.

grimar: I see 2 possible improvements: 1) When you have a dedicated `RegexPrefix` variable, you…

grimarUnsubmitted

Done

Oh, and for 1) there is no need to use -> std::string:

auto CreatePrecisionRegex = [](StringRef S) {
  return (S + Twine(Precision) + "}").str();
};

grimar: Oh, and for `1)` there is no need to use `-> std::string`: ``` auto CreatePrecisionRegex = []…

thopreAuthorUnsubmitted

Done

Ah yes, I started doing it your way and changed in the middle. I'll remove the top if block

thopre: Ah yes, I started doing it your way and changed in the middle. I'll remove the top if block

Expected<std::string> Expected<std::string>

ExpressionFormat::getMatchingString(ExpressionValue IntegerValue) const { ExpressionFormat::getMatchingString(ExpressionValue IntegerValue) const {

uint64_t AbsoluteValue;

StringRef SignPrefix = IntegerValue.isNegative() ? "-" : "";

if (Value == Kind::Signed) { if (Value == Kind::Signed) {

Expected<int64_t> SignedValue = IntegerValue.getSignedValue(); Expected<int64_t> SignedValue = IntegerValue.getSignedValue();

if (!SignedValue) if (!SignedValue)

return SignedValue.takeError(); return SignedValue.takeError();

return itostr(*SignedValue); if (*SignedValue < 0)

} AbsoluteValue = cantFail(IntegerValue.getAbsolute().getUnsignedValue());

else

AbsoluteValue = *SignedValue;

grimarUnsubmitted

Done

else

- AbsoluteValue = cantFail(IntegerValue.getSignedValue());

+ AbsoluteValue = *SignedValue;

} else {

You can just use the value you have already.

grimar: You can just use the value you have already.

} else {

Expected<uint64_t> UnsignedValue = IntegerValue.getUnsignedValue(); Expected<uint64_t> UnsignedValue = IntegerValue.getUnsignedValue();

if (!UnsignedValue) if (!UnsignedValue)

return UnsignedValue.takeError(); return UnsignedValue.takeError();

AbsoluteValue = *UnsignedValue;

}

std::string AbsoluteValueStr;

switch (Value) { switch (Value) {

case Kind::Unsigned: case Kind::Unsigned:

return utostr(*UnsignedValue); case Kind::Signed:

AbsoluteValueStr = utostr(AbsoluteValue);

break;

case Kind::HexUpper: case Kind::HexUpper:

return utohexstr(*UnsignedValue, /*LowerCase=*/false);

case Kind::HexLower: case Kind::HexLower:

return utohexstr(*UnsignedValue, /*LowerCase=*/true); AbsoluteValueStr = utohexstr(AbsoluteValue, Value == Kind::HexLower);

break;

grimarUnsubmitted

Done

You can combine these cases I think:

case Kind::HexUpper:
case Kind::HexLower:
  AbsoluteValueStr = utohexstr(AbsoluteValue, Value == Kind::HexLower);
  break;

grimar: You can combine these cases I think: ``` case Kind::HexUpper: case Kind::HexLower…

default: default:

return createStringError(std::errc::invalid_argument, return createStringError(std::errc::invalid_argument,

"trying to match value with invalid format"); "trying to match value with invalid format");

} }

if (Precision > AbsoluteValueStr.size()) {

unsigned LeadingZeros = Precision - AbsoluteValueStr.size();

return (Twine(SignPrefix) + std::string(LeadingZeros, '0') +

AbsoluteValueStr)

.str();

}

return (Twine(SignPrefix) + AbsoluteValueStr).str();

} }

Expected<ExpressionValue> Expected<ExpressionValue>

ExpressionFormat::valueFromStringRepr(StringRef StrVal, ExpressionFormat::valueFromStringRepr(StringRef StrVal,

const SourceMgr &SM) const { const SourceMgr &SM) const {

bool ValueIsSigned = Value == Kind::Signed; bool ValueIsSigned = Value == Kind::Signed;

StringRef OverflowErrorStr = "unable to represent numeric value"; StringRef OverflowErrorStr = "unable to represent numeric value";

if (ValueIsSigned) { if (ValueIsSigned) {

▲ Show 20 Lines • Show All 622 Lines • ▼ Show 20 Lines

Expected<std::unique_ptr<Expression>> Pattern::parseNumericSubstitutionBlock( Expected<std::unique_ptr<Expression>> Pattern::parseNumericSubstitutionBlock(

StringRef Expr, Optional<NumericVariable *> &DefinedNumericVariable, StringRef Expr, Optional<NumericVariable *> &DefinedNumericVariable,

bool IsLegacyLineExpr, Optional<size_t> LineNumber, bool IsLegacyLineExpr, Optional<size_t> LineNumber,

FileCheckPatternContext *Context, const SourceMgr &SM) { FileCheckPatternContext *Context, const SourceMgr &SM) {

std::unique_ptr<ExpressionAST> ExpressionASTPointer = nullptr; std::unique_ptr<ExpressionAST> ExpressionASTPointer = nullptr;

StringRef DefExpr = StringRef(); StringRef DefExpr = StringRef();

DefinedNumericVariable = None; DefinedNumericVariable = None;

ExpressionFormat ExplicitFormat = ExpressionFormat(); ExpressionFormat ExplicitFormat = ExpressionFormat();

unsigned Precision = 0;

// Parse format specifier (NOTE: ',' is also an argument seperator). // Parse format specifier (NOTE: ',' is also an argument seperator).

size_t FormatSpecEnd = Expr.find(','); size_t FormatSpecEnd = Expr.find(',');

size_t FunctionStart = Expr.find('('); size_t FunctionStart = Expr.find('(');

if (FormatSpecEnd != StringRef::npos && FormatSpecEnd < FunctionStart) { if (FormatSpecEnd != StringRef::npos && FormatSpecEnd < FunctionStart) {

Expr = Expr.ltrim(SpaceChars); StringRef FormatExpr = Expr.take_front(FormatSpecEnd);

if (!Expr.consume_front("%")) Expr = Expr.drop_front(FormatSpecEnd + 1);

FormatExpr = FormatExpr.trim(SpaceChars);

if (!FormatExpr.consume_front("%"))

grimarUnsubmitted

Done

Use trim?

FormatExpr.trim(SpaceChars)

grimar: Use `trim`? ``` FormatExpr.trim(SpaceChars) ```

return ErrorDiagnostic::get( return ErrorDiagnostic::get(

SM, Expr, "invalid matching format specification in expression"); SM, FormatExpr,

"invalid matching format specification in expression");

// Parse precision.

if (FormatExpr.consume_front(".")) {

if (FormatExpr.consumeInteger(10, Precision))

return ErrorDiagnostic::get(SM, FormatExpr,

"invalid precision in format specifier");

}

if (!FormatExpr.empty()) {

// Check for unknown matching format specifier and set matching format in // Check for unknown matching format specifier and set matching format in

// class instance representing this expression. // class instance representing this expression.

SMLoc fmtloc = SMLoc::getFromPointer(Expr.data()); SMLoc FmtLoc = SMLoc::getFromPointer(FormatExpr.data());

jhendersonUnsubmitted

Done

Can you fix the case of fmtloc whilst you're modifying this line, please?

jhenderson: Can you fix the case of `fmtloc` whilst you're modifying this line, please?

switch (popFront(Expr)) { switch (popFront(FormatExpr)) {

case 'u': case 'u':

ExplicitFormat = ExpressionFormat(ExpressionFormat::Kind::Unsigned); ExplicitFormat =

ExpressionFormat(ExpressionFormat::Kind::Unsigned, Precision);

break; break;

case 'd': case 'd':

ExplicitFormat = ExpressionFormat(ExpressionFormat::Kind::Signed); ExplicitFormat =

ExpressionFormat(ExpressionFormat::Kind::Signed, Precision);

break; break;

case 'x': case 'x':

ExplicitFormat = ExpressionFormat(ExpressionFormat::Kind::HexLower); ExplicitFormat =

ExpressionFormat(ExpressionFormat::Kind::HexLower, Precision);

break; break;

case 'X': case 'X':

ExplicitFormat = ExpressionFormat(ExpressionFormat::Kind::HexUpper); ExplicitFormat =

ExpressionFormat(ExpressionFormat::Kind::HexUpper, Precision);

break; break;

default: default:

return ErrorDiagnostic::get(SM, fmtloc, return ErrorDiagnostic::get(SM, FmtLoc,

"invalid format specifier in expression"); "invalid format specifier in expression");

} }

}

Expr = Expr.ltrim(SpaceChars); FormatExpr = FormatExpr.ltrim(SpaceChars);

if (!Expr.consume_front(",")) if (!FormatExpr.empty())

return ErrorDiagnostic::get( return ErrorDiagnostic::get(

SM, Expr, "invalid matching format specification in expression"); SM, FormatExpr,

"invalid matching format specification in expression");

} }

// Save variable definition expression if any. // Save variable definition expression if any.

size_t DefEnd = Expr.find(':'); size_t DefEnd = Expr.find(':');

if (DefEnd != StringRef::npos) { if (DefEnd != StringRef::npos) {

DefExpr = Expr.substr(0, DefEnd); DefExpr = Expr.substr(0, DefEnd);

Expr = Expr.substr(DefEnd + 1); Expr = Expr.substr(DefEnd + 1);

} }

▲ Show 20 Lines • Show All 43 Lines • ▼ Show 20 Lines Expected<std::unique_ptr<Expression>> Pattern::parseNumericSubstitutionBlock(

else if (ExpressionASTPointer) { else if (ExpressionASTPointer) {

Expected<ExpressionFormat> ImplicitFormat = Expected<ExpressionFormat> ImplicitFormat =

ExpressionASTPointer->getImplicitFormat(SM); ExpressionASTPointer->getImplicitFormat(SM);

if (!ImplicitFormat) if (!ImplicitFormat)

return ImplicitFormat.takeError(); return ImplicitFormat.takeError();

Format = *ImplicitFormat; Format = *ImplicitFormat;

} }

if (!Format) if (!Format)

Format = ExpressionFormat(ExpressionFormat::Kind::Unsigned); Format = ExpressionFormat(ExpressionFormat::Kind::Unsigned, Precision);

std::unique_ptr<Expression> ExpressionPointer = std::unique_ptr<Expression> ExpressionPointer =

std::make_unique<Expression>(std::move(ExpressionASTPointer), Format); std::make_unique<Expression>(std::move(ExpressionASTPointer), Format);

// Parse the numeric variable definition. // Parse the numeric variable definition.

if (DefEnd != StringRef::npos) { if (DefEnd != StringRef::npos) {

DefExpr = DefExpr.ltrim(SpaceChars); DefExpr = DefExpr.ltrim(SpaceChars);

Expected<NumericVariable *> ParseResult = parseNumericVariableDefinition( Expected<NumericVariable *> ParseResult = parseNumericVariableDefinition(

▲ Show 20 Lines • Show All 117 Lines • ▼ Show 20 Lines if (PatternStr.startswith("[[")) {

bool IsDefinition = false; bool IsDefinition = false;

bool SubstNeeded = false; bool SubstNeeded = false;

// Whether the substitution block is a legacy use of @LINE with string // Whether the substitution block is a legacy use of @LINE with string

// substitution block syntax. // substitution block syntax.

bool IsLegacyLineExpr = false; bool IsLegacyLineExpr = false;

StringRef DefName; StringRef DefName;

StringRef SubstStr; StringRef SubstStr;

StringRef MatchRegexp; std::string MatchRegexp;

size_t SubstInsertIdx = RegExStr.size(); size_t SubstInsertIdx = RegExStr.size();

// Parse string variable or legacy @LINE expression. // Parse string variable or legacy @LINE expression.

if (!IsNumBlock) { if (!IsNumBlock) {

size_t VarEndIdx = MatchStr.find(":"); size_t VarEndIdx = MatchStr.find(":");

size_t SpacePos = MatchStr.substr(0, VarEndIdx).find_first_of(" \t"); size_t SpacePos = MatchStr.substr(0, VarEndIdx).find_first_of(" \t");

if (SpacePos != StringRef::npos) { if (SpacePos != StringRef::npos) {

SM.PrintMessage(SMLoc::getFromPointer(MatchStr.data() + SpacePos), SM.PrintMessage(SMLoc::getFromPointer(MatchStr.data() + SpacePos),

Show All 27 Lines if (PatternStr.startswith("[[")) {

if (Context->GlobalNumericVariableTable.find(Name) != if (Context->GlobalNumericVariableTable.find(Name) !=

Context->GlobalNumericVariableTable.end()) { Context->GlobalNumericVariableTable.end()) {

SM.PrintMessage( SM.PrintMessage(

SMLoc::getFromPointer(Name.data()), SourceMgr::DK_Error, SMLoc::getFromPointer(Name.data()), SourceMgr::DK_Error,

"numeric variable with name '" + Name + "' already exists"); "numeric variable with name '" + Name + "' already exists");

return true; return true;

} }

DefName = Name; DefName = Name;

MatchRegexp = MatchStr; MatchRegexp = MatchStr.str();

} else { } else {

if (IsPseudo) { if (IsPseudo) {

MatchStr = OrigMatchStr; MatchStr = OrigMatchStr;

IsLegacyLineExpr = IsNumBlock = true; IsLegacyLineExpr = IsNumBlock = true;

} else } else

SubstStr = Name; SubstStr = Name;

} }

▲ Show 20 Lines • Show All 1,649 Lines • Show Last 20 Lines

llvm/lib/Support/FileCheckImpl.h

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines enum class Kind {

/// Value should be printed as an uppercase hex number. /// Value should be printed as an uppercase hex number.

HexUpper, HexUpper,

/// Value should be printed as a lowercase hex number. /// Value should be printed as a lowercase hex number.

HexLower HexLower

}; };

private: private:

Kind Value; Kind Value;

unsigned Precision = 0;

MaskRayUnsubmitted

Done

Prefer default member initializer (unsigned Precision = 0;)

MaskRay: Prefer default member initializer (`unsigned Precision = 0;`)

public: public:

/// Evaluates a format to true if it can be used in a match. /// Evaluates a format to true if it can be used in a match.

explicit operator bool() const { return Value != Kind::NoFormat; } explicit operator bool() const { return Value != Kind::NoFormat; }

/// Define format equality: formats are equal if neither is NoFormat and /// Define format equality: formats are equal if neither is NoFormat and

/// their kinds are the same. /// their kinds and precision are the same.

bool operator==(const ExpressionFormat &Other) const { bool operator==(const ExpressionFormat &Other) const {

return Value != Kind::NoFormat && Value == Other.Value; return Value != Kind::NoFormat && Value == Other.Value &&

Precision == Other.Precision;

} }

bool operator!=(const ExpressionFormat &Other) const { bool operator!=(const ExpressionFormat &Other) const {

return !(*this == Other); return !(*this == Other);

} }

bool operator==(Kind OtherValue) const { return Value == OtherValue; } bool operator==(Kind OtherValue) const { return Value == OtherValue; }

bool operator!=(Kind OtherValue) const { return !(*this == OtherValue); } bool operator!=(Kind OtherValue) const { return !(*this == OtherValue); }

/// \returns the format specifier corresponding to this format as a string. /// \returns the format specifier corresponding to this format as a string.

StringRef toString() const; StringRef toString() const;

ExpressionFormat() : Value(Kind::NoFormat){}; ExpressionFormat() : Value(Kind::NoFormat){};

explicit ExpressionFormat(Kind Value) : Value(Value){}; explicit ExpressionFormat(Kind Value) : Value(Value), Precision(0){};

explicit ExpressionFormat(Kind Value, unsigned Precision)

jhendersonUnsubmitted

Done

/// \returns a wildcard regular expression string that matches any value in

- /// the format represented by this instance and none other value, or an error

+ /// the format represented by this instance and no other value, or an error

/// if the format is NoFormat.

jhenderson:

/// \returns a wildcard regular expression StringRef that matches any value : Value(Value), Precision(Precision){};

/// in the format represented by this instance, or an error if the format is

/// NoFormat. /// \returns a wildcard regular expression string that matches any value in

Expected<StringRef> getWildcardRegex() const; /// the format represented by this instance and no other value, or an error

/// if the format is NoFormat.

Expected<std::string> getWildcardRegex() const;

/// \returns the string representation of \p Value in the format represented /// \returns the string representation of \p Value in the format represented

/// by this instance, or an error if conversion to this format failed or the /// by this instance, or an error if conversion to this format failed or the

/// format is NoFormat. /// format is NoFormat.

Expected<std::string> getMatchingString(ExpressionValue Value) const; Expected<std::string> getMatchingString(ExpressionValue Value) const;

/// \returns the value corresponding to string representation \p StrVal /// \returns the value corresponding to string representation \p StrVal

/// according to the matching format represented by this instance or an error /// according to the matching format represented by this instance or an error

▲ Show 20 Lines • Show All 763 Lines • Show Last 20 Lines

llvm/test/FileCheck/numeric-expression.txt

Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines

INVALID-FMT-SPEC2-NEXT: INVVAR2=[[#%hhd,INVVAR2:]]

INVALID-FMT-SPEC-MSG1: numeric-expression.txt:[[#@LINE-2]]:37: error: invalid format specifier in expression

INVALID-FMT-SPEC-MSG1-NEXT: {{I}}NVALID-FMT-SPEC1-NEXT: INVVAR1={{\[\[#%c,INVVAR1:\]\]}}

INVALID-FMT-SPEC-MSG1-NEXT: {{^}} ^{{$}}

INVALID-FMT-SPEC-MSG2: numeric-expression.txt:[[#@LINE-4]]:37: error: invalid format specifier in expression

INVALID-FMT-SPEC-MSG2-NEXT: {{I}}NVALID-FMT-SPEC2-NEXT: INVVAR2={{\[\[#%hhd,INVVAR2:\]\]}}

INVALID-FMT-SPEC-MSG2-NEXT: {{^}} ^{{$}}

; Numeric expressions in explicit matching format and default matching rule using

; Numeric variable definition with precision specifier.

; variables defined on other lines without spaces.

DEF PREC FMT // CHECK-LABEL: DEF PREC FMT

00000022 // CHECK-NEXT: {{^}}[[#%.8,PADDED_UNSI:]]

323232323 // CHECK-NEXT: {{^}}[[#%.8,PADDED_UNSI2:]]

00000018 // CHECK-NEXT: {{^}}[[#%.8u,PADDED_UNSI3:]]

181818181 // CHECK-NEXT: {{^}}[[#%.8u,PADDED_UNSI4:]]

0000000f // CHECK-NEXT: {{^}}[[#%.8x,PADDED_LHEX:]]

fffffffff // CHECK-NEXT: {{^}}[[#%.8x,PADDED_LHEX2:]]

0000000E // CHECK-NEXT: {{^}}[[#%.8X,PADDED_UHEX:]]

EEEEEEEEE // CHECK-NEXT: {{^}}[[#%.8X,PADDED_UHEX2:]]

-00000055 // CHECK-NEXT: {{^}}[[#%.8d,PADDED_SIGN:]]

-555555555 // CHECK-NEXT: {{^}}[[#%.8d,PADDED_SIGN2:]]

; Numeric variable definition with precision specifier with value not padded

; enough.

RUN: FileCheck --check-prefix INVALID-PADDING-DEF --input-file %s %s

FAIL DEF PREC FMT // INVALID-PADDING-DEF-LABEL: FAIL DEF PREC FMT

INVALID_PADDED_UNSI: 0000022 // INVALID-PADDING-DEF-NOT: {{^}}INVALID_PADDED_UNSI: [[#%.8,INVALID_PADDED_UNSI:]]

INVALID_PADDED_UNSI2: 0000018 // INVALID-PADDING-DEF-NOT: {{^}}INVALID_PADDED_UNSI2: [[#%.8u,INVALID_PADDED_UNSI2:]]

INVALID_PADDED_LHEX: 000000f // INVALID-PADDING-DEF-NOT: {{^}}INVALID_PADDED_LHEX: [[#%.8x,INVALID_PADDED_LHEX:]]

INVALID_PADDED_UHEX: 000000E // INVALID-PADDING-DEF-NOT: {{^}}INVALID_PADDED_UHEX: [[#%.8X,INVALID_PADDED_UHEX:]]

INVALID_PADDED_SIGN: -0000055 // INVALID-PADDING-DEF-NOT: {{^}}INVALID_PADDED_SIGN: [[#%.8d,INVALID_PADDED_SIGN:]]

; Numeric expressions with explicit matching format and default matching rule

; using variables defined on other lines without spaces.

USE EXPL FMT IMPL MATCH // CHECK-LABEL: USE EXPL FMT IMPL MATCH

11 // CHECK-NEXT: {{^}}[[#%u,UNSI]]

12 // CHECK-NEXT: {{^}}[[#%u,UNSI+1]]

10 // CHECK-NEXT: {{^}}[[#%u,UNSI-1]]

15 // CHECK-NEXT: {{^}}[[#%u,add(UNSI,4)]]

11 // CHECK-NEXT: {{^}}[[#%u,max(UNSI,7)]]

99 // CHECK-NEXT: {{^}}[[#%u,max(UNSI,99)]]

7 // CHECK-NEXT: {{^}}[[#%u,min(UNSI,7)]]

Show All 25 Lines

-17 // CHECK-NEXT: {{^}}[[#%d,max(SIGN,-17)]]

-30 // CHECK-NEXT: {{^}}[[#%d,min(SIGN,-17)]]

-31 // CHECK-NEXT: {{^}}[[#%d,sub(SIGN,1)]]

11 // CHECK-NEXT: {{^}}[[#%u,UNSIa]]

11 // CHECK-NEXT: {{^}}[[#%u,UNSIb]]

11 // CHECK-NEXT: {{^}}[[#%u,UNSIc]]

c // CHECK-NEXT: {{^}}[[#%x,LHEXa]]

; Numeric expressions in explicit matching format and default matching rule using

; Numeric expressions with explicit matching format and default matching rule

; variables defined on other lines with different spacing.

; using variables defined on other lines with different spacing.

USE EXPL FMT IMPL MATCH SPC // CHECK-LABEL: USE EXPL FMT IMPL MATCH SPC

11 // CHECK-NEXT: {{^}}[[#%u, UNSI]]

11 // CHECK-NEXT: {{^}}[[# %u, UNSI]]

11 // CHECK-NEXT: {{^}}[[# %u, UNSI ]]

12 // CHECK-NEXT: {{^}}[[#%u, UNSI+1]]

12 // CHECK-NEXT: {{^}}[[# %u, UNSI+1]]

12 // CHECK-NEXT: {{^}}[[# %u , UNSI+1]]

12 // CHECK-NEXT: {{^}}[[# %u , UNSI +1]]

Show All 10 Lines

13 // CHECK-NEXT: {{^}}[[# %u , add(UNSI,2)]]

13 // CHECK-NEXT: {{^}}[[# %u , add(UNSI, 2)]]

13 // CHECK-NEXT: {{^}}[[# %u , add( UNSI, 2)]]

13 // CHECK-NEXT: {{^}}[[# %u , add( UNSI,2)]]

13 // CHECK-NEXT: {{^}}[[# %u , add(UNSI,2) ]]

13 // CHECK-NEXT: {{^}}[[# %u , add (UNSI,2)]]

104 // CHECK-NEXT: {{^}}[[# %u , UNSI + sub( add (100 , UNSI+ 1 ), 20) +1 ]]

; Numeric expressions in implicit matching format and default matching rule using

; Numeric expressions with explicit matching format, precision, and default

jhendersonUnsubmitted

Done

104 // CHECK-NEXT: {{^}}[[# %u , UNSI + sub( add (100 , UNSI+ 1 ), 20) +1 ]]

- ; Numeric expressions in explicit matching format with precision and default

+ ; Numeric expressions with explicit matching format, precision, and default

; matching rule using variables defined on other lines without spaces.

Same goes elsewhere.

jhenderson: Same goes elsewhere.

; variables defined on other lines.

; matching rule using variables defined on other lines without spaces.

USE EXPL FMT WITH PREC IMPL MATCH // CHECK-LABEL: USE EXPL FMT WITH PREC IMPL MATCH

11 // CHECK-NEXT: {{^}}[[#%.1u,UNSI]]

00000011 // CHECK-NEXT: {{^}}[[#%.8u,UNSI]]

1c // CHECK-NEXT: {{^}}[[#%.1x,LHEX+16]]

0000000c // CHECK-NEXT: {{^}}[[#%.8x,LHEX]]

1D // CHECK-NEXT: {{^}}[[#%.1X,UHEX+16]]

0000000D // CHECK-NEXT: {{^}}[[#%.8X,UHEX]]

-30 // CHECK-NEXT: {{^}}[[#%.1d,SIGN]]

-00000030 // CHECK-NEXT: {{^}}[[#%.8d,SIGN]]

; Numeric expressions with explicit matching format, precision and wrong

; padding, and default matching rule using variables defined on other lines

; without spaces.

RUN: FileCheck --check-prefixes CHECK,INVALID-PADDING-EXPL-USE --input-file %s %s

FAIL USE IMPL FMT WITH PREC EXPL MATCH // INVALID-PADDING-EXPL-USE-LABEL: FAIL USE IMPL FMT WITH PREC IMPL MATCH

INVALID UNSI+1: 0000012 // INVALID-PADDING-EXPL-USE-NOT: {{^}}INVALID UNSI+1: [[#%.8u,UNSI+1]]

INVALID UNSI-1: 000000010 // INVALID-PADDING-EXPL-USE-NOT: {{^}}INVALID UNSI-1: [[#%.8u,UNSI-1]]

INVALID LHEX+1: 000000d // INVALID-PADDING-EXPL-USE-NOT: {{^}}INVALID LHEX+1: [[#%.8x,LHEX+1]]

INVALID LHEX-1: 00000000b // INVALID-PADDING-EXPL-USE-NOT: {{^}}INVALID LHEX-1: [[#%.8x,LHEX-1]]

INVALID UHEX+1: 000000E // INVALID-PADDING-EXPL-USE-NOT: {{^}}INVALID UHEX+1: [[#%.8X,UHEX+1]]

INVALID UHEX-1: 00000000C // INVALID-PADDING-EXPL-USE-NOT: {{^}}INVALID UHEX-1: [[#%.8X,UHEX-1]]

INVALID SIGN+1: -0000029 // INVALID-PADDING-EXPL-USE-NOT: {{^}}INVALID SIGN+1: [[#%.8d,SIGN+1]]

INVALID SIGN-1: -000000031 // INVALID-PADDING-EXPL-USE-NOT: {{^}}INVALID SIGN-1: [[#%.8d,SIGN-1]]

; Numeric expressions with implicit matching format and default matching rule

; using variables defined on other lines.

USE IMPL FMT IMPL MATCH // CHECK-LABEL: USE IMPL FMT IMPL MATCH

11 // CHECK-NEXT: {{^}}[[#UNSI]]

12 // CHECK-NEXT: {{^}}[[#UNSI+1]]

10 // CHECK-NEXT: {{^}}[[#UNSI-1]]

99 // CHECK-NEXT: {{^}}[[#max(UNSI,99)]]

7 // CHECK-NEXT: {{^}}[[#min(UNSI,7)]]

c // CHECK-NEXT: {{^}}[[#LHEX]]

d // CHECK-NEXT: {{^}}[[#LHEX+1]]

b // CHECK-NEXT: {{^}}[[#LHEX-1]]

1a // CHECK-NEXT: {{^}}[[#LHEX+0xe]]

1a // CHECK-NEXT: {{^}}[[#LHEX+0xE]]

ff // CHECK-NEXT: {{^}}[[#max(LHEX,255)]]

a // CHECK-NEXT: {{^}}[[#min(LHEX,10)]]

D // CHECK-NEXT: {{^}}[[#UHEX]]

E // CHECK-NEXT: {{^}}[[#UHEX+1]]

C // CHECK-NEXT: {{^}}[[#UHEX-1]]

1B // CHECK-NEXT: {{^}}[[#UHEX+0xe]]

1B // CHECK-NEXT: {{^}}[[#UHEX+0xE]]

FF // CHECK-NEXT: {{^}}[[#max(UHEX,255)]]

A // CHECK-NEXT: {{^}}[[#min(UHEX,10)]]

-30 // CHECK-NEXT: {{^}}[[#SIGN]]

-29 // CHECK-NEXT: {{^}}[[#SIGN+1]]

-31 // CHECK-NEXT: {{^}}[[#SIGN-1]]

; Numeric expressions with implicit matching format, precision, and default

; matching rule using variables defined on other lines.

USE IMPL FMT WITH PREC IMPL MATCH // CHECK-LABEL: USE IMPL FMT WITH PREC IMPL MATCH

00000023 // CHECK-NEXT: {{^}}[[#PADDED_UNSI+1]]

323232324 // CHECK-NEXT: {{^}}[[#PADDED_UNSI2+1]]

00000019 // CHECK-NEXT: {{^}}[[#PADDED_UNSI3+1]]

181818182 // CHECK-NEXT: {{^}}[[#PADDED_UNSI4+1]]

00000010 // CHECK-NEXT: {{^}}[[#PADDED_LHEX+1]]

1000000000 // CHECK-NEXT: {{^}}[[#PADDED_LHEX2+1]]

0000000F // CHECK-NEXT: {{^}}[[#PADDED_UHEX+1]]

EEEEEEEEF // CHECK-NEXT: {{^}}[[#PADDED_UHEX2+1]]

-00000054 // CHECK-NEXT: {{^}}[[#PADDED_SIGN+1]]

-555555554 // CHECK-NEXT: {{^}}[[#PADDED_SIGN2+1]]

; Numeric expression with implicit matching format, precision and wrong amount

; of padding, and default matching rule using variables defined on other lines.

RUN: FileCheck --check-prefixes CHECK,INVALID-PADDING-IMPL-USE --input-file %s %s

FAIL USE IMPL FMT WITH PREC IMPL MATCH // INVALID-PADDING-IMPL-USE-LABEL: FAIL USE IMPL FMT WITH PREC IMPL MATCH

INVALID PADDED_UNSI+1: 0000023 // INVALID-PADDING-IMPL-USE-NOT: {{^}}INVALID PADDED_UNSI+1: [[#PADDED_UNSI+1]]

INVALID PADDED_UNSI-1: 000000021 // INVALID-PADDING-IMPL-USE-NOT: {{^}}INVALID PADDED_UNSI-1: [[#PADDED_UNSI-1]]

INVALID PADDED_UNSI3+1: 0000019 // INVALID-PADDING-IMPL-USE-NOT: {{^}}INVALID PADDED_UNSI3+1: [[#PADDED_UNSI3+1]]

INVALID PADDED_UNSI3-1: 000000017 // INVALID-PADDING-IMPL-USE-NOT: {{^}}INVALID PADDED_UNSI3-1: [[#PADDED_UNSI3-1]]

INVALID PADDED_LHEX+1: 0000010 // INVALID-PADDING-IMPL-USE-NOT: {{^}}INVALID PADDED_LHEX+1: [[#PADDED_LHEX+1]]

INVALID PADDED_LHEX-1: 00000000e // INVALID-PADDING-IMPL-USE-NOT: {{^}}INVALID PADDED_LHEX-1: [[#PADDED_LHEX-1]]

INVALID PADDED_UHEX+1: 000000F // INVALID-PADDING-IMPL-USE-NOT: {{^}}INVALID PADDED_UHEX+1: [[#PADDED_UHEX+1]]

INVALID PADDED_UHEX-1: 00000000D // INVALID-PADDING-IMPL-USE-NOT: {{^}}INVALID PADDED_UHEX-1: [[#PADDED_UHEX-1]]

INVALID PADDED_SIGN+1: -0000054 // INVALID-PADDING-IMPL-USE-NOT: {{^}}INVALID PADDED_SIGN+1: [[#PADDED_SIGN+1]]

INVALID PADDED_SIGN-1: -000000056 // INVALID-PADDING-IMPL-USE-NOT: {{^}}INVALID PADDED_SIGN-1: [[#PADDED_SIGN-1]]

; Numeric expressions using variables defined on other lines and an immediate

; interpreted as an unsigned value.

; Note: 9223372036854775819 = 0x8000000000000000 + 11

USE IMPL FMT IMPL MATCH UNSIGNED IMM

9223372036854775819

CHECK-LABEL: USE IMPL FMT IMPL MATCH UNSIGNED IMM

CHECK-NEXT: [[#UNSI+0x8000000000000000]]

▲ Show 20 Lines • Show All 344 Lines • Show Last 20 Lines

llvm/unittests/Support/FileCheckTest.cpp

Show First 20 Lines • Show All 80 Lines • ▼ Show 20 Lines
constexpr uint64_t MaxUint64 = std::numeric_limits<uint64_t>::max();		constexpr uint64_t MaxUint64 = std::numeric_limits<uint64_t>::max();
constexpr int64_t MaxInt64 = std::numeric_limits<int64_t>::max();		constexpr int64_t MaxInt64 = std::numeric_limits<int64_t>::max();
constexpr int64_t MinInt64 = std::numeric_limits<int64_t>::min();		constexpr int64_t MinInt64 = std::numeric_limits<int64_t>::min();
constexpr uint64_t AbsoluteMinInt64 =		constexpr uint64_t AbsoluteMinInt64 =
static_cast<uint64_t>(-(MinInt64 + 1)) + 1;		static_cast<uint64_t>(-(MinInt64 + 1)) + 1;
constexpr uint64_t AbsoluteMaxInt64 = static_cast<uint64_t>(MaxInt64);		constexpr uint64_t AbsoluteMaxInt64 = static_cast<uint64_t>(MaxInt64);

struct ExpressionFormatParameterisedFixture		struct ExpressionFormatParameterisedFixture
: public ::testing::TestWithParam<ExpressionFormat::Kind> {		: public ::testing::TestWithParam<
		std::pair<ExpressionFormat::Kind, unsigned>> {
		unsigned Precision;
bool Signed;		bool Signed;
bool AllowHex;		bool AllowHex;
bool AllowUpperHex;		bool AllowUpperHex;
ExpressionFormat Format;		ExpressionFormat Format;
Regex WildcardRegex;		Regex WildcardRegex;

StringRef TenStr;		StringRef TenStr;
StringRef FifteenStr;		StringRef FifteenStr;
std::string MaxUint64Str;		std::string MaxUint64Str;
std::string MaxInt64Str;		std::string MaxInt64Str;
std::string MinInt64Str;		std::string MinInt64Str;
StringRef FirstInvalidCharDigits;		StringRef FirstInvalidCharDigits;
StringRef AcceptedHexOnlyDigits;		StringRef AcceptedHexOnlyDigits;
StringRef RefusedHexOnlyDigits;		StringRef RefusedHexOnlyDigits;

SourceMgr SM;		SourceMgr SM;

void SetUp() override {		void SetUp() override {
ExpressionFormat::Kind Kind = GetParam();		ExpressionFormat::Kind Kind;
		std::tie(Kind, Precision) = GetParam();
AllowHex = Kind == ExpressionFormat::Kind::HexLower \|\|		AllowHex = Kind == ExpressionFormat::Kind::HexLower \|\|
Kind == ExpressionFormat::Kind::HexUpper;		Kind == ExpressionFormat::Kind::HexUpper;
AllowUpperHex = Kind == ExpressionFormat::Kind::HexUpper;		AllowUpperHex = Kind == ExpressionFormat::Kind::HexUpper;
Signed = Kind == ExpressionFormat::Kind::Signed;		Signed = Kind == ExpressionFormat::Kind::Signed;
Format = ExpressionFormat(Kind);		Format = ExpressionFormat(Kind, Precision);

if (!AllowHex) {		if (!AllowHex) {
MaxUint64Str = std::to_string(MaxUint64);		MaxUint64Str = std::to_string(MaxUint64);
MaxInt64Str = std::to_string(MaxInt64);		MaxInt64Str = std::to_string(MaxInt64);
MinInt64Str = std::to_string(MinInt64);		MinInt64Str = std::to_string(MinInt64);
TenStr = "10";		TenStr = "10";
FifteenStr = "15";		FifteenStr = "15";
FirstInvalidCharDigits = "aA";		FirstInvalidCharDigits = "aA";
AcceptedHexOnlyDigits = RefusedHexOnlyDigits = "N/A";		AcceptedHexOnlyDigits = RefusedHexOnlyDigits = "N/A";
return;		return;
}		}

MaxUint64Str = AllowUpperHex ? "FFFFFFFFFFFFFFFF" : "ffffffffffffffff";		MaxUint64Str = AllowUpperHex ? "FFFFFFFFFFFFFFFF" : "ffffffffffffffff";
MaxInt64Str = AllowUpperHex ? "7FFFFFFFFFFFFFFF" : "7fffffffffffffff";		MaxInt64Str = AllowUpperHex ? "7FFFFFFFFFFFFFFF" : "7fffffffffffffff";
TenStr = AllowUpperHex ? "A" : "a";		TenStr = AllowUpperHex ? "A" : "a";
FifteenStr = AllowUpperHex ? "F" : "f";		FifteenStr = AllowUpperHex ? "F" : "f";
AcceptedHexOnlyDigits = AllowUpperHex ? "ABCDEF" : "abcdef";		AcceptedHexOnlyDigits = AllowUpperHex ? "ABCDEF" : "abcdef";
RefusedHexOnlyDigits = AllowUpperHex ? "abcdef" : "ABCDEF";		RefusedHexOnlyDigits = AllowUpperHex ? "abcdef" : "ABCDEF";
MinInt64Str = "N/A";		MinInt64Str = "N/A";
FirstInvalidCharDigits = "gG";		FirstInvalidCharDigits = "gG";
}		}

void checkWildcardRegexMatch(StringRef Input) {		void checkWildcardRegexMatch(StringRef Input,
		unsigned TrailExtendTo = 0) const {
SmallVector<StringRef, 4> Matches;		SmallVector<StringRef, 4> Matches;
ASSERT_TRUE(WildcardRegex.match(Input, &Matches))		std::string ExtendedInput = Input.str();
		jhendersonUnsubmitted Done Reply Inline Actions I think you could simplify this code by starting with `std::string ExtendedInput = Input;` and then just using `ExtendedInput` in the checks below. jhenderson: I think you could simplify this code by starting with `std::string ExtendedInput = Input;` and…
<< "Wildcard regex does not match " << Input;		if (TrailExtendTo > Input.size()) {
EXPECT_EQ(Matches[0], Input);		ExtendedInput.append(TrailExtendTo - Input.size(), Input[0]);
		}
		ASSERT_TRUE(WildcardRegex.match(ExtendedInput, &Matches))
		<< "Wildcard regex does not match " << ExtendedInput;
		EXPECT_EQ(Matches[0], ExtendedInput);
}		}

void checkWildcardRegexMatchFailure(StringRef Input) {		void checkWildcardRegexMatchFailure(StringRef Input) const {
EXPECT_FALSE(WildcardRegex.match(Input));		EXPECT_FALSE(WildcardRegex.match(Input));
}		}
		jhendersonUnsubmitted Done Reply Inline Actions It sounds to me like this is really just two completely different functions. I'd recommend splitting. jhenderson: It sounds to me like this is really just two completely different functions. I'd recommend…

		void checkWildcardRegexCharMatchFailure(StringRef Chars) const {
		for (auto C : Chars)
		EXPECT_FALSE(WildcardRegex.match(StringRef(&C, 1)));
		}

		std::string padWithLeadingZeros(StringRef NumStr) const {
		bool Negative = NumStr.startswith("-");
		if (NumStr.size() - unsigned(Negative) >= Precision)
		return NumStr.str();

		grimarUnsubmitted Done Reply Inline Actions This will fail if `NumStr` is empty. Is it OK (I guess so), though perhaps a bit cleaner would be to use `StringRef::startswith`. grimar: This will fail if `NumStr` is empty. Is it OK (I guess so), though perhaps a bit cleaner would…
		std::string PaddedStr;
		if (Negative) {
		PaddedStr = "-";
		NumStr = NumStr.drop_front();
		}
		PaddedStr.append(Precision - NumStr.size(), '0');
		grimarUnsubmitted Done Reply Inline Actions PaddedStr = "-"; grimar: ``` PaddedStr = "-"; ```
		PaddedStr.append(NumStr.str());
		return PaddedStr;
		}

template <class T> void checkMatchingString(T Val, StringRef ExpectedStr) {		template <class T> void checkMatchingString(T Val, StringRef ExpectedStr) {
Expected<std::string> MatchingString =		Expected<std::string> MatchingString =
Format.getMatchingString(ExpressionValue(Val));		Format.getMatchingString(ExpressionValue(Val));
ASSERT_THAT_EXPECTED(MatchingString, Succeeded())		ASSERT_THAT_EXPECTED(MatchingString, Succeeded())
<< "No matching string for " << Val;		<< "No matching string for " << Val;
EXPECT_EQ(*MatchingString, ExpectedStr);		EXPECT_EQ(*MatchingString, ExpectedStr);
}		}

Show All 28 Lines	void checkValueFromStringReprFailure(StringRef Str) {
StringRef OverflowErrorStr = "unable to represent numeric value";		StringRef OverflowErrorStr = "unable to represent numeric value";
Expected<ExpressionValue> ResultValue = getValueFromStringReprFailure(Str);		Expected<ExpressionValue> ResultValue = getValueFromStringReprFailure(Str);
expectDiagnosticError(OverflowErrorStr, ResultValue.takeError());		expectDiagnosticError(OverflowErrorStr, ResultValue.takeError());
}		}
};		};

TEST_P(ExpressionFormatParameterisedFixture, FormatGetWildcardRegex) {		TEST_P(ExpressionFormatParameterisedFixture, FormatGetWildcardRegex) {
// Wildcard regex is valid.		// Wildcard regex is valid.
Expected<StringRef> WildcardPattern = Format.getWildcardRegex();		Expected<std::string> WildcardPattern = Format.getWildcardRegex();
ASSERT_THAT_EXPECTED(WildcardPattern, Succeeded());		ASSERT_THAT_EXPECTED(WildcardPattern, Succeeded());
WildcardRegex = Regex((Twine("^") + *WildcardPattern).str());		WildcardRegex = Regex((Twine("^") + *WildcardPattern + "$").str());
ASSERT_TRUE(WildcardRegex.isValid());		ASSERT_TRUE(WildcardRegex.isValid());

// Does not match empty string.		// Does not match empty string.
checkWildcardRegexMatchFailure("");		checkWildcardRegexMatchFailure("");

// Matches all decimal digits and matches several of them.		// Matches all decimal digits and matches several of them.
checkWildcardRegexMatch("0123456789");		StringRef LongNumber = "12345678901234567890";
		checkWildcardRegexMatch(LongNumber);

// Matches negative digits.		// Matches negative digits.
		LongNumber = "-12345678901234567890";
if (Signed)		if (Signed)
checkWildcardRegexMatch("-42");		checkWildcardRegexMatch(LongNumber);
else		else
checkWildcardRegexMatchFailure("-42");		checkWildcardRegexMatchFailure(LongNumber);

// Check non digits or digits with wrong casing are not matched.		// Check non digits or digits with wrong casing are not matched.
if (AllowHex) {		if (AllowHex) {
checkWildcardRegexMatch(AcceptedHexOnlyDigits);		checkWildcardRegexMatch(AcceptedHexOnlyDigits, 16);
checkWildcardRegexMatchFailure(RefusedHexOnlyDigits);		checkWildcardRegexCharMatchFailure(RefusedHexOnlyDigits);
}		}
checkWildcardRegexMatchFailure(FirstInvalidCharDigits);		checkWildcardRegexCharMatchFailure(FirstInvalidCharDigits);

		// Check leading zeros are only accepted if number of digits is less than the
		// precision.
		LongNumber = "01234567890123456789";
		if (Precision) {
		checkWildcardRegexMatch(LongNumber.take_front(Precision));
		checkWildcardRegexMatchFailure(LongNumber.take_front(Precision - 1));
		if (Precision < LongNumber.size())
		checkWildcardRegexMatchFailure(LongNumber.take_front(Precision + 1));
		} else
		checkWildcardRegexMatch(LongNumber);
}		}

TEST_P(ExpressionFormatParameterisedFixture, FormatGetMatchingString) {		TEST_P(ExpressionFormatParameterisedFixture, FormatGetMatchingString) {
checkMatchingString(0, "0");		checkMatchingString(0, padWithLeadingZeros("0"));
checkMatchingString(9, "9");		checkMatchingString(9, padWithLeadingZeros("9"));

if (Signed) {		if (Signed) {
checkMatchingString(-5, "-5");		checkMatchingString(-5, padWithLeadingZeros("-5"));
checkMatchingStringFailure(MaxUint64);		checkMatchingStringFailure(MaxUint64);
checkMatchingString(MaxInt64, MaxInt64Str);		checkMatchingString(MaxInt64, padWithLeadingZeros(MaxInt64Str));
checkMatchingString(MinInt64, MinInt64Str);		checkMatchingString(MinInt64, padWithLeadingZeros(MinInt64Str));
} else {		} else {
checkMatchingStringFailure(-5);		checkMatchingStringFailure(-5);
checkMatchingString(MaxUint64, MaxUint64Str);		checkMatchingString(MaxUint64, padWithLeadingZeros(MaxUint64Str));
checkMatchingString(MaxInt64, MaxInt64Str);		checkMatchingString(MaxInt64, padWithLeadingZeros(MaxInt64Str));
checkMatchingStringFailure(MinInt64);		checkMatchingStringFailure(MinInt64);
}		}

checkMatchingString(10, TenStr);		checkMatchingString(10, padWithLeadingZeros(TenStr));
checkMatchingString(15, FifteenStr);		checkMatchingString(15, padWithLeadingZeros(FifteenStr));
}		}

TEST_P(ExpressionFormatParameterisedFixture, FormatValueFromStringRepr) {		TEST_P(ExpressionFormatParameterisedFixture, FormatValueFromStringRepr) {
checkValueFromStringRepr("0", 0);		checkValueFromStringRepr("0", 0);
checkValueFromStringRepr("9", 9);		checkValueFromStringRepr("9", 9);

if (Signed) {		if (Signed) {
checkValueFromStringRepr("-5", -5);		checkValueFromStringRepr("-5", -5);
Show All 10 Lines	TEST_P(ExpressionFormatParameterisedFixture, FormatValueFromStringRepr) {
// StringRef's getAsInteger() which does not allow to restrict casing.		// StringRef's getAsInteger() which does not allow to restrict casing.
checkValueFromStringReprFailure("G");		checkValueFromStringReprFailure("G");
}		}

TEST_P(ExpressionFormatParameterisedFixture, FormatBoolOperator) {		TEST_P(ExpressionFormatParameterisedFixture, FormatBoolOperator) {
EXPECT_TRUE(bool(Format));		EXPECT_TRUE(bool(Format));
}		}

INSTANTIATE_TEST_CASE_P(AllowedExplicitExpressionFormat,		INSTANTIATE_TEST_CASE_P(
ExpressionFormatParameterisedFixture,		AllowedExplicitExpressionFormat, ExpressionFormatParameterisedFixture,
::testing::Values(ExpressionFormat::Kind::Unsigned,		::testing::Values(std::make_pair(ExpressionFormat::Kind::Unsigned, 0),
ExpressionFormat::Kind::Signed,		std::make_pair(ExpressionFormat::Kind::Signed, 0),
ExpressionFormat::Kind::HexLower,		std::make_pair(ExpressionFormat::Kind::HexLower, 0),
ExpressionFormat::Kind::HexUpper), );		std::make_pair(ExpressionFormat::Kind::HexUpper, 0),

		std::make_pair(ExpressionFormat::Kind::Unsigned, 1),
		std::make_pair(ExpressionFormat::Kind::Signed, 1),
		std::make_pair(ExpressionFormat::Kind::HexLower, 1),
		std::make_pair(ExpressionFormat::Kind::HexUpper, 1),

		std::make_pair(ExpressionFormat::Kind::Unsigned, 16),
		std::make_pair(ExpressionFormat::Kind::Signed, 16),
		std::make_pair(ExpressionFormat::Kind::HexLower, 16),
		std::make_pair(ExpressionFormat::Kind::HexUpper, 16),

		std::make_pair(ExpressionFormat::Kind::Unsigned, 20),
		std::make_pair(ExpressionFormat::Kind::Signed, 20)), );

TEST_F(FileCheckTest, NoFormatProperties) {		TEST_F(FileCheckTest, NoFormatProperties) {
ExpressionFormat NoFormat(ExpressionFormat::Kind::NoFormat);		ExpressionFormat NoFormat(ExpressionFormat::Kind::NoFormat);
expectError<StringError>("trying to match value with invalid format",		expectError<StringError>("trying to match value with invalid format",
NoFormat.getWildcardRegex().takeError());		NoFormat.getWildcardRegex().takeError());
expectError<StringError>(		expectError<StringError>(
"trying to match value with invalid format",		"trying to match value with invalid format",
NoFormat.getMatchingString(ExpressionValue(18u)).takeError());		NoFormat.getMatchingString(ExpressionValue(18u)).takeError());
▲ Show 20 Lines • Show All 706 Lines • ▼ Show 20 Lines	TEST_F(FileCheckTest, ParseNumericSubstitutionBlock) {
EXPECT_THAT_EXPECTED(Tester.parseSubst("VAR3: "), Succeeded());		EXPECT_THAT_EXPECTED(Tester.parseSubst("VAR3: "), Succeeded());

// Acceptable variable definition with format specifier. Use parsePattern for		// Acceptable variable definition with format specifier. Use parsePattern for
// variables whose definition needs to be visible for later checks.		// variables whose definition needs to be visible for later checks.
EXPECT_FALSE(Tester.parsePattern("[[#%u, VAR_UNSIGNED:]]"));		EXPECT_FALSE(Tester.parsePattern("[[#%u, VAR_UNSIGNED:]]"));
EXPECT_FALSE(Tester.parsePattern("[[#%x, VAR_LOWER_HEX:]]"));		EXPECT_FALSE(Tester.parsePattern("[[#%x, VAR_LOWER_HEX:]]"));
EXPECT_THAT_EXPECTED(Tester.parseSubst("%X, VAR_UPPER_HEX:"), Succeeded());		EXPECT_THAT_EXPECTED(Tester.parseSubst("%X, VAR_UPPER_HEX:"), Succeeded());

		// Acceptable variable definition with precision specifier.
		EXPECT_FALSE(Tester.parsePattern("[[#%.8X, PADDED_ADDR:]]"));
		EXPECT_FALSE(Tester.parsePattern("[[#%.8, PADDED_NUM:]]"));

// Acceptable variable definition from a numeric expression.		// Acceptable variable definition from a numeric expression.
EXPECT_THAT_EXPECTED(Tester.parseSubst("FOOBAR: FOO+1"), Succeeded());		EXPECT_THAT_EXPECTED(Tester.parseSubst("FOOBAR: FOO+1"), Succeeded());

// Numeric expression. Switch to next line to make above valid definition		// Numeric expression. Switch to next line to make above valid definition
// available in expressions.		// available in expressions.
Tester.initNextPattern();		Tester.initNextPattern();

// Invalid variable name.		// Invalid variable name.
▲ Show 20 Lines • Show All 93 Lines • ▼ Show 20 Lines	expectDiagnosticError(
Tester.parseSubst("@LINE+0xC", /LegacyLineExpr=/true).takeError());		Tester.parseSubst("@LINE+0xC", /LegacyLineExpr=/true).takeError());

// Valid expression with format specifier.		// Valid expression with format specifier.
EXPECT_THAT_EXPECTED(Tester.parseSubst("%u, FOO"), Succeeded());		EXPECT_THAT_EXPECTED(Tester.parseSubst("%u, FOO"), Succeeded());
EXPECT_THAT_EXPECTED(Tester.parseSubst("%d, FOO"), Succeeded());		EXPECT_THAT_EXPECTED(Tester.parseSubst("%d, FOO"), Succeeded());
EXPECT_THAT_EXPECTED(Tester.parseSubst("%x, FOO"), Succeeded());		EXPECT_THAT_EXPECTED(Tester.parseSubst("%x, FOO"), Succeeded());
EXPECT_THAT_EXPECTED(Tester.parseSubst("%X, FOO"), Succeeded());		EXPECT_THAT_EXPECTED(Tester.parseSubst("%X, FOO"), Succeeded());

		// Valid expression with precision specifier.
		EXPECT_THAT_EXPECTED(Tester.parseSubst("%.8u, FOO"), Succeeded());
		EXPECT_THAT_EXPECTED(Tester.parseSubst("%.8, FOO"), Succeeded());

// Valid legacy @LINE expression.		// Valid legacy @LINE expression.
EXPECT_THAT_EXPECTED(Tester.parseSubst("@LINE+2", /IsLegacyNumExpr=/true),		EXPECT_THAT_EXPECTED(Tester.parseSubst("@LINE+2", /IsLegacyNumExpr=/true),
Succeeded());		Succeeded());

// Invalid legacy @LINE expression with more than 2 operands.		// Invalid legacy @LINE expression with more than 2 operands.
expectDiagnosticError(		expectDiagnosticError(
"unexpected characters at end of expression '+@LINE'",		"unexpected characters at end of expression '+@LINE'",
Tester.parseSubst("@LINE+2+@LINE", /IsLegacyNumExpr=/true).takeError());		Tester.parseSubst("@LINE+2+@LINE", /IsLegacyNumExpr=/true).takeError());
▲ Show 20 Lines • Show All 558 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[FileCheck] Add precision to format specifierClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 288863

llvm/docs/CommandGuide/FileCheck.rst

llvm/lib/Support/FileCheck.cpp

llvm/lib/Support/FileCheckImpl.h

llvm/test/FileCheck/numeric-expression.txt

llvm/unittests/Support/FileCheckTest.cpp

[FileCheck] Add precision to format specifier
ClosedPublic