Page MenuHomePhabricator

FileCheck [9/12]: Add support for matching formats
Needs ReviewPublic

Authored by thopre on Apr 7 2019, 4:22 PM.

Details

Summary

This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch adds support for selecting a
matching format to match a numeric value against (ie. decimal, hex lower
case letters or hex upper case letters).

This commit allows to select what format a numeric value should be
matched against. The following formats are supported: decimal value,
lower case hex value and upper case hex value. Matching formats impact
both the format of numeric value to be matched as well as the format of
accepted numbers in a definition with empty numeric expression
constraint.

Default for absence of format is decimal value unless the numeric
expression constraint is non null and use a variable in which case the
format is the one used to define that variable. Conclict of format in
case of several variable being used is diagnosed and forces the user to
select a matching format explicitely.

This commit also enables immediates in numeric expressions to be in any
radix known to StringRef's GetAsInteger method, except for legacy
numeric expressions (ie [[@LINE+<offset>]] which only support decimal
immediates.

Copyright:

  • Linaro (changes up to diff 183612 of revision D55940)
  • GraphCore (changes in later versions of revision D55940 and in new revision created off D55940)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
thopre updated this revision to Diff 199140.May 11 2019, 7:15 AM

Rebase on top of latest changes

thopre updated this revision to Diff 199238.May 13 2019, 3:58 AM

Rebase on top of latest changes

thopre updated this revision to Diff 199398.May 14 2019, 3:58 AM

Rebase on top of latest changes

rnk removed a reviewer: rnk.May 14 2019, 11:29 AM
rnk removed a subscriber: rnk.
thopre updated this revision to Diff 200475.May 21 2019, 5:56 AM

rebase on top of latest changes

thopre updated this revision to Diff 200552.May 21 2019, 11:15 AM

rebase on top of latest changes

thopre updated this revision to Diff 201984.May 29 2019, 10:52 AM

Rebase on top of latest changes in patch series

thopre updated this revision to Diff 203211.Jun 5 2019, 11:21 AM

Rebase on top of latest changes in the series

grimar added inline comments.Jun 17 2019, 5:20 AM
llvm/lib/Support/FileCheck.cpp
40

You can simplify this:

if (!Hex)
  return StringRef("[[:digit:]]+");
if (Cap)
  return StringRef("[[:digit:]A-F]+");
return StringRef("[[:digit:]a-f]+");
48

You don't need else after return:

if (Hex)
  return utohexstr(Value, !Cap);
return utostr(Value);

(and in other methods)

88

return NumExpr ? NumExpr->getEffectiveFmt() : FmtNone;?

198

Just return !S.consume_front(SkipStr) && !Optional?

llvm/unittests/Support/FileCheckTest.cpp
399

There is a good practive to avoid using auto in case the type isn't obvious.

thopre updated this revision to Diff 205112.Jun 17 2019, 10:21 AM
thopre marked 5 inline comments as done.
  • Rebase on top of latest changes in patch series
  • Address outstanding review comments
thopre added inline comments.Jun 17 2019, 2:15 PM
llvm/unittests/Support/FileCheckTest.cpp
399

But isn't it obvious in this case since we see the constructor call?

grimar added inline comments.Jun 18 2019, 1:22 AM
llvm/unittests/Support/FileCheckTest.cpp
399

May be. I think I didn't realize it is a constructor call and not a helper function call when saw this.

MaskRay added inline comments.
llvm/lib/Support/FileCheck.cpp
358

/*Optional*/ is usually written before the argument:

/*Optional=*/false

llvm/unittests/Support/FileCheckTest.cpp
74

FileCheckNumericVariable FoobarVar(...) or FileCheckNumericVariable FoobarVar{...}

MaskRay added inline comments.Jun 18 2019, 1:35 AM
llvm/lib/Support/FileCheck.cpp
61

this->Fmt = Fmt.Valid ? Fmt : FmtUnsigned;

llvm/unittests/Support/FileCheckTest.cpp
99–102

FileCheckNumExpr DefNumExpr(..) or FileCheckNumExpr DefNumExpr{..}

thopre updated this revision to Diff 205539.Jun 19 2019, 3:48 AM
thopre marked 5 inline comments as done.
  • Rebase on top of latest changes in patch series
  • Address comments except initialization format
thopre updated this revision to Diff 205619.Jun 19 2019, 9:20 AM

Add the fix mentioned in previous diff upload

thopre updated this revision to Diff 205627.Jun 19 2019, 10:01 AM

Fix some return -> \returns

thopre added inline comments.Jun 20 2019, 2:43 AM
llvm/unittests/Support/FileCheckTest.cpp
74

Is that the expected usage in LLVM? I've already added quite a lot of occurences in existing code, that would need to be taken care in a separate patch IMO.

thopre updated this revision to Diff 205764.Jun 20 2019, 2:45 AM
  • Remove Optional parameter of stripFront since it's only used from patch 11/12 onwards
  • Rebase on latest changes in patch series
arichardson added inline comments.Jun 20 2019, 3:13 AM
llvm/include/llvm/Support/FileCheck.h
51

I would call this UpperCase or Capitalize. If it is only ever used for hex values (I'm not sure if there are plans for e.g. uppercasing string values later on) storing an enumeration with HexUpper and HexLower, Unsigned values could also make sense.

I you use the enum approach, it should also be possible to omit the Valid bitfield and use an enum value of Implicit or NoFormat instead.

I haven't managed to review the full patch yet, but maybe it is enough to store just a single enumeration value in this class that also contains the conflict value since that seems mutually exclusive with all other boolean flags.

MaskRay added inline comments.Jul 14 2019, 3:13 AM
llvm/unittests/Support/FileCheckTest.cpp
74

Yes. In addition, I think Ctor A = Ctor(arg1, arg2); is a very uncommon pattern in C++.

MaskRay added inline comments.Jul 14 2019, 3:49 AM
llvm/include/llvm/Support/FileCheck.h
51

Is there a specific reason that the 4 members must be bit fields, not regular bools? I don't think that will waste space.

104
virtual FileCheckExpressionFormat getImplicitFormat() const {
  return FormatNone;
}

and delete the unnecessary override below.

172

Why isn't this getFormat() const? Or rename the member to EffectiveFormat;.

I think it is probably clearer to merge the comment with FileCheckExpressionFormat Format; and delete the comment here. For a class with 2 getters, I think the definition is probably too long.

212

This comment is redundant. I know you added some to other getters in previous patches, but I think they can also be deleted.

llvm/lib/Support/FileCheck.cpp
29

Consider defining this inline. The definition is trivial and FileCheck.h is only included in two .cpp files.

54

I think if (StrVal.getAsInteger(Hex ? 16 : 10, Value)) is just as clear. (Or (Hex ? 16 : 10). The empty line above can also be deleted.

131

LeftFormat.Valid ? LeftFormat : RightFormat;

194

Delete the helper and inline it into the call sites.

287

I think if (!Expr.consumeInteger(AO == Literal ? 10 : 0, LiteralValue)) is just as clear as the current one.

arichardson added inline comments.Jul 24 2019, 2:13 PM
llvm/lib/Support/FileCheck.cpp
35

Maybe [0-9]+ instead of "[[:digit:]]+"?

92

if (Expression && Expression->getAST()) is a bit shorter and just as readable.

jhenderson added inline comments.Jul 25 2019, 6:29 AM
llvm/docs/CommandGuide/FileCheck.rst
599

Perhaps worth stating "61680" in hex, rather than decimal.

601

I'm getting confused by the syntax here not matching the syntax above for defining a numeric variable. Aren't the two essentially the same syntax, just with different parameters missing? If I'm not mistaken they can be unified to [[#%<fmtspc>,<NUMVAR>:<expr>]] where <expr> says what this matches against (if specified), <NUMVAR> says what numeric variable to store the result in (if specified) and fmtspec defines the format of the expression (if any) and that stored in the variable (if any).

Aside: does #%X match a hex number, but not store it?

607

"constraint, if any, and defaults to". The later sentence handles the "all the same" bit by talking of conflicts.

llvm/include/llvm/Support/FileCheck.h
51

I would call this UpperCase or Capitalize. If it is only ever used for hex values (I'm not sure if there are plans for e.g. uppercasing string values later on) storing an enumeration with HexUpper and HexLower, Unsigned values could also make sense.

I agree that an enum seems like the right way forward here, possibly with Conflict and Unspecified as other values in that enum.

67

"regexp"? I assume that should be "regular expression pattern"?

84

Is there going to be an ODR violation here, due to these being instantiated in the header?

152

Why shared_ptr? Shouldn't an expression be the sole owner of its AST?

154

i.e -> i.e.

161

ie. -> i.e.

Perhaps Format should take a default argument for an unsigned decimal?

The last clause ("set it to default one"...) then becomes unnecessary.

170

ie -> i.e.

(i) its explicit format, if any, otherwise (ii) its implicit format, if any, otherwise (iii) the default format.

186

If the expression is empty, Expression points to a FileCheckExpression instance...

264

if the implicit

265

conflict, and FormatNone

llvm/lib/Support/FileCheck.cpp
65

I believe you could do this in the initializer list:

: AST(AST)
, Format(Format.Valid ? Format : FormatUnsigned) {}
92

Readability is in the eye of the beholder. I personally find it more readable to compare against nullptr, because it is then clear that Expression is a pointer, not a boolean.

llvm/test/FileCheck/numeric-defines.txt
4

I feel like you're losing coverage here: where are command-line defined numeric variables tested with an implicit format specifier? I'd have a separate test for format specifiers, that explicitly focus on testing those and nothing else.

llvm/test/FileCheck/numeric-expression.txt
39–43

Do you really need every one of these test cases? It feels like the last one would be enough (and one with no spaces, which you have earlier on). Same goes for other instances like this elsewhere.

47

Should this be USE EXPL FMT IMPL MATCH?

124

Nit: missing trailing full stop.

150

Nit: missing trailing full stop.

The grammar of this statement needs significant improvement. How about "Explicitly specified format can override conflicting implicit formats."

This test case should be after the one that shows that explicit format specifiers override non-conflicting implicit ones, and probably also after a test case showing what happens when they conflict.

165

Nit: missing trailing full stop.

What is a "conversion matching format"?

177

Nit: missing trailing full stop.

191

In one of the other changes, you went to some effort to improve the readability of these using --strict-whitespace, so that the '^' line up with the correct thing. Could you replicate these improvements in this patch too, please?

294

Missing trailing full stop.

295–296

You're being inconsistent here with your '-' versus '--' switch usage. Please standardise to one or the other throughout the test.

302

This message is incomplete - which variables? What were their format specifiers?

llvm/test/FileCheck/string-defines-diagnostics.txt
28–29

Please re-add the second dash, to remain consistent with other tests in this file (why did you change it?)

llvm/unittests/Support/FileCheckTest.cpp
47

FileCheckExpression NumVarExpr(nullptr, FormatUnsigned);

Same kind of comment goes throughout changes in this file.

48–49

Why make_shared and not make_unique? In fact, why is this a smart pointer at all? Why not just create this on the stack, i.e. FileCheckNumericVariable FooVar (1, "FOO", &NumVarExpr);

Same comment goes in all sorts of places elsewhere.