Page MenuHomePhabricator

FileCheck [10/12]: Add support for signed numeric values
ClosedPublic

Authored by thopre on Apr 7 2019, 4:23 PM.

Details

Summary

This patch is part of a patch series to add support for FileCheck
numeric expressions. This specific patch adds support signed numeric
values, thus allowing negative numeric values.

As such, the patch adds a new class to represent a signed or unsigned
value and add the logic for type promotion and type conversion in
numeric expression mixing signed and unsigned values. It also adds
the %d format specifier to represent signed value.

Finally, it also adds underflow and overflow detection when performing a
binary operation.

Copyright:

  • Linaro (changes up to diff 183612 of revision D55940)
  • GraphCore (changes in later versions of revision D55940 and in new revision created off D55940)

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
thopre added inline comments.May 13 2020, 6:25 AM
llvm/lib/Support/FileCheckImpl.h
127 ↗(On Diff #263290)

Casting to int64_t is implementation-defined if Value is > std::numeric_limits<int64_t>::max(). That said current code assumes 2-complement so I've changed for memcpy. Hopefully the compiler can elide that call.

llvm/test/FileCheck/numeric-expression.txt
82–84

Yes, that's the VAR2+72 case.

152

Ah yes indeed, I noticed some bug in the testcase. I forgot to split the fix out in its own patch. Now done in https://reviews.llvm.org/D79820

llvm/unittests/Support/FileCheckTest.cpp
104

Those are ExpressionValue since this test is testing ExpressionValue operator+. I've renamed all such variable to add the suffix "Value"

124

Ditto.

thopre marked 3 inline comments as done.May 13 2020, 6:30 AM
thopre added inline comments.
llvm/test/FileCheck/numeric-expression.txt
87–88

The CHECK-NEXT: #VAR2a is trying to match c which is also found in "// VAR1c". The CHECK-NEXT thus fails because it found c after the previous match on the same line.

llvm/unittests/Support/FileCheckTest.cpp
104

I forgot to send the comments I had written. I hope that answers why I didn't do it.

arichardson added inline comments.May 13 2020, 7:19 AM
llvm/lib/Support/FileCheckImpl.h
127 ↗(On Diff #263290)

Yes it definitely can (even at -O1): https://godbolt.org/z/5jcz9L
I'm still not sure that it's possible that casting gives a different value from memcpy on any architecture that LLVM builds on, but I'm also fine with the memcpy.

thopre updated this revision to Diff 263718.May 13 2020, 7:34 AM
thopre marked 2 inline comments as done.

Fix typo

thopre planned changes to this revision.May 13 2020, 7:37 AM

The current promotion rules prevent to compute VAR-10 when VAR is an unsigned variable whose value is 5 because the result would be negative but the expression should be unsigned. I'll address that by accepting all values as long as they fit in an int64_t or uint64_t and only throw an overflow error when trying to match the value if it doesn't fit with the expression format. I should have done that in the first place.

thopre marked 2 inline comments as done.May 13 2020, 7:56 AM
thopre added inline comments.
llvm/lib/Support/FileCheckImpl.h
127 ↗(On Diff #263290)

architecture -> system, surely the compiler can mess things up as well? It's such a shame that there isn't a better way to achieve this while being perfectly well defined.

jhenderson added inline comments.May 14 2020, 12:36 AM
llvm/test/FileCheck/numeric-expression.txt
87–88

I see. I'm beginning to think the names of the variables need improving. Rather than have VAR1, VAR2, VAR1a etc, how about "SIGN", "UNSI" "UHEX", "LHEX", etc, possibly with additional context for their actual values. The CHECKs might also want an additional {{^}} to avoid matching things spuriously.

llvm/unittests/Support/FileCheckTest.cpp
104

I'm not sure it does, if I'm honest. An unsigned 10 plus a signed -11 should trigger the clause at line 174 in FileCheck.cpp, whilst the other way around triggers the clause at 170. They are distinctly interesting from the other cases that trigger those clauses because they should trigger overflows, if I've understood things right, rather than resulting in max uint64_t, but I don't think that's tested directly. The fact that this happens to call into other functions seems somewhat irrelevant from testing a distinct unit perspective - if you refactored the code to no longer use the same code path as something else that might be tested independently, you still want the test coverage.

124

It seems to me like operator- is a distinct unit, and its interface should be fully tested independently (see also my comments above). I believe this ends path ends up calling into checkedSub for which I can see no direct testing (does it handle double-negatives correctly, for example?)

thopre updated this revision to Diff 264003.May 14 2020, 8:25 AM
thopre marked 8 inline comments as done.

Address review comments

llvm/unittests/Support/FileCheckTest.cpp
104

I've reworked the ExpressionValue storage which should simplify the codepath somewhat and added one testcase to cover for an overflow case that was not tested (I was relying on testing of checkedSub but as you pointed out testing code would risk to bitrot in case the operator code is changed).

Please let me know if that doesn't address all your concerns.

124

How about now? Whether the value is signed or not doesn't matter anymore, only whether it's negative which should allow to reduce the amount of overflow codepaths.

jhenderson added inline comments.May 15 2020, 12:55 AM
llvm/docs/CommandGuide/FileCheck.rst
606–608

Is this still true?

llvm/lib/Support/FileCheck.cpp
359

I think there's an overflow issue here, right? If RightValue is bigger than max int64_t, it's possible for the expression to end up outside a representable int64_t, which seems like it should be an error?

llvm/lib/Support/FileCheckImpl.h
127–131 ↗(On Diff #264003)

I'm looking at this and thinking can't this and the above constructor not just be changed to:

template <class T>
explicit ExpressionValue(T Val)
  : Value(Val), Negative(Val < 0) {}
133 ↗(On Diff #264003)

true is -> true if

llvm/unittests/Support/FileCheckTest.cpp
104

I'm starting to get a bit of review fatigue, so I'm starting to miss things, but if I've followed things right, my suggested simplification to the constructors should enable us to get rid of the distinction between SignedTen and UnsignedTen, which in turn should allow the tests to be much simpler. At that point, I'd expect the test cases to be:

  1. Negative + Positive
  2. Positive + Negative
  3. Negative + Negative (no overflow)
  4. Positive + Positive (no overflow)

5-6) Overflowing variations of 3) and 4)

with an equivalent set for operator-

thopre updated this revision to Diff 264263.May 15 2020, 9:17 AM
thopre marked 9 inline comments as done.

Address review comments and fix a few more bugs

llvm/lib/Support/FileCheck.cpp
301–302

Undefined on 2-complement system when Value == min int64_t.

337

This can fail since RightOperand, being positive, is represented by a uint64_t.

359

Indeed, and it made me realize that getAbsolute doesn't work for min int64_t either.

llvm/unittests/Support/FileCheckTest.cpp
104

My apologies for causing this fatigue. I've added comments for each individual checks for each of the function/operator tested. Hope this helps. I've also caught a bug in the -A -B case where B would not fit in an int64_t. There's now an extra underflow here with an extra test.

thopre updated this revision to Diff 264382.May 15 2020, 4:20 PM

Fix format issue

I've been thinking more about test cases, and suggested a number more. If you can reduce the boiler plate for each individual test case, that would be great too. Perhaps even consider a TEST_P solution to specify input and expected output, whilst sharing the same code body.

llvm/unittests/Support/FileCheckTest.cpp
34

I'd test the range edge of -1 instead of -10.

It may be worth showing it fails for int64_t min too.

Also, a 0 check would be good to ensure no off-by-1 errors/incorrect use of < versus <= etc.

46–47

Perhaps worth adding a int64_t max + 1 case.

52

Maybe also int64_t min and max cases?

61

Maybe worth considering test cases for 0, and uint64_t max too?

90

Perhaps use -1 instead of -10.

Perhaps also MinInt64 - MinInt64.

95–99

These checks all have a lot of repeated code. It's probably time to factor out much of this into a function. It might also be a good idea to factor them into separate TESTs to allow the rest of the test to run in the event of a failure with an ASSERT etc, and to provide finer-grained test failures.

107

I'd prefer these slightly to not just always give a result of 0. I think it is more interesting to show that 10 + -11 can produce a negative number, for example, whilst 11 + -10 produces a positive one. Same goes for the case above.

118

Perhaps 1 instead of 10 here.

Perhaps also MaxUint64Value + MaxUint64Value would be a useful case, to maximise surface area.

123

What is meant by "too big" here? Be more precise in this comment.

128

- 1 rather than - 10 would probably be best here.

141

Similar to the + case, it might be nice for this to result in a non-zero value.

156

"or nul"?

157

I'm not sure Value is a useful suffix to all these names. It wasn't confusing to me before either.

166

Maybe Zero instead of Ten here would be good, for maximum possible underflow.
Maybe consider 0 - (std::abs(MinInt64) + 1) for minimum possible underflow.

177

Why not test the exact value?

180

Seems to me like there are two interesting cases here: "result is positive" and "result is negative".

thopre updated this revision to Diff 265083.May 19 2020, 4:31 PM
thopre marked 20 inline comments as done.

Address unittest review comments

thopre updated this revision to Diff 265084.May 19 2020, 4:34 PM

Fix format by using source tree clang-format-diff.py instead of system clang-format-diff

llvm/unittests/Support/FileCheckTest.cpp
95–99

Is that fine grain enough or do you want me to split further?

166

I can't use std::abs(MinInt64) since it's undefined in 2-complement (std::abs returns an int64_t for some reason).

180

Result is positive is already handled above. I meant to handle the negative case where result < -(max int64_t). Fixed now.

The latest changes are a big improvement on readability, thanks. Did you consider using TEST_P to factor things out further? That way you'd only need to specify the input and expected output values for each test case, with only a single TEST itself.

llvm/unittests/Support/FileCheckTest.cpp
19

Don't know how plausible it is, but did you consider passing in a function_ref to the operator+/operator- functions instead of the boolean? That might be useful in the future if support is extended to e.g. multiplication.

86

I think these sorts of comments indicate a need to split the test up. The suggestion I have is one method == one (or more) TESTs.

Thus there'd be tests for getAbsolute, getSignedValue etc etc and I'd call them ExpressionValueGetAbsolute, ExpressionValueGetSignedValue etc.

132

Typo

166

(Just to be clear, I only used std::abs as an intended example, rather than literally what should have been written).

thopre updated this revision to Diff 265602.May 21 2020, 2:24 PM
thopre marked 4 inline comments as done.

Address inline review comments

The latest changes are a big improvement on readability, thanks. Did you consider using TEST_P to factor things out further? That way you'd only need to specify the input and expected output values for each test case, with only a single TEST itself.

To make the result more concise that would require sharing fixture and instantiation among TEST_P but I'm not sure how to do it since the values generated in the instantiation are passed to all TEST_P but even operator+ and operator- have different values. If I need separate instantiation per TEST_P then I need separate fixture as well and that becomes more faff that what we have now I feel.

If there's some clever trick please give me some pointers on how to do it as I cannot see it right now.

jhenderson accepted this revision.May 27 2020, 12:51 AM

LGTM, with nit.

Re. TEST_P, the best suggestion I've got is look at the examples in DWARFDebugLineTest.cpp, where I've used a few different tricks to achieve the goal of code re-use. In particular, AdjustAddressFixtureBase related test cases might be interesting to you. In your case, I'd probably have a set of parameters each for failing and passing combinations, and for addition and subtraction, for a total of 4 combinations, if I'm not mistaken. You'd have thin fixtures for each, but they could probably share some code via a base class.

That all being said, only make those changes if you think they are worthwhile. I'm happy enough with this as-is.

llvm/unittests/Support/FileCheckTest.cpp
143

Here and in various other places, not sure you want the space between - and 10 (same for - 20 and various cases below).

This revision is now accepted and ready to land.May 27 2020, 12:51 AM
thopre marked an inline comment as done.May 27 2020, 11:21 AM
thopre added inline comments.
llvm/unittests/Support/FileCheckTest.cpp
143

I don't like it either but this comes from clang-format-diff.py. Should I go against it?

LGTM, with nit.

Re. TEST_P, the best suggestion I've got is look at the examples in DWARFDebugLineTest.cpp, where I've used a few different tricks to achieve the goal of code re-use. In particular, AdjustAddressFixtureBase related test cases might be interesting to you. In your case, I'd probably have a set of parameters each for failing and passing combinations, and for addition and subtraction, for a total of 4 combinations, if I'm not mistaken. You'd have thin fixtures for each, but they could probably share some code via a base class.

That all being said, only make those changes if you think they are worthwhile. I'm happy enough with this as-is.

So far, I'm getting:

llvm/unittests/Support/FileCheckTest.cpp | 413 +++++++++++++++++++++++++++++++++-------------------
1 file changed, 263 insertions(+), 150 deletions(-)

It's also quite a lot of code change, so even if that is acceptable because the code is deemed more readable, it's better to look at it as a separate patch and land the current code (with the suggested spacing fix)

thopre updated this revision to Diff 266789.May 28 2020, 2:43 AM
thopre marked an inline comment as done.

Rebase and fix spacing issue for negative literals

Argh sorry, forgot to remove the unnecessary phabricator marks when landing.

This revision was automatically updated to reflect the committed changes.

LGTM, with nit.

Re. TEST_P, the best suggestion I've got is look at the examples in DWARFDebugLineTest.cpp, where I've used a few different tricks to achieve the goal of code re-use. In particular, AdjustAddressFixtureBase related test cases might be interesting to you. In your case, I'd probably have a set of parameters each for failing and passing combinations, and for addition and subtraction, for a total of 4 combinations, if I'm not mistaken. You'd have thin fixtures for each, but they could probably share some code via a base class.

That all being said, only make those changes if you think they are worthwhile. I'm happy enough with this as-is.

So far, I'm getting:

llvm/unittests/Support/FileCheckTest.cpp | 413 +++++++++++++++++++++++++++++++++-------------------
1 file changed, 263 insertions(+), 150 deletions(-)

It's also quite a lot of code change, so even if that is acceptable because the code is deemed more readable, it's better to look at it as a separate patch and land the current code (with the suggested spacing fix)

I'll leave it up to you as to whether you want to. This is probably sufficient as is.

llvm/unittests/Support/FileCheckTest.cpp
143

If clang-format-diff.py was trying to format it that way, I'd consider that a bug in clang-format, so should be reported as such. The - there is a unary operator, and should be formatted as such (without a space).