This is an archive of the discontinued LLVM Phabricator instance.

Unit tests to improve code coverage
ClosedPublic

Authored by arindamsharma8 on Jan 26 2022, 4:11 AM.

Details

Summary

Adds a batch of C tests that have been found to cover several hundred
lines of Clang/LLVM that are not covered by the unit and regression
tests of the main LLVM project, nor by the test suite when run with the
-O3 configuration.

The tests were originally generated using our fuzzer, and were then reduced
using C-Reduce and some manual inspection. They have been checked for undefined behaviour-freedom
using Frama-C and CompCert, and manually checked to eliminate
implementation-defined behaviour.

Most of the new coverage achieved by these tests is in:
/llvm-source/lib/Transforms/ (3 testcases)
/llvm-source/lib/IR/ (1 testcase)
/llvm-source/lib/Support/ (2 testcases)
/llvm-source/tools/clang/lib/AST (3 testcases)

Diff Detail

Event Timeline

arindamsharma8 requested review of this revision.Jan 26 2022, 4:11 AM
arindamsharma8 created this revision.
fhahn added a subscriber: fhahn.Jan 27 2022, 2:25 AM

Thanks for sharing those test cases. I am just curious if some or most of them could be converted either to clang unit tests (the CGExpr* ones) and LLVM IR unit tests (e.g. the LoopVectorize one). The latter by extracting the IR before the corresponding LLVM pass.

Hello,

Is there anything that we can provide to assist the review of this PR?

Thanks
Arindam

Herald added a project: Restricted Project. · View Herald TranscriptMar 23 2022, 6:29 AM

Sorry, this dropped off my radar a bit.

Some review comments. I'm not super familiar with SingleSource/UnitTests but I've tried?

SingleSource/UnitTests/testcase-APFloat-1.c
16

I think I looked at this and convinced myself this was UB. Putting aside vector_size(N) being not part of the C spec, how do you know that V2SI fits in a long long, without reference to a specific platform?

I suppose if this is a clang test to improve coverage, then maybe this is OK, as the UB i've found is runtime UB, but I'm not sure?

In short, I raised an eyebrow at this when I saw the assertion "this is free of UB" as I have been caught out by similar assertions in the past (including about portability of code), and so it dropped down my priorities.

27

Why is the reference output empty if you contain an unconditional printf?

SingleSource/UnitTests/testcase-Expr-1.c
17

I'm surprised the tests don't always expect these programs to return 0 from main on success, or have I failed to understand how 0xAE wraps?

The reference output files now contain the correct exit codes such that when these tests were run as a part of the LLVM testsuite, all of them passed.
We left the exitcodes in the
testcases are such because of the fuzzer and we
wanted to keep the testcase as close to the fuzzed file as possible.

Regarding the UB in the first testcase (APFloat-1),
the fuzzed file comes from an existing LLVM testcase
(https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/20050316-1.c)
and it appears to be alright.
Please do let us know if there's anything we might have missed.

Will it help to add the test's original tags?
This is the original test:https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/20050316-1.c

the tags are:
/* PR rtl-optimization/16104 */
/* { dg-require-effective-target int32plus } */
/* { dg-options "-Wno-psabi" } */

Address comments

SingleSource/UnitTests/testcase-APFloat-1.c
16

Regarding the UB in the first testcase (APFloat-1),
the fuzzed file comes from an existing LLVM testcase
(https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/20050316-1.c)
and it appears to be alright.
Please do let us know if there's anything we might have missed.

Will it help to add the test's original tags?
This is the original test:https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/20050316-1.c

the tags are:
/* PR rtl-optimization/16104 */
/* { dg-require-effective-target int32plus } */
/* { dg-options "-Wno-psabi" } */

27

The exitcodes and the structure in the
testcases are such because of the fuzzer and we
wanted to keep the testcase as close to the fuzzed file as possible.

SingleSource/UnitTests/testcase-Expr-1.c
17

We left the exitcodes in the
testcases as such because of the fuzzer and we
wanted to keep the testcase as close to the fuzzed file as possible.

Hey,
Thanks a lot for the careful review. We have now updated the reference output files and also tested the testcases with the LLVM testsuite such that they pass. We have also tried to address the comments. It'd be great if you could have a look at them again. Please do let us know if that's alright.
Thanks
Arindam

afd added a subscriber: afd.Apr 27 2022, 9:08 AM

Please see comments inline.

SingleSource/UnitTests/testcase-CGExprConstant.c
17

Is the use of .a required here to get the coverage?

18

I did not know this was valid syntax - taking the address of a struct declared in this fashion.

If you rewrite this more conventionally, along the lines of:

struct S2 temp = { ....};
struct S2* s = &temp;

then do you still get the coverage?

Or does getting the coverage actually depend on the unusual syntax?

19–21

Is it essential to use the .a, .b and [0] notation here to achieve the desired coverage? Could this struct initializer be written more conventionally as:

{{1, 2}, &gs1, 1, 1 + 1}}

?

Please rewrite to use the most conventional, obvious syntax unless it turns out that less conventional syntax is actually essential to get high coverage.

SingleSource/UnitTests/testcase-CGExprConstant.reference_output
2

Based on my understanding of the test this looks like the correct expectation.

afd added inline comments.Apr 27 2022, 9:08 AM
SingleSource/UnitTests/testcase-Expr-1.c
17

The fuzzer is a means for finding high-coverage test cases.

Once we've found them, there's no value in keeping the test files as close to the fuzzed test case as possible. If we can make a test easier to read whilst still getting the additional coverage that the fuzzed test case achieved we should do that.

If you can change -0xAE to something simpler without spoiling coverage you should do so. And I also agree that making 0 be the return code associated with the test passing would be more intuitive.

SingleSource/UnitTests/testcase-ExprConstant-1.c
10

Remove the parentheses around 66 unless they're actually essential.

16

Is this definitely legitimate - is it definitely legal to do pointer subtraction like this? (Genuine question - I'm not sure.)

17–18

These seem like pretty arbitrary things to return, and then the expectation of 245 isn't at a glance connected to either of them (I suppose it corresponds to the -2 case but you don't see it immediately).

I suggest returning 0 for the expected result and 1 otherwise.

SingleSource/UnitTests/testcase-ExprConstant-2.c
8

This macro isn't used. Unless it's essential for coverage, remove it.

13

I have never heard of this _Generic intrinsic. I think the test needs to be accompanied by some explanation so that a similarly unfamiliar reader can understand what's going on.

SingleSource/UnitTests/testcase-InstCombine-1.c
3

This test is very hard to read.

Can you look into simplifying it further? Consider simplifications that break the huge expressions into multiple lines, introducing temporaries as needed.

Also try to avoid short and int, etc., and favour int16_t and similar unless this actually doesn't work.

6–10

Is this truly needed? Can't you include the header file that declares int32_t and just use that?

SingleSource/UnitTests/testcase-LoopVectorize.c
3

Similar comment to the above: this test needs to be simplified to make it readable, and you should use int32_t etc. unless you actually cannot (i.e., coverage goes away).

SingleSource/UnitTests/testcase-Reassociate-1.c
2

This test is too complex for a human developer to easily work with.

I suggest you inline the uses of the T macro and then work on simplifying the code accordingly after that.

SingleSource/UnitTests/testcase-Value-1.c
8

Can you use int16_t or something instead of short?

8

Avoid redundant parentheses.

10

Avoid the use of hex in favour of easier-to-read integers.

13–34

Similar comments about hex and redundant parentheses.

arindamsharma8 marked 4 inline comments as done.May 7 2022, 7:08 AM

Updated with comments

SingleSource/UnitTests/testcase-CGExprConstant.c
19–21

The aforementioned constructs do seem unconventional. These constructs also appear the the base program from which this fuzzed testcase was produced. The link to that program is here:

https://github.com/c-testsuite/c-testsuite/blob/5c7275656d751de0e68b2d340a95b5681858ed07/tests/single-exec/00150.c

SingleSource/UnitTests/testcase-Expr-1.c
17

I understand! Keeping this comment in mind, I have updated the return codes and constants while avoiding use of such hex values as long as they preserve coverage

SingleSource/UnitTests/testcase-ExprConstant-1.c
17–18

Updated in the revision

SingleSource/UnitTests/testcase-ExprConstant-2.c
13

I agree!! It's rather new I believe: Following provides a good explanation of this construct:

https://docs.microsoft.com/en-us/cpp/c-language/generic-selection?view=msvc-170

SingleSource/UnitTests/testcase-InstCombine-1.c
3

I see. I have tried removing redundant parentheses (might have missed some). I noticed that tampering with these expressions seems to lose the coverage we get with the testcase.

6–10

It seems essential to the coverage, as far as I could try.

SingleSource/UnitTests/testcase-Value-1.c
8

Addressed in the revision

10

Addressed in the revision

arindamsharma8 marked an inline comment as done.

Address comments from reviewers.

Update parentheses in return expressions

Updated InstCombine testcase such that it's smaller and more readable for the developers

Update InstCombine testcase

Update reference output for InstCombine

afd added a comment.Jul 14 2022, 10:12 AM

Some comments - but I stopped as I'm concerned I might not be reviewing the latest changes.

SingleSource/UnitTests/testcase-CGExprConstant.c
3

Remove leading spaces so that "This fuzzed program" is aligned with the rest of the comment.

Please do similarly in all other contributed tests.

SingleSource/UnitTests/testcase-Expr.reference_output
2

This file doesn't match the corresponding source file. There's a "-1" in the source file but not here.

Add missing updates

Addressed more comments

arindamsharma8 marked 8 inline comments as done.Jul 14 2022, 2:06 PM
afd added a comment.Aug 5 2022, 6:27 AM

Thanks for making the changes, Arindam.

There are two tests that I think need to go because their main methods are completely trivial.

Otherwise this looks good to me, except for a few small comments which it would be good if you could address.

SingleSource/UnitTests/testcase-APFloat-1.c
26–28

I suggest you exclude this test case because it is trivial: sizeof(short) == 2 statically evaluates to true, so the test case will be trivially optimized to something that prints 5 and then returns. All the code that is actually achieving interesting coverage is dead.

SingleSource/UnitTests/testcase-ExprConstant-2.c
11

@arindamsharma8 you didn't reply to my comment about this and you also didn't remove the unused macro, so I don't know whether it's needed for coverage, or whether you missed the comment.

Can you either confirm that it is needed (in which case note that in the test with a comment), or remove it?

SingleSource/UnitTests/testcase-InstCombine-1.c
23

I think there is trailing whitespace at the end of this line, which should go (I got a warning about whitespace errors here when I applied your patch).

SingleSource/UnitTests/testcase-Reassociate-1.c
91–94

This is no good: main doesn't invoke any of the above code. Thus even if this does get extra coverage in the compiler, it's barely interesting for regression testing since it could only confirm that compilation does not crash, and not that acceptable results are computed.

I suggest getting rid of this test.

arindamsharma8 marked an inline comment as done.Aug 5 2022, 8:45 AM
arindamsharma8 added inline comments.
SingleSource/UnitTests/testcase-ExprConstant-2.c
11

It is indeed necessary and I will notify this in a comment in the upcoming update to the patch

arindamsharma8 marked an inline comment as done.

Address latest comments

Remove testcases as per previous comments and sanitise testcases

afd added a comment.Aug 9 2022, 3:47 AM

@lenary I have reviewed these test cases carefully and I now think they are in good shape.

Would you mind taking a look, with a view to getting them merged if they also look OK to you?

Thanks!

lenary added a comment.Aug 9 2022, 6:17 AM

These are looking much better, thanks!

I think testcase-InstCombine-1.c needs the most attention (maybe just updating the expected output?), and beyond that I have a portability nit (which I have been burned by in the past).

SingleSource/UnitTests/testcase-CGExprConstant.c
37

it is not portable to use %d with int32_t.

The portable way to do this is in my suggestion - though you may need to include <inttypes.h>. More docs on these are available here: https://en.cppreference.com/w/c/types/integer under "Format macro constants".

SingleSource/UnitTests/testcase-InstCombine-1.c
18

I'm slightly confused as to how this loop even executes:

  • f should be zero initialized, I think (otherwise reading it is Undefined Behaviour)
  • (h is also zero-initialized, so all fields are zero - note that : is not for initializers in C, it's for bitfield widths)
  • So the loop should not be entered - the condition is 0, so falsey.
  • So between line 20 and 21, h should have the values it started with, i.e. all values zero
  • Which would make the if statement false, as 0 != 0 - 0 * 0 should be false, right?

So I'm overall confused about how this case can print 1exit 0.

Indeed I tried the program on my linux machine, with clang-14, and it doesn't print anything and exits 0.

23

This comment is misleading. h.d is 9 (or it's 0, these are the only two values assigned to it). - the : is not an initializer, it declares a bitfield.

On first reading, I did think that the print was reachable, but I'm no now convinced it is not (see my other comment).

WiP (Update testcases and fix failing testcase)

Update the printf for types

Missing testcase printf fix

@lenary

Hi,

Thanks a lot for the help on these testcases. I have updated the PR and the pre-merge checks seem to pass now and I have taken into account the CGExprConstant testcase comment. Please let me know if they are now ready to be merged.

Thanks
Arindam

lenary added inline comments.Aug 12 2022, 12:59 PM
SingleSource/UnitTests/testcase-InstCombine-1.c
23

Given the output doesn't include the "1", this comment should be deleted, as it's evidently *not* reachable.

arindamsharma8 marked an inline comment as done.Aug 12 2022, 1:01 PM
arindamsharma8 added inline comments.
SingleSource/UnitTests/testcase-InstCombine-1.c
23

Apologies! Not sure how this comment stayed in! Will be removed in upcoming patch

arindamsharma8 marked an inline comment as done.

Remove inaccurate comment

Add original source for testcases

SingleSource/UnitTests/testcase-InstCombine-1.c
4

Original source : PR tree-optimization/79737

SingleSource/UnitTests/testcase-Value-1.c
3

Original source : corpus/pr77766.c

arindamsharma8 added inline comments.Aug 16 2022, 2:38 AM
SingleSource/UnitTests/testcase-Expr-1.c
7

Original source :

/* corpus/00147.c */
/* Taken from: https://github.com/c-testsuite/c-testsuite */

lenary accepted this revision.Aug 16 2022, 2:45 AM

I'm happy with this.

I don't know if you'll need to include the source info for these fuzzed/reduced tests into code comments. I'll let @afd decide on that.

This revision is now accepted and ready to land.Aug 16 2022, 2:45 AM
afd added a comment.Aug 30 2022, 12:14 AM

@lenary Thanks for approving - I'm happy with what the test cases currently say about the original source (they give a brief note).

Would you be OK to land these?

Many thanks!

I'm finally getting back to landing this. @arindamsharma8 what name and email would you like me to use in git's Author field?

I'm finally getting back to landing this. @arindamsharma8 what name and email would you like me to use in git's Author field?

Hi Sam,

Thanks for the help!
You can use the following details:
Name: Arindam Sharma
email: arindam.sharma@imperial.ac.uk

Thanks
Arindam

This revision was landed with ongoing or failed builds.Oct 11 2022, 5:27 AM
This revision was automatically updated to reflect the committed changes.