Page MenuHomePhabricator

Unit tests to improve code coverage
Needs ReviewPublic

Authored by arindamsharma8 on Jan 26 2022, 4:11 AM.

Details

Reviewers
lenary
afd
Summary

Adds a batch of C tests that have been found to cover several hundred
lines of Clang/LLVM that are not covered by the unit and regression
tests of the main LLVM project, nor by the test suite when run with the
-O3 configuration.

The tests were originally generated using our fuzzer, and were then reduced
using C-Reduce and some manual inspection. They have been checked for undefined behaviour-freedom
using Frama-C and CompCert, and manually checked to eliminate
implementation-defined behaviour.

Most of the new coverage achieved by these tests is in:
/llvm-source/lib/Transforms/ (3 testcases)
/llvm-source/lib/IR/ (1 testcase)
/llvm-source/lib/Support/ (2 testcases)
/llvm-source/tools/clang/lib/AST (3 testcases)

Diff Detail

Unit TestsFailed

TimeTest
60,090 msx64 debian > AddressSanitizer-x86_64-linux-dynamic.TestCases::scariness_score_test.cpp
Script: -- : 'RUN: at line 4'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64 -shared-libasan -O0 /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/asan/TestCases/scariness_score_test.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/asan/X86_64LinuxDynamicConfig/TestCases/Output/scariness_score_test.cpp.tmp
60,100 msx64 debian > AddressSanitizer-x86_64-linux.TestCases::scariness_score_test.cpp
Script: -- : 'RUN: at line 4'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang --driver-mode=g++ -fsanitize=address -mno-omit-leaf-frame-pointer -fno-omit-frame-pointer -fno-optimize-sibling-calls -gline-tables-only -m64 -O0 /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/asan/TestCases/scariness_score_test.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/asan/X86_64LinuxConfig/TestCases/Output/scariness_score_test.cpp.tmp
100 msx64 debian > Clang.CodeGenCUDA::atomics-remarks-gfx90a.cu
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/bin/clang -cc1 -internal-isystem /var/lib/buildkite-agent/builds/llvm-project/build/lib/clang/15.0.0/include -nostdsysteminc /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGenCUDA/atomics-remarks-gfx90a.cu -triple=amdgcn-amd-amdhsa -fcuda-is-device -target-cpu gfx90a -Rpass=atomic-expand -S -o - 2>&1 | /var/lib/buildkite-agent/builds/llvm-project/build/bin/FileCheck /var/lib/buildkite-agent/builds/llvm-project/clang/test/CodeGenCUDA/atomics-remarks-gfx90a.cu --check-prefix=GFX90A-CAS
60,020 msx64 debian > ThreadSanitizer-x86_64.ThreadSanitizer-x86_64::signal_sync.cpp
Script: -- : 'RUN: at line 1'; /var/lib/buildkite-agent/builds/llvm-project/build/./bin/clang -fsanitize=thread -Wall -m64 -msse4.2 -gline-tables-only -I/var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/tsan/../ -O1 /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/tsan/signal_sync.cpp -o /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/tsan/X86_64Config/Output/signal_sync.cpp.tmp && /var/lib/buildkite-agent/builds/llvm-project/build/projects/compiler-rt/test/tsan/X86_64Config/Output/signal_sync.cpp.tmp 2>&1 | FileCheck /var/lib/buildkite-agent/builds/llvm-project/compiler-rt/test/tsan/signal_sync.cpp

Event Timeline

arindamsharma8 requested review of this revision.Jan 26 2022, 4:11 AM
arindamsharma8 created this revision.
fhahn added a subscriber: fhahn.Jan 27 2022, 2:25 AM

Thanks for sharing those test cases. I am just curious if some or most of them could be converted either to clang unit tests (the CGExpr* ones) and LLVM IR unit tests (e.g. the LoopVectorize one). The latter by extracting the IR before the corresponding LLVM pass.

Hello,

Is there anything that we can provide to assist the review of this PR?

Thanks
Arindam

Herald added a project: Restricted Project. · View Herald TranscriptMar 23 2022, 6:29 AM

Sorry, this dropped off my radar a bit.

Some review comments. I'm not super familiar with SingleSource/UnitTests but I've tried?

SingleSource/UnitTests/testcase-APFloat-1.c
16

I think I looked at this and convinced myself this was UB. Putting aside vector_size(N) being not part of the C spec, how do you know that V2SI fits in a long long, without reference to a specific platform?

I suppose if this is a clang test to improve coverage, then maybe this is OK, as the UB i've found is runtime UB, but I'm not sure?

In short, I raised an eyebrow at this when I saw the assertion "this is free of UB" as I have been caught out by similar assertions in the past (including about portability of code), and so it dropped down my priorities.

27

Why is the reference output empty if you contain an unconditional printf?

SingleSource/UnitTests/testcase-Expr-1.c
17

I'm surprised the tests don't always expect these programs to return 0 from main on success, or have I failed to understand how 0xAE wraps?

The reference output files now contain the correct exit codes such that when these tests were run as a part of the LLVM testsuite, all of them passed.
We left the exitcodes in the
testcases are such because of the fuzzer and we
wanted to keep the testcase as close to the fuzzed file as possible.

Regarding the UB in the first testcase (APFloat-1),
the fuzzed file comes from an existing LLVM testcase
(https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/20050316-1.c)
and it appears to be alright.
Please do let us know if there's anything we might have missed.

Will it help to add the test's original tags?
This is the original test:https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/20050316-1.c

the tags are:
/* PR rtl-optimization/16104 */
/* { dg-require-effective-target int32plus } */
/* { dg-options "-Wno-psabi" } */

Address comments

SingleSource/UnitTests/testcase-APFloat-1.c
16

Regarding the UB in the first testcase (APFloat-1),
the fuzzed file comes from an existing LLVM testcase
(https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/20050316-1.c)
and it appears to be alright.
Please do let us know if there's anything we might have missed.

Will it help to add the test's original tags?
This is the original test:https://github.com/llvm/llvm-test-suite/blob/main/SingleSource/Regression/C/gcc-c-torture/execute/20050316-1.c

the tags are:
/* PR rtl-optimization/16104 */
/* { dg-require-effective-target int32plus } */
/* { dg-options "-Wno-psabi" } */

27

The exitcodes and the structure in the
testcases are such because of the fuzzer and we
wanted to keep the testcase as close to the fuzzed file as possible.

SingleSource/UnitTests/testcase-Expr-1.c
17

We left the exitcodes in the
testcases as such because of the fuzzer and we
wanted to keep the testcase as close to the fuzzed file as possible.

Hey,
Thanks a lot for the careful review. We have now updated the reference output files and also tested the testcases with the LLVM testsuite such that they pass. We have also tried to address the comments. It'd be great if you could have a look at them again. Please do let us know if that's alright.
Thanks
Arindam

afd added a subscriber: afd.Apr 27 2022, 9:08 AM

Please see comments inline.

SingleSource/UnitTests/testcase-CGExprConstant.c
17

Is the use of .a required here to get the coverage?

18

I did not know this was valid syntax - taking the address of a struct declared in this fashion.

If you rewrite this more conventionally, along the lines of:

struct S2 temp = { ....};
struct S2* s = &temp;

then do you still get the coverage?

Or does getting the coverage actually depend on the unusual syntax?

19–21

Is it essential to use the .a, .b and [0] notation here to achieve the desired coverage? Could this struct initializer be written more conventionally as:

{{1, 2}, &gs1, 1, 1 + 1}}

?

Please rewrite to use the most conventional, obvious syntax unless it turns out that less conventional syntax is actually essential to get high coverage.

SingleSource/UnitTests/testcase-CGExprConstant.reference_output
2

Based on my understanding of the test this looks like the correct expectation.

SingleSource/UnitTests/testcase-Expr-1.c
17

The fuzzer is a means for finding high-coverage test cases.

Once we've found them, there's no value in keeping the test files as close to the fuzzed test case as possible. If we can make a test easier to read whilst still getting the additional coverage that the fuzzed test case achieved we should do that.

If you can change -0xAE to something simpler without spoiling coverage you should do so. And I also agree that making 0 be the return code associated with the test passing would be more intuitive.

SingleSource/UnitTests/testcase-ExprConstant-1.c
10

Remove the parentheses around 66 unless they're actually essential.

16

Is this definitely legitimate - is it definitely legal to do pointer subtraction like this? (Genuine question - I'm not sure.)

17–18

These seem like pretty arbitrary things to return, and then the expectation of 245 isn't at a glance connected to either of them (I suppose it corresponds to the -2 case but you don't see it immediately).

I suggest returning 0 for the expected result and 1 otherwise.

SingleSource/UnitTests/testcase-ExprConstant-2.c
8

This macro isn't used. Unless it's essential for coverage, remove it.

13

I have never heard of this _Generic intrinsic. I think the test needs to be accompanied by some explanation so that a similarly unfamiliar reader can understand what's going on.

SingleSource/UnitTests/testcase-InstCombine-1.c
3

This test is very hard to read.

Can you look into simplifying it further? Consider simplifications that break the huge expressions into multiple lines, introducing temporaries as needed.

Also try to avoid short and int, etc., and favour int16_t and similar unless this actually doesn't work.

6–10

Is this truly needed? Can't you include the header file that declares int32_t and just use that?

SingleSource/UnitTests/testcase-LoopVectorize.c
2 ↗(On Diff #423580)

Similar comment to the above: this test needs to be simplified to make it readable, and you should use int32_t etc. unless you actually cannot (i.e., coverage goes away).

SingleSource/UnitTests/testcase-Reassociate-1.c
2

This test is too complex for a human developer to easily work with.

I suggest you inline the uses of the T macro and then work on simplifying the code accordingly after that.

SingleSource/UnitTests/testcase-Value-1.c
8

Can you use int16_t or something instead of short?

8

Avoid redundant parentheses.

10

Avoid the use of hex in favour of easier-to-read integers.

13–34

Similar comments about hex and redundant parentheses.

arindamsharma8 marked 4 inline comments as done.May 7 2022, 7:08 AM

Updated with comments

SingleSource/UnitTests/testcase-CGExprConstant.c
19–21

The aforementioned constructs do seem unconventional. These constructs also appear the the base program from which this fuzzed testcase was produced. The link to that program is here:

https://github.com/c-testsuite/c-testsuite/blob/5c7275656d751de0e68b2d340a95b5681858ed07/tests/single-exec/00150.c

SingleSource/UnitTests/testcase-Expr-1.c
17

I understand! Keeping this comment in mind, I have updated the return codes and constants while avoiding use of such hex values as long as they preserve coverage

SingleSource/UnitTests/testcase-ExprConstant-1.c
17–18

Updated in the revision

SingleSource/UnitTests/testcase-ExprConstant-2.c
13

I agree!! It's rather new I believe: Following provides a good explanation of this construct:

https://docs.microsoft.com/en-us/cpp/c-language/generic-selection?view=msvc-170

SingleSource/UnitTests/testcase-InstCombine-1.c
3

I see. I have tried removing redundant parentheses (might have missed some). I noticed that tampering with these expressions seems to lose the coverage we get with the testcase.

6–10

It seems essential to the coverage, as far as I could try.

SingleSource/UnitTests/testcase-Value-1.c
8

Addressed in the revision

10

Addressed in the revision

arindamsharma8 marked an inline comment as done.

Address comments from reviewers.

Update parentheses in return expressions

Updated InstCombine testcase such that it's smaller and more readable for the developers

Update InstCombine testcase

Update reference output for InstCombine