This is an archive of the discontinued LLVM Phabricator instance.

[flang] Update output format test to use GTest
ClosedPublic

Authored by ashermancinelli on Mar 9 2021, 3:49 PM.

Details

Summary

Better document each test in output formatting tests. Use GTest primitives and infrastructure in same
spirit as D97403. See legacy github issue linked here for additional context. Reorganize long test cases to be more readable.

Diff Detail

Event Timeline

ashermancinelli created this revision.Mar 9 2021, 3:49 PM
ashermancinelli requested review of this revision.Mar 9 2021, 3:49 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 9 2021, 3:49 PM

Has this version of the tests been validated for its ability to still find bugs in a broken runtime or decimal conversion library?

ashermancinelli added a comment.EditedMar 9 2021, 4:36 PM

Has this version of the tests been validated for its ability to still find bugs in a broken runtime or decimal conversion library?

I apologize for my ignorance here - could you elaborate on this? Do you have a recommendation for how to go about intentionally breaking some component to ensure the tests still catch broken libraries?

Use fully qualified auto to address clang-tidy warnings.

Has this version of the tests been validated for its ability to still find bugs in a broken runtime or decimal conversion library?

I apologize for my ignorance here - could you elaborate on this? Do you have a recommendation for how to go about intentionally breaking some component to ensure the tests still catch broken libraries?

The existing version of this test works in the sense that it reliably fails when there's a bug in the Decimal library or in the code in the runtime that processes formatted output. Will your new version fail when there are bugs in those libraries? If you're not sure, then please don't replace the test until you are sure.

@klausler I reproduce all of the same test cases, so the test coverage should be identical and should catch all of the same bugs. I'd like to increase coverage, but the patch as-is should keep it the same.

flang/unittests/RuntimeGTest/NumericalFormatTest.cpp
42–45

I would be interested in any suggestions for how to better handle this case.

This is the first time I'm reading this and the additional comments are incredibly helpful. Thank you for all the effort @ashermancinelli !

Has this version of the tests been validated for its ability to still find bugs in a broken runtime or decimal conversion library?

I agree that we should try to preserve this capability. This should be possible with (I might be missing something):

  • GTest test fixture
  • call to RegisterCrashHandler in setUp

We could re-use CatchCrash from testing.cpp (it's likely that we will need to duplicate it for a short while). I think that we could test crash handling with LenTrimKind (try running it with some invalid arguments). More specifically, we need to trigger a call to terminator.Crash. This could be a implemented in a separate file IMHO (better - a separate patch).

@ashermancinelli , perhaps you could add some negative tests. This would increase the test coverage and improve the overall robustness of the tests. I would hope that for tests like HelloWorldOutputTest this should be rather straightforward. Perhaps trickier elsewhere.

flang/unittests/RuntimeGTest/NumericalFormatTest.cpp
76

ASSERT_FALSE suggests that we are expecting this to fail, but in fact this is testing for success (0 in this context means success, see here. I suggest ASSERT_EQ(status, 0) instead.

@klausler I reproduce all of the same test cases, so the test coverage should be identical and should catch all of the same bugs. I'd like to increase coverage, but the patch as-is should keep it the same.

I would like you to demonstrate that this new test does indeed properly report test failures when they occur.

@klausler I think I better understand your concerns now. How would you feel about this patch if I un-removed the original test, we refine this test, and ten

@klausler I reproduce all of the same test cases, so the test coverage should be identical and should catch all of the same bugs. I'd like to increase coverage, but the patch as-is should keep it the same.

I would like you to demonstrate that this new test does indeed properly report test failures when they occur.

In your opinion, would some death tests be sufficient? I can demonstrate some conditions under which many of these operations result in a runtime crash with the correct message, and in my opinion that would demonstrate proper failure reporting.

flang/unittests/RuntimeGTest/NumericalFormatTest.cpp
76

This is a great point, I'll address all instances of this I can find.

In your opinion, would some death tests be sufficient? I can demonstrate some conditions under which many of these operations result in a runtime crash with the correct message, and in my opinion that would demonstrate proper failure reporting.

I think that it's worthwhile to add a dedicated test for the crash handler API either way: https://reviews.llvm.org/D98601. This patch demonstrates that the message from the crash handler is indeed generated and printed to stdout.

Related to that, @ashermancinelli , do you know whether any of the methods tested here ever crash? This might be tricky to check. It might be easier to install a crash handler with Fortran::runtime::Terminator::RegisterCrashHandler like in the original test file.

flang/unittests/RuntimeGTest/NumericalFormatTest.cpp
99

IMHO, it would be better to dump to stderr so that it is possible to separate the regular output from GTest and other logging like this.

Rebase on recent main branch to incorporate terminator crash tests

@klausler D98652 hopefully demonstrates that runtime crashes are caught by death tests. This revision has been rebased on D98652 - I hope this sufficiently demonstrates that these tests accurately catch crashes. We could also work on another patch that intentionally crashes each of the APIs tested above not already tested in D98652.

@klausler D98652 hopefully demonstrates that runtime crashes are caught by death tests. This revision has been rebased on D98652 - I hope this sufficiently demonstrates that these tests accurately catch crashes. We could also work on another patch that intentionally crashes each of the APIs tested above not already tested in D98652.

I wasn't asking about crashes in the runtime. I was asking about catching incorrect results and reporting failures.

I will try again. If you modify the decimal or runtime library in such a way as to cause f18 to emit incorrectly formatted results, does your rewritten test fail? Let me know when you have confirmed this by observation.

@klausler D98652 hopefully demonstrates that runtime crashes are caught by death tests. This revision has been rebased on D98652 - I hope this sufficiently demonstrates that these tests accurately catch crashes. We could also work on another patch that intentionally crashes each of the APIs tested above not already tested in D98652.

I wasn't asking about crashes in the runtime. I was asking about catching incorrect results and reporting failures.

I will try again. If you modify the decimal or runtime library in such a way as to cause f18 to emit incorrectly formatted results, does your rewritten test fail? Let me know when you have confirmed this by observation.

After applying the breaking patch attached below, all tests fail except for IOApiTests.DescriptorOutputTest because I'm not familiar enough with the descriptor api to know how to break it. Please let me know if this is sufficient, and thanks for being understanding.


ashermancinelli marked an inline comment as done.

Fix clang-format and clang-tidy warnings.

klausler accepted this revision.Mar 19 2021, 11:00 AM
This revision is now accepted and ready to land.Mar 19 2021, 11:00 AM

Fix header include order.

awarzynski accepted this revision.Mar 28 2021, 12:52 PM

LGTM, thank you for working on this! I've left a couple of comments, but these are [nit]s (feel free to ignore!).

flang/unittests/RuntimeGTest/NumericalFormatTest.cpp
42–45

Perhaps ASSERT_EQ(status, 0) would be better?

99

To clarify, I meant whole.Dump(stderr) instead of whole.Dump().

Use GTest EXPECT_EQ over potentially ambiguous return value, redirect SUT IO to stderr.

This revision was landed with ongoing or failed builds.Mar 29 2021, 9:19 AM
This revision was automatically updated to reflect the committed changes.
flang/unittests/Runtime/CMakeLists.txt