Better document each test in output formatting tests. Use GTest primitives and infrastructure in same
spirit as D97403. See legacy github issue linked here for additional context. Reorganize long test cases to be more readable.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Unit Tests
Event Timeline
Has this version of the tests been validated for its ability to still find bugs in a broken runtime or decimal conversion library?
I apologize for my ignorance here - could you elaborate on this? Do you have a recommendation for how to go about intentionally breaking some component to ensure the tests still catch broken libraries?
The existing version of this test works in the sense that it reliably fails when there's a bug in the Decimal library or in the code in the runtime that processes formatted output. Will your new version fail when there are bugs in those libraries? If you're not sure, then please don't replace the test until you are sure.
@klausler I reproduce all of the same test cases, so the test coverage should be identical and should catch all of the same bugs. I'd like to increase coverage, but the patch as-is should keep it the same.
flang/unittests/RuntimeGTest/NumericalFormatTest.cpp | ||
---|---|---|
42–45 | I would be interested in any suggestions for how to better handle this case. |
This is the first time I'm reading this and the additional comments are incredibly helpful. Thank you for all the effort @ashermancinelli !
I agree that we should try to preserve this capability. This should be possible with (I might be missing something):
- GTest test fixture
- call to RegisterCrashHandler in setUp
We could re-use CatchCrash from testing.cpp (it's likely that we will need to duplicate it for a short while). I think that we could test crash handling with LenTrimKind (try running it with some invalid arguments). More specifically, we need to trigger a call to terminator.Crash. This could be a implemented in a separate file IMHO (better - a separate patch).
@ashermancinelli , perhaps you could add some negative tests. This would increase the test coverage and improve the overall robustness of the tests. I would hope that for tests like HelloWorldOutputTest this should be rather straightforward. Perhaps trickier elsewhere.
flang/unittests/RuntimeGTest/NumericalFormatTest.cpp | ||
---|---|---|
76 | ASSERT_FALSE suggests that we are expecting this to fail, but in fact this is testing for success (0 in this context means success, see here. I suggest ASSERT_EQ(status, 0) instead. |
I would like you to demonstrate that this new test does indeed properly report test failures when they occur.
@klausler I think I better understand your concerns now. How would you feel about this patch if I un-removed the original test, we refine this test, and ten
In your opinion, would some death tests be sufficient? I can demonstrate some conditions under which many of these operations result in a runtime crash with the correct message, and in my opinion that would demonstrate proper failure reporting.
flang/unittests/RuntimeGTest/NumericalFormatTest.cpp | ||
---|---|---|
76 | This is a great point, I'll address all instances of this I can find. |
I think that it's worthwhile to add a dedicated test for the crash handler API either way: https://reviews.llvm.org/D98601. This patch demonstrates that the message from the crash handler is indeed generated and printed to stdout.
Related to that, @ashermancinelli , do you know whether any of the methods tested here ever crash? This might be tricky to check. It might be easier to install a crash handler with Fortran::runtime::Terminator::RegisterCrashHandler like in the original test file.
flang/unittests/RuntimeGTest/NumericalFormatTest.cpp | ||
---|---|---|
98 | IMHO, it would be better to dump to stderr so that it is possible to separate the regular output from GTest and other logging like this. |
@klausler D98652 hopefully demonstrates that runtime crashes are caught by death tests. This revision has been rebased on D98652 - I hope this sufficiently demonstrates that these tests accurately catch crashes. We could also work on another patch that intentionally crashes each of the APIs tested above not already tested in D98652.
I wasn't asking about crashes in the runtime. I was asking about catching incorrect results and reporting failures.
I will try again. If you modify the decimal or runtime library in such a way as to cause f18 to emit incorrectly formatted results, does your rewritten test fail? Let me know when you have confirmed this by observation.
After applying the breaking patch attached below, all tests fail except for IOApiTests.DescriptorOutputTest because I'm not familiar enough with the descriptor api to know how to break it. Please let me know if this is sufficient, and thanks for being understanding.
Use GTest EXPECT_EQ over potentially ambiguous return value, redirect SUT IO to stderr.
I would be interested in any suggestions for how to better handle this case.