This is an archive of the discontinued LLVM Phabricator instance.

[clang-tidy] Resubmit hicpp-multiway-paths-covered without breaking test
ClosedPublic

Authored by JonasToth on Dec 1 2017, 10:15 AM.

Details

Summary

The original check did break the green buildbot in the sanitizer build.
It took a while to redroduce and understand the issue.

There occured a stackoverflow while parsing the AST. The testcase with
256 case labels was the problem because each case label added another
stackframe. It seemed that the issue occured only in 'RelWithDebInfo' builds
and not in normal sanitizer builds.

To simplify the matchers the recognition for the different kinds of switch
statements has been moved into a seperate function and will not be done with
ASTMatchers. This is an attempt to reduce recursion and stacksize as well.

The new check removed this big testcase. Covering all possible values is still
implemented for bitfields and works there. The same logic on integer types
will lead to the issue.

Running it over LLVM gives the following results:

Diff Detail

Repository
rL LLVM

Event Timeline

JonasToth created this revision.Dec 1 2017, 10:15 AM
aaron.ballman edited edge metadata.Dec 3 2017, 7:59 AM

I would like to see the large test case added back in -- it was demonstrating a real problem with the code and it would be good to ensure we don't regress that behavior again in the future.

It does still regress! In a sense this is a WIP that i did not clarify.(which i do now)

While debugging the issue it seemed that parsing the AST did result in the stack overflow and not my check specific code.
Looping over all case Labels seems to be done in a recursive manner, which results in a massive stack (about 600 Frames or more while debugging). I am trying to better isolate the issue to be able to specify the problem more precise.

The odd thing to me was, that only the RelWithDebInfo build showed this behaviour. Building clang with MinSizeRel + Sanitizer did not show a stack overflow. I assume that the Debug Info increase the stack frame size. The release builds should overflow as well but with more case labels.

My goals for investigation:

  • Figure out exactly which part of the matcher is the problem
  • Rewrite the matchers and try to figure out a way to not use recursive functions to figure out the properties of the switch statement. Could information like CaseCount and hasDefault be interesting to have as member functions in the frontend? I think that can be added into Sema (i never touched anything there, so noo experience)
  • Figure out when the Release Build will stackoverflow.

Would it be acceptable to allow a Stackoverflow when there are thousands of case labels? If i understand correctly (and AST-dump suggests it as well) each case label is another layer in the tree. If that is true other parts of clang might suffer from a similar issue.

JonasToth retitled this revision from [clang-tidy] Resubmit hicpp-multiway-paths-covered without breaking test to [clang-tidy] WIP Resubmit hicpp-multiway-paths-covered without breaking test .Dec 4 2017, 3:56 AM

@aaron.ballman I further investigated and the issue does not come directly from my code (at least thats what i believe) :)

If i change the matcher simply to

Finder->addMatcher(switchStmt().bind("switch"), this);

The overflow still occurs! I assume that the recursive natur of ASTVisitor causes that issue. How should we go further? Maybe ask klimek for some advice?

JonasToth updated this revision to Diff 125556.Dec 5 2017, 9:33 AM
JonasToth retitled this revision from [clang-tidy] WIP Resubmit hicpp-multiway-paths-covered without breaking test to [clang-tidy] WIP Resubmit hicpp-multiway-paths-covered without breaking test.
  • [Misc] comment out matcher, make compile and runnable

@sbenza and @klimek the issue that occurs here is probably ASTMatchers related, thats why I added you.

The problem:

This patch introduces a stack overflow when building with RelWithDebInfo that is caught by the sanitizers. I isolated the issue in the big switch statements that has 256 case labels. On label 200 the overflow occurs, that is noted with a comment.
What I found while debugging is, that no matter which matcher I use the overflow occurs. The toplevel function is parseAST and then executing the frontend actions.

I think the problem is that every case label adds multiple recursive function calls coming from the nature of how case labels are stored in the AST. That produces the massive stack. Is there a way I could work around this Issue in the check?

@sbenza and/or @klimek did you have time to address the stackoverflow caused from the ASTMatchers?

JonasToth retitled this revision from [clang-tidy] WIP Resubmit hicpp-multiway-paths-covered without breaking test to [clang-tidy] Resubmit hicpp-multiway-paths-covered without breaking test.Jan 7 2018, 4:31 AM

@sbenza and/or @klimek did you have time to address the stackoverflow caused from the ASTMatchers?

I forgot to add the link to the original break: http://green.lab.llvm.org/green/job/clang-stage2-cmake-RgSan/5470/consoleFull#17462642768254eaf0-7326-4999-85b0-388101f2d404

  • get up to date to 7.0

After long inactivity (sorry!) i had a chance to look at it again:

switch(i) {
case 0:;
case 1:;
case 2:;
...
}

does *NOT* lead to the stack overflow. This is most likely an issue in the AST:
https://godbolt.org/g/vZw2BD

Empty case labels do nest, an empty statement prevents this. The nesting leads most likely to the deep recursion. I will file a bug for it.

After long inactivity (sorry!) i had a chance to look at it again:

switch(i) {
case 0:;
case 1:;
case 2:;
...
}

does *NOT* lead to the stack overflow. This is most likely an issue in the AST:
https://godbolt.org/g/vZw2BD

Empty case labels do nest, an empty statement prevents this. The nesting leads most likely to the deep recursion. I will file a bug for it.

FWIW here are my 5 cent: this is a preexisting bug. Your testcase just happened to expose it.
I'd file the bug, and then simply adjust the testcases here not to trigger it and re-land this diff.

I'm not sure what is to be gained by not doing that.
Of course, the bug is a bug, and should be fixed, but it exists regardless of this differential...

After long inactivity (sorry!) i had a chance to look at it again:

switch(i) {
case 0:;
case 1:;
case 2:;
...
}

does *NOT* lead to the stack overflow. This is most likely an issue in the AST:
https://godbolt.org/g/vZw2BD

Empty case labels do nest, an empty statement prevents this. The nesting leads most likely to the deep recursion. I will file a bug for it.

FWIW here are my 5 cent: this is a preexisting bug. Your testcase just happened to expose it.
I'd file the bug, and then simply adjust the testcases here not to trigger it and re-land this diff.

I'm not sure what is to be gained by not doing that.
Of course, the bug is a bug, and should be fixed, but it exists regardless of this differential...

Just because a particular check triggers an issue elsewhere doesn't mean we should gloss over the issue in the check. That just kicks the can farther down the road so that our users find the issue. In this case, the check is likely to push users *towards* discovering that AST bug rather than away from it which is why I'm giving the push-back.

alexfh removed a reviewer: alexfh.Mar 14 2018, 7:48 AM

Good news: The stack overflow seems to be fixed, by the patch r326624 from @rsmith (Thank you very much!).
Bad news: The check does not do its job anymore.... I try to fix the functionality problems.

JonasToth updated this revision to Diff 138849.Mar 18 2018, 7:23 AM
  • get the check working again
  • enable the big test
JonasToth updated this revision to Diff 138850.Mar 18 2018, 7:27 AM
  • reorder check in release notes to get it in alphabetical order

I think the check is ready to land. I did check run it over LLVM and found some interesting code parts that could benefit. I extracted all the warnings and sorted them into the categories missing default/covered codepath(uncovered.txt), better use if/else(better_if.txt) and use if(mandatory_if.txt). Some warnings come from generated files but some live in usercode. In total there are 540 warnings.

Please note, that i identified a common pattern to leave out the default label and add an ending llvm_unreachable that is outside the switch but ensures there is no unwanted fall_through. Most of the warnings for the first category are like this, but i could not check all of them!

JonasToth edited the summary of this revision. (Show Details)Mar 18 2018, 7:32 AM
JonasToth updated this revision to Diff 138851.Mar 18 2018, 7:35 AM
  • remove spurious debug iostream
aaron.ballman added inline comments.Mar 20 2018, 9:25 AM
clang-tidy/hicpp/MultiwayPathsCoveredCheck.cpp
69

Add some vertical whitespace before the comments.

76

Same here.

80

s/then/than

120

I think a better way to phrase this one is: "switch statement without labels has no effect" and perhaps have a fix-it to replace the entire switch construct with its predicate?

docs/clang-tidy/checks/hicpp-multiway-paths-covered.rst
96

Missing whitespace after "missing". May want to clarify that that this is a Boolean option.

JonasToth updated this revision to Diff 139293.Mar 21 2018, 7:28 AM
JonasToth marked 4 inline comments as done.
  • adress review comments
clang-tidy/hicpp/MultiwayPathsCoveredCheck.cpp
120

The fixit would only aim for the sideeffects the predicate has, right? I would consider such a switch as a bug or are there reasonable things to accomplish? Given its most likely unintended code i dont think a fixit is good.

Fixing the code removes the warning and might introduce a bug.

aaron.ballman accepted this revision.Mar 21 2018, 8:18 AM

LGTM!

clang-tidy/hicpp/MultiwayPathsCoveredCheck.cpp
120

This code pattern comes up with machine-generated code with some frequency, so I was thinking it would be nice to automatically fix that code. However, I think you're right that "fixing" the code may introduce bugs because you don't want to keep around non-side effecting operations and that's complicated. e.g.,

switch (i) { // Don't replace this with i;
}

switch (some_function_call()) { // Maybe replace this with some_function_call()?
}

switch (i = 12) { // Should definitely be replaced with i = 12;
}

Perhaps only diagnosing is the best option.

This revision is now accepted and ready to land.Mar 21 2018, 8:18 AM
JonasToth updated this revision to Diff 139304.Mar 21 2018, 8:32 AM
JonasToth marked 3 inline comments as done.
  • add fixme for degenerated switch
JonasToth added inline comments.Mar 21 2018, 8:33 AM
clang-tidy/hicpp/MultiwayPathsCoveredCheck.cpp
120

I will add another FIXME. The if-is-better pattern might be reasonable transformable and doing this allows addressing the issue again later.

JonasToth marked an inline comment as done.Mar 21 2018, 8:33 AM
JonasToth closed this revision.Mar 21 2018, 8:43 AM