Adds a new ASTMatcher condition called 'hasInitStatement()' that matches if, switch and range-for statements with an initializer. Reworked clang-tidy readability-else-after-return to handle variables in the if condition or init statements in c++17 ifs. Also checks if removing the else would affect object lifetimes in the else branch
Details
Diff Detail
Event Timeline
It does support the new range for loop, I'll add in a case for that in the AST matcher
Thank you for this!
There's some missing test coverage for the changes to the clang-tidy check and for the new matcher. Also, the new matcher should be registered in clang\lib\ASTMatchers\Dynamic\Registry.cpp (in alphabetical order). Oh, and you should also run clang\docs\tools\dump_ast_matchers.py to regenerate the AST matcher documentation.
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
4300 | Missing a full stop at the end of the sentence. |
Thanks for the pointers, I'll amend those changes and update the patch, thanks for pointing it out for me. Probably be after boxing day though
I added test cases for the matchers and the check, wasn't able to add the CXXRangeFor to the matcher as that class uses a different dialect for the initializer statement handling. I also couldn't generate the new documentation as the script always fails on my machine, even without the changes i have made, resulting in the ASTMatcher html being empty
Renamed 'hasInitStorage' to 'hasInitStatement'.
Added support for range for loops in 'hasInitStatement'.
'hasInitStatement' now has a match predicate on the init statement.
ReadabilityElseAfterReturn check now checks the init statement is a decl statement before disregarding it.
Added test cases for check and the matcher.
Regenerated the docs for the AST Matchers.
LGTM aside from a minor style nit.
clang/include/clang/ASTMatchers/ASTMatchers.h | ||
---|---|---|
4325–4326 | Can drop the top-level const qualifier on the pointer (we don't typically do that). And can remove the extra parens around the return expression. e.g., const Stmt *Init = Node.getInit(); return Init && InnerMatcher.matches(*Init, Finder, Builder); |
So I wasn't happy with the vagueness of the else after return check implementation. Almost finished rewriting the check to properly handle the if statements with condition variables based on scope restrictions and where decls are used etc.
int *varInitAndCondition() { if (int *X = g(); X != nullptr) { return X; } else { h(&X); return X; } } int *varInitAndConditionUnusedInElseWithDecl() { int *Y = g(); if (int *X = g(); X != nullptr) { return X; } else { int *Y = g(); h(&Y); } return Y; }
transforms to
int *varInitAndCondition() { int *X = g(); if (X != nullptr) { return X; } h(&X); return X; } int *varInitAndConditionUnusedInElseWithDecl() { int *Y = g(); if (int *X = g(); X != nullptr) { return X; } else { int *Y = g(); h(&Y); } return Y; }
There's a few more test cases but that's the general idea. I'm having a little trouble writing the test cases as I can't run them on my windows machine to verify they report correctly
Think all the test cases are working, but they look like they could do with a clean up from someone who may know the clang-tidy-checks system better than myself
Should also point out that clang-format now has a better time reformatting as i expanded the replacements range to include the entire else block rather than the start and end braces
Hmm, I'm not convinced the new behavior is better in all cases, and I suspect this may boil down to user preference (suggesting we may want an option to control the behavior). There are competing stylistic decisions to be made for varInitAndCondition() -- one is the else after a return, but the other is the scope of the variable X. The behavior now forces the user to prefer removing the else after the return over hiding the variable X in a logical scope. For the contrived example provided, this isn't a huge deal because the function is small. However, in real-world code with larger function bodies, I can see wanting to prefer the name hiding. WDYT?
Sorry I didn't make it obvious with the test cases. The fix will never extend the lifetime of variables either in the condition or declared in the else block. It will only apply if the if is the last statement in its scope. Though I should check back through the scope for any declarations that have the same identifier as those in the if statement which would cause an error if they were brought out of the if scope
Adding checks to see if there are any declarations already in the ifs contained scope is really starting to bloat the code while trying to cover every edge case.
Oh, thank you for the clarification, I hadn't picked up on that.
clang-tools-extra/clang-tidy/readability/ElseAfterReturnCheck.cpp | ||
---|---|---|
21 | Hmm, I just realized we missed co_return for this check. That's a separate patch, though, so don't feel the need to change anything about this here. | |
45 | The use of dyn_cast_or_null is suspicious in these methods -- if Node is null, then we'll get into the else clause and promptly dereference a null pointer. My guess is that these should be dyn_cast calls instead, but if Node can truly be null, that case should be handled. | |
50 | const auto *, no? (Same below.) | |
51 | const auto * for sure. (Same below.) | |
77 | const auto * (this comment applies throughout the patch -- we don't want to deduce qualifiers or pointer/references, so they should be spelled out explicitly.) | |
86 | const auto *, no? | |
123 | Please don't use auto here as the type is not spelled out in the initialization. Also, drop the top-level const qualifier. Same elsewhere. | |
161 | warn -> Warn |
I'm in two minds about issuing a warning when scope restrictions prevent a fix. Do you think creating an option to enable or disable emitting warnings for cases where the scope prevents a fix would be a good idea?
It's not uncommon for fixits to only be generated under specific circumstances, so I'm wondering what your concern is with warning when we can't provide a fixit? The cases I am thinking about all seem reasonable to diagnose (are true positives) without fixing, but maybe you have different circumstances in mind.
Right now an issue is raised for every else after return flag, but not all else after return flags can be fixed due to declaration statements and scope issues. My suggestion is that you can choose to warn about those cases or not. For example a developer may want else after return for when they need to limit the scope and getting a warning for it may be undesirable.
Okay, I can see the value in having an option for that -- especially given that silencing the diagnostic would add // NOLINT noise to the code. Do you think the option should default to diagnosing all cases, even ones without a fixit?
Definitely default to diagnosing everything, that's how my current implementation looks right now. Also is there a way to do options as bool because I don't particularly like putting 1 for true in the config file
SGTM
Also is there a way to do options as bool because I don't particularly like putting 1 for true in the config file
Not to my knowledge.
I just did a quick grep and found that all checks that take bools for config options use a 1 for true, 0 for false, so I guess best stick to convention rather than hack at OptionsView to get and set bools
Added option to disable warning when an automatic fix can't be applied due to scope restrictions of variables, default option is to show all warnings
The new option LGTM but one of the tests can be updated to have a less complex RUN line.
clang-tools-extra/test/clang-tidy/checkers/readability-else-after-return.cpp | ||
---|---|---|
2–3 | You don't need to explicitly set it to 1 here. |
Yes please, I'm still very new to the llvm contributing system so I would have no idea what to do next.
Thank you for the patch, I've committed on your behalf in ec3d8e61b527c6312f77a4dab095ffc34e954927
The use of dyn_cast_or_null is suspicious in these methods -- if Node is null, then we'll get into the else clause and promptly dereference a null pointer. My guess is that these should be dyn_cast calls instead, but if Node can truly be null, that case should be handled.