This is an archive of the discontinued LLVM Phabricator instance.

[analyzer] Create a stub for an std::variant checker
Needs ReviewPublic

Authored by spaits on Jan 23 2023, 5:57 AM.

Details

Summary

As per the discussion on this thread i have started working on an std::variant checker.
https://discourse.llvm.org/t/analyzer-new-checker-for-std-any-as-a-bsc-thesis/65613/2

This patch is mostly a conversation starter. @Szelethus will supervise me on this project, and we shall discuss our next steps here.

Diff Detail

Event Timeline

spaits created this revision.Jan 23 2023, 5:57 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 23 2023, 5:57 AM
spaits requested review of this revision.Jan 23 2023, 5:57 AM

I'm very much interrested. I think you should define a mock standard variant header, that you could include from your tests.

Thank you for your response @steakhal! I am going to define a mock standard variant header.
I was thinking about just copying the definitions from the original variant header
but it had other headers as dependency. I will write my own variant definition one step at the time.
First I will start with a very simple one. Then I am going to add the definitions
that are needed to test the latest changes in the checker.

spaits updated this revision to Diff 491816.Jan 24 2023, 8:25 AM

Adding mock header for std::variant.

NoQ added a comment.Jan 24 2023, 2:32 PM

Interesting, what specific goals do you have here? Are you planning to find specific bugs (eg. force-unwrap to a wrong type) or just to model the semantics? In the latter case, have you explored the possibility of force-inlining the class instead, like I suggested in the thread? Have you found a reasonable amount of code that uses std::variant, to test your checker on?

clang/include/clang/StaticAnalyzer/Checkers/Checkers.td
312

Maybe alpha.cplusplus?

Interesting, what specific goals do you have here? Are you planning to find specific bugs (eg. force-unwrap to a wrong type) or just to model the semantics?

Hi!

Meant to write this comment yesterday, but here we go. My idea was aiming for both of the goals you mentioned:

  1. Emit diagnostics on improper usages of std::variant, which mostly boils down retrieving a field through a wrong type, yes. This will be, in my view, the main showpiece of the thesis.
  2. Model the semantics.

In this or in a followup patch I think we should demonstrate with a few tests what we expect the checker to be capable of.

In the latter case, have you explored the possibility of force-inlining the class instead, like I suggested in the thread?

I have done some tests then, and figured that the analyzer can't really follow what happens in std::variant. I admit, now that I've taken a more thorough look, I was wrong! Here are some examples.

std::variant<int, std::string> v = 0;
int i = std::get<int>(v);
clang_analyzer_eval(i == 0); // expected-warning{{TRUE}}
10 / i; // FIXME: This should warn for division by zero!
std::variant<int, std::string> v = 0;
std::string *sptr = std::get_if<std::string>(&v);
clang_analyzer_eval(sptr == nullptr); // expected-warning{{TRUE}}
*sptr = "Alma!"; // FIXME: Should warn, sptr is a null pointer!
void g(std::variant<int, std::string> v) {
  if (!std::get_if<std::string>(&v))
    return;
  int *iptr = std::get_if<int>(&v);
  clang_analyzer_eval(iptr == nullptr); // expected-warning{{TRUE}}
  *iptr = 5; // FIXME: Should warn, iptr is a null pointer!
}

In neither of these cases was a warning emitted, but that was a result of bug report suppression, because the exploded graph indicates that the errors were found.

We will need to teach these suppression strategies in these cases, the heuristic is too conservative, and I wouldn't mind some NoteTags to tell the user something along the lines of "Assuming this variant holds and std::string value".

Have you found a reasonable amount of code that uses std::variant, to test your checker on?

Acid has a few, csv-parser in one place, there is a couple in config-loader and cista, and a surprising amount in userver. Though its a question how painful it is to set up their dependencies.

I would be interested in some of the free-functions dealing with variants, such as std::visit(). https://godbolt.org/z/bbocrz4dG
I hope that's also on the radar.

Have you found a reasonable amount of code that uses std::variant, to test your checker on?

Acid has a few, csv-parser in one place, there is a couple in config-loader and cista, and a surprising amount in userver. Though its a question how painful it is to set up their dependencies.

I think bitcoin and qtbase also have some uses for variant.

NoQ added a comment.Jan 25 2023, 7:32 PM

In the latter case, have you explored the possibility of force-inlining the class instead, like I suggested in the thread?

I have done some tests then, and figured that the analyzer can't really follow what happens in std::variant. I admit, now that I've taken a more thorough look, I was wrong! Here are some examples.

std::variant<int, std::string> v = 0;
int i = std::get<int>(v);
clang_analyzer_eval(i == 0); // expected-warning{{TRUE}}
10 / i; // FIXME: This should warn for division by zero!
std::variant<int, std::string> v = 0;
std::string *sptr = std::get_if<std::string>(&v);
clang_analyzer_eval(sptr == nullptr); // expected-warning{{TRUE}}
*sptr = "Alma!"; // FIXME: Should warn, sptr is a null pointer!
void g(std::variant<int, std::string> v) {
  if (!std::get_if<std::string>(&v))
    return;
  int *iptr = std::get_if<int>(&v);
  clang_analyzer_eval(iptr == nullptr); // expected-warning{{TRUE}}
  *iptr = 5; // FIXME: Should warn, iptr is a null pointer!
}

In neither of these cases was a warning emitted, but that was a result of bug report suppression, because the exploded graph indicates that the errors were found.

We will need to teach these suppression strategies in these cases, the heuristic is too conservative, and I wouldn't mind some NoteTags to tell the user something along the lines of "Assuming this variant holds and std::string value".

Yeah, right, these are probably inlined defensive check suppressions (type-2, the ones with "let's early-return null"). So you'll need to see how these inlined stack frames look like and what makes them different from the unwanted case. (And it's possible that there are no formal differences, other than the programmer's "intention"; but it could also be something really simple).

So in any case, I'm really curious whether there's any solution besides "let's model everything manually", given that we've spent two summers modeling unique_ptr manually and never finished it.

While we were there, we also dug into std::any, and learned that the analyzer can model it shockingly well. Hopefully we can submit a few patches that demonstrates it in a form of some test files.

I would be interested in some of the free-functions dealing with variants, such as std::visit(). https://godbolt.org/z/bbocrz4dG
I hope that's also on the radar.

Thank you so much!! We definitely intend to work on this issue, and haven't thought of it before your suggestion.