This is an archive of the discontinued LLVM Phabricator instance.

[analyzer][docs] Add Configuration section to user-docs
Needs ReviewPublic

Authored by steakhal on Nov 2 2021, 4:28 AM.

Details

Summary

This change is twofold.

  1. Introduces the Configuration subpage under the user-docs of the analyzer.
  2. Adds a section describing the modeling of flexible array members.

This page would house a more user-friendly digest of the most useful analyzer config options.
It's not aimed to be a complete list of the options, rather be first aid for the users if the analysis has some unintended behavior, such as false positives regarding e.g. FAMs for form 3.
They could use this page to help them decide if there is a known option that they missed and give guidance on how to look for more config options if these don't solve their issue.

Diff Detail

Event Timeline

steakhal created this revision.Nov 2 2021, 4:28 AM
steakhal requested review of this revision.Nov 2 2021, 4:28 AM

Big yes to the idea, the config of choice as the first to document seems fine, but I'm not sure we should lecture the user right in the very first paragraph. How about a short description of what a FAM is in the first place, than examples of the 3 types of it, and why we assume that only the first and the second to be such by default. If you insist, a word of caution would be okay in the last paragraph.

steakhal planned changes to this revision.Nov 2 2021, 8:40 AM

Thanks for the remarks @Szelethus!
I'm gonna update accordingly.

steakhal updated this revision to Diff 384425.Nov 3 2021, 7:18 AM
  • Briefly introduce FAMs by quoting Wikipedia.
  • Use double backticks for inline code.
NoQ added a comment.Nov 3 2021, 8:28 AM

Some -analyzer-config flags indeed deserve better treatment. I appreciate the effort.

clang/docs/analyzer/user-docs/Configuration.rst
12

Most of these options are not intended for end users. Anything with -cc1 or -Xclang is not intended for end users. If end users need this, we should provide user-friendly options (eg., driver flags).

In fact, we already have an -analyzer-config flag in scan-build and scan-build --help already documents one user-facing flag:

-analyzer-config <options>

  Provide options to pass through to the analyzer's -analyzer-config flag.
  Several options are separated with comma: 'key1=val1,key2=val2'

  Available options:
    * stable-report-filename=true or false (default)
      Switch the page naming to:
      report-<filename>-<function/method name>-<id>.html
      instead of report-XXXXXX.html

This might be a better place to put this info. If you really want a more generic documentation page, you should still use a user-facing tool such as scan-build in your examples. Or anyhow redirect to whatever UI the user is using.

73

We recommend not using FAMs at all.

The linux kernel uses them all over the place and, let's face it, we're in no position to call Linus Torvalds a bad programmer and tell him to go fix his code before using our tools.

Thanks for the review @NoQ.

clang/docs/analyzer/user-docs/Configuration.rst
12

We mostly use CodeChecker, but I don't want to advocate any specific tool here. On the other hand, it would be still really useful for the users to get an idea of which options we encourage them to tinker with to see if it solves their issue before filing a ticket that the analyzer can't do something.
I'm not exactly sure how to do this without using the driver or advocating a specific tool.

Aside from this, scan-build doesn't seem to be the perfect place for this IMO.

Should I stick to documenting these in the scan-build perl file?

@NoQ @Szelethus @martong WDYT

73

Sure, I'm going to paraphrase this.

Szelethus added inline comments.Nov 9 2021, 5:20 AM
clang/docs/analyzer/user-docs/Configuration.rst
12

In fact, we already have an -analyzer-config flag in scan-build and scan-build --help already documents one user-facing flag:

We could consider adding all this to -analyzer-config-help, and maybe filter developer only configs to -analyzer-config-help, similar to how we do it for checkers.

Most of these options are not intended for end users. Anything with -cc1 or -Xclang is not intended for end users. If end users need this, we should provide user-friendly options (eg., driver flags).

I agree with this sentiment, but not at the cost of these things remaining undocumented. I personally support this patch, even if its precedes bumping some flags to the driver level. Mind that checkers can only be enabled through frontend flags, but are documented as user facing.

We could however omit those sentences that use -cc1. Just have an intro similar to https://clang.llvm.org/docs/analyzer/checkers.html. Maybe another sentence about "only those configs that are listed on this page are meant for end users!".

Aside from this, scan-build doesn't seem to be the perfect place for this IMO.

Agreed.

martong added inline comments.Nov 18 2021, 2:17 AM
clang/docs/analyzer/user-docs/Configuration.rst
12

I support @Szelethus 's viewpoint. We could have an -analyzer-config-help-developer (and possibly an -analyzer-config-help-alpha). Very similarly to what we already have with -analyzer-checker-option-help-*.
I'd expect 3 follow up patches then:

  1. Document the existing cli options. Currently there are many options undocumented:
...
  -analyzer-checker-help-developer
  -analyzer-checker-help  Display the list of analyzer checkers that are available
  -analyzer-checker-option-help-alpha
  -analyzer-checker-option-help-developer
  -analyzer-checker-option-help
...
  1. Implement -analyzer-config-help-developer, this would include all options that are intended for debugging.
  2. Implement -analyzer-config-help-alpha, this would include all options that are intended for brave and friendly users.

@NoQ What do you think?

martong added inline comments.Nov 18 2021, 2:22 AM
clang/docs/analyzer/user-docs/Configuration.rst
12
  1. Implement -analyzer-config-help-developer, this would include all options that are intended for debugging.
  2. Implement -analyzer-config-help-alpha, this would include all options that are intended for brave and friendly

And once we have this explicit categorization of config options then I think it is perfectly legit to have such a Configuration.rst that includes the user facing (non-alpha or alpha) options.

As a user, I'm not sure if this would be the right way moving forward. This patch conflates a lot of parallel (but related) executions of the idea. (I'm not against introducing points 1 and 2 in the same patch, but do not feel like it's right to do it this way!)

  • If I am having a bunch of false positives related to FAMs, I wouldn't think that looking at a file named Configuration is where I might find the answer.
  • If I want to consider further reading about what options I have to fine-tune the analyser (especially after realising that --analyser-help-whatever or CodeChecker is not explaining stuff to me in a satisfactory way...), then it looks weird to have a file which has a "<h2>" about Flexible array members and shows the config option at the end, after a lengthy introduction to the particular issue.

Document the existing cli options.

As far as I understand we are using TableGen for the flags in CSA too, right? I think TableGen could generate at least the skeleton of the documentation. We sure do it with the supported attributes. I do not advocate that the specific format there is the true solution, but certainly looks better organised than what we are proposing in this patch.

How I would like to enjoy reading about configuration options (without getting into the debate as to what options the user should know about in the first place...).
First, the global (i.e. analyser-wide) configuration options, listed, may or may not categorised, in a sorted way. The "primary key" should be the configuration option (or the conceptual group it belongs to). Maybe not show it like consider-checker-option-as-meaningful but "Consider checker option as meaningful" to make the writing be more human-friendly and less machine-generated. Conversely, that would mean grepping the actual flag will not show up anything unless we kind of repeat ourselves...

For each option, include a simple example, the very striking change that shows how the analysis will change depending on the value of the configuration.
(E.g. in the case of FAMs, make it simple: flag ON -> this warning happens, that warning doesn't. Flag OFF -> this warning does not happen, but that other one does!)
If there needs to be an elaborate documentation for the conceptual background of the configuration, I would prefer if it was its own page -- which is of course cross-referenced from the configuration list.
This page then would contain all the theoretical background and definition of all the various forms developers might end up defining FAMs, the discussion on which is standard, which is not supported or recommended, and why.

As to whether the checker-specific configs should be not listed here / listed but not elaborated here / properly elaborated here, I do not see the right answer. Perhaps the check-specific configuration should live together with the check's documentation (as a sub-heading), but follow the same format: simple example in the list, elaboration on its own page.

For reference, the Tidy convention is that there is a global list of all checks which is just a Table-of-Contents, not even a summary is shown there (which I believe is kind of a drawback of that structure, but let's take it as a given), and each check's documentation has its own elaboration of the options it can take.
Then again, Tidy doesn't really have global options that change a significant part of the behaviour for all, so elaboration per-checker is fine there.

I think we should document all clang analyzer options on this page (which is listed by clang -cc1 -analyzer-config-help"), where the section header should be the analyzer option name (e.g. consider-single-element-arrays-as-flexible-array-members).
I agree with @whisperity, that options, which affect the analysis should be described with code examples highlighting the changes the option causes in the analysis behaviour. (such as in case of consider-single-element-arrays-as-flexible-array-members, or widen-loops, unroll-loops etc.)

These descriptions would enable present and future front-end tool developers (scan-build, CodeChecker etc.) to understand and decide, which options they would like to forward to the users. For example "consider-single-element-arrays-as-flexible-array-members" can be presented to the end-user (who is running the anlaysis through CodeChekcer), but " ctu-invocation-list" may not (as CodeChecker configures up this parameter automatically in CTU analysis mode).

So we can regard all of these parameters as "developer" configuration options, but I think it is much better to document the config options such as "consider-single-element-arrays-as-flexible-array-members" _here_, in clang static analyzer, where it is implemented, instead of in CodeChecker where there is no related code. In CodeChecker docs we would like to give a reference to this config option and the related documentation.

So the primary consumer of this doc are clang front-end tool(scan-build,codechecker etc.) developers and secondarily users that are referred here (from those front-end tools).

I also support the idea of @whisperity of autogenerating the titles of this doc based on the TD file, but that could be done in a later step too.

So I think this patch could go in like this, with the change that the section title should not be "Flexible array member modeling", but "consider-single-element-arrays-as-flexible-array-members" and probably some example of reports that this may change.


How the user finds these pages and false positive related documentation?

When a user suspect a false positive report, (s)he will typically try to look up the doc by the emitting checker name name, such as https://clang.llvm.org/docs/analyzer/checkers.html#deadcode-deadstores-c
where he should find the description of potential false positives and checker options that may alter the behaviour (e.g. deadcode.DeadStores:WarnForDeadNestedAssignments). This how clang tidy does it (e.g. https://clang.llvm.org/extra/clang-tidy/checks/bugprone-signed-char-misuse.html)
Then, we could refer to global analyzer options from here, which may have an effect to the mentioned false positives.
So, I think we should require all checker developers to document all checker options in the (https://clang.llvm.org/docs/analyzer/checkers.html (./clang/docs/analyzer/checkers.rst)).

ASDenysPetrov added inline comments.Apr 14 2022, 3:02 AM
clang/docs/analyzer/user-docs/Configuration.rst
73

we're in no position to call Linus Torvalds a bad programmer

But still there are much more programmers that are not so good in programing than Linus, therefore this recommendation applies for them. That's the fact that int a[1] and int a[] bring different information. Most every book of best practice consider avoiding the code that can be misinterpreted or interpreted ambiguously even if the language syntax allows it.

Herald added a project: Restricted Project. · View Herald TranscriptApr 14 2022, 3:02 AM

Ping

We agreed with @steakhal that we should move this under the "Developer Docs" section. That way, the consider-single-element-arrays-as-flexible-array-members=true option remains hidden from the average users. But for advanced users/developers we can still point to the relevant options.

@steakhal, any update on this?