This is an archive of the discontinued LLVM Phabricator instance.

[-Wunsafe-buffer-usage][PoC][WIP] Add #include <span> to emitted Fix-Its
Needs ReviewPublic

Authored by ziqingluo-90 on Feb 8 2023, 11:15 PM.

Diff Detail

Event Timeline

jkorous created this revision.Feb 8 2023, 11:15 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 8 2023, 11:15 PM
jkorous requested review of this revision.Feb 8 2023, 11:15 PM

Note: The naive strategy doesn't add #include <span> to header files declaring a function which parameter we change.
We might have to track all the files we add a reference to a new type and make sure all of those have the necessary headers included.

ziqingluo-90 commandeered this revision.Apr 3 2023, 5:35 PM
ziqingluo-90 edited reviewers, added: jkorous; removed: ziqingluo-90.
ziqingluo-90 updated this revision to Diff 510651.EditedApr 3 2023, 5:50 PM

To emit an insertion of #include ... to each unique source file of fix-its associated to a variable VD according to VD's strategy.

Add tests:

  1. If the insertion conflicts with other fix-its of VD, give up on fixing VD;
  2. If there is already an equivalent #include ... in a file, do not emit the insertion to the file;
  3. Included files (headers) will be inserted #include ..., if they have unsafe variables or parameters.
t-rasmud added inline comments.May 5 2023, 9:52 AM
clang/lib/Analysis/UnsafeBufferUsage.cpp
2253

Should the code that adds #include <span> be placed after this check that might erase a fixit for VD?

ziqingluo-90 marked an inline comment as done.May 5 2023, 11:26 AM
ziqingluo-90 added inline comments.
clang/lib/Analysis/UnsafeBufferUsage.cpp
2253

One of the reasons that I place "conflict/macro overlap" check after the header insertion is that an inserted header could conflict with other fix-its. In that case, they should all be erased.

I agree that if we do the check earlier, we don't have to insert headers (which is expensive now) for erased fix-its. It is a performance gain for sure but the conflict aforementioned could happen.

I'm working on a performance optimization for this patch. Currently every variable fix-it invokes the construction of tooling::HeaderIncludes, in which the source file is scanned slowly. Even if there is just one single source file, the file is scanned n times, where n is the number of variables being fixed. Optimization will make sure that one source file is scanned at most once.

t-rasmud added inline comments.May 8 2023, 10:39 AM
clang/lib/Analysis/UnsafeBufferUsage.cpp
2253

Thank you for explaining your approach to me. Looking forward to the optimizations!

ziqingluo-90 marked an inline comment as done.

Use a cache for loaded HeaderIncludes for a translation unit (TU). Loading HeaderIncludes is expensive so that we only want to load it for a unique file at most once per TU.

A class HeaderIncludesCache was created to represent the cache. One instance of it is created at AnalysisBasedWarnings::IssueWarnings for the whole TU.

NoQ added a comment.May 9 2023, 5:41 PM

Caching is great!

But IIRC even a single invocation of tooling::HeaderIncludes() is too slow by compiler warning performance standards. We most likely need more optimizations before we can land this patch :(

clang/include/clang/Analysis/Analyses/UnsafeBufferUsage.h
50–56

Interesting, how does this really work when there's a lot of files? Do we add #include <span> in every file in which we introduce a use? This is probably always the correct thing to do, and it probably aligns pretty well with the include-what-you-use paradigm, but a many people don't actually follow that paradigm. In particular, if you include <span> in a header, you no longer need to include it again in the .cpp file. Maybe there's room for more savings there.