This is an archive of the discontinued LLVM Phabricator instance.

Fix uninitialized class members
ClosedPublic

Authored by akshaykhadse on Apr 19 2023, 1:15 AM.

Details

Summary

This patch fixes static code analysis warnings.

Diff Detail

Event Timeline

akshaykhadse created this revision.Apr 19 2023, 1:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 19 2023, 1:15 AM
akshaykhadse requested review of this revision.Apr 19 2023, 1:15 AM
Herald added a project: Restricted Project. · View Herald TranscriptApr 19 2023, 1:15 AM
akshaykhadse retitled this revision from Fix uninitialized class meembers to Fix uninitialized class members.Apr 19 2023, 1:19 AM

Fix failures

LuoYuanke added inline comments.Apr 19 2023, 7:38 AM
llvm/include/llvm/CodeGen/ReachingDefAnalysis.h
90

It seems in ReachingDefAnalysis::enterBasicBlock() CurInstr is reset to 0, so I wonder if it is better to initialized to 0.

llvm/lib/CodeGen/IfConversion.cpp
200–201

In test/CodeGen/AArch64/O3-pipeline.ll, this pass runs before the register allocation pass, so set it true may be better.

llvm/lib/CodeGen/ShrinkWrap.cpp
141
144

Ditto.

akshaykhadse added inline comments.Apr 19 2023, 8:42 AM
llvm/include/llvm/CodeGen/ReachingDefAnalysis.h
90

Setting it to 0 would imply that there is at least one instruction in the basic block. I am not sure if that would be true if this method would run on empty basic block. But, if we cna be sure that there are no empty basic blocks, we can set it to 0. Let me know what you think.

Address comments

akshaykhadse marked 4 inline comments as done.Apr 19 2023, 8:46 AM
This revision is now accepted and ready to land.Apr 19 2023, 4:21 PM

Failures are unrealted to this change. Same are present in previous reviews as well

This revision was landed with ongoing or failed builds.Apr 19 2023, 8:19 PM
This revision was automatically updated to reflect the committed changes.

The description is missing context, are these fixing bugs? How were these issues found?

akshaykhadse edited the summary of this revision. (Show Details)Apr 19 2023, 9:32 PM

The description is missing context, are these fixing bugs? How were these issues found?

I think these are to fix the issues that reported by Coverity/Klocwork tool.

The description is missing context, are these fixing bugs? How were these issues found?

I think these are to fix the issues that reported by Coverity/Klocwork tool.

For the record - these sort of changes may not be desirable.

Adding initializers to variables that don't need it (because through some dynamic path they are initialized before they're used) may hinder tools like msan from finding real bugs.

Adding initializers to variables that don't need it (because through some dynamic path they are initialized before they're used) may hinder tools like msan from finding real bugs.

It's can be useful for local variables. But for class members, especially complex, when half members are already initialized, it's seems cont-productive. We will more often introduce bugs then catch some with msan. Unless it's not perf issue, I would rather see consistent initialized state.
Dynamic checks do not cover all cases in testing, plus optimizations suppress some bugs.

Fopr logical bug I would rather recommend to set default to some invalid value, and then assert() or diagnostics.

Adding initializers to variables that don't need it (because through some dynamic path they are initialized before they're used) may hinder tools like msan from finding real bugs.

It's can be useful for local variables. But for class members, especially complex, when half members are already initialized, it's seems cont-productive.

Actually for class members there are cases where it comes to mind - when a member is only used under some substate of the object, etc. (we do also have compile time warnings to catch some of these cases too, that aren't just "make sure everything's always initialized at the declaration" - checking for private members that are only ever read and never written, which is admittedly a smaller set of cases)

We will more often introduce bugs then catch some with msan.

Happen to have data on this?

Unless it's not perf issue, I would rather see consistent initialized state.
Dynamic checks do not cover all cases in testing, plus optimizations suppress some bugs.

Fopr logical bug I would rather recommend to set default to some invalid value, and then assert() or diagnostics.

Not everything has invalid values.

In any case, I think it'd be a pretty major stylistic requirement for LLVM if we're going to adopt initializing members at declaration in all cases - maybe worth a style discussion on the forums, etc, before we go adding these where they aren't demonstrably needed/only motivated by a stylistic checker.

If we do want to adopt such a policy, it seems like a pretty easy one to enforce/check/etc, rather than these sporadic "someone ran a checker" situations.

Adding initializers to variables that don't need it (because through some dynamic path they are initialized before they're used) may hinder tools like msan from finding real bugs.

It's can be useful for local variables. But for class members, especially complex, when half members are already initialized, it's seems cont-productive.

Actually for class members there are cases where it comes to mind - when a member is only used under some substate of the object, etc. (we do also have compile time warnings to catch some of these cases too, that aren't just "make sure everything's always initialized at the declaration" - checking for private members that are only ever read and never written, which is admittedly a smaller set of cases)

We will more often introduce bugs then catch some with msan.

Happen to have data on this?

Not very scientific, just my experience. I have to do a lot of msan cleanups. 90%+ is pure UBs, like values loaded but not rely used. But it's UB and may trigger miss-compile, so need to be fixed. I've see logical bugs very rarely. So mindless init in most cases is a perfect fix.

Unless it's not perf issue, I would rather see consistent initialized state.
Dynamic checks do not cover all cases in testing, plus optimizations suppress some bugs.

Fopr logical bug I would rather recommend to set default to some invalid value, and then assert() or diagnostics.

Not everything has invalid values.

Yes, the point is usually always a better way to check invariant, than keeping variable uninitialized. Msan should be considered as the last resort.

In any case, I think it'd be a pretty major stylistic requirement for LLVM if we're going to adopt initializing members at declaration in all cases - maybe worth a style discussion on the forums, etc, before we go adding these where they aren't demonstrably needed/only motivated by a stylistic checker.

If we do want to adopt such a policy, it seems like a pretty easy one to enforce/check/etc, rather than these sporadic "someone ran a checker" situations.

Yes, discussed and agreed rule would be nice to have.

Still if we have no reverse rule as well, do we want to push back on such patches? That genuine question, I don't know what should be the answer.
I don't advocate for the patch. I just claim that intentionally keeping value uninitialized, to let dynamic tools to catch a bug, causes more maintenance issues, than "just-in-case" initialization.

BTW. @akshaykhadse In particular I don't like about the patch:
1 it changes many unrelated things in one patch.
2 it mention "static code analysis", without details.

  1. enforcing build bot would be nice, as nothing prevents anyone from reverting the patch. (unlikely possible to make people to respect the bot without formal rule).
barannikov88 added a subscriber: barannikov88.EditedMay 12 2023, 12:46 PM

I think it is more the static analyzer issue that it is not able to figure out that
the entry point of the class is not its constructor, but runOnMachineFunction,
which initializes most (or all) member variables before using them.

Not everything has invalid values.

This is a really good point. Initializing all fields to some default values will make
it more difficult to add a new filed that does not have an obvious default
(because, you know, consistency prevails).
Even in this patch some time has been spent discussing what's the best default
value for a particular variable, and I've seen more of such discussions in other
similar patches. That just wastes people's time without any benefit for human.

Finally, I don't remember any of the recent patches caught a real bug.

LuoYuanke added a comment.EditedMay 12 2023, 11:59 PM

It looks to me that initializing variable is a good practice to avoid security issue. If it doesn’t cause performance degradation, it should be better than uninitialized variable which may cause UB.
I once encountered issue that compiler behavior is not consistent when compiling several times, not sure it is related with uninitialized variable.