This is an archive of the discontinued LLVM Phabricator instance.

[Analyzer] Synthesize function body for call_once
ClosedPublic

Authored by george.karpenkov on Sep 13 2017, 6:03 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

NoQ added inline comments.Sep 14 2017, 8:58 AM
lib/Analysis/BodyFarm.cpp
87 ↗(On Diff #115148)

I guess it's supposed to be the type to which we cast, while the arg's type would be the type from which we cast?

266–268 ↗(On Diff #115148)

This function looks as if it begs to be reused from some other place, but i didn't immediately find a ready-made way to do this, which is surprising.

281 ↗(On Diff #115148)

return None;.

405–407 ↗(On Diff #115148)

No idea, but, as a bit of brainstorming, Objective-C++ sometimes seems to treat blocks and lambdas similarly, eg. CK_CopyAndAutoreleaseBlockObject cast kind casts from lambdas to blocks. So i'm probably curious if passing a block is possible here (which goes in the opposite direction).

421–424 ↗(On Diff #115148)

Just to make sure i understand - the name of the once-token must be std::once_flag, while the name of the field we're looking for is unspecified. So we wouldn't necessarily find the field called __state inside the function in the real-world code. Same for the possible values of this field and their meanings.

Could you comment that we're trying to match popular implementation(s) of once_flag? And if you're seeing potential better approaches.

681 ↗(On Diff #115148)

I think we have to check that this is in namespace std::, otherwise it might be a different function.

lib/Analysis/BodyFarm.cpp
87 ↗(On Diff #115148)

But in most cases we don't actually change the type,
e.g. just changing lvalue to rvalue.

266–268 ↗(On Diff #115148)

yeah we've talked about that with Devin.
It was initially mostly copied from ExprEngineWithCall (or a similar name),
but changed substantially.
That code can be rewritten to reuse this one now.
Maybe we should just expose all of ASTMaker?

lib/Analysis/BodyFarm.cpp
421–424 ↗(On Diff #115148)

So we wouldn't necessarily find the field called __state inside the function

It seems we would, from libc++ header:

struct _LIBCPP_TEMPLATE_VIS once_flag
{
    _LIBCPP_INLINE_VISIBILITY
    _LIBCPP_CONSTEXPR
        once_flag() _NOEXCEPT : __state_(0) {}

private:
    once_flag(const once_flag&); // = delete;
    once_flag& operator=(const once_flag&); // = delete;

    unsigned long __state_;

#ifndef _LIBCPP_CXX03_LANG
    template<class _Callable, class... _Args>
    friend
    void call_once(once_flag&, _Callable&&, _Args&&...);
#else  // _LIBCPP_CXX03_LANG
    template<class _Callable>
    friend
    void call_once(once_flag&, _Callable&);

    template<class _Callable>
    friend
    void call_once(once_flag&, const _Callable&);
#endif  // _LIBCPP_CXX03_LANG
};

Though other implementation might try to put more stuff into the .cpp file.

george.karpenkov edited the summary of this revision. (Show Details)

updated to work with blocks.

george.karpenkov marked 2 inline comments as done.Sep 14 2017, 2:14 PM
george.karpenkov added inline comments.
lib/Analysis/BodyFarm.cpp
405–407 ↗(On Diff #115148)

shockingly, blocks work.

george.karpenkov marked an inline comment as done.

This works with --analyze on an actual file which has #include<mutex>.

Checked for std:: namespace, left a comment that we are assuming libc++ implementation.

george.karpenkov marked an inline comment as done.Sep 14 2017, 3:49 PM
george.karpenkov added inline comments.
lib/Analysis/BodyFarm.cpp
421–424 ↗(On Diff #115148)

Added a comment that we are assuming libc++ implementation.

Add logging statements.
@NoQ that should be good to commit I think.

Removed TODO notes.
@NoQ @dcoughlin any further comments?

zaks.anna accepted this revision.Sep 18 2017, 7:47 PM

Please, commit. I think we can take further comments post-commit.

This revision is now accepted and ready to land.Sep 18 2017, 7:47 PM
NoQ accepted this revision.Sep 19 2017, 10:57 AM

Yeah, sounds good to me as well.

lib/Analysis/BodyFarm.cpp
87 ↗(On Diff #115148)

No, most casts actually change the type, even if implicit. Even CK_NoOp changes the type sometimes.

Actually not sure, do we need the default argument for CastKind?

test/Analysis/call_once.cpp
222 ↗(On Diff #115421)
dcoughlin edited edge metadata.Sep 19 2017, 8:32 PM

Some comments in line. Also, don't forget to run git clang-format on your changes to fix the indention and '*' placement for pointers.

lib/Analysis/BodyFarm.cpp
28 ↗(On Diff #115421)

We should probably follow the convention the rest of the analyzer uses have this be camel case ("BodyFarm").

53 ↗(On Diff #115421)

There a bunch of whitespace-only changes here. (Probably clang-format or your editor removed spaces on empty lines?) We generally try to avoid whitespace-only changes since they obscure the history of a commit.

285 ↗(On Diff #115421)

Why does findMember() return an optional of a pointer type? What is the meaning of 'Some(nullptr)' to the caller? Can the method just return NamedDecl * instead?

301 ↗(On Diff #115421)

Do we need to look up in the bases? There don't seem to be tests for this code. Should it be removed?

360 ↗(On Diff #115421)

It seems quite sketchy to me that create_call_once_lambda_call() mutates one of its reference parameters. It is also inserting at the beginning of a vector, which is expensive. Can you create the CallArgs vector in the caller and pass in an immutable llvm::ArrayRef to it instead? This is more idiomatic and will prevent surprises on the part of future callers from unexpected mutation.

403 ↗(On Diff #115421)

To be idiomatic, CallArgs should probably be an llvm::SmallVector. This will avoid a heap allocation.

266–268 ↗(On Diff #115148)

I'm a bit uncomfortable with returning just the first found member for member functions. Since member functions are often overloaded, anyone who uses this API in the future to find a member function will almost certainly be using it incorrectly. What do you think about restricting it to only find fields?

lib/Analysis/BodyFarm.cpp
360 ↗(On Diff #115421)

It is also inserting at the beginning of a vector, which is expensive

Yes, I have taken that into consideration (though realistically speaking we are talking about an array with a couple of elements, in a function called once per analysis).
The problem arises from the fact that CXXOperatorCallExpr does not autoinsert function reference in the beginning of call arguments, like parent implementation (maybe it is a bug which should be fixed separately).
Thus I haven't seen any other option if we want to reuse the code inserting parameters into the vector, while only inserting the function first parameter for CXX operator calls. Also note I can't easily use deque, as args type is not auto-construcable from it.

Can you create the CallArgs vector in the caller and pass in an immutable llvm::ArrayRef to it instead

Did you mean "callee" instead of "caller"? Yes, but from my understanding that would result in more code, and be less efficient than the current implementation (even though performance does not particularly matter in this case).

and will prevent surprises on the part of future callers from unexpected mutation

Why is it unexpected though? It is a mutable datastructure passed by reference, it is quite idiomatic to write into those.
Moreover, conceptually the mutation is needed if we want callArgs to correspond to the list of arguments passed to the call expression constructor.

Updated using review comments.

george.karpenkov marked 3 inline comments as done.Sep 21 2017, 5:12 PM
george.karpenkov added inline comments.
lib/Analysis/BodyFarm.cpp
28 ↗(On Diff #115421)

I have only found two files with DEBUG_TYPE in the analyzer, however.
Would that be called a convention, especially given that we are inside Clang, and the rest of Clang and LLVM codebase use dashes and lowercase?
Would seem to make more sense to rename DEBUG_TYPE in those files.

53 ↗(On Diff #115421)

Can't see any now?

285 ↗(On Diff #115421)

OK that depends on the style, let's change it to be more consistent with body generation.
If a function returns a pointer, it is not immediately apparent that it can be a nullptr, while it is with an optional.

301 ↗(On Diff #115421)

Good point, I've cargo-culted it from another file without realizing.
If we are not planning on generalizing this function/class (and it seems we are not doing it in this PR) then indeed it makes sense to remove it.

403 ↗(On Diff #115421)

ok

266–268 ↗(On Diff #115148)

OK let's restrict it to fields.

dcoughlin accepted this revision.Sep 29 2017, 4:34 PM

Ok. Looks fine to me. Just make sure to check out the raw diff for whitespace changes. I suspect your viewer is hiding them from you.

lib/Analysis/BodyFarm.cpp
53 ↗(On Diff #115421)

Make sure to look at the raw diff. I still see them.

360 ↗(On Diff #115421)

I'm not convinced the mutation is needed and I think you can rewrite it without it. I won't block this patch on that though.

This revision was automatically updated to reflect the committed changes.
george.karpenkov marked 3 inline comments as done.
zaks.anna added inline comments.Sep 29 2017, 10:36 PM
lib/Analysis/BodyFarm.cpp
360 ↗(On Diff #115421)

Generally, in LLVM codebase it's preferable to use immutable data structures wherever possible. That makes code safer and easier to understand.