This is an archive of the discontinued LLVM Phabricator instance.

[Consumed] Refactor handleCall to take function argument list. NFC.
AcceptedPublic

Authored by comex on Sep 16 2019, 8:00 PM.

Details

Reviewers
dblaikie
Summary

Currently, handleCall takes a CallExpr as an argument and retrieves the arguments from there. This changes it to take the argument list as a separate argument of type ArrayRef.

The main motivation that I want VisitCXXConstructExpr to also be able to invoke handleCall, even though CXXConstructExpr is not a subclass of CallExpr. But actually adding that will wait for a future patch.

However, this also fixes a minor bug in VisitCXXOperatorCallExpr:

if (const auto *MCall = dyn_cast<CXXMemberCallExpr>(Call))
  handleCall(MCall, MCall->getImplicitObjectArgument(), FunDecl);
else
  handleCall(Call, Call->getArg(0), FunDecl);

Call here is a CXXOperatorCallExpr, and CXXMemberCallExpr is a sibling class, so the cast will never succeed. CXXMemberCallExprs go through a separate visitor, VisitCXXMemberCallExpr.

On the other hand, this logic may be intended to reflect the fact that C++ operators can be declared as either methods or free functions. The correct way to differentiate these is shown at the beginning of handleCall:

unsigned Offset = 0;
if (isa<CXXOperatorCallExpr>(Call) && isa<CXXMethodDecl>(FunD))
  Offset = 1;  // first argument is 'this'

For CXXOperatorCallExprs, if the decl is a CXXMethodDecl, the first argument is this; otherwise there is no this.

Going back to the other first: currently, since the dyn_cast always fails, it always passes Call->getArg(0) as ObjArg (i.e. the expression representing this), even for operators declared as free functions. However, this is harmless, because ObjArg is only used if the function is marked as one of set_typestate or test_typestate, or callable_when, yet those attributes are only allowed on methods. Call->getArg(0) will crash if there are zero arguments, but the only kind of non-method operator function allowed to have zero arguments is a user-defined literal, and those do not produce CXXOperatorCallExprs.

The bug could be fixed by changing the first snippet to check for CXXMethodDecl like the second one, but this refactor makes things cleaner by only having to check in one place.

Diff Detail

Event Timeline

comex created this revision.Sep 16 2019, 8:00 PM
dblaikie added inline comments.Sep 17 2019, 1:40 PM
lib/Analysis/Consumed.cpp
752 ↗(On Diff #220421)

probably use "Call->arguments()" here & in the other places that need an range of the arguments? (& if that can be an ArrayRef - updating CallExpr's arguments() and the arg_range and const_arg_range to use/be ArrayRef would be good)

comex added a comment.Sep 17 2019, 5:41 PM

Ugh, it looks like getArgs() is a massive strict aliasing violation. And it's used in enough different places in Clang that fixing the violation may require extensive refactoring... I've reported the issue as https://bugs.llvm.org/show_bug.cgi?id=43344, but I don't really want to fix it :)

However, I can at least update this patch to avoid using getArgs().

Ugh, it looks like getArgs() is a massive strict aliasing violation. And it's used in enough different places in Clang that fixing the violation may require extensive refactoring... I've reported the issue as https://bugs.llvm.org/show_bug.cgi?id=43344, but I don't really want to fix it :)

However, I can at least update this patch to avoid using getArgs().

Yeah, existing technical debt - I don't think your situation is making that any worse or better, so no worries about it.

comex updated this revision to Diff 220771.Sep 18 2019, 5:18 PM

Here's a new version of the patch that uses iterator ranges instead of ArrayRef, to avoid adding new uses of CallExpr::getArgs() in case it has to be removed in the future due to the strict aliasing issue.

To support this, the following additional changes are included:

  • Fix CastIterator's use of iterator_adaptor_base so that operator[] isn't broken.
  • Add an operator[] implementation to iterator_range.
  • Improve the existing llvm::drop_begin helper to assert that it doesn't go out of bounds. (The implementation would be a lot less ugly if we could use C++17 features; oh well.)
Herald added a project: Restricted Project. · View Herald TranscriptSep 18 2019, 5:18 PM

Here's a new version of the patch that uses iterator ranges instead of ArrayRef, to avoid adding new uses of CallExpr::getArgs() in case it has to be removed in the future due to the strict aliasing issue.

Not related to the strict aliasing issue - just seemed like it'd be tidier than reconstructing ranges manually from getArgs and getNumArgs.

To support this, the following additional changes are included:

I wasn't expecting it to involve all this work - but instead to change "arguments()" to return ArrayRef. Though perhaps you didn't go that route to allow future fixes to the strict aliasing (by having an adapting iterator - or perhaps arguments() already uses such an iterator that doesn't hit the strict aliasing issue?) needing to do even more cleaunp work?

  • Fix CastIterator's use of iterator_adaptor_base so that operator[] isn't broken.
  • Add an operator[] implementation to iterator_range.

I'm not sure that's appropriate - not all types with begin() and end() (even those with random access iterators) would have op[], so it doesn't seem appropriate to add it/depend on it?

  • Improve the existing llvm::drop_begin helper to assert that it doesn't go out of bounds. (The implementation would be a lot less ugly if we could use C++17 features; oh well.)
comex added a comment.Sep 18 2019, 6:02 PM

I wasn't expecting it to involve all this work - but instead to change "arguments()" to return ArrayRef. Though perhaps you didn't go that route to allow future fixes to the strict aliasing (by having an adapting iterator - or perhaps arguments() already uses such an iterator that doesn't hit the strict aliasing issue?) needing to do even more cleaunp work?

Yeah, arguments() returns an adapting iterator.

  • Add an operator[] implementation to iterator_range.

I'm not sure that's appropriate - not all types with begin() and end() (even those with random access iterators) would have op[], so it doesn't seem appropriate to add it/depend on it?

Random access iterators are supposed to have operator[], according to:
http://www.cplusplus.com/reference/iterator/RandomAccessIterator/

But the use of a template ensures that it doesn't cause an error with types that don't have operator[], whether they are marked as random-access or not.

I checked the spec for "real" C++20 ranges... it seems that random access ranges don't always have operator[], but "view_interface" (which is a type of range) does:
https://eel.is/c++draft/view.interface

Anyway, I don't think it hurts to have it on this little imitation.

I wasn't expecting it to involve all this work - but instead to change "arguments()" to return ArrayRef. Though perhaps you didn't go that route to allow future fixes to the strict aliasing (by having an adapting iterator - or perhaps arguments() already uses such an iterator that doesn't hit the strict aliasing issue?) needing to do even more cleaunp work?

Yeah, arguments() returns an adapting iterator.

Right - I was suggesting that could be changed. Would it be OK to you to change arguments() to return ArrayRef? Or would you rather avoid that to avoid baking in more dependence on that alias violation?

  • Add an operator[] implementation to iterator_range.

I'm not sure that's appropriate - not all types with begin() and end() (even those with random access iterators) would have op[], so it doesn't seem appropriate to add it/depend on it?

Random access iterators are supposed to have operator[], according to:
http://www.cplusplus.com/reference/iterator/RandomAccessIterator/

Yep, I'm with you there

But the use of a template ensures that it doesn't cause an error with types that don't have operator[], whether they are marked as random-access or not.

I checked the spec for "real" C++20 ranges... it seems that random access ranges don't always have operator[],

Yeah, that's where I've got different feelings - a general purpose utility to take any range (thing with begin/end) over random access iterators and do indexed lookup, I'd be fine with that (implemented as "r.begin()[n]") but adding op[] to the range itself then leads code to depend on that feature which isn't a general purpose range feature - so changing this iterator_range to some other range implementation might break that code - it'd be nice to keep iterator_range's interface minimal to allow implementation flexibility in the APIs that expose it.

Same reason I've pushed back on adding things like "empty()" or "size()", etc. Which are fine (& have been added in STLExtras) as free functions.

but "view_interface" (which is a type of range) does:
https://eel.is/c++draft/view.interface

Fair - guess they've gone in a slightly different direction than we have in LLVM for now. I don't feel super strongly about it - but a bit.

Anyway, I don't think it hurts to have it on this little imitation.

comex updated this revision to Diff 220926.Sep 19 2019, 5:42 PM

Right - I was suggesting that could be changed. Would it be OK to you to change arguments() to return ArrayRef? Or would you rather avoid that to avoid baking in more dependence on that alias violation?

I'd rather avoid that for the reason you said.

Same reason I've pushed back on adding things like "empty()" or "size()", etc. Which are fine (& have been added in STLExtras) as free functions.

Okay, I'll just do that then. Added as index(Range, N), matching range-v3 (though not the actual C++20 draft).

dblaikie accepted this revision.Sep 23 2019, 12:20 PM

Sounds good - thanks!

llvm/include/llvm/ADT/iterator_range.h
27–33

Is this how the C++ standard is doing things these days? Or is it usually done with variable templates?

This revision is now accepted and ready to land.Sep 23 2019, 12:20 PM
comex marked 2 inline comments as done.Sep 26 2019, 2:47 PM
comex added inline comments.
llvm/include/llvm/ADT/iterator_range.h
27–33

"Type trait"-like things have gone through three different evolutions:

  • C++11 struct templates: std::is_integral<T>::value
  • C++17 variable templates: std::is_integral_v<T>
  • C++20 concepts: std::integral<T>

In fact, the C++20 draft has a random_access_iterator<T> concept, though there is no is_random_access_iterator in prior versions.

However, with LLVM still built as C++11, this helper has to be a struct template or a constexpr function. It seemed a bit simpler to use a constexpr function, since template specialization wasn't needed.

dblaikie added inline comments.Sep 26 2019, 2:53 PM
llvm/include/llvm/ADT/iterator_range.h
27–33

Fair enough - thanks for the details :)

comex updated this revision to Diff 225557.Oct 17 2019, 6:48 PM
comex marked 2 inline comments as done.

So, I landed this patch but had to revert it as it broke the build on MSVC (and only MSVC). That was almost a month ago, but I haven't gotten back around to this until now – sorry :|

In this updated patch, I switched is_random_iterator from a constexpr function:

template <typename T>
constexpr bool is_random_iterator() {
  return std::is_same<
    typename std::iterator_traits<T>::iterator_category,
    std::random_access_iterator_tag>::value;
}

to a type alias:

template <typename T>
using is_random_iterator =
  std::is_same<typename std::iterator_traits<T>::iterator_category,
               std::random_access_iterator_tag>;

and changed the uses accordingly. For some reason, MSVC thought the two overloads of drop_begin, one with enable_if<!is_random_iterator<T>()> and one with enable_if<is_random_iterator<T>()>, were duplicate definitions. But with is_random_iterator changed to a type alias, MSVC is fine with them. GCC and Clang think both versions are valid, so I think it's just an MSVC bug.

Simplified example for reference: https://gcc.godbolt.org/z/niXCy4

So, I landed this patch but had to revert it as it broke the build on MSVC (and only MSVC). That was almost a month ago, but I haven't gotten back around to this until now – sorry :|

In this updated patch, I switched is_random_iterator from a constexpr function:

template <typename T>
constexpr bool is_random_iterator() {
  return std::is_same<
    typename std::iterator_traits<T>::iterator_category,
    std::random_access_iterator_tag>::value;
}

to a type alias:

template <typename T>
using is_random_iterator =
  std::is_same<typename std::iterator_traits<T>::iterator_category,
               std::random_access_iterator_tag>;

and changed the uses accordingly. For some reason, MSVC thought the two overloads of drop_begin, one with enable_if<!is_random_iterator<T>()> and one with enable_if<is_random_iterator<T>()>, were duplicate definitions. But with is_random_iterator changed to a type alias, MSVC is fine with them. GCC and Clang think both versions are valid, so I think it's just an MSVC bug.

Simplified example for reference: https://gcc.godbolt.org/z/niXCy4

Sounds good to me. Maybe leave a comment explaining why a constexpr function was not used?