This is an archive of the discontinued LLVM Phabricator instance.

[ADT] Add llvm::unique_function which is like std::function but supporting move-only closures.
ClosedPublic

Authored by chandlerc on Jun 20 2018, 1:13 AM.

Details

Summary

Most of the core optimizations for std::function are here plus
a potentially novel one that detects trivially movable and destroyable
functors and implements those with fewer indirections.

This is especially useful as we start trying to add concurrency
primitives as those often end up with move-only types (futures,
promises, etc) and wanting them to work through lambdas.

As further work, we could add better support for things like const-qualified
operator()s to support more algorithms, and r-value ref qualified operator()s
to model call-once. None of that is here though.

We can also provide our own llvm::function that has some of the optimizations
used in this class, but with copy semantics instead of move semantics.

This is motivated by increasing usage of things like executors and the task
queue where it is useful to embed move-only types like a std::promise within
a type erased function. That isn't possible without this version of a type
erased function.

Diff Detail

Repository
rL LLVM

Event Timeline

chandlerc created this revision.Jun 20 2018, 1:13 AM

BTW, one question I have is what the best name for the file is here. function.h seems kinda lame, but not sure what else to call it. Suggestions very welcome here.

Not to suggest the obvious, but UniqueFunction.h?

Not to suggest the obvious, but UniqueFunction.h?

If people prefer that, totally fine with me.

I went with this because we might want to merge llvm::function_ref into this file. And we might want to add llvm::function rather than relying on std::function given the amount of churn in that part of the standard library.

In that case maybe FunctionExtras.h in keeping with the llvm naming
spirit

timshen added inline comments.Jun 20 2018, 3:38 PM
llvm/include/llvm/ADT/function.h
9 ↗(On Diff #152031)

Can we add a bit of future plan? Is it going to merge with the future-standard unique_function, if exist at all?

54 ↗(On Diff #152031)

teh duh!

80 ↗(On Diff #152031)

What's the rationale of choosing this specific data layout? Currently it's

tagged_union { CallPtrT, struct { CallPtrT, MovePtrT, DestroyPtrT }* }

What makes us want it instead of, for example:

struct { CallPtrT, /* nullable */ struct { MovePtrT, DestroyPtrT}* }

Or even:

// MoveOrDestroyPtrT does either Move or Destroy depending on an extra parameter.
struct { CallPtrT, /* nulltable */ MoveOrDestroyPtrT }

?

I'm not saying the examples are better, but just want more details on the decision.

227 ↗(On Diff #152031)

Do we want to perfect-forward the parameters?

dberris added inline comments.
llvm/include/llvm/ADT/function.h
187 ↗(On Diff #152031)

Do you still need the std::move(...) here?

chandlerc marked 3 inline comments as done.Jun 21 2018, 12:44 AM

Thanks for the comments! Some responses below...

llvm/include/llvm/ADT/function.h
80 ↗(On Diff #152031)

The goal is to minimize the footprint by only using a single pointer within the object. Improved the comment to include this and not reference the prior design of a base class.

187 ↗(On Diff #152031)

Yeah, within the function its an L-value reference and so this will copy otherwise. Same reason you need std::forward w/ a forwarding reference.

Address comments from review, beefing up tests along the way to cover this
stuff much more carefully.

Also fixes several bugs uncovered with better testing and adds a missing
interface (bool conversion).

In that case maybe FunctionExtras.h in keeping with the llvm naming
spirit

I'm fine with the idea of this, but I'm always frustrated by the generic "extras" name component. Feels about as useless at "Utils". But I lack a significantly better idea....

Closest thing to a better idea I have is FunctionTypeErasure.h or CallableTypeErasuer.h. Thoughts? Other folks with naming thoughts here?

chandlerc updated this revision to Diff 152239.Jun 21 2018, 1:47 AM

Actually include some of the new tests and fix minor bugs they found.

timshen added inline comments.Jun 21 2018, 10:20 AM
llvm/include/llvm/ADT/function.h
236 ↗(On Diff #152239)

I think I understand the perfect-forwarding problem better now. Suppose we have the following user code:

unique_function<void(string)> f(...);
string a;
f(a);

Is the string copy-constructed once or twice? I figured that it'd be twice, (1) when calling operator(), (2) when calling CallImpl().

Regardless of the solution, can we add a test case for that?

chandlerc updated this revision to Diff 153607.Jun 29 2018, 6:24 PM

Add even more testing and rework how we forward parameters through the various
level of call wrapping and type erasure. This minimizes the copies and moves
when those do occur. We also use an optimization to keep things pass-by-value
where it will end up especially beneficial and can't be observed.

chandlerc marked an inline comment as done.Jun 29 2018, 6:51 PM

Thanks so much for the careful review Tim!

llvm/include/llvm/ADT/function.h
236 ↗(On Diff #152239)

Ah, there is an interesting case here. I've added tests that much more carefully cover all the variations of both functionality and the number of moves and copies triggered.

There is a fair amount of logic that goes into getting this right however, so please do look through the updated code.

timshen added inline comments.Jul 2 2018, 1:48 PM
llvm/include/llvm/ADT/FunctionExtras.h
161 ↗(On Diff #153607)

The default ctor doesn't initialize CallbackAndInlineFlag to a valid empty state, does it? We probably want to zero-initialize it.

std::function's default ctor is not trivial either.

181 ↗(On Diff #153607)

if (!(bool)RHS) { ... }

chandlerc updated this revision to Diff 153819.Jul 2 2018, 3:57 PM
chandlerc marked 2 inline comments as done.

Minor fix from code review. Also adjusted the order of things in the private section of the class to be more readable by putting the two data members together.

llvm/include/llvm/ADT/FunctionExtras.h
161 ↗(On Diff #153607)

Yep, the underlying value is set to zero by an in-class initializer.

This isn't about making something trivial, its just the simplest implementation.

181 ↗(On Diff #153607)

Not 100% sure what you're going for here, but yes I should bail out early when RHS is null. Let me know if you intended something else.

timshen accepted this revision.Jul 2 2018, 4:28 PM
This revision is now accepted and ready to land.Jul 2 2018, 4:28 PM
This revision was automatically updated to reflect the committed changes.
yroux added a subscriber: yroux.Jul 5 2018, 4:38 AM

Hi Chandler,

The added test cases are broken on ARM bots, logs are available at:

http://lab.llvm.org:8011/builders/clang-cmake-thumbv7-full-sh/builds/464

Thanks
Yvan

I've (finally) landed r336337 which just completely moves away from using a
function pointer packed into a pointerunion. =/

yroux added a comment.Jul 5 2018, 11:37 PM

ARM bots are back, Thanks Chandler.