This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Remove problematic ADL in container implementations.
Needs ReviewPublic

Authored by dlj on Sep 6 2017, 3:40 PM.

Details

Reviewers
rsmith
EricWF
Summary

Some container operations require ADL. For example, std::advance is
required to use specific operators, which will participate in ADL.

However, implementation details which rely on SFINAE should be careful not to
inadvertently invoke ADL. Otherwise, the SFINAE lookup will (incorrectly)
consider namespaces other than std::. This is particularly problematic with
incomplete types, since the set of associated namespaces for a class includes
the body of the class (which may contain inline friend function overloads). If a
type is incomplete, its body cannot be added to the set of associated
namespaces; this results in an error.

The changes in this patch mostly appear to be omissions. Several of the changes
are for SFINAE overloads with internal names (in some cases, there are other
uses which are already correctly qualified). In a few cases, the implementation
details of iterators are directly to avoid invoking ADL (this seems unfortunate;
but on balance, better than failing when type erasure is otherwise plausible).

Event Timeline

dlj created this revision.Sep 6 2017, 3:40 PM
rsmith edited edge metadata.Sep 6 2017, 5:01 PM

Looks good to me, but I'd like EricWF to also review.

include/deque
1167–1169

The other changes all look like obvious improvements to me. This one is a little more subtle, but if we want types like deque<Holder<Incomplete> *> to be destructible, I think we need to do something equivalent to this.

EricWF added inline comments.Sep 12 2017, 5:22 PM
include/__split_buffer
279

The changes adding _VSTD:: LGTM.

include/deque
1167–1169

That's really yucky! I'm OK with this, but I really don't like it.

Fundamentally this can't work, at least not generically. When the allocator produces a fancy pointer type, the operator lookups need to be performed using ADL.

In this specific case, we control the iterator type, but this isn't always true. for example like in __split_buffer, which uses the allocators pointer type as an iterator.

@dlj I updated your test case to demonstrate the fancy-pointer problem: https://gist.github.com/EricWF/b465fc475f55561ba972b6dd87e7e7ea

So I think we have a couple choices:

(1) Do nothing, since we can't universally support the behavior, so we shouldn't attempt to support any form of this behavior.
(2) Fix only the non-fancy pointer case (which is what this patch does).
(3) Attempt to fix *all* cases, including the fancy-pointer ones. This won't be possible, but perhaps certain containers can tolerate incomplete fancy pointers?

Personally I'm leaning towards (1), since we can't claim to support the use case, at least not universally. If we select (1) we should probably encode the restriction formally in standardese.

include/memory
1349–1351

Wouldn't the declval calls also need qualification?

Quuxplusone added inline comments.
include/memory
1349–1351

(Sorry I'm jumping in before I know the full backstory here.) I would expect that the "declval" call doesn't need qualification because its argument-list is empty. ADL doesn't apply to template arguments AFAIK.
(Test case proving it to myself: https://wandbox.org/permlink/i0YarAfjOYKOhLFP )

Now, IMHO, __has_construct_test also doesn't need guarding against ADL, because it contains a double underscore and is thus firmly in the library's namespace. If the user has declared an ADL-findable entity my::__has_construct_test, they're already in undefined behavior land and you don't need to uglify libc++'s code just to coddle such users.

And, IMO, to the extent that ADL *does* work on the old code here, that's a feature, not a bug. As the implementor of some weird fancy pointer or iterator type, I might *like* having the power to hook into libc++'s customization points here. Of course I wouldn't try to hook into __has_construct_test, but __to_raw_pointer feels more likely. (At my current low level of understanding, that is.)

EricWF added inline comments.Sep 12 2017, 6:53 PM
include/memory
1349–1351

If the user has declared an ADL-findable entity my::__has_construct_test, they're already in undefined behavior land

The problem isn't that it will find a my::__has_construct_test, but that the simple act of looking requires completed types, otherwise an error is generated. Example

dlj marked 2 inline comments as done.Sep 13 2017, 10:08 PM
dlj added inline comments.
include/deque
1167–1169

So I think there are at least a couple of issues here, but first I want to get clarification on this point:

Do nothing, since we can't universally support the behavior, so we shouldn't attempt to support any form of this behavior.

To me, this sounds like it could be equivalent to the statement "someone might do a bad job of implementing type-erasure for fancy pointers, so we should never attempt to preserve type erasure of any pointers." Now, that statement seems tenuous at best (and a straw man on a slippery slope otherwise), so I'm guessing that's not quite what you meant. ;-)

That said, it does sort of make sense to ignore deque for now, so I'll drop it from the patch; but for other containers, I don't really see the argument.

As for #3: I'm able to make (most of) your gist pass for everything but deque... calls to _VSTD::__to_raw_pointer are enough to avoid ADL around operators, although a few other changes are needed, too. That probably makes sense to do as a follow-up.

As for deque in particular, I think there may just be some missing functions on the __deque_iterator, but it probably warrants a closer look before adding to the interface.

include/memory
1349–1351

Hmm, no I don't think so... declval<_Alloc> is a "declaration of a function at block scope" (because it's a template specialization), so it does not participate in ADL.

I believe the difference with e.g. __has_construct_test is that the other calls name a function template, which is not specialized (and so isn't a function declaration) until it wins the ADL.

I'm a little bit hazy on exactly the semantics, though, so Richard can correct me if I've gotten it substantially wrong. :-)

1349–1351

Double-underscore names don't actually change the lookup rules, so (as Eric points out) isn't really related to the issue I'm addressing here.

As for extension points... __to_raw_pointer is essentially a selector for the correct expression syntax to get the pointed-to node, somewhat analogous to how std::pointer_traits selects an "unwrapped" pointee type. (In this case, an "extension point" already exists: you can use a custom allocator with fancy pointers... have a look at Eric's comments in include/deque for a good example of how you might write one.)

dlj updated this revision to Diff 115164.Sep 13 2017, 10:11 PM
  • Remove deque from the test for now.
dlj updated this revision to Diff 115165.Sep 13 2017, 10:12 PM
  • Remove deque from the test for now.
EricWF added inline comments.Sep 14 2017, 5:02 PM
include/deque
1167–1169

To me, this sounds like it could be equivalent to the statement "someone might do a bad job of implementing type-erasure for fancy pointers, so we should never attempt to preserve type erasure of any pointers." Now, that statement seems tenuous at best (and a straw man on a slippery slope otherwise), so I'm guessing that's not quite what you meant. ;-)

I'm not sure it's even possible to type-erase fancy pointers in a way your suggesting. Fundamentally I think they need to carry around the pointee type. So this problem isn't so easily explained away or made the fault of the user. Or am I missing something?

The library is allowed, and in this case required, to perform operator lookup via ADL. full stop. That's how the STL is expected to interact with user types. Even after fixing the accidental ADL caused by __foo functions, we haven't and can't "fix" the rest. So now the question becomes "How far should libc++ go to avoid legal ADL lookup when it's possible"?

Whatever we decide, we are agreeing to "support" (or not support) a set of behavior, and I have a complex about claiming to "support" bizarre use cases like this. To support something we need to be able to (1) articulate exactly what behavior is supported, and (2) that the behavior won't regress.

I don't think we can provide guarantee (2) since future changes may require the use of ADL lookup in a way we can't avoid, for example in the same way that __split_buffer does. So that's a strike against the possibility of claiming "support".

One possibility is "The default constructor and destructor of containers X, Y, Z do not perform ADL lookup (Constructors A, B, C do not support this behavior) ". Admittedly it was easier than I thought to fix __hash_table and __tree, but __split_buffer is still a problem, which means that both deque and vector won't work.

Since we can't fix all of the containers, and the ones we can fix may need to break again in the future, I don't think we can arrive at a meaningful statement of support. Therefore, I think a good argument needs to be made as to why it's important to "fix" the particular cases we can.

EricWF edited edge metadata.Sep 14 2017, 5:36 PM

@dlj I went ahead and committed the fixes to std::allocator_traits in r313324, because I think we agree those are bugs, and I didn't want this discussion to hold up that fix. I hope you don't mind.

dlj added a comment.Sep 14 2017, 6:26 PM

@dlj I went ahead and committed the fixes to std::allocator_traits in r313324, because I think we agree those are bugs, and I didn't want this discussion to hold up that fix. I hope you don't mind.

Nope, not a problem.

dlj and I discussed this offline somewhat, and dlj made the interesting point that using ADL *at all* within the implementation of the standard library brings with it the risk that your implementation will not function correctly if a user-declared operator could be at least as good a match as your intended callee.

As an example, consider: https://godbolt.org/g/rgFTME -- as far as I can see, that code follows all the rules for safe and correct use of the C++ standard library. And yet it does not compile with either libc++ or libstdc++, because both internally use ADL to find the operator!= for vector::iterator within their implementation of vector::erase. Any ADL call seems likely to suffer from this problem; moreover, you can observe that vector<Foo>::iterator fails to even satisfy the input iterator requirements, because a != b is not a valid expression for iterators a and b.

I think the above example is actually demonstrating a bug in the standard: such user code is unreasonable and it's not sensible to expect the standard library to cope with it.

Everything in this PR is obsolete and/or fixed at this point, except for the change to <deque>, which as Eric said, fixes the lower-hanging fruit but sadly can't "ADL-proof" it completely.
(Grepping for "ADL" in the git history of libcxx/include/ will turn up a lot of patches from me and from Logan Smith, on the subject of ADL-proofing.)
I suggest that this PR could be closed at this point.