This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Explicitly reject `uniform_int_distribution<bool>` and `<char>`.
ClosedPublic

Authored by Quuxplusone on Dec 1 2021, 5:01 PM.

Details

Summary
`uniform_int_distribution<T>` is UB unless `T` is one of the non-character,
non-boolean integer types (`short` or larger). However, libc++ has never
enforced this. D114129 accidentally made `uniform_int_distribution<bool>`
into an error. Make it now *intentionally* an error; and likewise for the
character types and all user-defined class and enum types; but permit
`__[u]int128_t` to continue working.

Apply the same static_assert to all the integer distributions.

Diff Detail

Event Timeline

Quuxplusone requested review of this revision.Dec 1 2021, 5:01 PM
Quuxplusone created this revision.
Herald added 1 blocking reviewer(s): Restricted Project. · View Herald TranscriptDec 1 2021, 5:01 PM
Quuxplusone added inline comments.Dec 1 2021, 5:06 PM
libcxx/test/std/numerics/rand/rand.dis/rand.dist.uni/rand.dist.uni.int/eval.pass.cpp
84–91 ↗(On Diff #391171)

Before you ask: yeah, these error bounds are crazy ad-hoc. But:

  • the test isn't "flaky": in the sense that the engine g will always produce the same random sequence, so our code for dist should always cook that sequence into something that satisfies these arbitrary error bounds, and if it ever changes so that the error bounds fail, then we can just go edit this test.
  • the old test was way more ad-hoc: like, what even is std::abs((mean - x_mean) / x_mean) < 0.01? Dividing by the mean of the distribution? We're just lucky they never tried a distribution with a mean of zero. 😀
jloser added a subscriber: jloser.Dec 1 2021, 5:12 PM
jloser added inline comments.
libcxx/test/std/numerics/rand/rand.dis/rand.dist.uni/rand.dist.uni.int/eval.pass.cpp
84–91 ↗(On Diff #391171)

One of those "oh boy" moments when looking at the existing trunk code (e.g. mean of zero comment).

ldionne requested changes to this revision.Dec 2 2021, 5:33 AM

Let's just *not* support uniform_int_distribution<bool>, and make it into a static_assert.

Also, the test improvements should be made as a different patch. probably before this one cause it is needed to test with a wide variety of types like you do (which is a good improvement BTW).

This revision now requires changes to proceed.Dec 2 2021, 5:33 AM

Let's just *not* support uniform_int_distribution<bool>, and make it into a static_assert.

Works for me, but (1) I'll feel better once we hear from @leonardchan, and (2) what do you want to do about char, signed char, char16_t, etc.? Also disallow them via static_assert, I'd guess?

Let's just *not* support uniform_int_distribution<bool>, and make it into a static_assert.

Works for me, but (1) I'll feel better once we hear from @leonardchan, and (2) what do you want to do about char, signed char, char16_t, etc.? Also disallow them via static_assert, I'd guess?

Yeah, I'd do the same. If something is library UB and it's easy to diagnose, I would diagnose it.

Quuxplusone retitled this revision from [libc++] Re-enable `uniform_int_distribution<bool>`. to [libc++] Explicitly reject `uniform_int_distribution<bool>`..
Quuxplusone edited the summary of this revision. (Show Details)

Change the intent of this PR from "re-enable" to "explicitly reject".

Quuxplusone added inline comments.Dec 6 2021, 7:53 AM
libcxx/include/__random/is_valid_inttype.h
21–40 ↗(On Diff #392064)

I'm not thrilled about this helper or its placement/naming, but it seemed like the best option.
Incidentally, this also gives people a sneaky libc++ internal detail that they can (unsupportedly) specialize in order to re-enable the removed functionality for specific user-defined types.
Should this helper be named __libcpp_*?

ldionne requested changes to this revision.Dec 6 2021, 10:30 AM

Couple of comments.

Also, I'm going to try this patch out on a significant code base just to confirm it doesn't result in crazy breakage -- requesting changes until that's done.

Ref for myself: 86111407&86111417

libcxx/include/__random/is_valid_inttype.h
21–40 ↗(On Diff #392064)

Yeah, let's use __libcpp_*. I don't like that name, but I think it's worth doing it if it reduces the likelihood of users messing around with our internal details.

37 ↗(On Diff #392064)

This is where we could document that we support this as an extension, in a comment.

libcxx/test/std/numerics/rand/rand.dis/rand.dist.uni/rand.dist.uni.int/eval.pass.cpp
13 ↗(On Diff #392064)

Any appetite for improving tests for other random distributions like this? It's *much* better than what we had before.

This revision now requires changes to proceed.Dec 6 2021, 10:30 AM

Okay, so I have some initial results of my investigation. I found some uses of uniform_int_distribution with things like uint8_t, char, and unsigned char. There was definitely some breakage, however nothing that seemed entirely unmanageable.

I say we go for it, but we definitely need to add a release note. Also, we'll want to sync up with some folks that build their code bases from libc++ trunk (like I think @thakis @rnk @phosek) or they might encounter issues. They should proof their code bases before we land this (or if that can't be done in a reasonable time frame, then they can revert this commit internally).

One thing we could do to make things easier would be to have the static_asserts in a different commit, that way folks could just revert that one locally until they have solved their problems.

@ldionne: Makes perfect sense to me. Would you care to suggest wording for a release note? Perhaps

(API Changes)
The integer distributions ``binomial_distribution``, ``discrete_distribution``,
``geometric_distribution``, ``negative_binomial_distribution``, ``poisson_distribution``,
and ``uniform_int_distribution`` now conform to the Standard by rejecting
template parameter types other than ``short``, ``int``, ``long``, ``long long``,
(as an extension) ``__int128_t``, and the unsigned versions thereof.
In particular, ``uniform_int_distribution<int8_t>`` is no longer supported.

Also, since "everything but the static_asserts" is really just "the test file," shall I go ahead and land the test-file change by itself, soonish?

Quuxplusone added inline comments.Dec 7 2021, 2:27 PM
libcxx/include/__random/is_valid_inttype.h
21–40 ↗(On Diff #392064)

Will rename to __libcpp_random_is_valid_inttype.

Also yuck, libcxx/benchmarks/vector_operations.bench.cpp uses uniform_int_distribution<char> in order to get random example data for its vectors. Maybe we should just support that. It would match libstdc++, too.

uniform_int_distribution<char/int8_t/uint8_t>: GCC accepts, MSVC rejects with static_assert.
uniform_int_distribution<bool>: GCC and MSVC both reject, with ugly template error message.

@ldionne: Makes perfect sense to me. Would you care to suggest wording for a release note? Perhaps

(API Changes)
The integer distributions ``binomial_distribution``, ``discrete_distribution``,
``geometric_distribution``, ``negative_binomial_distribution``, ``poisson_distribution``,
and ``uniform_int_distribution`` now conform to the Standard by rejecting
template parameter types other than ``short``, ``int``, ``long``, ``long long``,
(as an extension) ``__int128_t``, and the unsigned versions thereof.
In particular, ``uniform_int_distribution<int8_t>`` is no longer supported.

Yes, that sounds perfect to me.

Also, since "everything but the static_asserts" is really just "the test file," shall I go ahead and land the test-file change by itself, soonish?

Yes! You can do this as a NFC commit and repurpose this patch to be only the static asserts and the release note, and fixing the benchmarks.

libcxx/include/__random/is_valid_inttype.h
21–40 ↗(On Diff #392064)

So basically we'd only exclude bool? IMO we should strive for strict conformance, however we should also write a half-page paper to support it for char and other small types cause it kind of makes sense.

Quuxplusone retitled this revision from [libc++] Explicitly reject `uniform_int_distribution<bool>`. to [libc++] Explicitly reject `uniform_int_distribution<bool>`; reject URNGs with signed result types.
Quuxplusone edited the summary of this revision. (Show Details)

Add a release note.
Fix the benchmarks.
Scope-creep to [libc++] Explicitly reject URNG types with signed result_types (fixes https://github.com/llvm/llvm-project/issues/48965 ) while I'm here.

ldionne accepted this revision.Jan 17 2022, 10:49 AM

Can we add libc++ specific tests to check that we properly diagnose those? One big .verify.cpp test for all the distributions would be great. LGTM apart from that.

This revision is now accepted and ready to land.Jan 17 2022, 10:49 AM
Quuxplusone retitled this revision from [libc++] Explicitly reject `uniform_int_distribution<bool>`; reject URNGs with signed result types to [libc++] Explicitly reject `uniform_int_distribution<bool>` and `<char>`..
Quuxplusone edited the summary of this revision. (Show Details)

Rebase. Split out "Explicitly reject URNG types with signed result_types" into its own PR.
If CI on this part is green, I'll probably just land it on Monday, since it's been approved since forever.

Add missing pragmas.