This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
libcxx/
-
include/__random/
-
__random/
1
uniform_int_distribution.h
-
test/std/numerics/rand/rand.dis/rand.dist.uni/rand.dist.uni.int/
-
std/
-
numerics/
-
rand/
-
rand.dis/
-
rand.dist.uni/
-
rand.dist.uni.int/
1
no_wasted_entropy.pass.cpp

Differential D115125

[libc++] Avoid rejection sampling in `uniform_int_distribution` if possible
AbandonedPublic

Authored by fwolff on Dec 5 2021, 1:41 PM.

Download Raw Diff

Details

Reviewers

• Quuxplusone
ldionne

Group Reviewers

Restricted Project

Summary

Fixes PR#39209. Suppose we have a URBG which generates values in the range [0, 12), and we want a uniform distribution for the range [0, 3). The current implementation will now use an independent bits engine to generate two independent bits using the URBG, i.e. the range [0, 4), and then do rejection sampling. But this is not necessary: Since URBG is supposed to generate uniformly distributed values, we can assume that each of the subranges [0, 3), [3, 6), [6, 9) and [9, 12) is equally probable, i.e. we can get a uniform value in [0, 3) by simply generating one value in [0, 12) and dividing it by 4 (=12/3).

Diff Detail

Unit TestsFailed

	Time	Test
	920 ms	libcxx CI C++03 > llvm-libc++-shared-cfg-in.std/algorithms/alg_modifying_operations/alg_random_shuffle::random_shuffle.pass.cpp
	1,190 ms	libcxx CI C++03 > llvm-libc++-shared-cfg-in.std/algorithms/alg_modifying_operations/alg_random_shuffle::random_shuffle_urng.pass.cpp
	1,020 ms	libcxx CI C++11 > llvm-libc++-shared-cfg-in.std/algorithms/alg_modifying_operations/alg_random_shuffle::random_shuffle.pass.cpp
	1,330 ms	libcxx CI C++11 > llvm-libc++-shared-cfg-in.std/algorithms/alg_modifying_operations/alg_random_shuffle::random_shuffle_urng.pass.cpp
	1,040 ms	libcxx CI C++14 > llvm-libc++-shared-cfg-in.std/algorithms/alg_modifying_operations/alg_random_shuffle::random_shuffle.pass.cpp
		View Full Test Results (14 Failed)

Event Timeline

fwolff requested review of this revision.Dec 5 2021, 1:41 PM

fwolff created this revision.

Herald added a reviewer: Restricted Project. · View Herald TranscriptDec 5 2021, 1:41 PM

Herald added a subscriber: libcxx-commits. · View Herald Transcript

Harbormaster completed remote builds in B137574: Diff 391939.Dec 5 2021, 2:00 PM

My initial reaction is that this change slows down the common case to cater to a really super pathological case; I don't think we should do it (but a benchmark could probably change my mind).
The new test needs to move from libcxx/test/std to libcxx/test/libcxx if it's testing non-mandated, libcxx-specific behavior.

libcxx/include/__random/uniform_int_distribution.h
244–245	(1) `_Rg != 0` is trivially true, right? (2) Should we be worried that `_Rg % _Rp` is a division by a non-constant value, on the hot path? I think this change needs at least some benchmark numbers. (3) In non-pathological cases, `_Rg` is a power of two, so `_Rg % _Rp == 0` is true only if `_Rp` is also a power of two. But if `_Rp` is a power of two, don't we already do the optimal number of calls to `__g`? So this helps only in the pathological cases where the URBG's range is not a power of two? Perhaps at least we could `if constexpr` (or equivalent) to keep the non-pathological case fast.
libcxx/test/std/numerics/rand/rand.dis/rand.dist.uni/rand.dist.uni.int/no_wasted_entropy.pass.cpp
41	I'm pretty sure this is library UB. http://eel.is/c++draft/rand.req.urng#1 "A uniform random bit generator g of type G is a function object returning unsigned integer values such that each value in the range of possible results has (ideally) equal probability of being returned." If this test case is deliberately doing weird stuff to test a corner case, then OK (but it needs to be more explicit about what corner case it's testing).

This revision now requires changes to proceed.Dec 5 2021, 3:49 PM

In D115125#3172432, @Quuxplusone wrote:

My initial reaction is that this change slows down the common case to cater to a really super pathological case; I don't think we should do it (but a benchmark could probably change my mind).

OK. This was just a suggestion because there was a bug report about this, and the check definitely gets optimized away for all "common" cases, but I understand your concern.

In D115125#3174316, @fwolff wrote:

In D115125#3172432, @Quuxplusone wrote:

My initial reaction is that this change slows down the common case to cater to a really super pathological case; I don't think we should do it (but a benchmark could probably change my mind).

OK. This was just a suggestion because there was a bug report about this, and the check definitely gets optimized away for all "common" cases, but I understand your concern.

IMO correctness is more important than performance -- if our implementation is incorrect and we can fix it without introducing a bad performance regression (and that seems to be the case, the branch and the modulo don't seem to be much in comparison to the rest of the algorithm, but I could be wrong), then we should do it.

Revision Contents

Path

Size

libcxx/

include/

__random/

uniform_int_distribution.h

4 lines

test/

std/

numerics/

rand/

rand.dis/

rand.dist.uni/

rand.dist.uni.int/

no_wasted_entropy.pass.cpp

80 lines

Diff 391939

libcxx/include/__random/uniform_int_distribution.h

	Show First 20 Lines • Show All 231 Lines • ▼ Show 20 Lines
	{			{
	typedef typename conditional<sizeof(result_type) <= sizeof(uint32_t), uint32_t,			typedef typename conditional<sizeof(result_type) <= sizeof(uint32_t), uint32_t,
	typename make_unsigned<result_type>::type>::type _UIntType;			typename make_unsigned<result_type>::type>::type _UIntType;
	const _UIntType _Rp = _UIntType(__p.b()) - _UIntType(__p.a()) + _UIntType(1);			const _UIntType _Rp = _UIntType(__p.b()) - _UIntType(__p.a()) + _UIntType(1);
	if (_Rp == 1)			if (_Rp == 1)
	return __p.a();			return __p.a();
	const size_t _Dt = numeric_limits<_UIntType>::digits;			const size_t _Dt = numeric_limits<_UIntType>::digits;
	typedef __independent_bits_engine<_URNG, _UIntType> _Eng;			typedef __independent_bits_engine<_URNG, _UIntType> _Eng;
				typedef typename _URNG::result_type _URNG_RT;
				_URNG_RT _Rg = _URNG::max() - _URNG::min() + _URNG_RT(1);
	if (_Rp == 0)			if (_Rp == 0)
	return static_cast<result_type>(_Eng(__g, _Dt)());			return static_cast<result_type>(_Eng(__g, _Dt)());
				else if (_Rg != 0 && _Rg % _Rp == 0)
				return static_cast<result_type>(__p.a() + (__g() / (_Rg / _Rp)));
				QuuxplusoneUnsubmitted Not Done Reply Inline Actions (1) `_Rg != 0` is trivially true, right? (2) Should we be worried that `_Rg % _Rp` is a division by a non-constant value, on the hot path? I think this change needs at least some benchmark numbers. (3) In non-pathological cases, `_Rg` is a power of two, so `_Rg % _Rp == 0` is true only if `_Rp` is also a power of two. But if `_Rp` is a power of two, don't we already do the optimal number of calls to `__g`? So this helps only in the pathological cases where the URBG's range is not a power of two? Perhaps at least we could `if constexpr` (or equivalent) to keep the non-pathological case fast. Quuxplusone: (1) `_Rg != 0` is trivially true, right? (2) Should we be worried that `_Rg % _Rp` is a…
	size_t __w = _Dt - __countl_zero(_Rp) - 1;			size_t __w = _Dt - __countl_zero(_Rp) - 1;
	if ((_Rp & (numeric_limits<_UIntType>::max() >> (_Dt - __w))) != 0)			if ((_Rp & (numeric_limits<_UIntType>::max() >> (_Dt - __w))) != 0)
	++__w;			++__w;
	_Eng __e(__g, __w);			_Eng __e(__g, __w);
	_UIntType __u;			_UIntType __u;
	do			do
	{			{
	__u = __e();			__u = __e();
	▲ Show 20 Lines • Show All 41 Lines • Show Last 20 Lines

libcxx/test/std/numerics/rand/rand.dis/rand.dist.uni/rand.dist.uni.int/no_wasted_entropy.pass.cpp

This file was added.

				//===----------------------------------------------------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// UNSUPPORTED: c++03

				// <random>

				// template<class _IntType = int>
				// class uniform_int_distribution

				// template<class _URNG> result_type operator()(_URNG& g);

				#include <random>
				#include <cassert>

				#include "test_macros.h"

				// Cycles through the range [0, 12) backwards.
				struct uniform_rbg {
				using result_type = unsigned;

				static constexpr unsigned min() noexcept { return 0; }
				static constexpr unsigned max() noexcept { return 11; }

				unsigned operator()() {
				unsigned result = 11 - count;
				count = (count + 1) % 12;
				++num_calls;
				return result;
				}

				unsigned count = 0;
				unsigned num_calls = 0;
				};

				// Returns numbers in the range [0, 24) which always have the lowest two bits set.
				QuuxplusoneUnsubmitted Not Done Reply Inline Actions I'm pretty sure this is library UB. http://eel.is/c++draft/rand.req.urng#1 "A uniform random bit generator g of type G is a function object returning unsigned integer values such that each value in the range of possible results has (ideally) equal probability of being returned." If this test case is deliberately doing weird stuff to test a corner case, then OK (but it needs to be more explicit about what corner case it's testing). Quuxplusone: I'm pretty sure this is library UB. http://eel.is/c++draft/rand.req.urng#1 "A uniform random…
				struct non_uniform_rbg {
				using result_type = unsigned;

				static constexpr unsigned min() noexcept { return 0; }
				static constexpr unsigned max() noexcept { return 23; }

				unsigned operator()() {
				// Bail out in case of an endless loop:
				assert(num_calls <= 100);

				unsigned result = count \| 3;
				count = (count + 1) % 24;
				++num_calls;
				return result;
				}

				unsigned count = 0;
				unsigned num_calls = 0;
				};

				int main() {
				{
				uniform_rbg urbg;
				std::uniform_int_distribution<int> uni(0, 2);
				int i = uni(urbg);
				assert(0 <= i && i < 3);
				assert(urbg.num_calls == 1);
				}

				{
				non_uniform_rbg nurbg;
				std::uniform_int_distribution<int> uni(0, 2);
				int i = uni(nurbg);
				assert(0 <= i && i < 3);
				assert(nurbg.num_calls == 1);
				}

				return 0;
				}