Download Raw Diff

Details

Reviewers

ldionne
STL_MSFT
philnik
Mordante

Group Reviewers

Restricted Project

Commits

rGb07880454ba3: [libc++] Replace modulus operations in std::seed_seq::generate with conditional…

Summary

Abseil benchmarks suggest that the conditional checks result in faster code (4-5x)
as they are compiled into conditional move instructions (cmov on x86).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

laramiel created this revision.May 10 2022, 11:09 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 10 2022, 11:09 AM

Updating D125329: Replace modulus operations in std::seed_seq::generate with conditional checks.

laramiel added reviewers: ldionne, STL_MSFT.May 10 2022, 11:34 AM

laramiel published this revision for review.May 10 2022, 11:38 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 10 2022, 11:38 AM

Herald added a reviewer: Restricted Project. · View Herald Transcript

Herald added a subscriber: libcxx-commits. · View Herald Transcript

Could you provide some benchmarks in libcxx/benchmarks and their results?

This revision now requires changes to proceed.May 10 2022, 11:43 AM

Add benchmark file.

Updating D125329: Replace modulus operations in std::seed_seq::generate with conditional checks.

This looks pretty good, but I'd like to see some benchmark results before approving. BTW you can ignore clang-format currently. (I assume it's the reason for your last two diffs). If you don't have commit access please provide "Your Name" <your.email@address.com> for attribution.

libcxx/benchmarks/random.bench.cpp
2	This shouldn't be in a `.cpp` file, only in `.h` files.
11–15	Could you order these alphabetically?
libcxx/include/__random/seed_seq.h
119–133	This would be just perfect for a lambda. Why do we have to support C++03?

Updating D125329: Replace modulus operations in std::seed_seq::generate with conditional checks.

Resolve some comments

libcxx/benchmarks/random.bench.cpp
2	I can do that, but it's in a lot of the *.bench.cpp files. Though not all.

philnik added inline comments.May 10 2022, 1:01 PM

libcxx/benchmarks/random.bench.cpp
2	Sorry for the miscommunication. I only meant the `// -- C++ --`.

Ok, I managed to run the benchmarks with the before and after on both my x86 workstation and my mac M1 laptop. For the M1, the new code is about the same, but the new code is dramatically faster on x86.

Mac LAPTOP

CPU Caches:
  L1 Data 64 KiB (x10)
  L1 Instruction 128 KiB (x10)
  L2 Unified 4096 KiB (x5)
Load Average: 3.74, 4.25, 3.59

OLD

Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------
BM_SeedSeq_Generate/1/1          16.9 ns         16.9 ns     39431288
BM_SeedSeq_Generate/8/1          50.1 ns         50.0 ns     13943668
BM_SeedSeq_Generate/16/1         88.4 ns         87.9 ns      7974663
BM_SeedSeq_Generate/1/8          56.1 ns         56.0 ns     12601941
BM_SeedSeq_Generate/8/8          60.7 ns         60.7 ns     11631385
BM_SeedSeq_Generate/16/8         97.0 ns         93.6 ns      7510568
BM_SeedSeq_Generate/1/64          564 ns          561 ns      1259672
BM_SeedSeq_Generate/8/64          549 ns          549 ns      1241113
BM_SeedSeq_Generate/16/64         541 ns          540 ns      1294139
BM_SeedSeq_Generate/1/256        2299 ns         2296 ns       304877
BM_SeedSeq_Generate/8/256        2298 ns         2297 ns       305722
BM_SeedSeq_Generate/16/256       2300 ns         2299 ns       305262

NEW

Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------
BM_SeedSeq_Generate/1/1          17.0 ns         17.0 ns     38130515
BM_SeedSeq_Generate/8/1          49.5 ns         49.5 ns     14140414
BM_SeedSeq_Generate/16/1         86.6 ns         86.1 ns      8184933
BM_SeedSeq_Generate/1/8          47.3 ns         47.2 ns     14856161
BM_SeedSeq_Generate/8/8          50.7 ns         50.7 ns     13919268
BM_SeedSeq_Generate/16/8         79.8 ns         79.6 ns      8695436
BM_SeedSeq_Generate/1/64          520 ns          520 ns      1361550
BM_SeedSeq_Generate/8/64          519 ns          519 ns      1355355
BM_SeedSeq_Generate/16/64         525 ns          525 ns      1319261
BM_SeedSeq_Generate/1/256        2198 ns         2194 ns       317959
BM_SeedSeq_Generate/8/256        2196 ns         2194 ns       320623
BM_SeedSeq_Generate/16/256       2215 ns         2201 ns       317943

X86

Intel(R) Xeon(R) CPU E5-1650 v4 @ 3.60GHz
Run on (12 X 4000 MHz CPU s)
CPU Caches:
  L1 Data 32 KiB (x6)
  L1 Instruction 32 KiB (x6)
  L2 Unified 256 KiB (x6)
  L3 Unified 15360 KiB (x1)
Load Average: 0.34, 0.38, 0.36

OLD

Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------
BM_SeedSeq_Generate/1/1          51.9 ns         51.9 ns     12867935
BM_SeedSeq_Generate/8/1           179 ns          179 ns      3915093
BM_SeedSeq_Generate/16/1          325 ns          325 ns      2158696
BM_SeedSeq_Generate/1/8           361 ns          361 ns      1960768
BM_SeedSeq_Generate/8/8           342 ns          342 ns      1982211
BM_SeedSeq_Generate/16/8          485 ns          484 ns      1449114
BM_SeedSeq_Generate/1/64         3010 ns         3008 ns       234187
BM_SeedSeq_Generate/8/64         2966 ns         2964 ns       237480
BM_SeedSeq_Generate/16/64        2910 ns         2909 ns       240558
BM_SeedSeq_Generate/1/256       12132 ns        12127 ns        58263
BM_SeedSeq_Generate/8/256       12051 ns        12046 ns        58368
BM_SeedSeq_Generate/16/256      12169 ns        12163 ns        58210

NEW

Benchmark                           Time             CPU   Iterations
---------------------------------------------------------------------
BM_SeedSeq_Generate/1/1          26.1 ns         26.1 ns     25494346
BM_SeedSeq_Generate/8/1          45.5 ns         45.5 ns     15384561
BM_SeedSeq_Generate/16/1         78.0 ns         77.9 ns      9019345
BM_SeedSeq_Generate/1/8          65.0 ns         65.0 ns     10775171
BM_SeedSeq_Generate/8/8          69.4 ns         69.4 ns     10152010
BM_SeedSeq_Generate/16/8          100 ns        100.0 ns      7041379
BM_SeedSeq_Generate/1/64          488 ns          488 ns      1426010
BM_SeedSeq_Generate/8/64          488 ns          488 ns      1434429
BM_SeedSeq_Generate/16/64         489 ns          488 ns      1322369
BM_SeedSeq_Generate/1/256        1938 ns         1938 ns       362517
BM_SeedSeq_Generate/8/256        1934 ns         1934 ns       361980
BM_SeedSeq_Generate/16/256       1936 ns         1936 ns       361482

Wow, these numbers look great! LGTM with the license header added back.

Updating D125329: Replace modulus operations in std::seed_seq::generate with conditional checks.

laramiel marked an inline comment as done.May 10 2022, 4:18 PM

laramiel added inline comments.May 10 2022, 4:21 PM

libcxx/benchmarks/random.bench.cpp
21	Does the benchmark need to support C++03? because std::array

philnik added inline comments.May 10 2022, 4:24 PM

libcxx/benchmarks/random.bench.cpp
21	No. We normally don't have any divergence of performance critical code between C++ versions, so we only run the benchmarks for C++20(?).

Harbormaster completed remote builds in B163810: Diff 428529.May 11 2022, 1:19 PM

What next? The AIX 32-bit build fails with an unrelated error, all the rest pass.

FAILED: libcxx/src/CMakeFiles/cxx_shared.dir/atomic.cpp.o
/opt/IBM/openxlC/17.1.0/bin/ibm-clang++_r -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE -D_LIBCPP_BUILDING_LIBRARY -D_LIBCPP_DISABLE_NEW_DELETE_DEFINITIONS -D_LIBCPP_HAS_NO_PRAGMA_SYSTEM_HEADER -DSTDC_CONSTANT_MACROS -DSTDC_FORMAT_MACROS -DSTDC_LIMIT_MACROS -Ilibcxx/include/c++build -I/scratch/powerllvm/cpap8008/llvm-project/libcxx-ci/libcxx/src -I/scratch/powerllvm/cpap8008/llvm-project/libcxx-ci/libcxxabi/include -DLIBC_NO_CPP_MATH_OVERLOADS__ -mcmodel=large -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wno-comment -Wstring-conversion -Wmisleading-indentation -fdiagnostics-color -ffunction-sections -fdata-sections -O2 -g -fPIC -DLIBCXX_BUILDING_LIBCXXABI -faligned-allocation -nostdinc++ -fvisibility-inlines-hidden -Wall -Wextra -W -Wwrite-strings -Wno-unused-parameter -Wno-long-long -Werror=return-type -Wextra-semi -Wundef -Wformat-nonliteral -Wno-user-defined-literals -Wno-covered-switch-default -Wno-suggest-override -Werror -I/scratch/powerllvm/cpap8008/llvm-project/libcxx-ci/build/aix/include/c++/v1 -std=c++2a -MD -MT libcxx/src/CMakeFiles/cxx_shared.dir/atomic.cpp.o -MF libcxx/src/CMakeFiles/cxx_shared.dir/atomic.cpp.o.d -o libcxx/src/CMakeFiles/cxx_shared.dir/atomic.cpp.o -c /scratch/powerllvm/cpap8008/llvm-project/libcxx-ci/libcxx/src/atomic.cpp
In file included from /scratch/powerllvm/cpap8008/llvm-project/libcxx-ci/libcxx/src/atomic.cpp:12:
/scratch/powerllvm/cpap8008/llvm-project/libcxx-ci/build/aix/include/c++/v1/atomic:1005:12: error: large atomic operation may incur significant performance penalty; the access size (8 bytes) exceeds the max lock-free size (4 bytes) [-Werror,-Watomic-alignment]

return __c11_atomic_fetch_add(&__a->__a_value, __delta, static_cast<__memory_order_underlying_t>(__order));
       ^

/scratch/powerllvm/cpap8008/llvm-project/libcxx-ci/build/aix/include/c++/v1/atomic:948:12: error: large atomic operation may incur significant performance penalty; the access size (8 bytes) exceeds the max lock-free size (4 bytes) [-Werror,-Watomic-alignment]

return __c11_atomic_load(const_cast<__ptr_type>(&__a->__a_value), static_cast<__memory_order_underlying_t>(__order));
       ^

/scratch/powerllvm/cpap8008/llvm-project/libcxx-ci/build/aix/include/c++/v1/atomic:1000:12: error: large atomic operation may incur significant performance penalty; the access size (8 bytes) exceeds the max lock-free size (4 bytes) [-Werror,-Watomic-alignment]

return __c11_atomic_fetch_add(&__a->__a_value, __delta, static_cast<__memory_order_underlying_t>(__order));
       ^

/scratch/powerllvm/cpap8008/llvm-project/libcxx-ci/build/aix/include/c++/v1/atomic:1022:12: error: large atomic operation may incur significant performance penalty; the access size (8 bytes) exceeds the max lock-free size (4 bytes) [-Werror,-Watomic-alignment]

return __c11_atomic_fetch_sub(&__a->__a_value, __delta, static_cast<__memory_order_underlying_t>(__order));
       ^

4 errors generated.

@laramiel The AIX build is fixed now, so you can rebase to fix the CI. You have to wait for a second libc++ approver to review and then this can be landed.

Updating D125329: Rebase

Harbormaster completed remote builds in B164359: Diff 429300.May 13 2022, 12:51 PM

Abseil benchmarks suggest that the conditional checks result in faster code (4-5x)
as they can be compiled into conditional move instructions.

Nice improvement! Did state that the cmov is the big win?

In general quite happy with the changes, but I really like a bit more comment explaining the cleverness.

libcxx/benchmarks/random.bench.cpp
21	Correct the benchmarks are indeed only tested with C++20.
libcxx/include/__random/seed_seq.h
88	I saw `libcxx/test/std/numerics/rand/rand.util/rand.util.seedseq/generate.pass.cpp` tests the generated output of this function.
112	This is a kind of clever code which isn't really obvious. Can you add a bit more comment what it exactly does. After you did that the `__k % __n` comment becomes clear.
151	I think here we can add some comment too. The modulo in this loop works when `__k == __m`, which is true when `__s + 1 <= __m`.

laramiel added inline comments.May 16 2022, 9:50 AM

libcxx/include/__random/seed_seq.h
112	The comments that I added worked to make it clear to me while reading [rand.util.seedseq]. What specific comment would make it more clear to you?
151	What comment do you think helps? I think that the modulo pattern is pretty clear, particularly since it mirrors the first loop.

Adjust comments & rename km1modn to k1modn.

Harbormaster completed remote builds in B164675: Diff 429755.May 16 2022, 10:02 AM

laramiel edited the summary of this revision. (Show Details)May 16 2022, 10:02 AM

Herald added a subscriber: pengfei. · View Herald TranscriptMay 16 2022, 10:02 AM

Sort benchmark consts

Harbormaster completed remote builds in B164676: Diff 429756.May 16 2022, 10:05 AM

laramiel marked 2 inline comments as done.May 16 2022, 10:08 AM

Something went wrong with your patch, the changes to libcxx/include/__random/seed_seq.h have disappeared. Can you upload a fixed version?

This revision now requires changes to proceed.May 16 2022, 11:01 AM

Rebase. Does this fix it.

Harbormaster completed remote builds in B164707: Diff 429789.May 16 2022, 11:07 AM

In D125329#3516627, @laramiel wrote:

Rebase. Does this fix it.

Unfortunately no. Do you have multiple commits in your local branch? Arc only uses one commit.
I guess it is related to arc making the new review D125702.

This revision now requires changes to proceed.May 16 2022, 11:14 AM

Try arc diff origin/main, which seems to work after the rebase.

Harbormaster completed remote builds in B164709: Diff 429792.May 16 2022, 12:28 PM

It's all back and the buildbot has completed a green build.

LGTM after adding the comment. Do you need somebody to land to patch for you?
If so please provide your name and e-mail address.

libcxx/include/__random/seed_seq.h
112	I think it would be nice to explain a bit more what happens and more importantly why the code is written in the current way. That would help future readers to understand the code without looking in the git history. I propose something along the lines of: As an optimization the code in the loop doesn't calculate modulo n every iteration. Instead it's initialized here allowing the loop to use an if statement instead of a modulo. If you have a link to the claim of abseil it would be great to add it too.
151	I see the precondition `__s + 1 <= __m` is always true. Might be nice to add a comment to that effect, but not required.

This revision is now accepted and ready to land.May 17 2022, 10:46 AM

The abseil code is internal, and we opted to upstream the main optimization rather than push another std:: compatible type, so I don't have a public reference there.

Updating D125329: Update comment on loop.

Harbormaster completed remote builds in B165546: Diff 431005.May 20 2022, 10:26 AM

laramiel marked an inline comment as done.May 20 2022, 10:27 AM

laramiel added inline comments.

libcxx/include/__random/seed_seq.h
112	Ok, I have a very similar comment to that. Please take another look.

I would prefer help landing a patch; lar@google.com (Laramie Leavitt)

Updating D125329: Rebase

Harbormaster completed remote builds in B165548: Diff 431007.May 20 2022, 12:23 PM

philnik retitled this revision from Replace modulus operations in std::seed_seq::generate with conditional checks. to [libc++] Replace modulus operations in std::seed_seq::generate with conditional checks..May 24 2022, 1:27 AM

Closed by commit rGb07880454ba3: [libc++] Replace modulus operations in std::seed_seq::generate with conditional… (authored by laramiel, committed by philnik). · Explain WhyMay 24 2022, 1:30 AM

This revision was automatically updated to reflect the committed changes.

philnik added a commit: rGb07880454ba3: [libc++] Replace modulus operations in std::seed_seq::generate with conditional….

Diff 428463

libcxx/benchmarks/random.bench.cpp

This file was added.

				// -- C++ --
				//===----------------------------------------------------------------------===//
				philnikUnsubmitted Done Reply Inline Actions This shouldn't be in a `.cpp` file, only in `.h` files. philnik: This shouldn't be in a `.cpp` file, only in `.h` files.
				laramielAuthorUnsubmitted Done Reply Inline Actions I can do that, but it's in a lot of the .bench.cpp files. Though not all. laramiel:* I can do that, but it's in a lot of the *.bench.cpp files. Though not all.
				philnikUnsubmitted Done Reply Inline Actions Sorry for the miscommunication. I only meant the `// -- C++ --`. philnik: Sorry for the miscommunication. I only meant the `// -- C++ --`.
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include <cstddef>
				#include <random>
				#include <algorithm>
				#include <array>
				#include <functional>

				philnikUnsubmitted Done Reply Inline Actions Could you order these alphabetically? philnik: Could you order these alphabetically?
				#include "benchmark/benchmark.h"

				constexpr std::size_t MAX_SEED_LEN = 16;
				constexpr std::size_t MAX_BUFFER_LEN = 256;

				static void BM_SeedSeq_Generate(benchmark::State& state) {
				laramielAuthorUnsubmitted Done Reply Inline Actions Does the benchmark need to support C++03? because std::array laramiel: Does the benchmark need to support C++03? because std::array
				philnikUnsubmitted Done Reply Inline Actions No. We normally don't have any divergence of performance critical code between C++ versions, so we only run the benchmarks for C++20(?). philnik: No. We normally don't have any divergence of performance critical code between C++ versions, so…
				MordanteUnsubmitted Done Reply Inline Actions Correct the benchmarks are indeed only tested with C++20. Mordante: Correct the benchmarks are indeed only tested with C++20.
				std::array<std::uint32_t, MAX_BUFFER_LEN> buffer;
				std::array<std::uint32_t, MAX_SEED_LEN> seeds;
				{
				std::random_device rd;
				std::generate(std::begin(seeds), std::begin(seeds) + state.range(0), std::cref(rd));
				}
				std::seed_seq seed(std::begin(seeds), std::begin(seeds) + state.range(0));
				for (auto _ : state) {
				benchmark::DoNotOptimize(seed.generate(std::begin(buffer), std::begin(buffer) + state.range(1)));
				}
				}
				BENCHMARK(BM_SeedSeq_Generate)->Range(1, MAX_SEED_LEN)->Range(1, MAX_BUFFER_LEN);

				BENCHMARK_MAIN();

libcxx/include/__random/seed_seq.h

	Show First 20 Lines • Show All 79 Lines • ▼ Show 20 Lines
	seed_seq::__init(_InputIterator __first, _InputIterator __last)			seed_seq::__init(_InputIterator __first, _InputIterator __last)
	{			{
	for (_InputIterator __s = __first; __s != __last; ++__s)			for (_InputIterator __s = __first; __s != __last; ++__s)
	__v_.push_back(*__s & 0xFFFFFFFF);			__v_.push_back(*__s & 0xFFFFFFFF);
	}			}

	template<class _RandomAccessIterator>			template<class _RandomAccessIterator>
	void			void
	seed_seq::generate(_RandomAccessIterator __first, _RandomAccessIterator __last)			seed_seq::generate(_RandomAccessIterator __first, _RandomAccessIterator __last)
				MordanteUnsubmitted Done Reply Inline Actions I saw `libcxx/test/std/numerics/rand/rand.util/rand.util.seedseq/generate.pass.cpp` tests the generated output of this function. Mordante: I saw `libcxx/test/std/numerics/rand/rand.util/rand.util.seedseq/generate.pass.cpp` tests the…
	{			{
	if (__first != __last)			if (__first != __last)
	{			{
	_VSTD::fill(__first, __last, 0x8b8b8b8b);			_VSTD::fill(__first, __last, 0x8b8b8b8b);
	const size_t __n = static_cast<size_t>(__last - __first);			const size_t __n = static_cast<size_t>(__last - __first);
	const size_t __s = __v_.size();			const size_t __s = __v_.size();
	const size_t __t = (__n >= 623) ? 11			const size_t __t = (__n >= 623) ? 11
	: (__n >= 68) ? 7			: (__n >= 68) ? 7
	: (__n >= 39) ? 5			: (__n >= 39) ? 5
	: (__n >= 7) ? 3			: (__n >= 7) ? 3
	: (__n - 1) / 2;			: (__n - 1) / 2;
	const size_t __p = (__n - __t) / 2;			const size_t __p = (__n - __t) / 2;
	const size_t __q = __p + __t;			const size_t __q = __p + __t;
	const size_t __m = _VSTD::max(__s + 1, __n);			const size_t __m = _VSTD::max(__s + 1, __n);
	// __k = 0;			// __k = 0;
	{			{
	result_type __r = 1664525 * _Tp(__first[0] ^ __first[__p]			result_type __r = 1664525 * _Tp(__first[0] ^ __first[__p]
	^ __first[__n - 1]);			^ __first[__n - 1]);
	__first[__p] += __r;			__first[__p] += __r;
	__r += __s;			__r += __s;
	__first[__q] += __r;			__first[__q] += __r;
	__first[0] = __r;			__first[0] = __r;
	}			}
				size_t __kmodn = 0; // __k % __n
				MordanteUnsubmitted Done Reply Inline Actions This is a kind of clever code which isn't really obvious. Can you add a bit more comment what it exactly does. After you did that the `__k % __n` comment becomes clear. Mordante: This is a kind of clever code which isn't really obvious. Can you add a bit more comment what…
				laramielAuthorUnsubmitted Done Reply Inline Actions The comments that I added worked to make it clear to me while reading [rand.util.seedseq]. What specific comment would make it more clear to you? laramiel: The comments that I added worked to make it clear to me while reading [rand.util.seedseq]. What…
				MordanteUnsubmitted Done Reply Inline Actions I think it would be nice to explain a bit more what happens and more importantly why the code is written in the current way. That would help future readers to understand the code without looking in the git history. I propose something along the lines of: As an optimization the code in the loop doesn't calculate modulo n every iteration. Instead it's initialized here allowing the loop to use an if statement instead of a modulo. If you have a link to the claim of abseil it would be great to add it too. Mordante: I think it would be nice to explain a bit more what happens and more importantly why the code…
				laramielAuthorUnsubmitted Done Reply Inline Actions Ok, I have a very similar comment to that. Please take another look. laramiel: Ok, I have a very similar comment to that. Please take another look.
				size_t __kpmodn = __p % __n; // (__k + __p) % __n
				size_t __kqmodn = __q % __n; // (__k + __q) % __n
				size_t __km1modn = __n - 1; // (__k-1) % __n

	for (size_t __k = 1; __k <= __s; ++__k)			for (size_t __k = 1; __k <= __s; ++__k)
	{			{
	const size_t __kmodn = __k % __n;			if (++__kmodn == __n)
	const size_t __kpmodn = (__k + __p) % __n;			__kmodn = 0;
	result_type __r = 1664525 * _Tp(__first[__kmodn] ^ __first[__kpmodn]			if (++__km1modn == __n)
	^ __first[(__k - 1) % __n]);			__km1modn = 0;
				if (++__kpmodn == __n)
				__kpmodn = 0;
				if (++__kqmodn == __n)
				__kqmodn = 0;

				result_type __r = 1664525 * _Tp(__first[__kmodn] ^ __first[__kpmodn] ^ __first[__km1modn]);
	__first[__kpmodn] += __r;			__first[__kpmodn] += __r;
	__r += __kmodn + __v_[__k-1];			__r += __kmodn + __v_[__k - 1];
	__first[(__k + __q) % __n] += __r;			__first[__kqmodn] += __r;
	__first[__kmodn] = __r;			__first[__kmodn] = __r;
	}			}
				philnikUnsubmitted Done Reply Inline Actions This would be just perfect for a lambda. Why do we have to support C++03? philnik: This would be just perfect for a lambda. Why do we have to support C++03?
	for (size_t __k = __s + 1; __k < __m; ++__k)			for (size_t __k = __s + 1; __k < __m; ++__k)
	{			{
	const size_t __kmodn = __k % __n;			if (++__kmodn == __n)
	const size_t __kpmodn = (__k + __p) % __n;			__kmodn = 0;
	result_type __r = 1664525 * _Tp(__first[__kmodn] ^ __first[__kpmodn]			if (++__km1modn == __n)
	^ __first[(__k - 1) % __n]);			__km1modn = 0;
				if (++__kpmodn == __n)
				__kpmodn = 0;
				if (++__kqmodn == __n)
				__kqmodn = 0;

				result_type __r = 1664525 * _Tp(__first[__kmodn] ^ __first[__kpmodn] ^ __first[__km1modn]);
	__first[__kpmodn] += __r;			__first[__kpmodn] += __r;
	__r += __kmodn;			__r += __kmodn;
	__first[(__k + __q) % __n] += __r;			__first[__kqmodn] += __r;
	__first[__kmodn] = __r;			__first[__kmodn] = __r;
	}			}
	for (size_t __k = __m; __k < __m + __n; ++__k)			for (size_t __k = __m; __k < __m + __n; ++__k)
				MordanteUnsubmitted Not Done Reply Inline Actions I think here we can add some comment too. The modulo in this loop works when `__k == __m`, which is true when `__s + 1 <= __m`. Mordante: I think here we can add some comment too. The modulo in this loop works when `__k == __m`…
				laramielAuthorUnsubmitted Done Reply Inline Actions What comment do you think helps? I think that the modulo pattern is pretty clear, particularly since it mirrors the first loop. laramiel: What comment do you think helps? I think that the modulo pattern is pretty clear, particularly…
				MordanteUnsubmitted Not Done Reply Inline Actions I see the precondition `__s + 1 <= __m` is always true. Might be nice to add a comment to that effect, but not required. Mordante: I see the precondition `__s + 1 <= __m` is always true. Might be nice to add a comment to that…
	{			{
	const size_t __kmodn = __k % __n;			if (++__kmodn == __n)
	const size_t __kpmodn = (__k + __p) % __n;			__kmodn = 0;
	result_type __r = 1566083941 * _Tp(__first[__kmodn] +			if (++__km1modn == __n)
	__first[__kpmodn] +			__km1modn = 0;
	__first[(__k - 1) % __n]);			if (++__kpmodn == __n)
				__kpmodn = 0;
				if (++__kqmodn == __n)
				__kqmodn = 0;

				result_type __r = 1566083941 * _Tp(__first[__kmodn] + __first[__kpmodn] + __first[__km1modn]);
	__first[__kpmodn] ^= __r;			__first[__kpmodn] ^= __r;
	__r -= __kmodn;			__r -= __kmodn;
	__first[(__k + __q) % __n] ^= __r;			__first[__kqmodn] ^= __r;
	__first[__kmodn] = __r;			__first[__kmodn] = __r;
	}			}
	}			}
	}			}

	_LIBCPP_END_NAMESPACE_STD			_LIBCPP_END_NAMESPACE_STD

	_LIBCPP_POP_MACROS			_LIBCPP_POP_MACROS

	#endif // _LIBCPP___RANDOM_SEED_SEQ_H			#endif // _LIBCPP___RANDOM_SEED_SEQ_H

This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Replace modulus operations in std::seed_seq::generate with conditional checks.
ClosedPublic

Details

Diff Detail

Event Timeline

Mac LAPTOP

OLD

NEW

X86

OLD

NEW

Revision Contents

Diff 428463

libcxx/benchmarks/random.bench.cpp

libcxx/include/__random/seed_seq.h

This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Replace modulus operations in std::seed_seq::generate with conditional checks.ClosedPublic

Details

Diff Detail

Event Timeline

Mac LAPTOP

OLD

NEW

X86

OLD

NEW

Revision Contents

Diff 428463

libcxx/benchmarks/random.bench.cpp

libcxx/include/__random/seed_seq.h

[libc++] Replace modulus operations in std::seed_seq::generate with conditional checks.
ClosedPublic