This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
benchmarks/
2/4
unordered_set_comp.bench.cpp
-
include/
2/7
unordered_map
2/4
unordered_set

Differential D61878

[libc++] Optimize unordered_{multiset,multimap} equality comparison
AbandonedPublic

Authored by ldionne on May 13 2019, 4:49 PM.

Download Raw Diff

Details

Reviewers

mclow.lists
EricWF
zoecarver

Group Reviewers

Restricted Project

Summary

This patch simplifies the comparison of unordered multiset and multimap by first
trying to find a common prefix between the two containers, in case they
happen to be ordered the same.

This also adds some tests for corner cases of unordered_{multimap,multiset}
comparison that wasn't tested previously.

Diff Detail

Event Timeline

zoecarver created this revision.May 13 2019, 4:49 PM

Herald added subscribers: libcxx-commits, ldionne. · View Herald TranscriptMay 13 2019, 4:49 PM

I am quite surprised by the idea that a naked is_permutation can outperform the existing implementation.
However, the test that you linked to is definitely faster on my machine.

I think that result is something that needs to go into our benchmark suite.

include/unordered_map
1654	This seems odd to me. The old code cached both `__x.end()` and `__y.end()`. The new code caches only `__x.end()`. Why? Either it's worth caching `end()` or it is not.
include/unordered_set
960–963	Same comment as in `unordered_map`

I can add a benchmark for this. I have a few other tests (using map, etc.). I will refer to this, but I might have a few questions :)

include/unordered_map
1654	The old code compares the found value to a stored variable whereas here I compare it to `__y.end()`. I don't think it makes a difference. I think it's just a thing I did when re-writing these. I am happy to use either.

add benchmark

Let me know if there is anything I should add to the benchmark (or if I should write it differently). Here is the output from my machine:

----------------------------------------------------------------
Benchmark                     Time             CPU   Iterations
----------------------------------------------------------------
initialize_data           0.000 ns        0.000 ns            0
test_small_old           166805 ns       166343 ns         4356
test_small_new            64089 ns        63915 ns        10408
test_small_range_old     287885 ns       287437 ns         2648
test_small_range_new      50265 ns        50077 ns        10000
test_big_old           95000392 ns     94921500 ns            6
test_big_new           75199528 ns     75144333 ns            9

Unfortunately nowhere near what I was getting with the other test, but I suspect this is more accurate, and there is a bit of improvement (especially for sets with limited ranges).

What I would like here instead of just a benchmark is a test that ensures that the new implementation is faster than the old one (with some fuzz of course). Is that something reasonable to ask for? @mclow.lists Do you agree this would be better? I'm concerned that we won't be looking at benchmark results and might not notice if the new algorithm is slower under some circumstances.

@ldionne I can add a test that asserts one function takes less time than another, but it might fail if tests are run in parallel or if the computer's resources are being used more heavily while one of the tests is being run. I think it is unlikely that such circumstances arise, but it is possible.

While it is harder for me to show why is_permutation is faster, it follows logic that it would be at least as fast. Essentially, the old function looped through every element of the set. For each element, it would find all elements containing the same value as that element. Then, it would check is_permutation for that set of elements (who were required to both be the equal values and have the same size!). Assuming is_permutation works as specified in the standard, it does the same thing. However, it cuts out unnecessary calls to is_permutation, does not have to loop through every element, and has other optimizations. In other words, the standard specifies that is_permutation must have a maximum of O(n^2) complexity, the current (old) function is more complex than that, therefore, even if is_permutation is updated, it should still be faster.

I have found cases where the existing code is significantly faster than this new code.
Zoe is investigating.

This revision now requires changes to proceed.May 15 2019, 6:59 PM

I played around with this a bit and (thanks to @mclow.lists) discovered that there are issues with using is_permutation. I tried to take the best of both of these implementations by adding is_permutation's loop which chops off any identical parts and by removing the unneeded call to is_permutation while keeping the equal_range check. Here is the result:

-------------------------------------------------------------------------------------
Benchmark                                          Time             CPU   Iterations
-------------------------------------------------------------------------------------
initialize_data                                0.000 ns        0.000 ns            0
test_old/OLD_compare_small_set                 62502 ns        62483 ns        11055
test_new/NEW_compare_small_set                 26598 ns        26591 ns        27834
test_old/OLD_compare_similar_set               97684 ns        97492 ns         8014
test_new/NEW_compare_similar_set               22088 ns        22080 ns        33696
test_old/OLD_compare_big_set                79731796 ns     79664100 ns           10
test_new/NEW_compare_big_set                74296876 ns     74245222 ns            9
test_old/OLD_compare_big_set_different      71916394 ns     71878300 ns           10
test_new/NEW_compare_big_set_different     147169839 ns    147085200 ns            5
test_old/OLD_compare_oposite_order_set     306415935 ns    305943000 ns            2
test_new/NEW_compare_oposite_order_set     329123963 ns    329029500 ns            2
test_old/OLD_compare_random_set             12786696 ns     12763356 ns           45
test_new/NEW_compare_random_set              3610982 ns      3605559 ns          211
test_old/OLD_compare_random_set_different       2888 ns         2883 ns       250692
test_new/NEW_compare_random_set_different       2882 ns         2868 ns       234480
test_old/OLD_compare_short_string             241644 ns       241291 ns         2902
test_new/NEW_compare_short_string              61157 ns        60933 ns         8343
test_old/OLD_compare_long_string             1369126 ns      1366185 ns          509
test_new/NEW_compare_long_string              142631 ns       142461 ns         6213
test_old/OLD_compare_different_strings           311 ns          310 ns      2278497
test_new/NEW_compare_different_strings           319 ns          318 ns      2245367

Let's break this down. You will notice that in every one of these bencharks, NEW_* is faster or equally as fast as OLD_* with two exceptions: compare_different_strings and compare_big_set_different. The first one is similar enough that I won't worry about it, but the second one is what worries me. For some reason, the new implementation is twice as slow as the old one. However, when I look at the number of times values are compared the new implementation compares 4999507 values while the old one compares 4999505 values. Two comparisons should not make that big a difference. Additionally, when I run this outside of the benchmark enviorment the comparisons take almost the same amount of time. Any thoughts on why this might be?

Note: careful when running these benchmarks, it takes almost 5GB of memory.

zoecarver retitled this revision from Use is_permutation when comparing containers to Change how containers are compared .May 15 2019, 10:01 PM

zoecarver edited the summary of this revision. (Show Details)

zoecarver set the repository for this revision to rCXX libc++.

ldionne requested changes to this revision.May 16 2019, 11:04 AM

ldionne added inline comments.

benchmarks/unordered_set_comp.bench.cpp
83	This should be extracted into an implementation-detail function and used in both the headers and this benchmark, instead of copy-pasting.
include/unordered_set
1534	That's basically equivalent to: bool res = std::equal(__first1, __last1, __first2); if (res) return true; // rest of the code after __not_done label Or did I miss something? If they're equivalent, I suggest we remove the raw loop.

This revision now requires changes to proceed.May 16 2019, 11:04 AM

Herald added a subscriber: dexonsmith. · View Herald TranscriptMay 16 2019, 11:04 AM

zoecarver marked 2 inline comments as done.May 16 2019, 11:25 AM

zoecarver added inline comments.

benchmarks/unordered_set_comp.bench.cpp
83	Actually, I could probably just use `operator==` here.
include/unordered_set
1534	No, that would not be the same. This function does not check if the sets are the same; it removes any parts of the sets that are the same. Importantly, if only three elements are the same, it will "chop off" those elements and continue from the third element below. Does that make sense?

zoecarver marked an inline comment as done.May 16 2019, 11:36 AM

zoecarver added inline comments.

include/unordered_set
1532–1544	Just realizing now that this should be `__first1`. That might make it faster :)

I made a mistake and did not update the comparison function correctly, so most of the optimizations were not applied. No wonder it was giving such a weird benchmark score. Now that I have updated the comparison function, it is faster every time (with the exception of the last string test which goes back and forth):

Load Average: 6.94, 4.27, 3.07
-------------------------------------------------------------------------------------
Benchmark                                          Time             CPU   Iterations
-------------------------------------------------------------------------------------
initialize_data                                0.000 ns        0.000 ns            0
test_old/OLD_compare_small_set                 64835 ns        64759 ns         9022
test_new/NEW_compare_small_set                 25215 ns        25209 ns        27516
test_old/OLD_compare_similar_set               85983 ns        85929 ns         7419
test_new/NEW_compare_similar_set               23424 ns        23408 ns        31177
test_old/OLD_compare_big_set                88360371 ns     88124500 ns            6
test_new/NEW_compare_big_set                67877017 ns     67856889 ns            9
test_old/OLD_compare_big_set_different      76333837 ns     76093818 ns           11
test_new/NEW_compare_big_set_different      75138186 ns     74695111 ns            9
test_old/OLD_compare_oposite_order_set     324600832 ns    324167000 ns            2
test_new/NEW_compare_oposite_order_set     276399596 ns    275194333 ns            3
test_old/OLD_compare_random_set             12322272 ns     12318583 ns           72
test_new/NEW_compare_random_set              3413009 ns      3411407 ns          216
test_old/OLD_compare_random_set_different       3004 ns         3002 ns       233761
test_new/NEW_compare_random_set_different       2958 ns         2951 ns       225822
test_old/OLD_compare_short_string             237232 ns       237151 ns         3006
test_new/NEW_compare_short_string              62076 ns        61986 ns        11166
test_old/OLD_compare_long_string             1310500 ns      1310243 ns          519
test_new/NEW_compare_long_string              108740 ns       108724 ns         6146
test_old/OLD_compare_different_strings           345 ns          345 ns      1963061
test_new/NEW_compare_different_strings           364 ns          362 ns      1948509

Sorry about that mistake, now it should work properly.

Any more thoughts on this? Friendly ping if you have time to look over it again.

EricWF added inline comments.Jun 9 2019, 12:12 AM

benchmarks/unordered_set_comp.bench.cpp
217	I would be very surprised if the compiler wasn't optimizing your benchmark to: bool __cached_val = new_comp(u1, u2); while (st.KeepRunning()) { bool val = __cached_val; DoNotOptimize(val); }
include/unordered_map
2275	You can't use auto.
2278	Use explicit braces to make the nesting here more visible.
2280	You can write this without goto.
2287	There's no need to quality pair here.

zoecarver marked an inline comment as done.Jun 10 2019, 8:35 PM

zoecarver added inline comments.

benchmarks/unordered_set_comp.bench.cpp
217	Hopefully passing the function directly to `DoNotOptimize` will fix this. Any other suggestions?

updated based on Eric's suggestions

Also, here are some more benchmarks because I technically changed the comparison function:

-------------------------------------------------------------------------------------
Benchmark                                          Time             CPU   Iterations
-------------------------------------------------------------------------------------
initialize_data                                0.000 ns        0.000 ns            0
test_old/OLD_compare_small_set                 72746 ns        72391 ns        10212
test_new/NEW_compare_small_set                 27201 ns        27132 ns        24270
test_old/OLD_compare_similar_set              100853 ns       100354 ns         7101
test_new/NEW_compare_similar_set               22151 ns        22101 ns        28880
test_old/OLD_compare_big_set                85456872 ns     85059167 ns            6
test_new/NEW_compare_big_set                78148398 ns     78032222 ns            9
test_old/OLD_compare_big_set_different      80389034 ns     80301000 ns            9
test_new/NEW_compare_big_set_different      76252136 ns     76169200 ns           10
test_old/OLD_compare_oposite_order_set     338906270 ns    338359500 ns            2
test_new/NEW_compare_oposite_order_set     308761215 ns    308473500 ns            2
test_old/OLD_compare_random_set             11021038 ns     11011500 ns           56
test_new/NEW_compare_random_set              3717716 ns      3710818 ns          192
test_old/OLD_compare_random_set_different       2720 ns         2718 ns       261663
test_new/NEW_compare_random_set_different       2674 ns         2671 ns       260192
test_old/OLD_compare_short_string            2381223 ns      2377644 ns          295
test_new/NEW_compare_short_string             680146 ns       679617 ns          991
test_old/OLD_compare_long_string            19782950 ns     19765727 ns           33
test_new/NEW_compare_long_string             2370050 ns      2367704 ns          301
test_old/OLD_compare_different_strings          2068 ns         2064 ns       343178
test_new/NEW_compare_different_strings          2046 ns         2043 ns       346945

and (to be fair to the times this function was slower):

-------------------------------------------------------------------------------------
Benchmark                                          Time             CPU   Iterations
-------------------------------------------------------------------------------------
initialize_data                                0.000 ns        0.000 ns            0
test_old/OLD_compare_small_set                 72221 ns        71934 ns         8892
test_new/NEW_compare_small_set                 30014 ns        29660 ns        25674
test_old/OLD_compare_similar_set              107336 ns       106814 ns         6611
test_new/NEW_compare_similar_set               23795 ns        23757 ns        29510
test_old/OLD_compare_big_set                87814006 ns     87564750 ns            8
test_new/NEW_compare_big_set                77258477 ns     77124000 ns            9
test_old/OLD_compare_big_set_different      86371487 ns     86087222 ns            9
test_new/NEW_compare_big_set_different      79709931 ns     79506000 ns            8
test_old/OLD_compare_oposite_order_set     351100203 ns    349060500 ns            2
test_new/NEW_compare_oposite_order_set     312774516 ns    312366000 ns            2
test_old/OLD_compare_random_set             11675568 ns     11655017 ns           58
test_new/NEW_compare_random_set              3614458 ns      3611199 ns          196
test_old/OLD_compare_random_set_different       2807 ns         2802 ns       229585
test_new/NEW_compare_random_set_different       3031 ns         3022 ns       241292
test_old/OLD_compare_short_string            2557198 ns      2545343 ns          274
test_new/NEW_compare_short_string             739160 ns       736574 ns          976
test_old/OLD_compare_long_string            17619450 ns     17561594 ns           32
test_new/NEW_compare_long_string             2347907 ns      2343063 ns          284
test_old/OLD_compare_different_strings           134 ns          133 ns      5265256
test_new/NEW_compare_different_strings           147 ns          146 ns      4736707

include/unordered_map
2280	Yep. It requires one more comparison, but I think it is worth it. We should probably also update `is_permutation` so that it can eventually be a `constexpr`.

Would it be better to make a patch with just the benchmarks?

ldionne added a reviewer: Restricted Project.Nov 2 2020, 2:14 PM

[Github PR transition cleanup]

This patch doesn't seem correct to me: unordered containers are not ordered, so I don't think we can do a linear walk of __first1 and __first2, compare elements side by side and draw any conclusion from that. Abandoning.

Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2023, 7:56 AM

ldionne abandoned this revision.Sep 12 2023, 7:56 AM

In D61878#4644316, @ldionne wrote:

[Github PR transition cleanup]

This patch doesn't seem correct to me: unordered containers are not ordered, so I don't think we can do a linear walk of __first1 and __first2, compare elements side by side and draw any conclusion from that. Abandoning.

Looks to me like this patch is correct. It uses the linear walk as a "fast path" to ignore the prefix of the two containers that are both in the same order and equal, then reverts to the pre-existing slow path for the out-of-order (or unequal) suffix.

I think Duncan is right here, actually. There is a potentially useful optimization that we can do by looking at the matching prefix of the container and skipping repeated lookups based on that. Re-opening.

Rebase and update the approach. Some of the previous patch was not correct (e.g. we can't get rid of is_permutation for unordered_map).

I'm unable to get decisive benchmark results yet though.

Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2023, 3:14 PM

ldionne retitled this revision from [libc++] Optimize unordered_{set,map} equality comparison to [libc++] Optimize unordered_{multiset,multimap} equality comparison.Sep 12 2023, 3:17 PM

ldionne edited the summary of this revision. (Show Details)

ldionne added inline comments.

libcxx/include/unordered_map
1989 ↗	(On Diff #556612)	@dexonsmith Do you agree that we should be able to apply the same `mismatch` optimization here and in `std::unordered_set`?

Now I know why I'm unable to get decisive benchmark results: I added benchmarks for unordered_set but I changed unordered_multiset. What a shame. I'll update tomorrow.

Harbormaster completed remote builds in B257108: Diff 556612.Sep 12 2023, 3:51 PM

Apply the optimization to non-multi containers too. Long story short, I am not convinced that these changes are a real improvement.

Indeed, map lookup starts by computing the hash of the key. If computing the hash is cheaper than fully comparing the key for equality, it can happen that using std::mismatch actually pessimizes the algorithm, since it always checks for full key equality.

Consequently, I am going to drop this patch and only cherry-pick some of the changes in this patch (e.g. testing improvements).

Before patch:

----------------------------------------------------------------------------------------------------
Benchmark                                                          Time             CPU   Iterations
----------------------------------------------------------------------------------------------------
BM_Compare_same_container/unordered_set_string/1024           155440 ns       155422 ns         4495
BM_Compare_same_container/unordered_set_int/1024                2501 ns         2501 ns       269041
BM_Compare_different_containers/unordered_set_string/1024       67.7 ns         67.7 ns     10428771
BM_Compare_different_containers/unordered_set_int/1024          1.33 ns         1.33 ns    528206212
BM_Compare_shared_prefix/unordered_set_string/1024              68.1 ns         68.1 ns     10355336
BM_Compare_shared_prefix/unordered_set_int/1024                 1.33 ns         1.33 ns    527314912

After patch:

----------------------------------------------------------------------------------------------------
Benchmark                                                          Time             CPU   Iterations
----------------------------------------------------------------------------------------------------
BM_Compare_same_container/unordered_set_string/1024           153609 ns       153589 ns         4515
BM_Compare_same_container/unordered_set_int/1024                2390 ns         2390 ns       279170
BM_Compare_different_containers/unordered_set_string/1024       70.7 ns         70.7 ns      9918807
BM_Compare_different_containers/unordered_set_int/1024          3.11 ns         3.11 ns    173664520
BM_Compare_shared_prefix/unordered_set_string/1024              69.7 ns         69.7 ns     10042753
BM_Compare_shared_prefix/unordered_set_int/1024                 1.44 ns         1.44 ns    486236047

ldionne abandoned this revision.Sep 18 2023, 12:45 PM

See https://github.com/llvm/llvm-project/pull/66692 for the changes I am keeping.

Harbormaster completed remote builds in B257356: Diff 556969.Sep 18 2023, 6:08 PM

Revision Contents

Path

Size

benchmarks/

unordered_set_comp.bench.cpp

263 lines

include/

unordered_map

50 lines

unordered_set

43 lines

Diff 199877

benchmarks/unordered_set_comp.bench.cpp

This file was added.

				// This benchmarks the difference between the old comparison (old_comp)
				// and new comparison (is_permutation) methods for unordered multisets.

				#include <unordered_set>
				#include <random>
				#include <string>

				#include "benchmark/benchmark.h"

				using namespace std;

				// util

				template <typename T>
				struct simple_rand // thanks @mclow for this :)
				{
				static const size_t seed_from_clock = (size_t) -1;
				simple_rand(T min, T max, size_t seed = seed_from_clock) :
				gen_(seed != seed_from_clock
				? seed
				: std::chrono::system_clock::now().time_since_epoch().count()),
				dist_(min, max) {}

				T operator ()() { return dist_(gen_); }

				std::default_random_engine gen_;
				std::uniform_int_distribution<T> dist_;
				};

				std::string random_string(size_t size)
				{
				auto random_char = []() -> char
				{
				const char char_set[] =
				"0123456789"
				"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
				"abcdefghijklmnopqrstuvwxyz";
				int max_index = sizeof(char_set) - 1;
				simple_rand r(0, max_index);
				return char_set[r()];
				};

				std::string str(size, 0);
				std::generate_n(str.begin(), size, random_char);
				return str;
				}

				// setup

				unordered_multiset<int> u1;
				unordered_multiset<int> u2;

				unordered_multiset<int> u1_diff;
				unordered_multiset<int> u2_diff;

				unordered_multiset<int> u1_small;
				unordered_multiset<int> u2_small;

				unordered_multiset<int> u1_small_range;
				unordered_multiset<int> u2_small_range;

				unordered_multiset<int> u1_back;
				unordered_multiset<int> u2_back;

				unordered_multiset<int> u1_rand;
				unordered_multiset<int> u2_rand;

				unordered_multiset<int> u1_rand_diff;
				unordered_multiset<int> u2_rand_diff;

				unordered_multiset<std::string> u1_str_diff;
				unordered_multiset<std::string> u2_str_diff;

				unordered_multiset<std::string> u1_str_long;
				unordered_multiset<std::string> u2_str_long;

				unordered_multiset<std::string> u1_str_short;
				unordered_multiset<std::string> u2_str_short;

				template <class V, class H, class P, class A>
				bool
				new_comp(const unordered_multiset<V, H, P, A>& x,
				const unordered_multiset<V, H, P, A>& y)
				ldionneAuthorUnsubmitted Not Done Reply Inline Actions This should be extracted into an implementation-detail function and used in both the headers and this benchmark, instead of copy-pasting. ldionne: This should be extracted into an implementation-detail function and used in both the headers…
				zoecarverUnsubmitted Done Reply Inline Actions Actually, I could probably just use `operator==` here. zoecarver: Actually, I could probably just use `operator==` here.
				{
				return x == y;
				}

				template <class V, class H, class P, class A>
				bool
				old_comp(const unordered_multiset<V, H, P, A>& x,
				const unordered_multiset<V, H, P, A>& y)
				{
				if (x.size() != y.size())
				return false;

				typedef typename unordered_multiset<V, H, P, A>::const_iterator const_iterator;
				typedef pair<const_iterator, const_iterator> pair_t;
				for (const_iterator i = x.begin(), __ex = x.end(); i != __ex;)
				{
				pair_t xeq = x.equal_range(*i);
				pair_t yeq = y.equal_range(*i);
				if (std::distance(xeq.first, xeq.second) !=
				std::distance(yeq.first, yeq.second) \|\|
				!std::is_permutation(xeq.first, xeq.second, yeq.first))
				return false;
				i = xeq.second;
				}
				return true;
				}

				void initialize_data(benchmark::State&)
				{
				// large set
				for (unsigned i = 0; i < 1000 * 1000; ++i)
				{
				u1.insert(i);
				u2.insert(i);
				}
				for (unsigned i = 0; i < 1000; ++i)
				{
				u1.insert(i % 10);
				u2.insert(i % 10);
				}

				u1_diff = u1;
				u2_diff = u2;
				for (unsigned i = 0; i < 100; ++i)
				{
				u1_diff.insert(i);
				u2_diff.insert(0);
				}

				// small set
				for (unsigned i = 0; i < 500; ++i)
				{
				u1_small.insert(i);
				u2_small.insert(i);
				}
				for (unsigned i = 0; i < 500; ++i)
				{
				u1_small.insert(i % 10);
				u2_small.insert(i % 10);
				}

				// small range
				for (unsigned i = 0; i < 1000; ++i)
				{
				u1_small_range.insert(i % 2);
				u2_small_range.insert(i % 2);
				}

				// back / front
				for (unsigned i = 0; i < 1000 * 1000; ++i)
				{
				u1_back.insert(i);
				u2_back.insert((1000 * 1000) - i);
				}
				for (unsigned i = 1000 * 1000; i > 0; --i)
				{
				u1_back.insert(i);
				u2_back.insert((1000 * 1000) - i);
				}

				// random fill
				simple_rand r(1, 1000);
				for (unsigned i = 0; i < 1000 * 100; ++i)
				{
				int value = r();
				u1_rand.insert(value);
				u2_rand.insert(value);
				}

				for (unsigned i = 0; i < 1000 * 50; ++i)
				{
				u1_rand_diff.insert(r());
				u2_rand_diff.insert(r());
				}

				// string set
				for (unsigned i = 0; i < 1000 * 10; ++i)
				{
				std::string str = random_string(5);
				u1_str_short.insert(str);
				u2_str_short.insert(str);
				}

				for (unsigned i = 0; i < 1000 * 10; ++i)
				{
				std::string str = random_string(100);
				u1_str_long.insert(str);
				u2_str_long.insert(str);
				}

				for (unsigned i = 0; i < 1000 * 10; ++i)
				{
				u1_str_diff.insert(random_string(12));
				u2_str_diff.insert(random_string(12));
				}
				}

				// benchmarks

				template<class V>
				void test_old(benchmark::State& st, unordered_multiset<V> u1, unordered_multiset<V> u2)
				{
				while (st.KeepRunning()) {
				bool val = old_comp(u1, u2);
				benchmark::DoNotOptimize(val);
				}
				}

				template<class V>
				void test_new(benchmark::State& st, unordered_multiset<V> u1, unordered_multiset<V> u2)
				{
				while (st.KeepRunning()) {
				bool val = new_comp(u1, u2);
				benchmark::DoNotOptimize(val);
				EricWFUnsubmitted Not Done Reply Inline Actions I would be very surprised if the compiler wasn't optimizing your benchmark to: bool __cached_val = new_comp(u1, u2); while (st.KeepRunning()) { bool val = __cached_val; DoNotOptimize(val); } EricWF: I would be very surprised if the compiler wasn't optimizing your benchmark to: ``` bool…
				zoecarverUnsubmitted Done Reply Inline Actions Hopefully passing the function directly to `DoNotOptimize` will fix this. Any other suggestions? zoecarver: Hopefully passing the function directly to `DoNotOptimize` will fix this. Any other suggestions?
				}
				}

				BENCHMARK(initialize_data); // TODO: how do I initialize this?

				// tests a small set (1000 items, 50 items overlap)
				BENCHMARK_CAPTURE(test_old, OLD_compare_small_set, u1_small, u2_small);
				BENCHMARK_CAPTURE(test_new, NEW_compare_small_set, u1_small, u2_small);

				// tests a small set (1000 items, 500 items overlap)
				BENCHMARK_CAPTURE(test_old, OLD_compare_similar_set, u1_small_range, u2_small_range);
				BENCHMARK_CAPTURE(test_new, NEW_compare_similar_set, u1_small_range, u2_small_range);

				// tests a big set (1,000,000 items, 100 items overlap)
				BENCHMARK_CAPTURE(test_old, OLD_compare_big_set, u1, u2);
				BENCHMARK_CAPTURE(test_new, NEW_compare_big_set, u1, u2);

				// tests a big set (1,000,000 items, 100 items overlap, 100 items different)
				BENCHMARK_CAPTURE(test_old, OLD_compare_big_set_different, u1_diff, u2_diff);
				BENCHMARK_CAPTURE(test_new, NEW_compare_big_set_different, u1_diff, u2_diff);

				// tests a big set (2,000,000 items, items are inserted in reverse order, 1,000,000 items overlap)
				BENCHMARK_CAPTURE(test_old, OLD_compare_oposite_order_set, u1_back, u2_back);
				BENCHMARK_CAPTURE(test_new, NEW_compare_oposite_order_set, u1_back, u2_back);

				// tests a random set (100,000 items, range from 0 to 1000)
				BENCHMARK_CAPTURE(test_old, OLD_compare_random_set, u1_rand, u2_rand);
				BENCHMARK_CAPTURE(test_new, NEW_compare_random_set, u1_rand, u2_rand);

				// tests a random set (50,000 items, range from 0 to 1000, each different)
				BENCHMARK_CAPTURE(test_old, OLD_compare_random_set_different, u1_rand_diff, u2_rand_diff);
				BENCHMARK_CAPTURE(test_new, NEW_compare_random_set_different, u1_rand_diff, u2_rand_diff);

				// tests a random set of strings (10,000 items, random strings, 5 chars)
				BENCHMARK_CAPTURE(test_old, OLD_compare_short_string, u1_str_short, u2_str_short);
				BENCHMARK_CAPTURE(test_new, NEW_compare_short_string, u1_str_short, u2_str_short);

				// tests a random set of strings (10,000 items, random strings, 100 chars)
				BENCHMARK_CAPTURE(test_old, OLD_compare_long_string, u1_str_long, u2_str_long);
				BENCHMARK_CAPTURE(test_new, NEW_compare_long_string, u1_str_long, u2_str_long);

				// tests a random set of strings (10,000 items, random strings, 12 chars, not necessarily the same in both sets)
				BENCHMARK_CAPTURE(test_old, OLD_compare_different_strings, u1_str_diff, u2_str_diff);
				BENCHMARK_CAPTURE(test_new, NEW_compare_different_strings, u1_str_diff, u2_str_diff);

				BENCHMARK_MAIN();

include/unordered_map

Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	public:
unordered_map(size_type n, const allocator_type& a)		unordered_map(size_type n, const allocator_type& a)
: unordered_map(n, hasher(), key_equal(), a) {} // C++14		: unordered_map(n, hasher(), key_equal(), a) {} // C++14
unordered_map(size_type n, const hasher& hf, const allocator_type& a)		unordered_map(size_type n, const hasher& hf, const allocator_type& a)
: unordered_map(n, hf, key_equal(), a) {} // C++14		: unordered_map(n, hf, key_equal(), a) {} // C++14
template <class InputIterator>		template <class InputIterator>
unordered_map(InputIterator f, InputIterator l, size_type n, const allocator_type& a)		unordered_map(InputIterator f, InputIterator l, size_type n, const allocator_type& a)
: unordered_map(f, l, n, hasher(), key_equal(), a) {} // C++14		: unordered_map(f, l, n, hasher(), key_equal(), a) {} // C++14
template <class InputIterator>		template <class InputIterator>
unordered_map(InputIterator f, InputIterator l, size_type n, const hasher& hf,		unordered_map(InputIterator f, InputIterator l, size_type n, const hasher& hf,
const allocator_type& a)		const allocator_type& a)
: unordered_map(f, l, n, hf, key_equal(), a) {} // C++14		: unordered_map(f, l, n, hf, key_equal(), a) {} // C++14
unordered_map(initializer_list<value_type> il, size_type n, const allocator_type& a)		unordered_map(initializer_list<value_type> il, size_type n, const allocator_type& a)
: unordered_map(il, n, hasher(), key_equal(), a) {} // C++14		: unordered_map(il, n, hasher(), key_equal(), a) {} // C++14
unordered_map(initializer_list<value_type> il, size_type n, const hasher& hf,		unordered_map(initializer_list<value_type> il, size_type n, const hasher& hf,
const allocator_type& a)		const allocator_type& a)
: unordered_map(il, n, hf, key_equal(), a) {} // C++14		: unordered_map(il, n, hf, key_equal(), a) {} // C++14
~unordered_map();		~unordered_map();
unordered_map& operator=(const unordered_map&);		unordered_map& operator=(const unordered_map&);
unordered_map& operator=(unordered_map&&)		unordered_map& operator=(unordered_map&&)
noexcept(		noexcept(
allocator_type::propagate_on_container_move_assignment::value &&		allocator_type::propagate_on_container_move_assignment::value &&
is_nothrow_move_assignable<allocator_type>::value &&		is_nothrow_move_assignable<allocator_type>::value &&
▲ Show 20 Lines • Show All 176 Lines • ▼ Show 20 Lines	public:
unordered_multimap(size_type n, const allocator_type& a)		unordered_multimap(size_type n, const allocator_type& a)
: unordered_multimap(n, hasher(), key_equal(), a) {} // C++14		: unordered_multimap(n, hasher(), key_equal(), a) {} // C++14
unordered_multimap(size_type n, const hasher& hf, const allocator_type& a)		unordered_multimap(size_type n, const hasher& hf, const allocator_type& a)
: unordered_multimap(n, hf, key_equal(), a) {} // C++14		: unordered_multimap(n, hf, key_equal(), a) {} // C++14
template <class InputIterator>		template <class InputIterator>
unordered_multimap(InputIterator f, InputIterator l, size_type n, const allocator_type& a)		unordered_multimap(InputIterator f, InputIterator l, size_type n, const allocator_type& a)
: unordered_multimap(f, l, n, hasher(), key_equal(), a) {} // C++14		: unordered_multimap(f, l, n, hasher(), key_equal(), a) {} // C++14
template <class InputIterator>		template <class InputIterator>
unordered_multimap(InputIterator f, InputIterator l, size_type n, const hasher& hf,		unordered_multimap(InputIterator f, InputIterator l, size_type n, const hasher& hf,
const allocator_type& a)		const allocator_type& a)
: unordered_multimap(f, l, n, hf, key_equal(), a) {} // C++14		: unordered_multimap(f, l, n, hf, key_equal(), a) {} // C++14
unordered_multimap(initializer_list<value_type> il, size_type n, const allocator_type& a)		unordered_multimap(initializer_list<value_type> il, size_type n, const allocator_type& a)
: unordered_multimap(il, n, hasher(), key_equal(), a) {} // C++14		: unordered_multimap(il, n, hasher(), key_equal(), a) {} // C++14
unordered_multimap(initializer_list<value_type> il, size_type n, const hasher& hf,		unordered_multimap(initializer_list<value_type> il, size_type n, const hasher& hf,
const allocator_type& a)		const allocator_type& a)
: unordered_multimap(il, n, hf, key_equal(), a) {} // C++14		: unordered_multimap(il, n, hf, key_equal(), a) {} // C++14
~unordered_multimap();		~unordered_multimap();
unordered_multimap& operator=(const unordered_multimap&);		unordered_multimap& operator=(const unordered_multimap&);
unordered_multimap& operator=(unordered_multimap&&)		unordered_multimap& operator=(unordered_multimap&&)
noexcept(		noexcept(
allocator_type::propagate_on_container_move_assignment::value &&		allocator_type::propagate_on_container_move_assignment::value &&
is_nothrow_move_assignable<allocator_type>::value &&		is_nothrow_move_assignable<allocator_type>::value &&
▲ Show 20 Lines • Show All 652 Lines • ▼ Show 20 Lines	#if _LIBCPP_STD_VER > 11
unordered_map(size_type __n, const hasher& __hf, const allocator_type& __a)		unordered_map(size_type __n, const hasher& __hf, const allocator_type& __a)
: unordered_map(__n, __hf, key_equal(), __a) {}		: unordered_map(__n, __hf, key_equal(), __a) {}
template <class _InputIterator>		template <class _InputIterator>
_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
unordered_map(_InputIterator __first, _InputIterator __last, size_type __n, const allocator_type& __a)		unordered_map(_InputIterator __first, _InputIterator __last, size_type __n, const allocator_type& __a)
: unordered_map(__first, __last, __n, hasher(), key_equal(), __a) {}		: unordered_map(__first, __last, __n, hasher(), key_equal(), __a) {}
template <class _InputIterator>		template <class _InputIterator>
_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
unordered_map(_InputIterator __first, _InputIterator __last, size_type __n, const hasher& __hf,		unordered_map(_InputIterator __first, _InputIterator __last, size_type __n, const hasher& __hf,
const allocator_type& __a)		const allocator_type& __a)
: unordered_map(__first, __last, __n, __hf, key_equal(), __a) {}		: unordered_map(__first, __last, __n, __hf, key_equal(), __a) {}
_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
unordered_map(initializer_list<value_type> __il, size_type __n, const allocator_type& __a)		unordered_map(initializer_list<value_type> __il, size_type __n, const allocator_type& __a)
: unordered_map(__il, __n, hasher(), key_equal(), __a) {}		: unordered_map(__il, __n, hasher(), key_equal(), __a) {}
_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
unordered_map(initializer_list<value_type> __il, size_type __n, const hasher& __hf,		unordered_map(initializer_list<value_type> __il, size_type __n, const hasher& __hf,
const allocator_type& __a)		const allocator_type& __a)
: unordered_map(__il, __n, __hf, key_equal(), __a) {}		: unordered_map(__il, __n, __hf, key_equal(), __a) {}
#endif		#endif
_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
~unordered_map() {		~unordered_map() {
static_assert(sizeof(__diagnose_unordered_container_requirements<_Key, _Hash, _Pred>(0)), "");		static_assert(sizeof(__diagnose_unordered_container_requirements<_Key, _Hash, _Pred>(0)), "");
}		}

▲ Show 20 Lines • Show All 669 Lines • ▼ Show 20 Lines

template <class _Key, class _Tp, class _Hash, class _Pred, class _Alloc>		template <class _Key, class _Tp, class _Hash, class _Pred, class _Alloc>
bool		bool
operator==(const unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>& __x,		operator==(const unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>& __x,
const unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>& __y)		const unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>& __y)
{		{
if (__x.size() != __y.size())		if (__x.size() != __y.size())
return false;		return false;
typedef typename unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>::const_iterator		typedef
		typename unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>::const_iterator
const_iterator;		const_iterator;
for (const_iterator __i = __x.begin(), __ex = __x.end(), __ey = __y.end();		for (const_iterator __i = __x.begin(), __ex = __x.end(); __i != __ex; ++__i)
__i != __ex; ++__i)
{		{
const_iterator __j = __y.find(__i->first);		const_iterator __found = __y.find(__i->first);
if (__j == __ey \|\| !(__i == __j))		if (__found == __y.end() \|\| !(__found == __i))
return false;		return false;
		mclow.listsUnsubmitted Not Done Reply Inline Actions This seems odd to me. The old code cached both `__x.end()` and `__y.end()`. The new code caches only `__x.end()`. Why? Either it's worth caching `end()` or it is not. mclow.lists: This seems odd to me. The old code cached both `__x.end()` and `__y.end()`. The new code…
		zoecarverUnsubmitted Done Reply Inline Actions The old code compares the found value to a stored variable whereas here I compare it to `__y.end()`. I don't think it makes a difference. I think it's just a thing I did when re-writing these. I am happy to use either. zoecarver: The old code compares the found value to a stored variable whereas here I compare it to `__y.
}		}
return true;		return true;
}		}

template <class _Key, class _Tp, class _Hash, class _Pred, class _Alloc>		template <class _Key, class _Tp, class _Hash, class _Pred, class _Alloc>
inline _LIBCPP_INLINE_VISIBILITY		inline _LIBCPP_INLINE_VISIBILITY
bool		bool
operator!=(const unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>& __x,		operator!=(const unordered_map<_Key, _Tp, _Hash, _Pred, _Alloc>& __x,
▲ Show 20 Lines • Show All 110 Lines • ▼ Show 20 Lines	#if _LIBCPP_STD_VER > 11
unordered_multimap(size_type __n, const hasher& __hf, const allocator_type& __a)		unordered_multimap(size_type __n, const hasher& __hf, const allocator_type& __a)
: unordered_multimap(__n, __hf, key_equal(), __a) {}		: unordered_multimap(__n, __hf, key_equal(), __a) {}
template <class _InputIterator>		template <class _InputIterator>
_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
unordered_multimap(_InputIterator __first, _InputIterator __last, size_type __n, const allocator_type& __a)		unordered_multimap(_InputIterator __first, _InputIterator __last, size_type __n, const allocator_type& __a)
: unordered_multimap(__first, __last, __n, hasher(), key_equal(), __a) {}		: unordered_multimap(__first, __last, __n, hasher(), key_equal(), __a) {}
template <class _InputIterator>		template <class _InputIterator>
_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
unordered_multimap(_InputIterator __first, _InputIterator __last, size_type __n, const hasher& __hf,		unordered_multimap(_InputIterator __first, _InputIterator __last, size_type __n, const hasher& __hf,
const allocator_type& __a)		const allocator_type& __a)
: unordered_multimap(__first, __last, __n, __hf, key_equal(), __a) {}		: unordered_multimap(__first, __last, __n, __hf, key_equal(), __a) {}
_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
unordered_multimap(initializer_list<value_type> __il, size_type __n, const allocator_type& __a)		unordered_multimap(initializer_list<value_type> __il, size_type __n, const allocator_type& __a)
: unordered_multimap(__il, __n, hasher(), key_equal(), __a) {}		: unordered_multimap(__il, __n, hasher(), key_equal(), __a) {}
_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
unordered_multimap(initializer_list<value_type> __il, size_type __n, const hasher& __hf,		unordered_multimap(initializer_list<value_type> __il, size_type __n, const hasher& __hf,
const allocator_type& __a)		const allocator_type& __a)
: unordered_multimap(__il, __n, __hf, key_equal(), __a) {}		: unordered_multimap(__il, __n, __hf, key_equal(), __a) {}
#endif		#endif
_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
~unordered_multimap() {		~unordered_multimap() {
static_assert(sizeof(__diagnose_unordered_container_requirements<_Key, _Hash, _Pred>(0)), "");		static_assert(sizeof(__diagnose_unordered_container_requirements<_Key, _Hash, _Pred>(0)), "");
}		}

▲ Show 20 Lines • Show All 469 Lines • ▼ Show 20 Lines

template <class _Key, class _Tp, class _Hash, class _Pred, class _Alloc>		template <class _Key, class _Tp, class _Hash, class _Pred, class _Alloc>
bool		bool
operator==(const unordered_multimap<_Key, _Tp, _Hash, _Pred, _Alloc>& __x,		operator==(const unordered_multimap<_Key, _Tp, _Hash, _Pred, _Alloc>& __x,
const unordered_multimap<_Key, _Tp, _Hash, _Pred, _Alloc>& __y)		const unordered_multimap<_Key, _Tp, _Hash, _Pred, _Alloc>& __y)
{		{
if (__x.size() != __y.size())		if (__x.size() != __y.size())
return false;		return false;
typedef typename unordered_multimap<_Key, _Tp, _Hash, _Pred, _Alloc>::const_iterator
		auto __first1 = __x.begin();
		EricWFUnsubmitted Not Done Reply Inline Actions You can't use auto. EricWF: You can't use auto.
		auto __last1 = __x.end();
		auto __first2 = __y.begin();
		for (; __first1 != __last1; ++__first1, (void) ++__first2)
		EricWFUnsubmitted Not Done Reply Inline Actions Use explicit braces to make the nesting here more visible. EricWF: Use explicit braces to make the nesting here more visible.
		if (!(__first1 == __first2))
		goto __not_done;
		EricWFUnsubmitted Not Done Reply Inline Actions You can write this without goto. EricWF: You can write this without goto.
		zoecarverUnsubmitted Done Reply Inline Actions Yep. It requires one more comparison, but I think it is worth it. We should probably also update `is_permutation` so that it can eventually be a `constexpr`. zoecarver: Yep. It requires one more comparison, but I think it is worth it. We should probably also…
		return true;

		__not_done:
		typedef
		typename _VSTD::unordered_multimap<_Key, _Tp, _Hash, _Pred, _Alloc>::const_iterator
const_iterator;		const_iterator;
typedef pair<const_iterator, const_iterator> _EqRng;		typedef _VSTD::pair<const_iterator, const_iterator> _EqRng;
		EricWFUnsubmitted Not Done Reply Inline Actions There's no need to quality pair here. EricWF: There's no need to quality pair here.
for (const_iterator __i = __x.begin(), __ex = __x.end(); __i != __ex;)		for (const_iterator __i = __first1; __i != __x.end();)
{		{
_EqRng __xeq = __x.equal_range(__i->first);		_EqRng __xeq = __x.equal_range(__i->first);
_EqRng __yeq = __y.equal_range(__i->first);		_EqRng __yeq = __y.equal_range(__i->first);
if (_VSTD::distance(__xeq.first, __xeq.second) !=		if (_VSTD::distance(__xeq.first, __xeq.second) !=
_VSTD::distance(__yeq.first, __yeq.second) \|\|		_VSTD::distance(__yeq.first, __yeq.second))
!_VSTD::is_permutation(__xeq.first, __xeq.second, __yeq.first))
return false;		return false;
__i = __xeq.second;		__i = __xeq.second;
}		}
return true;		return true;
}		}

template <class _Key, class _Tp, class _Hash, class _Pred, class _Alloc>		template <class _Key, class _Tp, class _Hash, class _Pred, class _Alloc>
inline _LIBCPP_INLINE_VISIBILITY		inline _LIBCPP_INLINE_VISIBILITY
Show All 10 Lines

include/unordered_set

Show First 20 Lines • Show All 69 Lines • ▼ Show 20 Lines	public:
unordered_set(initializer_list<value_type>, size_type n = 0,		unordered_set(initializer_list<value_type>, size_type n = 0,
const hasher& hf = hasher(), const key_equal& eql = key_equal(),		const hasher& hf = hasher(), const key_equal& eql = key_equal(),
const allocator_type& a = allocator_type());		const allocator_type& a = allocator_type());
unordered_set(size_type n, const allocator_type& a); // C++14		unordered_set(size_type n, const allocator_type& a); // C++14
unordered_set(size_type n, const hasher& hf, const allocator_type& a); // C++14		unordered_set(size_type n, const hasher& hf, const allocator_type& a); // C++14
template <class InputIterator>		template <class InputIterator>
unordered_set(InputIterator f, InputIterator l, size_type n, const allocator_type& a); // C++14		unordered_set(InputIterator f, InputIterator l, size_type n, const allocator_type& a); // C++14
template <class InputIterator>		template <class InputIterator>
unordered_set(InputIterator f, InputIterator l, size_type n,		unordered_set(InputIterator f, InputIterator l, size_type n,
const hasher& hf, const allocator_type& a); // C++14		const hasher& hf, const allocator_type& a); // C++14
unordered_set(initializer_list<value_type> il, size_type n, const allocator_type& a); // C++14		unordered_set(initializer_list<value_type> il, size_type n, const allocator_type& a); // C++14
unordered_set(initializer_list<value_type> il, size_type n,		unordered_set(initializer_list<value_type> il, size_type n,
const hasher& hf, const allocator_type& a); // C++14		const hasher& hf, const allocator_type& a); // C++14
~unordered_set();		~unordered_set();
unordered_set& operator=(const unordered_set&);		unordered_set& operator=(const unordered_set&);
unordered_set& operator=(unordered_set&&)		unordered_set& operator=(unordered_set&&)
noexcept(		noexcept(
▲ Show 20 Lines • Show All 150 Lines • ▼ Show 20 Lines	public:
unordered_multiset(size_type n, const allocator_type& a); // C++14		unordered_multiset(size_type n, const allocator_type& a); // C++14
unordered_multiset(size_type n, const hasher& hf, const allocator_type& a); // C++14		unordered_multiset(size_type n, const hasher& hf, const allocator_type& a); // C++14
template <class InputIterator>		template <class InputIterator>
unordered_multiset(InputIterator f, InputIterator l, size_type n, const allocator_type& a); // C++14		unordered_multiset(InputIterator f, InputIterator l, size_type n, const allocator_type& a); // C++14
template <class InputIterator>		template <class InputIterator>
unordered_multiset(InputIterator f, InputIterator l, size_type n,		unordered_multiset(InputIterator f, InputIterator l, size_type n,
const hasher& hf, const allocator_type& a); // C++14		const hasher& hf, const allocator_type& a); // C++14
unordered_multiset(initializer_list<value_type> il, size_type n, const allocator_type& a); // C++14		unordered_multiset(initializer_list<value_type> il, size_type n, const allocator_type& a); // C++14
unordered_multiset(initializer_list<value_type> il, size_type n,		unordered_multiset(initializer_list<value_type> il, size_type n,
const hasher& hf, const allocator_type& a); // C++14		const hasher& hf, const allocator_type& a); // C++14
~unordered_multiset();		~unordered_multiset();
unordered_multiset& operator=(const unordered_multiset&);		unordered_multiset& operator=(const unordered_multiset&);
unordered_multiset& operator=(unordered_multiset&&)		unordered_multiset& operator=(unordered_multiset&&)
noexcept(		noexcept(
allocator_type::propagate_on_container_move_assignment::value &&		allocator_type::propagate_on_container_move_assignment::value &&
is_nothrow_move_assignable<allocator_type>::value &&		is_nothrow_move_assignable<allocator_type>::value &&
is_nothrow_move_assignable<hasher>::value &&		is_nothrow_move_assignable<hasher>::value &&
▲ Show 20 Lines • Show All 191 Lines • ▼ Show 20 Lines	template <class _InputIterator>
const key_equal& __eql = key_equal());		const key_equal& __eql = key_equal());
template <class _InputIterator>		template <class _InputIterator>
unordered_set(_InputIterator __first, _InputIterator __last,		unordered_set(_InputIterator __first, _InputIterator __last,
size_type __n, const hasher& __hf, const key_equal& __eql,		size_type __n, const hasher& __hf, const key_equal& __eql,
const allocator_type& __a);		const allocator_type& __a);
#if _LIBCPP_STD_VER > 11		#if _LIBCPP_STD_VER > 11
template <class _InputIterator>		template <class _InputIterator>
inline _LIBCPP_INLINE_VISIBILITY		inline _LIBCPP_INLINE_VISIBILITY
unordered_set(_InputIterator __first, _InputIterator __last,		unordered_set(_InputIterator __first, _InputIterator __last,
size_type __n, const allocator_type& __a)		size_type __n, const allocator_type& __a)
: unordered_set(__first, __last, __n, hasher(), key_equal(), __a) {}		: unordered_set(__first, __last, __n, hasher(), key_equal(), __a) {}
template <class _InputIterator>		template <class _InputIterator>
unordered_set(_InputIterator __first, _InputIterator __last,		unordered_set(_InputIterator __first, _InputIterator __last,
size_type __n, const hasher& __hf, const allocator_type& __a)		size_type __n, const hasher& __hf, const allocator_type& __a)
: unordered_set(__first, __last, __n, __hf, key_equal(), __a) {}		: unordered_set(__first, __last, __n, __hf, key_equal(), __a) {}
#endif		#endif
_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
explicit unordered_set(const allocator_type& __a);		explicit unordered_set(const allocator_type& __a);
unordered_set(const unordered_set& __u);		unordered_set(const unordered_set& __u);
unordered_set(const unordered_set& __u, const allocator_type& __a);		unordered_set(const unordered_set& __u, const allocator_type& __a);
#ifndef _LIBCPP_CXX03_LANG		#ifndef _LIBCPP_CXX03_LANG
Show All 9 Lines	unordered_set(initializer_list<value_type> __il, size_type __n,
const hasher& __hf, const key_equal& __eql,		const hasher& __hf, const key_equal& __eql,
const allocator_type& __a);		const allocator_type& __a);
#if _LIBCPP_STD_VER > 11		#if _LIBCPP_STD_VER > 11
inline _LIBCPP_INLINE_VISIBILITY		inline _LIBCPP_INLINE_VISIBILITY
unordered_set(initializer_list<value_type> __il, size_type __n,		unordered_set(initializer_list<value_type> __il, size_type __n,
const allocator_type& __a)		const allocator_type& __a)
: unordered_set(__il, __n, hasher(), key_equal(), __a) {}		: unordered_set(__il, __n, hasher(), key_equal(), __a) {}
inline _LIBCPP_INLINE_VISIBILITY		inline _LIBCPP_INLINE_VISIBILITY
unordered_set(initializer_list<value_type> __il, size_type __n,		unordered_set(initializer_list<value_type> __il, size_type __n,
const hasher& __hf, const allocator_type& __a)		const hasher& __hf, const allocator_type& __a)
: unordered_set(__il, __n, __hf, key_equal(), __a) {}		: unordered_set(__il, __n, __hf, key_equal(), __a) {}
#endif		#endif
#endif // _LIBCPP_CXX03_LANG		#endif // _LIBCPP_CXX03_LANG
_LIBCPP_INLINE_VISIBILITY		_LIBCPP_INLINE_VISIBILITY
~unordered_set() {		~unordered_set() {
static_assert(sizeof(__diagnose_unordered_container_requirements<_Value, _Hash, _Pred>(0)), "");		static_assert(sizeof(__diagnose_unordered_container_requirements<_Value, _Hash, _Pred>(0)), "");
}		}
▲ Show 20 Lines • Show All 460 Lines • ▼ Show 20 Lines

template <class _Value, class _Hash, class _Pred, class _Alloc>		template <class _Value, class _Hash, class _Pred, class _Alloc>
bool		bool
operator==(const unordered_set<_Value, _Hash, _Pred, _Alloc>& __x,		operator==(const unordered_set<_Value, _Hash, _Pred, _Alloc>& __x,
const unordered_set<_Value, _Hash, _Pred, _Alloc>& __y)		const unordered_set<_Value, _Hash, _Pred, _Alloc>& __y)
{		{
if (__x.size() != __y.size())		if (__x.size() != __y.size())
return false;		return false;
typedef typename unordered_set<_Value, _Hash, _Pred, _Alloc>::const_iterator		typedef
		typename unordered_set<_Value, _Hash, _Pred, _Alloc>::const_iterator
const_iterator;		const_iterator;
for (const_iterator __i = __x.begin(), __ex = __x.end(), __ey = __y.end();		for (const_iterator __i = __x.begin(), __ex = __x.end(); __i != __ex; ++__i)
		mclow.listsUnsubmitted Not Done Reply Inline Actions Same comment as in `unordered_map` mclow.lists: Same comment as in `unordered_map`
__i != __ex; ++__i)
{		{
const_iterator __j = __y.find(*__i);		if (__y.find(*__i) == __y.end())
if (__j == __ey \|\| !(__i == __j))
return false;		return false;
}		}
return true;		return true;
}		}

template <class _Value, class _Hash, class _Pred, class _Alloc>		template <class _Value, class _Hash, class _Pred, class _Alloc>
inline _LIBCPP_INLINE_VISIBILITY		inline _LIBCPP_INLINE_VISIBILITY
bool		bool
▲ Show 20 Lines • Show All 72 Lines • ▼ Show 20 Lines	template <class _InputIterator>
const key_equal& __eql = key_equal());		const key_equal& __eql = key_equal());
template <class _InputIterator>		template <class _InputIterator>
unordered_multiset(_InputIterator __first, _InputIterator __last,		unordered_multiset(_InputIterator __first, _InputIterator __last,
size_type __n , const hasher& __hf,		size_type __n , const hasher& __hf,
const key_equal& __eql, const allocator_type& __a);		const key_equal& __eql, const allocator_type& __a);
#if _LIBCPP_STD_VER > 11		#if _LIBCPP_STD_VER > 11
template <class _InputIterator>		template <class _InputIterator>
inline _LIBCPP_INLINE_VISIBILITY		inline _LIBCPP_INLINE_VISIBILITY
unordered_multiset(_InputIterator __first, _InputIterator __last,		unordered_multiset(_InputIterator __first, _InputIterator __last,
size_type __n, const allocator_type& __a)		size_type __n, const allocator_type& __a)
: unordered_multiset(__first, __last, __n, hasher(), key_equal(), __a) {}		: unordered_multiset(__first, __last, __n, hasher(), key_equal(), __a) {}
template <class _InputIterator>		template <class _InputIterator>
inline _LIBCPP_INLINE_VISIBILITY		inline _LIBCPP_INLINE_VISIBILITY
unordered_multiset(_InputIterator __first, _InputIterator __last,		unordered_multiset(_InputIterator __first, _InputIterator __last,
size_type __n, const hasher& __hf, const allocator_type& __a)		size_type __n, const hasher& __hf, const allocator_type& __a)
: unordered_multiset(__first, __last, __n, __hf, key_equal(), __a) {}		: unordered_multiset(__first, __last, __n, __hf, key_equal(), __a) {}
#endif		#endif
▲ Show 20 Lines • Show All 461 Lines • ▼ Show 20 Lines

template <class _Value, class _Hash, class _Pred, class _Alloc>		template <class _Value, class _Hash, class _Pred, class _Alloc>
bool		bool
operator==(const unordered_multiset<_Value, _Hash, _Pred, _Alloc>& __x,		operator==(const unordered_multiset<_Value, _Hash, _Pred, _Alloc>& __x,
const unordered_multiset<_Value, _Hash, _Pred, _Alloc>& __y)		const unordered_multiset<_Value, _Hash, _Pred, _Alloc>& __y)
{		{
if (__x.size() != __y.size())		if (__x.size() != __y.size())
return false;		return false;
typedef typename unordered_multiset<_Value, _Hash, _Pred, _Alloc>::const_iterator		auto __first1 = __x.begin();
		auto __last1 = __x.end();
		auto __first2 = __y.begin();
		ldionneAuthorUnsubmitted Not Done Reply Inline Actions That's basically equivalent to: bool res = std::equal(__first1, __last1, __first2); if (res) return true; // rest of the code after __not_done label Or did I miss something? If they're equivalent, I suggest we remove the raw loop. ldionne: That's basically equivalent to: ``` bool res = std::equal(__first1, __last1, __first2); if…
		zoecarverUnsubmitted Done Reply Inline Actions No, that would not be the same. This function does not check if the sets are the same; it removes any parts of the sets that are the same. Importantly, if only three elements are the same, it will "chop off" those elements and continue from the third element below. Does that make sense? zoecarver: No, that would not be the same. This function does not check if the sets are the same; it…
		for (; __first1 != __last1; ++__first1, (void) ++__first2)
		if (!(__first1 == __first2))
		goto __not_done;
		return true;
		__not_done:
		typedef
		typename _VSTD::unordered_multiset<_Value, _Hash, _Pred, _Alloc>::const_iterator
const_iterator;		const_iterator;
typedef pair<const_iterator, const_iterator> _EqRng;		typedef _VSTD::pair<const_iterator, const_iterator> _EqRng;
for (const_iterator __i = __x.begin(), __ex = __x.end(); __i != __ex;)		for (const_iterator __i = __first1; __i != __x.end();)
		zoecarverUnsubmitted Done Reply Inline Actions Just realizing now that this should be `__first1`. That might make it faster :) zoecarver: Just realizing now that this should be `__first1`. That might make it faster :)
{		{
_EqRng __xeq = __x.equal_range(*__i);		_EqRng __xeq = __x.equal_range(*__i);
_EqRng __yeq = __y.equal_range(*__i);		_EqRng __yeq = __y.equal_range(*__i);
if (_VSTD::distance(__xeq.first, __xeq.second) !=		if (_VSTD::distance(__xeq.first, __xeq.second) !=
_VSTD::distance(__yeq.first, __yeq.second) \|\|		_VSTD::distance(__yeq.first, __yeq.second))
!_VSTD::is_permutation(__xeq.first, __xeq.second, __yeq.first))
return false;		return false;
__i = __xeq.second;		__i = __xeq.second;
}		}
return true;		return true;
}		}

template <class _Value, class _Hash, class _Pred, class _Alloc>		template <class _Value, class _Hash, class _Pred, class _Alloc>
inline _LIBCPP_INLINE_VISIBILITY		inline _LIBCPP_INLINE_VISIBILITY
Show All 10 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[libc++] Optimize unordered_{multiset,multimap} equality comparisonAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 199877

benchmarks/unordered_set_comp.bench.cpp

include/unordered_map

include/unordered_set

[libc++] Optimize unordered_{multiset,multimap} equality comparison
AbandonedPublic