Page MenuHomePhabricator

Implement `lcm` and `gcd` from library fundamentals V2
ClosedPublic

Authored by mclow.lists on Jun 14 2016, 2:04 PM.

Details

Summary

These are probably going to end up in C++17 as well.

This is a straight addition; no changes to existing code.

Diff Detail

Event Timeline

mclow.lists retitled this revision from to Implement `lcm` and `gcd` from library fundamentals V2.
mclow.lists updated this object.
mclow.lists added reviewers: EricWF, howard.hinnant.
mclow.lists added a subscriber: cfe-commits.

I created a top-level gcd which did nothing but make everything unsigned and do the abs once, and then called a __gcd specialized for unsigned and only one kind of unsigned, and got measurably faster results (about 10%).

template<class _Tp>
constexpr
std::enable_if_t<is_unsigned<_Tp>{}, _Tp> _LIBCPP_INLINE_VISIBILITY
__gcd(_Tp __m, _Tp __n)
{
	return __n == 0 ? __m : __gcd(__n, __m % __n);
}

template<class _Tp, class _Up>
constexpr
common_type_t<_Tp,_Up> _LIBCPP_INLINE_VISIBILITY
gcd(_Tp __m, _Up __n)
{
	static_assert((is_integral<_Tp>::value && is_integral<_Up>::value), "Arguments to gcd must be integer types");
	using _Rp = common_type_t<_Tp,_Up>;
	using _Wp = make_unsigned_t<_Rp>;
	return static_cast<_Rp>(__gcd(static_cast<_Wp>(__abs<_Tp>()(__m)),
	                              static_cast<_Wp>(__abs<_Up>()(__n))));
}

Here is the test driver I used:

#include <chrono>
#include <iostream>

int
main()
{
    std::uint64_t x = 0;
    auto t0 = std::chrono::steady_clock::now();
    for (auto i = 1'999'990'000; i <= 2'000'000'000; ++i)
    {
        for (auto j = 1'999'990'000; j <= 2'000'000'000; ++j)
        {
            x += gcd(i, j);
        }
    }
    auto t1 = std::chrono::steady_clock::now();
    std::cout << std::chrono::duration<double>(t1-t0).count() << '\n';
    return x;
}

I also tried the traditional iterative solution (as opposed to recursive) and was surprised that it was not faster. I then tried "binary" gcd (https://en.wikipedia.org/wiki/Binary_GCD_algorithm) and was again surprised it wasn't faster. Both of these alternatives were in the ball park though, and it could be that they might measure faster over other domains. The binary variant used __m >>= std::__ctz(__m); instead of a while loop to remove factors of 2 (in two places).

EricWF added inline comments.Jun 14 2016, 4:30 PM
include/experimental/numeric
45

This should be _LIBCPP_BEGIN_NAMESPACE_LFTS.

47

_VSTD:: is redundant here.

62

It might be beneficial to calculate the absolute values during the initial call instead of at every level of recursion, since most of those calls will be redundant.

80

Copy/Paste error

test/std/experimental/numeric/numeric.ops.overview/nothing_to_do.pass.cpp
10

Since it includes the header it should have: // UNSUPPORTED: c++98, c++03, c++11.

test/std/experimental/numeric/numeric.ops/nothing_to_do.pass.cpp
10

Since it includes the header it should have: // UNSUPPORTED: c++98, c++03, c++11.

Can't call std::abs as it isn't constexpr. You'll have to roll your own.

EricWF edited edge metadata.Jun 14 2016, 5:29 PM

Gosh darn it Howard. Your like 10 minutes ahead of me.

Here is a constexpr gcd test.
https://gist.github.com/EricWF/46ea4489f405ec9f83325a278fe99535

include/experimental/numeric
51

Both the call operator and abs are not constexpr.

56

Call operator is not constexpr.

mclow.lists edited edge metadata.

Updated based on comments here and elsewhere.

  • Do the abs once, rather than at every level of recursion
  • Add tests for bool.
  • Constexpr

Also, add an assertion to detect overflow in lcm.

I have not yet incorporated Eric's constexpr tests.

mclow.lists marked 8 inline comments as done.Jun 15 2016, 9:40 AM

Updated with Eric's constexpr tests.
Once this is approved, it will be applied twice - once in <experimental/numeric>, and once in <numeric>

EricWF accepted this revision.Jul 25 2016, 10:46 PM
EricWF edited edge metadata.

LGTM.

This revision is now accepted and ready to land.Jul 25 2016, 10:46 PM
Eugene.Zelenko added a subscriber: Eugene.Zelenko.

Committed in r276750.