This is an archive of the discontinued LLVM Phabricator instance.

[libc] Use __builtin_clz to find leading 1 in hypot
ClosedPublic

Authored by cratonica on Jan 19 2022, 8:13 AM.

Details

Summary

This is an optimization that using a single CPU instruction on supported
architectures (amd64 and aarch64, but possibly others) to replace what was
previously an iterative look-up-table algorithm.

Originally I suggested using inline assembly for this in
https://reviews.llvm.org/D117584.

Diff Detail

Event Timeline

cratonica created this revision.Jan 19 2022, 8:13 AM
cratonica requested review of this revision.Jan 19 2022, 8:13 AM
lntue added inline comments.Jan 19 2022, 8:45 AM
libc/src/__support/FPUtil/Hypot.h
26

builtin_clzl is just extended to either builtin_clz or builtin_clzll depending on the size of unsigned long. It might be better to keep uint32_t and uint64_t specializations and directly use __builtin_clz for uint32_t, builtin_clzll for uint64_t, together with static_assert(sizeof(unsigned int) == 4) and static_assert(sizeof(unsigned long long) == 8) instead?

sivachandra added inline comments.Jan 19 2022, 9:08 AM
libc/src/__support/FPUtil/Hypot.h
26

Considering that uint32_t and uint64_t are just type sugar and not actual types, the existing code seems to me like the best approach and keeps it really straighforward with no assumptions of any sizes.

35

Use a pattern like sizeof(unsigned int) - 1 instead of hard coding here and below. For, in the unsigned long case, 63 - <...>can be more than the width of unsigned long.

lntue added inline comments.Jan 19 2022, 9:17 AM
libc/src/__support/FPUtil/Hypot.h
26

If using this pattern then the output type for unsigned int should be kept as unsigned int, and I agree that we should use sizeof(...)*8 - 1 instead of hard-coded values.

cratonica updated this revision to Diff 401283.Jan 19 2022, 9:22 AM

Use sizeof instead of assuming width of types

cratonica marked 4 inline comments as done.Jan 19 2022, 9:26 AM

Thanks, PTAL

libc/src/__support/FPUtil/Hypot.h
26

Yes, sorry for that oversight. I originally used the uintxx_t values until I pivoted to using the non-aliased types, but forgot to drop the assumptions about bit width after that change.

35

Note that this assumes CHAR_BIT is 8, but I can't #include <climits> according to the lint rules.

cratonica updated this revision to Diff 401285.Jan 19 2022, 9:27 AM
cratonica marked 2 inline comments as done.

Fix bitwidth assumption on long long as well

sivachandra accepted this revision.Jan 19 2022, 9:49 AM
sivachandra added inline comments.
libc/src/__support/FPUtil/Hypot.h
31–32

Nit: Use unsigned int here?

35

Assuming that a byte is 8 bits wide is going to be almost always right on real world systems :)

This revision is now accepted and ready to land.Jan 19 2022, 9:49 AM
cratonica updated this revision to Diff 401295.Jan 19 2022, 9:52 AM

s/uint32_t/unsigned int/ on return type for find_leading_one

cratonica marked an inline comment as done.Jan 19 2022, 9:53 AM
lntue accepted this revision.Jan 19 2022, 10:31 AM

Thanks for the review. Since I don't yet have repo access, could someone please commit this on my behalf per https://llvm.org/docs/Phabricator.html#committing-someone-s-change-from-phabricator ?

Thanks.

lntue added a comment.Jan 20 2022, 9:58 AM

Thanks for the review. Since I don't yet have repo access, could someone please commit this on my behalf per https://llvm.org/docs/Phabricator.html#committing-someone-s-change-from-phabricator ?

Thanks.

You can check with @sivachandra about the process to obtain access to llvm repo.

I will submit this for you shortly.

This revision was automatically updated to reflect the committed changes.