This changes leverages the BSR instruction to find the leading 1 bit, reducing
what was previously an iterative look-up-table approach to a single
instruction on supported CPUs.
Details
- Reviewers
lntue sivachandra
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
libc/src/__support/find_leading_one.h | ||
---|---|---|
26 | Is the BSR instruction available for x86 also? If so then I think LLVM_LIBC_ARCH_X86 should be better. And if that's the case, you can just leave the implementation in the x86_64 folder, I don't think we need separate folders for x86 and x86-64. |
libc/src/__support/find_leading_one.h | ||
---|---|---|
26 | My understanding is that we'd be limited to 32-bit operands in that case and would therefore need to make this conditional upon both the size of the data and the architecture, which might be messier (so, x86 would drop back to the generic implementation for 64-bit numbers, or I suppose I could implement it as two BSR calls if it's possible to test this in 32-bit mode). Happy to do whatever you think is best. |
libc/src/__support/find_leading_one.h | ||
---|---|---|
26 | If that's the case then I'm fine with the current specialization. We can always specialize further if there are requests for non-64-bit x86 arch. |
libc/src/__support/x86_64/find_leading_one.h | ||
---|---|---|
20 | Nit: msbit_pos is probably a more appropriate name? | |
28–31 | It seems to me like one can use __builtin_clz and avoid target assembly? For example: https://godbolt.org/z/eY8n8P118 |
libc/src/__support/x86_64/find_leading_one.h | ||
---|---|---|
28–31 | Nice, that also works for ARM: https://godbolt.org/z/bY9nfEd16 Are we OK using such builtins in this library ? If so, should we get rid of the original find_leading_one() function altogether and just call __builtin_clz ? Or should it still be specialized for each architecture and delegate at that level? |
libc/src/__support/x86_64/find_leading_one.h | ||
---|---|---|
28–31 | Yes, it is OK to use compiler builtins for this use case. We can get rid of the original find_leading_one. |
libc/src/__support/x86_64/find_leading_one.h | ||
---|---|---|
28–31 |
libc/src/__support/x86_64/find_leading_one.h | ||
---|---|---|
28–31 | Normally you could just amend the commit and update the patch review with $ arc diff --update D.... instead of creating a new one, unless you want to keep the old review around. |
libc/src/__support/x86_64/find_leading_one.h | ||
---|---|---|
28–31 | I'd like to hang onto it for a few reasons:
That being said, I don't want to deviate from the norms used by LLVM, so I can amend it here if you'd prefer. |
libc/src/__support/x86_64/find_leading_one.h | ||
---|---|---|
28–31 | It's totally fine to me to have the a separate review patch, especially when you want to refer back to the original one in the future. Just a few comments about how Phabricator works as I know of:
|
Is the BSR instruction available for x86 also? If so then I think LLVM_LIBC_ARCH_X86 should be better. And if that's the case, you can just leave the implementation in the x86_64 folder, I don't think we need separate folders for x86 and x86-64.