While looking at D113413 I noticed that __log2i could perhaps be improved to be both slightly smaller and faster.
I'm not familiar with how to run benchmarks for this if there are any, but https://godbolt.org/z/48KT3z5Tj looks like an improvement to me.
I am wondering whether CHAR_BIT is preferred over using CHAR_BIT (available from climit header file). I don't know what the convention is for writing library code.