Writing support for three ACLE functions:
unsigned int __cls(uint32_t x) unsigned int __clsl(unsigned long x) unsigned int __clsll(uint64_t x)
CLS stands for "Count number of leading sign bits".
In AArch64, these two intrinsics can be translated into the 'cls'
instruction directly. In AArch32, on the other hand, this functionality
is achieved by implementing it in terms of clz (count number of leading
zeros).
I don't see a pattern match for the cls64 on ARM32, would that not fail to lower?