CTLZ intrinsic can use the VCLS instruction on MVE, which produces better results than expanding.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
llvm/test/CodeGen/Thumb2/mve-ctlz.ll | ||
---|---|---|
35 | I believe these intrinsics should take an extra i1 parameter specifying whether an input of 0 in undef or not. The MVE instructions should produce sensible values for an input of 0, as far as I understand, so they should both lower to the same thing. Worth adding tests for both. We might as well add tests for v2i64 too, just to show that they are doing something that doesn't look wrong. |
Added verify-machineinstrs, a test for 2i64 as well as tests for the 'isundef' parameter
You can remove the spaces before the ctlz, and before the MVE_VCLZs8 below.