manually optimized code for 8 bit integer bit reverse builtin, instruction count reduced from 14 to 11, slight performance gain
The performance implications on PR31810 need to be settled before this can be taken any further
This is general legalization code - we should ensure that ISD::ROTL is legalorcustom before attempting this path, otherwise the additional shift/mask code will definitely be a regression. With that in place you can probably drop the VT.isScalarInteger() requirement.
Please use the update_llc_test_checks.py script
The performance measurement testing did not show any performance benefit for the proposed 8bit bit reversal intrinsic implementation over the existing one.