Following the discussion on D22038, this change enables the setcc to srl(ctlz) transformation on the btver2 architecture.
This optimisation is beneficial on Jaguar architecture only, where the lzcnt has a good reciprocal throughput.
Other architectures such as Intel's Haswell/Broadwell or AMD's Bulldozer/PileDriver do not benefit from it.
For this reason the change also add a "HasFastLZCNT" feature which gets enabled for Jaguar.
This patch requires D23445
I would move this down with the other 'fake' features (ie, the other fast/slow attributes). Someday, we may come up with a better way to distinguish performance "features" from architectural ones.
It would also be good to explain exactly what we mean by "fast" in this context.
Finally, use a hyphen to make this more readable: "fast-lzcnt".