As indicated by the title, this post-register allocation pre-rewrite pass
generalizes D21774 by matching patterns of the form,
gr8<def> = instr_defining_gr8 # that may or may not use eflags gr32<def> = movzx gr8
into,
gr32 = mov32r0 eflags<imp-def> # carefully avoids clobbering eflags ... gr8<def> = instr_defining_gr8
with the goal of reducing read stalls, partial register stalls, micro-ops, and
overall binary size.
Except for a few rare cases, it never performs worse than D21774, and does
surprisingly better in other cases. IACA-annotated assembly output can be found
at https://reviews.llvm.org/P7213 (for x86-64) and at
https://reviews.llvm.org/P7214 (for x86-32).
Not all of the tests have been updated; this is still a work in progress.
EDIT: grammar.