We already do this for 16/32/64 as well as 8-bit add with register/immediate. Might as well do it for 8-bit INC/DEC too.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Are we going to end up with issues with cpus with slow/fast increment vs slow/fast lea?
INC is probably never faster than add. Just smaller. And we already convert adds to lea. And we already do this for other sizes of INC. So it shouldn’t create a new issue from that perspective.
Side note/follow-up from D58412 - in a local experiment, I removed 'FeatureSlowIncDec' from the default/generic x86-64 CPU model, and saw no regression test diffs. So I assume we're already converting to LEA in most cases already. As seen in some of the diffs here, that's probably for the best since it allows getting rid of an explicit mov register instruction?
There aren’t many tests that use x86-64 as a cpu are there? It’s not the default cpu in llc for 64-bit. Just in clang
Oh, I misunderstood how that gets set then. No, there aren't many tests that are explicitly setting -mcpu=generic. A grep shows 169 RUN lines with that string. So that seems weird...shouldn't we be testing the common case by default?
llc always uses "generic" as the cpu by default. With some special code to force "sse2" on 64-bit. Clang makes a platform specific decision. "pentium4" and "x86-64" are the 32-bit and 64-bit defaults for linux. Macs default to "yonah" and "core2" for 32-bit and 64-bit respectively. Less sure about other platforms. These are intended to enable mininum ISAs, but are also affecting our tuning decisions.
We should maybe look into making more tuning cpus for clang to use. Or we should support -mtune. So we can have separate default ISAs and tuning flags.
LGTM - the generic CPU concern is independent of this change, and as noted this is just making the logic more uniform. If there's some case where 'inc' is better, that should likely apply to all legal scalar types.