This is an archive of the discontinued LLVM Phabricator instance.

[X86] Remove FeatureSlowIncDec from Sandy Bridge and later Intel Core CPUs
ClosedPublic

Authored by craig.topper on Feb 19 2019, 3:10 PM.

Details

Summary

Inc and Dec were at one point slow on Intel CPUs due to their tendency to cause partial flag stalls on P6 derived CPU cores. This is because these instructions are defined to preserve the carry flag. This partial flag stall issue persisted until Sandy Bridge when flag merging was changed to be handled as a data dependency instead of as a stall until retirement. Sandy Bridge and later CPUs rename the C flag separately from OSPAZ so there is no flag merge needed on INC/DEC to preserve the C flag.

Given these improvements I don't know why INC/DEC was ever considered slow on Sandy Bridge. If anything they should have been disabled on the earlier CPUs instead.

Note after this patch, INC/DEC are still considered slow on Silvermont, Goldmont, Knights Landing and our generic "x86-64" CPU.

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Feb 19 2019, 3:10 PM

Should we change the generic model too then since most CPUs prefer inc/dec?
I'm guessing that's a lot more test diffs, so no problem to make it another patch, but just want to know if that's the way to think about the generic model.

Should we change the generic model too then since most CPUs prefer inc/dec?
I'm guessing that's a lot more test diffs, so no problem to make it another patch, but just want to know if that's the way to think about the generic model.

Related question to that. Should "pentium4" which is clang's Linux default CPU for 32-bit builds have more tuning flags on it?

Should we change the generic model too then since most CPUs prefer inc/dec?
I'm guessing that's a lot more test diffs, so no problem to make it another patch, but just want to know if that's the way to think about the generic model.

Related question to that. Should "pentium4" which is clang's Linux default CPU for 32-bit builds have more tuning flags on it?

I've never looked at how the 32-bit setting works...or at least I can't remember it now. :)
But that sounds weird. Shouldn't we have a generic 32-bit model similar to the generic 64-bit model?

chandlerc accepted this revision.Feb 19 2019, 6:13 PM

LGTM

I think we can consider tweaking the generic ones separately.

FWIW, I would be in favor of dropping this from x86-64.

Also FWIW, I would be in favor of making i686 (or a better spelling if someone wants to suggest one such as x86-32?) a thing similar to x86-64, and I'd be happy to see them follow along in similar ways of "as good as we can do for modern CPUs but supporting a particular baseline".

This revision is now accepted and ready to land.Feb 19 2019, 6:13 PM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 19 2019, 9:41 PM