This is currently enabled for Intel big cores from Sandy Bridge onward, as well as Atom, Silvermont, and KNL, due to 64-bit division being so slow on these cores. AMD cores can do this in hardware (use 32-bit division based on input operand width), so it's not a win there. But since the majority of x86 CPUs benefit from this optimization, and since the potential upside is significantly greater than the downside, we should enable this for the generic x86-64 target.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Comment Actions
I'm not sure what kind of test case to add. The codegen tests for bypassing slow division use -mattr=[+-]idivq-to-divl to explicitly control this feature, rather than assume it is enabled or disabled for a particular target.
Comment Actions
In which case we probably just need to add an additional -mcpu=x86-64 test to bypass-slow-division-tune.ll ?