Agner's tables indicate that for SSE42+ targets (Core2 and later) we can reduce the FADD/FSUB/FMUL costs down to 1, which should fix the Himeno benchmark.
Note: the AVX512 FDIV costs look rather dodgy, but this isn't part of this patch.
Paths
| Differential D43733
[X86][SSE] Reduce FADD/FSUB/FMUL costs on later targets (PR36280) ClosedPublic Authored by RKSimon on Feb 24 2018, 10:03 AM.
Details Summary Agner's tables indicate that for SSE42+ targets (Core2 and later) we can reduce the FADD/FSUB/FMUL costs down to 1, which should fix the Himeno benchmark. Note: the AVX512 FDIV costs look rather dodgy, but this isn't part of this patch.
Diff Detail
Event TimelineThis revision is now accepted and ready to land.Feb 26 2018, 11:19 AM Closed by commit rL326133: [X86][SSE] Reduce FADD/FSUB/FMUL costs on later targets (PR36280) (authored by RKSimon). · Explain WhyFeb 26 2018, 2:13 PM This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 135800 lib/Target/X86/X86TargetTransformInfo.cpp
test/Analysis/CostModel/X86/arith-fp.ll
test/Analysis/CostModel/X86/intrinsic-cost.ll
test/Transforms/SLPVectorizer/X86/PR36280.ll
test/Transforms/SLPVectorizer/X86/cse.ll
test/Transforms/SLPVectorizer/X86/horizontal.ll
test/Transforms/SLPVectorizer/X86/reorder_phi.ll
test/Transforms/SLPVectorizer/X86/simplebb.ll
|