This change limits the minimum cost of an insert/extract
element operation to 2 in cases where this would result
in mixing of NEON and VFP code.
Details
Details
Diff Detail
Diff Detail
Event Timeline
Comment Actions
Tested with lnt/spec2000 on Cortex-A57 and Cortex-A53. I've seen only improvements (and some noise):
Cortex-A57:
lnt.MultiSource/Benchmarks/TSVC/LoopRerolling-flt/LoopRerolling-flt -18.69%
lnt.MultiSource/Benchmarks/Bullet/bullet -1.59%
Cortex-A53:
spec.cpu2000.ref.300_twolf -2.40% -- This could be noise?
Comment Actions
LGTM. Thanks!
lib/Target/ARM/ARMTargetTransformInfo.cpp | ||
---|---|---|
266 | Excellent! I meant to look at that 2.5 years ago! :) |
Excellent! I meant to look at that 2.5 years ago! :)