This change turns on by default interleaved access vectorization on ARM,
as it has shown to be beneficial on ARM.
Details
Diff Detail
Event Timeline
Tested with lnt, spec2000 and some other internal benchmarks.
Performance Regressions - Execution Time
lnt.SingleSource/Benchmarks/Misc/himenobmtxpa 3.03%
lnt.SingleSource/Benchmarks/Shootout-C++/sieve 1.81%
lnt.MultiSource/Benchmarks/McCat/12-IOtest/iotest 1.36%
Performance Improvements - Execution Time
lnt.MultiSource/Benchmarks/PAQ8p/paq8p -18.90%
lnt.SingleSource/Benchmarks/Shootout-C++/EH/except -2.90%
lnt.MultiSource/Applications/ClamAV/clamscan -2.50%
lnt.SingleSource/UnitTests/Vectorizer/gcc-loops -1.59%
lnt.MultiSource/Applications/siod/siod -1.41%
lnt.SingleSource/Benchmarks/CoyoteBench/fftbench -1.33%
lnt.MultiSource/Benchmarks/VersaBench/bmm/bmm -1.19%
I think the paq8p change might be a performance variation, so overall no significant change in lnt/spec2000. I have see good improvements in other benchmarks though.