This is an archive of the discontinued LLVM Phabricator instance.

[ARM] Don't pessimize i32 vselect.
ClosedPublic

Authored by chatur01 on Nov 17 2015, 1:10 AM.

Details

Summary

The underlying issues surrounding codegen for 32-bit vselects have been resolved. The pessimistic costs for 64-bit vselects remain due to the bad
scalarization that is still happening there.

I tested this on A57 in T32, A32 and A64 modes. I saw no regressions, and some improvements.

From my benchmarks, I saw these improvements in A57 (T32)
spec.cpu2000.ref.177_mesa 5.95%
lnt.SingleSource/Benchmarks/Shootout/strcat 12.93%
lnt.MultiSource/Benchmarks/MiBench/telecomm-CRC32/telecomm-CRC32 11.89%

I also measured A57 A32 and A9 T32 and found no performance regressions. I see much bigger wins in third-party benchmarks with this change.

Diff Detail

Repository
rL LLVM

Event Timeline

chatur01 retitled this revision from to [ARM] Don't pessimize i32 vselect..
chatur01 updated this object.
chatur01 set the repository for this revision to rL LLVM.
chatur01 added a subscriber: llvm-commits.
rengolin accepted this revision.Nov 17 2015, 1:27 AM
rengolin added a reviewer: rengolin.

Hi Charlie,

The original commit mentions Cortex-A8's 25% improvement performance in PAQp8. It seems you've only tested in OOO cores, can you just double check on either A53 or A7/A8? It's probably a pattern we now recognise better, but would be good to be on the safe side.

If you find no regressions either, and I'm not expecting any, LGTM. Thanks!

cheers,
--renato

This revision is now accepted and ready to land.Nov 17 2015, 1:27 AM

Thanks Renato. I double-checked A53 T32 and found no regressions in SPEC 2000, 2006 or LNT.

Adobe-C++/loop_unroll was up 6.49%
ASC_Sequoia/AMGmk/AMGmk was up 5.04%

Several other improvements in LNT.

This revision was automatically updated to reflect the committed changes.

I missed some mechanical changes in test/Analysis/CostModel/ARM/select.ll in this review. I committed the changes along with what's in this revision, which I felt were appropriate for post-commit review if necessary. Sorry about that!