This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Improve code generation of vector build
ClosedPublic

Authored by evandro on Dec 21 2017, 2:13 PM.

Details

Summary

Instead of using, for example, dup v0.4s, wzr, which transfers between register files, use the more efficient movi v0.4s, #0 instead.

Diff Detail

Repository
rL LLVM

Event Timeline

evandro created this revision.Dec 21 2017, 2:13 PM
fhahn added a subscriber: fhahn.Dec 21 2017, 3:13 PM
fhahn added a reviewer: fhahn.Dec 22 2017, 1:58 AM

This looks good to me for Cortex cores (A57,A72), where movi and dup have the same cost, so this should be a (smallish) improvement there.

sdesmalen accepted this revision.Jan 2 2018, 5:14 AM

LGTM, it indeed seems like a sensible change which is confirmed by the available cost models.

This revision is now accepted and ready to land.Jan 2 2018, 5:14 AM
fhahn accepted this revision.Jan 2 2018, 5:17 AM
This comment was removed by evandro.

Ahem, thank you.

This revision was automatically updated to reflect the committed changes.