This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ] Avoid inserting same value after replication
ClosedPublic

Authored by jonpa on Nov 9 2018, 7:30 AM.

Details

Reviewers
uweigand
Summary

A minor improvement of buildVector() that skips creating an INSERT_VECTOR_ELT for a Value which has already been used for the REPLICATE.

Seems like just two files benefit from this on SPEC:

update.s
vlvgf          :                   27                   18       -9
l              :                  166                  163       -3
vrepf          :                   11                    8       -3
vlvgp          :                    9                    6       -3
vlrepf         :                    0                    3       +3
lgr            :                  186                  184       -2
Spill|Reload   :                  478                  478       +0

pullinit.s
vlvgf          :                   93                   63      -30
Spill|Reload   :                   83                   83       +0

Diff Detail

Event Timeline

jonpa created this revision.Nov 9 2018, 7:30 AM
uweigand accepted this revision.Nov 9 2018, 7:37 AM

LGTM, thanks!

As a future enhancement, we might prefer to choose the initial value to replicate such that the number of VLVGx instruction is minimized. E.g. if we load two integers LA and LB from memory and construct the vector { LA, LB, LB, LB }, the current code, even with your change, would load and replicate LA, and then use three VLVGx to insert the LB copies. We could save two instructions by instead using load-and-replicate for LB.

This revision is now accepted and ready to land.Nov 9 2018, 7:37 AM
jonpa closed this revision.Nov 9 2018, 7:54 AM

r346504

As a future enhancement, we might prefer to choose the initial value to replicate such that the number of VLVGx instruction is minimized. E.g. if we load two integers LA and LB from memory and construct the vector { LA, LB, LB, LB }, the current code, even with your change, would load and replicate LA, and then use three VLVGx to insert the LB copies. We could save two instructions by instead using load-and-replicate for LB.

I'll give it a try...