This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Turn splat shuffles of vector loads into strided load with stride of x0.
ClosedPublic

Authored by craig.topper on Apr 19 2021, 10:11 PM.

Details

Summary

Implementations are allowed to optimize an x0 stride to perform
less memory accesses. This is the case in SiFive cores.

No idea if this is the case in other implementations. We might
need a tuning flag for this.

Diff Detail

Event Timeline

craig.topper created this revision.Apr 19 2021, 10:11 PM
craig.topper requested review of this revision.Apr 19 2021, 10:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 19 2021, 10:11 PM
Herald added a subscriber: MaskRay. · View Herald Transcript
arcbbb accepted this revision.Apr 22 2021, 2:35 AM

LGTM

This revision is now accepted and ready to land.Apr 22 2021, 2:35 AM
frasercrmck accepted this revision.Apr 22 2021, 2:42 AM

LGTM too. I don't have an implementation at hand, so can't really add anything to the discussion about performance/tuning.