This is an archive of the discontinued LLVM Phabricator instance.

merge consecutive stores of extracted vector elements (PR21711)
ClosedPublic

Authored by spatel on Dec 16 2014, 3:24 PM.

Details

Summary

This patch adds a path to DAGCombiner::MergeConsecutiveStores() to combine multiple scalar stores when the store operands are extracted vector elements. This is a partial fix for PR21711 ( http://llvm.org/bugs/show_bug.cgi?id=21711 ).

For the new test case, codegen improves from:

vmovss  %xmm0, (%rdi)
vextractps      $1, %xmm0, 4(%rdi)
vextractps      $2, %xmm0, 8(%rdi)
vextractps      $3, %xmm0, 12(%rdi)
vextractf128    $1, %ymm0, %xmm0
vmovss  %xmm0, 16(%rdi)
vextractps      $1, %xmm0, 20(%rdi)
vextractps      $2, %xmm0, 24(%rdi)
vextractps      $3, %xmm0, 28(%rdi)
vzeroupper
retq

To:

vmovups	%ymm0, (%rdi)
vzeroupper
retq

Note that this patch depends on http://reviews.llvm.org/D6678 to avoid even worse codegen on SandyBridge. See http://llvm.org/bugs/show_bug.cgi?id=21711#c7 for more details.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel updated this revision to Diff 17369.Dec 16 2014, 3:24 PM
spatel retitled this revision from to merge consecutive stores of extracted vector elements (PR21711).
spatel updated this object.
spatel edited the test plan for this revision. (Show Details)
spatel added reviewers: hfinkel, andreadb, mkuper.
spatel added a subscriber: Unknown Object (MLST).
This revision was automatically updated to reflect the committed changes.