Page MenuHomePhabricator

[X86] Improve a dag-combine that handles a vector extract -> zext sequence.

Authored by mkuper on Dec 3 2014, 4:49 AM.



The current DAG combine turns a sequence of extracts from <4 x i32> followed by zexts into a store followed by scalar loads.
According to measurements by Martin Krastev (see PR 21269) for x86-64, a sequence of an extract, movs and shifts gives better performance. However, for 32-bit x86, the previous sequence still seems better.

Diff Detail


Event Timeline

mkuper updated this revision to Diff 16860.Dec 3 2014, 4:49 AM
mkuper retitled this revision from to [X86] Improve a dag-combine that handles a vector extract -> zext sequence..
mkuper updated this object.
mkuper edited the test plan for this revision. (Show Details)
mkuper added reviewers: qcolombet, nadav.
nadav edited edge metadata.Dec 3 2014, 8:26 AM

I did not review the code carefully but overall it looks good.

qcolombet accepted this revision.Dec 3 2014, 9:53 AM
qcolombet edited edge metadata.

Hi Michael,

This LGTM with one request.
Could you add a test case where we fall back to the old sequence of store + loads?
No need to send an updated patch.


22656 ↗(On Diff #16860)

Period at the end of the comment.

This revision is now accepted and ready to land.Dec 3 2014, 9:53 AM
mkuper edited edge metadata.Dec 3 2014, 1:28 PM
mkuper added a subscriber: Unknown Object (MLST).
mkuper added a comment.Dec 3 2014, 1:31 PM

Thanks, Quentin!

Sure, will add that.
It'll have to be slightly more contrived because it needs to have an explicit zext, as opposed to an implicit extension coming from a GEP, but that's no big deal.

mkuper closed this revision.Dec 4 2014, 5:50 AM
mkuper updated this revision to Diff 16922.

Closed by commit rL223360 (authored by @mkuper).