This is an archive of the discontinued LLVM Phabricator instance.

[DAG] visitVECTOR_SHUFFLE - MergeInnerShuffle - improve shuffle(shuffle(x,y),shuffle(x,y)) merging
ClosedPublic

Authored by RKSimon on Jan 14 2021, 4:18 AM.

Details

Summary

MergeInnerShuffle currently attempts to merge shuffle(shuffle(x,y),z) patterns into a single shuffle, using 1 or 2 of the x,y,z ops.

However if we already match 2 ops we might be able to handle the third op if its also a shuffle that references one of the previous ops, allowing us to handle some cases like:

shuffle(shuffle(x,y),shuffle(x,y))
shuffle(shuffle(shuffle(x,z),y),z)
shuffle(shuffle(x,shuffle(x,y)),z)
etc.

This isn't an exhaustive match and is dependent on the order the candidate ops are encountered - if one of the matched ops was a shuffle that was peek-able we don't go back and try to split that, I haven't found much need for that amount of analysis yet.

This is a preliminary patch that will allow us to later improve x86 HADD/HSUB matching - but needs to be reviewed separately as its in generic code and affects existing Thumb2 tests.

Diff Detail

Event Timeline

RKSimon created this revision.Jan 14 2021, 4:18 AM
RKSimon requested review of this revision.Jan 14 2021, 4:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 14 2021, 4:18 AM
yubing added a subscriber: yubing.Jan 15 2021, 3:08 AM
dmgreen accepted this revision.Jan 15 2021, 6:09 AM

LGTM. We sometimes generate a lot of shuffles in an attempt to do lane interleaving and I know the simplification of them isn't always what it could be once all the lowering has happened. I thought more happened through simplifying buildvectors but apparently not. This looks like a good continuation to the existing code.

This revision is now accepted and ready to land.Jan 15 2021, 6:09 AM