This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] Avoid INSERT_SUBVECTOR reinsertions (PR28678)
ClosedPublic

Authored by RKSimon on Aug 9 2016, 11:54 AM.

Details

Summary

If the input vector to INSERT_SUBVECTOR is another INSERT_SUBVECTOR, and this inserted subvector replaces the last insertion, then insert into the common source vector.

Diff Detail

Repository
rL LLVM

Event Timeline

RKSimon updated this revision to Diff 67385.Aug 9 2016, 11:54 AM
RKSimon retitled this revision from to [DAGCombine] Avoid INSERT_SUBVECTOR reinsertions (PR28678).
RKSimon updated this object.
RKSimon added reviewers: majnemer, mkuper, spatel, andreadb.
RKSimon set the repository for this revision to rL LLVM.
RKSimon added a subscriber: llvm-commits.
mkuper edited edge metadata.Aug 9 2016, 4:08 PM

LGTM.

Although I'm somewhat curious about how we even end up with this pattern - I guess for test_mm256_insert_epi64, we really need to have two insert_elements post-legalization, but why do we end up with two insert_subvectors?

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
13810 ↗(On Diff #67385)

Could you make this comment slightly easier to parse? It's probably my fault, but took me a bit too long to understand what you meant. :-)
Maybe explicitly say that both insert into the same lane, instead of (or in addition to) using "replaces"?

mkuper accepted this revision.Aug 9 2016, 4:08 PM
mkuper edited edge metadata.
This revision is now accepted and ready to land.Aug 9 2016, 4:08 PM
This revision was automatically updated to reflect the committed changes.

Although I'm somewhat curious about how we even end up with this pattern - I guess for test_mm256_insert_epi64, we really need to have two insert_elements post-legalization, but why do we end up with two insert_subvectors?

Thanks, I've cleaned up the description by using pseudocode instead. I also realised that we need to check the subvector types are the same (e.g. to protect against inserting a 256 following by 128 into a 512).

Any time that we're doing partial insertions (i.e. something that build vector won't handle) we tend to end up with these cases - each insertion has a separate extractsub/insertelt/insertsub pattern - what's curious is that a common extractsub is being used.

If you notice for the lower 128-bit vectors we still have a lot of duplicate blends to bring the lower/upper halves back together again.