This is an archive of the discontinued LLVM Phabricator instance.

[InstCombine] convert bitcast-shuffle to vector trunc
ClosedPublic

Authored by spatel on Apr 2 2020, 5:38 AM.

Details

Summary

As discussed in D76983, that patch can turn a chain of insert/extract with scalar trunc ops into bitcast+extract and existing instcombine vector transforms end up creating a shuffle out of that (see the PhaseOrdering test for an example). Currently, that process requires at least this sequence: -instcombine -early-cse -instcombine.

Before D76983, the sequence of insert/extract would reach the SLP vectorizer and become a vector trunc there.

Based on a small sampling of public targets/types, converting the shuffle to a trunc is better for codegen in most cases (and a regression of that form is the reason this was noticed). The trunc is clearly better for IR-level analysis as well.

This means that we can induce "spontaneous vectorization" without invoking any explicit vectorizer passes (at least a vector cast op may be created out of scalar casts), but that seems to be the right choice given that we started with a chain of insert/extract, and the backend would expand back to that chain if a target does not support the op.

Diff Detail

Event Timeline

spatel created this revision.Apr 2 2020, 5:38 AM

In the spirit of https://llvm.org/docs/CodeReview.html#non-experts-should-review-code, I am reviewing the code. But I defer the approval of this patch to somebody else who has more experience than me. (I found only a single nit.)

llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1691

i * TruncRatio could in theory overflow for ridiculous types. Maybe consider using int64_t for LSBIndex?

spatel updated this revision to Diff 254611.Apr 2 2020, 1:39 PM
spatel marked 2 inline comments as done.

Patch updated:
Widened type for scaled index and added assert.

llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1691

Good catch (and I should've remembered from D76983...).

This patch indeed fixes our regression. Thanks !

Just one small comment about the overflow.

llvm/lib/Transforms/InstCombine/InstCombineVectorOps.cpp
1691

unsigned * unsigned = unsigned. Make 'TruncRatio' int64_t (or, use uint64_t, just like in D76983).

spatel updated this revision to Diff 254778.Apr 3 2020, 6:32 AM
spatel marked an inline comment as done.

Patch updated:
Fixed types to deal with overflow (hopefully properly this time!).

LGTM.

(But still deferring approval to somebody with more experience.)

This revision is now accepted and ready to land.Apr 3 2020, 7:05 AM
jeroen.dobbelaere accepted this revision.Apr 3 2020, 7:08 AM

LGTM. Thanks !

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptApr 5 2020, 6:56 AM