This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombiner] reduce insert+bitcast+extract vector ops to truncate (PR39016)
ClosedPublic

Authored by spatel on Oct 12 2018, 9:20 AM.

Details

Summary

This is a late backend subset of the IR transform added with:
D52439

We can confirm that the conversion to a 'trunc' is correct by running:
$ opt -instcombine -data-layout="e"
(assuming the IR transforms are correct; change "e" to "E" for big-endian)

As discussed in PR39016:
https://bugs.llvm.org/show_bug.cgi?id=39016
...the pattern may emerge during legalization, so that's why I've opted to wait for an insertelement to become a scalar_to_vector in the pattern matching here.

The DAG allows for fun variations that are not possible in IR. Result types for extracts and scalar_to_vector don't necessarily match input types, so that means we have to be a bit more careful in the transform (see code comments).

The tests show that we don't handle cases that require a shift (as we did in the IR version). I've left that as a potential follow-up because I'm not sure if that's a real concern at this late stage.

The bug report mentions an x86 regression test that isn't changed here, so I'm also not sure if this is enough to close the bug report.

Diff Detail

Repository
rL LLVM

Event Timeline

spatel created this revision.Oct 12 2018, 9:20 AM
craig.topper added inline comments.Oct 12 2018, 12:16 PM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
15526 ↗(On Diff #169418)

Does this work correctly on big endian for non power of 2 element count or element size such that XBitWidth isn't evenly divisible by VecEltBitWidth.

spatel added inline comments.Oct 13 2018, 8:02 AM
lib/CodeGen/SelectionDAG/DAGCombiner.cpp
15526 ↗(On Diff #169418)

I'll add a test with non-power-of-2 vector size, but I don't see a way to actually exercise that possibility given that this transform is only happening late (legal types are already in effect, but we can make that explicit).
AFAICT, the scalar bitwidth must be a multiple of the vector element bitwidth, so I'll add an assert for that.

spatel updated this revision to Diff 169563.Oct 13 2018, 8:07 AM
spatel marked 2 inline comments as done.

Patch updated:

  1. Add "LegalTypes" as the first predicate for trying this transform to make it less likely that any weird types invalidate the later assumptions.
  2. Add an assert that the scalar bitwidth is a multiple of the vector element bitwidth (scalar_to_vector size must be a multiple, and bitcast can't change size).
  3. Add a test that at least starts with a weird type in IR.

FYI, the post legalize types DAG combine doesn’t run if all the types in the DAG were legal and nothing was changed by legalize types. So gating a combine on LegalTypes can really mean a transform doesn’t run until after vector op legalization(again the DAG combine after this doesn’t run if nothing changed) or LegalizeDAG.

FYI, the post legalize types DAG combine doesn’t run if all the types in the DAG were legal and nothing was changed by legalize types. So gating a combine on LegalTypes can really mean a transform doesn’t run until after vector op legalization(again the DAG combine after this doesn’t run if nothing changed) or LegalizeDAG.

OK, didn't realize that limitation, although it still catches all of the cases that I've added here. Suggestions for a more appropriate predicate?
We could confirm that the type that we're casting to is legal:
TLI.isTypeLegal(NVT) ?

This revision is now accepted and ready to land.Oct 15 2018, 10:05 PM
This revision was automatically updated to reflect the committed changes.