This is an archive of the discontinued LLVM Phabricator instance.

Fold EXTRACT_VECTOR_ELT(BUILD_VECTOR(Elt[0], ...), CstX ) -> Elt[CstX]
AbandonedPublic

Authored by mehdi_amini on Apr 18 2015, 1:06 AM.

Details

Reviewers
ab
Summary

Add a new DAG combine to fold EXTRACT_VECTOR_ELT through BUILD_VECTOR.
It breaks one ARM test, but I believe it canonicalizes the test in
a different way and the DAG should be updated to handle it. Added
another variant of the test, conceptually identical but which was
failing before this patch. CC:

  • Tim to have a look at the AArch64 test change, I hope it is legal.
  • Tom/Matt for the R600 test changes.

Diff Detail

Event Timeline

mehdi_amini retitled this revision from to Fold EXTRACT_VECTOR_ELT(BUILD_VECTOR(Elt[0], ...), CstX ) -> Elt[CstX].
mehdi_amini updated this object.
mehdi_amini edited the test plan for this revision. (Show Details)
mehdi_amini added subscribers: t.p.northover, tstellarAMD, arsenm, Unknown Object (MLST).

Fix ARM/vmov.ll test (was unintentionally truncated)

ab edited edge metadata.Apr 23 2015, 3:34 PM

The combine seems sound, but ARM definitely needs to learn about special constants before this can go in.

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
11004

Would it make sense to check that the BUILD_VECTOR .hasOneUse()?

11010

space before (

test/CodeGen/AArch64/fold-constants.ll
6–8

This looks fine to me, but we can do better, I think. The code looks like it's just shuffling zeros around, so I'd expect:

mov x0, xzr
18–19

This is the part where the combiners get confused. We could catch this with:

(v1i64 bitcast (v4i16 build_vector (i16 C), undef, undef, undef))
    ->
(v1i64 bitcast (i64 zext/sext (i16 C)))

It's not clear to me that's better though. This would be more legitimate:

(i64 bitcast (v4i16 build_vector (i16 C), undef, undef, undef))
    ->
(i64 zext/sext (i16 C))

But for this testcase, we'd also need:

(i64 extract_element (v1i64 bitcast N), i32 0)
    ->
(i64 bitcast N)

And that might be a whole 'nother mess on AArch64 (it might be very useful as well, I'm not sure).

I'll have a closer look if you don't beat me to it ;)

test/CodeGen/ARM/vmov.ll
3

These failures are caused by the logic in LowerBUILD_VECTOR (recognizing splats) not firing anymore. With the combine, the vector doesn't survive until lowering, so we're stuck with a magic f64 constant that happens to have the same bitpattern as a vector splat.

As you say, ARM probably should learn about recognizing special patterns in LowerConstantFP and whatnot. That would let it materialize stuff like:

double 0x0808080808080808

using MOVI instead of constant pools (judging by the failure, I don't think it does). Feel free to file a PR and/or give it a shot; I'll try to have a look myself.

test/CodeGen/R600/ds_read2.ll
220–221

A single -NOT is enough

Thanks for the review. I'll wait for you to have time to fix the ARM64 failure ;)

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
11004

Good question, I can imagine cases where I don't want to this check but I'm not sure I see a lot of cases where it would be beneficial?

11010

grahh clang-format :(

test/CodeGen/R600/ds_read2.ll
220–221

Of course! Thanks.

D13655 implements this transformation, but the example I added here v_movQi8_double is probably still a missing optimization in the ARM backend.

mehdi_amini abandoned this revision.Dec 11 2015, 2:02 PM