Download Raw Diff

Details

Reviewers

rengolin
t.p.northover
gberry
jmolloy
sbaranga
mcrosier

Commits

rG4355e404d50c: [AArch64] Add DAG combine for extract extend pattern
rL255895: [AArch64] Add DAG combine for extract extend pattern

Summary

This patch adds a DAG combine for (any_extend (extract_vector_elt v, i)) ->
(extract_vector_elt v, i). The extend is unnecessary since it is performed
implicitly by the extract.

Diff Detail

Event Timeline

mssimpso updated this revision to Diff 42810.Dec 14 2015, 6:24 PM

mssimpso retitled this revision from to [AArch64] Add missing extract extend patterns.

mssimpso updated this object.

mssimpso added reviewers: t.p.northover, sbaranga, rengolin, mcrosier, gberry.

mssimpso added a subscriber: llvm-commits.

Herald added subscribers: rengolin, aemerson. · View Herald TranscriptDec 14 2015, 6:24 PM

sbaranga added inline comments.Dec 15 2015, 3:02 AM

lib/Target/AArch64/AArch64InstrInfo.td
3809	Any idea why we don't have (anyext (vector_extract (...) ) -> vector_extract ( )? I see that vector_extract is EXTRACT_VECTOR_ELT, which according to ISDOpcodes.h should be able to perform the anyext operation. If that would happen, wouldn't one of the patterns above match this?

Thanks very much for the quick reply, Silviu! Please see my response inline.

lib/Target/AArch64/AArch64InstrInfo.td
3809	I see what you're getting at, but as far as I know (please correct me if I'm wrong), "vector_extract" can't be used in an output pattern.

gberry added inline comments.Dec 15 2015, 7:45 AM

lib/Target/AArch64/AArch64InstrInfo.td
3809	I think what Silvu is getting at is that maybe a better place to fix this would be to add a dag-combine that did (anyext (vector_extract (x)) -> vector_extract(x) (with the wider type).

Reimplemented change as a DAG combine following feedback from Silviu and Geoff.

I first attemted to make this a target-independent combine, but that broke a lot of X86 tests. If we like the approach, I think it's best to limit the combine to AArch64 for now.

mssimpso retitled this revision from [AArch64] Add missing extract extend patterns to [AArch64] Add DAG combine for extract extend pattern.Dec 15 2015, 12:37 PM

mssimpso updated this object.

Get the DAG combine right this time.

Check for illegal vector types.

jmolloy requested changes to this revision.Dec 16 2015, 1:06 AM

jmolloy added a reviewer: jmolloy.

jmolloy added a subscriber: jmolloy.

jmolloy added inline comments.

lib/Target/AArch64/AArch64ISelLowering.cpp

8444 ↗

(On Diff #42932)

I suppose we need the sign extend because ISD::EXTRACT_VECTOR_ELT gets matched to "smov"? Perhaps that's something that needs changing - surely we should match (sign_ext_inreg (extract_vector_elt)) to smov and (zero_ext_inreg (extract_vector_elt)) to umov.

I think there needs to be more type checking going on here. Consider:

%0 = EXTRACT_VECTOR_ELT <8 x i16> %a  ; %0 is i16
%1 = ANYEXT i16 %0 to i32 ; %1 is i32
%2 = SIGN_EXTEND_INREG i32 %0 (from i8 to i32) ; bit[7] is replicated into bit[8..31]

This cannot be an SMOVv8i16, because the sign bit is bit[7], not bit[15]. This testcase shows it:

define i64 @matt(<8 x i16> %a) {
  %b = extractelement <8 x i16> %a, i32 1
  %c = trunc i16 %b to i8
  %d = sext i8 %c to i64
  ret i64 %d
}

SelectionDAG has 11 nodes:
t0: ch = EntryToken
        t2: v8i16,ch = CopyFromReg t0, Register:v8i16 %vreg0
      t14: i32 = extract_vector_elt t2, Constant:i64<1>
    t15: i64 = any_extend t14
  t13: i64 = sign_extend_inreg t15, ValueType:ch:i8
t9: ch,glue = CopyToReg t0, Register:i64 %X0, t13
t10: ch = AArch64ISD::RET_FLAG t9, Register:i64 %X0, t9:1

This revision now requires changes to proceed.Dec 16 2015, 1:06 AM

Added additional type checking noted by James.

mssimpso marked an inline comment as done.Dec 16 2015, 7:54 AM

mssimpso added inline comments.

lib/Target/AArch64/AArch64ISelLowering.cpp
8444 ↗	(On Diff #43010)	James, You're right. Thanks for joining the review! We also need to check that the vector element type is the same as the type we're sign extending from. Yes, this combine is intended to canonicalize the patterns we match to SMOV. The existing patterns attempt match (sext_inreg (vector_extract v, i)) for the legal types. However, the 16xi8-to-i64 case and the 8xi16-to-i64 case (which we have patterns for) don't actually follow this form because there is an added any_extend. An any_extend from i32 to i64 is added by the type legalizer because it legalizes the extracted elements to i32. So for example, for the case below: define i64 @f(<16 x i8> %a) { entry: %e = extractelement <16 x i8> %a, i32 2 %b = zext i8 %e to i64 ret i64 %b } we currently generate: umov w8, v0.b[2] sxtb x0, w8 With this DAG combine, we will instead generate the following code because the existing 16xi8-to-i64 pattern will fire. smov x0, v0.b[2]

mssimpso marked an inline comment as done.Dec 16 2015, 7:57 AM

mssimpso added inline comments.

lib/Target/AArch64/AArch64ISelLowering.cpp
8444 ↗	(On Diff #43010)	Sorry, that should have been a sext in the test case above.

Type checking.

James, you may need to go into Phabricator and unblock the patch.

From: James Molloy [mailto:james@jamesmolloy.co.uk]
Sent: Wednesday, December 16, 2015 11:55 AM
To: reviews+D15515+public+f270c3ee04f7413b@reviews.llvm.org; Matthew Simpson; t.p.northover@gmail.com; silviu.baranga@arm.com; renato.golin@linaro.org; mcrosier@codeaurora.org; gberry@codeaurora.org; james.molloy@arm.com
Cc: kanheim@a-bix.com; llvm-commits@lists.llvm.org
Subject: Re: [PATCH] D15515: [AArch64] Add DAG combine for extract extend pattern

That makes sense to me. LGTM!

LGTM, per James.

Closed by commit rL255895: [AArch64] Add DAG combine for extract extend pattern (authored by mssimpso). · Explain WhyDec 17 2015, 6:34 AM

This revision was automatically updated to reflect the committed changes.

Thanks very much Silviu, James, Geoff, and Chad for the feedback.

I reverted this patch because it was breaking some internal tests. After looking at the failing test case, I don't think the DAG is the right place to do this. The issue in the failing test was that the sign_extend was first matched to another pattern (a widening operation like ssubw). Without the sign_extend, the widened extract_vector_elt could no longer be matched to anything. See the code below for an example.

define i64 @f(<8 x i16> %a, i64 %b) {
  %d = extractelement <8 x i16> %a, i32 2
  %e = sext i16 %d to i64
  %t1 = sub nsw i64 %b, %e
  ret i64 %t1
}

I think the initial revision of the patch, which added the two additional patterns to the target description, is the safer approach. We'll match the smov cases, but won't risk making the DAG illegal. Were there any objections to the initial revision?

If I understand correctly, you may be able to just add patterns for the vector_extracts without extends/ands to handle this case (along with the current dagcombine).
I'm not sure this will buy you anything more than just using the original patterns you suggested though.

Ok, it seems that the original solution is the best way to do this then.

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add DAG combine for extract extend pattern
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 42810

lib/Target/AArch64/AArch64InstrInfo.td

test/CodeGen/AArch64/arm64-neon-copy.ll

This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Add DAG combine for extract extend patternClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 42810

lib/Target/AArch64/AArch64InstrInfo.td

test/CodeGen/AArch64/arm64-neon-copy.ll

[AArch64] Add DAG combine for extract extend pattern
ClosedPublic