This is an archive of the discontinued LLVM Phabricator instance.

[ARM64] Implement NEON post-increment LD1 (lane) and post-increment LD1R
ClosedPublic

Authored by HaoLiu on May 13 2014, 3:00 AM.

Details

Summary

Hi Tim,

This patch implements post-increment LD1 (lane) and post-increment LD1R. The implementation is like the implementation of NEON post-increment load with 2/3/4 vectors.
It tries to do the following 2 combines if satisfied:

(1) combine an ARM64ISD:DUP and a post-increment load into a post-increment LD1R.
(2) combine an ISD::INSERT_VECTOR_ELT and a post-increment load into a post-increment LD1 (lane).

Ask for code review.

Thanks,
-Hao

Diff Detail

Event Timeline

HaoLiu updated this revision to Diff 9340.May 13 2014, 3:00 AM
HaoLiu retitled this revision from to [ARM64] Implement NEON post-increment LD1 (lane) and post-increment LD1R.
HaoLiu updated this object.
HaoLiu edited the test plan for this revision. (Show Details)
HaoLiu added a reviewer: t.p.northover.
HaoLiu added a subscriber: Unknown Object (MLST).

ping...

-----Original Message-----
From: Hao Liu [mailto:Hao.Liu@arm.com]
Sent: Tuesday, May 13, 2014 6:01 PM
To: Hao Liu; t.p.northover@gmail.com
Cc: Amara Emerson; llvm-commits@cs.uiuc.edu
Subject: [PATCH] [ARM64] Implement NEON post-increment LD1 (lane) and
post-increment LD1R

Hi t.p.northover,

Hi Tim,

This patch implements post-increment LD1 (lane) and post-increment LD1R.
The implementation is like the implementation of NEON post-increment load
with 2/3/4 vectors.
It tries to do the following 2 combines if satisfied:

(1) combine an ARM64ISD:DUP and a post-increment load into a post-

increment LD1R.

(2) combine an ISD::INSERT_VECTOR_ELT and a post-increment load into a

post-increment LD1 (lane).

Ask for code review.

Thanks,
-Hao

http://reviews.llvm.org/D3740

Files:

lib/Target/ARM64/ARM64ISelDAGToDAG.cpp
lib/Target/ARM64/ARM64ISelLowering.cpp
lib/Target/ARM64/ARM64ISelLowering.h
test/CodeGen/ARM64/indexed-vector-ldst.ll

Hi Hao,

I have a couple of comments.

Cheers,

James

lib/Target/ARM64/ARM64ISelDAGToDAG.cpp
1159–1172

Please add braces around this if clause? either there should be no braces around if or else, or braces around both.

2670

Would a switch be clearer here?

2746

Having all these individual MVT:: cases makes it likely you'll miss one or someone will miss one on an update. Would it be easier to switch on the EVT::getVectorElementSizeInBits()?

lib/Target/ARM64/ARM64ISelLowering.cpp
7098

What about SEXTLOAD and ZEXTLOAD here?

7129

GetScalarSizeInBits() may simplify this.

7158

Typo: "write back"

HaoLiu updated this revision to Diff 9408.May 14 2014, 8:39 PM

Hi James,

Thanks for the code review. I've reworked on the code according to your comments.

Thanks,
-Hao

lib/Target/ARM64/ARM64ISelDAGToDAG.cpp
2670

Hi James,
Yes, switch is much clearer. But all the other code to select load/store is implemented like this, if we modify this, all other code needs to be modified. This will change a lot. So Tim suggested to deal with this piece of code in the future with a separate patch (see the comment in http://reviews.llvm.org/D3605).
So I just implement this piece of code like others. This will be improved in the future.

lib/Target/ARM64/ARM64ISelLowering.cpp
7098

Oh, I analysis this again. I find that this is always ISD::LOAD, no ISD::EXTLOAD or SEXTLOAD/ZEXTLOAD. So I remove EXTLOAD and check the type of memory operand type, which should be the same to the element type of the vector.

jmolloy accepted this revision.May 15 2014, 3:45 AM
jmolloy added a reviewer: jmolloy.

Hi Hao,

OK, understood. This looks fine to me, but Tim needs to OK it before it can go in.

Cheers,

James

This revision is now accepted and ready to land.May 15 2014, 3:45 AM

Hi Tim,

Do you have any comments?

If no, I'll commit this patch.

Thanks,
-Hao

t.p.northover edited edge metadata.May 16 2014, 1:05 AM

Hi Hao,

I think this looks OK. It would have been nicer to leave it later (ISelDAGToDAG) but we already had this code lying around, so why not make use of it?

Go for it!

Tim.

Hi Tim,

This patch has two steps:

(1) do post LD1 combine in ARM64ISelLowing
(2) do select post LD1 in ARM64ISelDAGToDAG

So you mean it's better to not to do post LD1 combine in ARM64ISelLowering? We can do such combine and select in one step in ARM64ISelDAGToDAG?

If so, considering other post NEON load/store like LD2/LD3/LD4 are also implemented in two steps, they also need to be refactored.

Thanks,
-Hao

Hi Hao,

This patch has two steps:

(1) do post LD1 combine in ARM64ISelLowing
(2) do select post LD1 in ARM64ISelDAGToDAG

So you mean it's better to not to do post LD1 combine in ARM64ISelLowering? We can do such combine and select in one step in ARM64ISelDAGToDAG?

No need to take it as a suggestion: it was just a vague desire for the
future really.

If so, considering other post NEON load/store like LD2/LD3/LD4 are also implemented in two steps, they also need to be refactored.

The main difference there, as I see it, is that they're *already*
intrinsics coming into the DAG. We're definitely not going to be
missing out on any generic DAG combines by fiddling them a bit more.
The same can't necessarily be said for more generic ISD::LOADs.

That said, the LD2/LD3/LD4 situation did help convince me that doing
it in ISelLowering wasn't too bad.

Cheers.

Tim.

Oh, I see. That's reasonable.

Committed in http://llvm.org/viewvc/llvm-project?rev=208955&view=rev

Thanks,
-Hao

HaoLiu closed this revision.May 13 2015, 7:55 PM