This is an archive of the discontinued LLVM Phabricator instance.

Code Generator for Gather and Scatter Intrinsics.
ClosedPublic

Authored by delena on Feb 16 2015, 2:25 AM.

Details

Summary

I'm asking to review the CodeGen part of masked gather and scatter. I know that the patch is big, I just don't see how to split it.
Just to remind the syntax:

<4 x double> @masked.gather.v4f64 (<4 x double*> %Ptrs, i32 8, <4 x i1> %Mask, <4 x double> %PassTru)
void @masked.scatter.v4f64 (<4 x double> %Val, <4 x double*> %Ptrs, i32 8, <4 x i1> %Mask)

In the target independent part - I build MGATHER and MSCATTER nodes and legalize vector types.

X86 specific:

  • the both X86 gather and scatter instructions kill mask register (implicitly)
  • the memory operand is in base-index-scale form, where the "index" is a vector of indices

Loop vectorizer, cost model, target transform changes are next in the pipe.

Diff Detail

Repository
rL LLVM

Event Timeline

delena updated this revision to Diff 20007.Feb 16 2015, 2:25 AM
delena retitled this revision from to Code Generator Patterns for X86 Gather and Scatter.
delena updated this object.
delena edited the test plan for this revision. (Show Details)
delena added reviewers: anemet, craig.topper.
delena set the repository for this revision to rL LLVM.
delena added a subscriber: Unknown Object (MLST).
craig.topper added inline comments.Mar 2 2015, 9:50 PM
../include/llvm/Target/TargetSelectionDAG.td
199

Why isn't this SDTypeProfile<2. 3 ?

../lib/Target/X86/X86InstrFragmentsSIMD.td
294

This doesn't appear to be used.

499

Why so much commented out code?

../lib/Target/X86/X86InstrInfo.td
723

Why is this commented out?

../utils/TableGen/CodeGenDAGPatterns.cpp
1623

Shouldn't this be "MadeChange |=" as it is it overwrites the change from the first call.

1624

Remove space before the --.

delena added inline comments.Mar 3 2015, 12:43 AM
../include/llvm/Target/TargetSelectionDAG.td
199

I tie the second destination to the one of the sources and create an SDNode node with 3 sources and one destination. This way it works.
Otherwise I need to rewrite a half of the tablegen code.

../lib/Target/X86/X86InstrFragmentsSIMD.td
294

You are right, I'll remove it, thanks.

499

This is the real code that will go in. I have the full implementation of the feature including tests, and trying to split it now into small patches for review.

../lib/Target/X86/X86InstrInfo.td
723

I don't want to put implementation of SelectVectorAddr() in this patch. I just show that the vectoraddr is a special complex pattern that has its own implementation.

../utils/TableGen/CodeGenDAGPatterns.cpp
1623

You are right, thanks.

1624

Ok.

hfinkel added inline comments.Mar 4 2015, 10:04 PM
../include/llvm/CodeGen/ISDOpcodes.h
686

For these nodes (and also the MLOAD/MSTORE nodes above), we need comments explaining what the operands are.

../lib/Target/X86/X86InstrInfo.td
723

Okay, then the comment should say that.

delena updated this revision to Diff 22547.Mar 24 2015, 2:56 AM
delena retitled this revision from Code Generator Patterns for X86 Gather and Scatter to Code Generator for Gather and Scatter Intrinsics..
delena updated this object.
delena edited the test plan for this revision. (Show Details)
delena added reviewers: HaoLiu, rengolin.

Craig Topper submitted TableGen changes that allow to specify the "gather" as a node with two destination registers.

craig.topper added inline comments.Mar 26 2015, 11:41 PM
../include/llvm/Target/TargetSelectionDAG.td
200

I think we should add SDTCisPtrTy for the last argument to both scatter and gather?

../lib/Target/X86/X86InstrFragmentsSIMD.td
499

Oh was this commented out to simplify review?

include/llvm/CodeGen/SelectionDAG.h
876 ↗(On Diff #22547)

Second line isn't quite aligned right. Same with the line below.

include/llvm/CodeGen/SelectionDAGNodes.h
2038 ↗(On Diff #22547)

No need for parentheses

lib/CodeGen/SelectionDAG/DAGCombiner.cpp
5130 ↗(On Diff #22547)

Please add a blank line between this function and the previous function.

5185 ↗(On Diff #22547)

Is there a reason this is commented out instead of just removed?

lib/Target/X86/X86ISelLowering.cpp
1379 ↗(On Diff #22547)

No space before EltSize.

delena updated this revision to Diff 22852.Mar 29 2015, 5:49 AM

Hi Craig,
Thank you for review.

I addressed all comments, and uploaded a new diff.

delena updated this revision to Diff 22853.Mar 29 2015, 6:27 AM

Ping..

Thanks.

rengolin edited edge metadata.Apr 16 2015, 12:45 PM

Hi Elena,

<4 x double> @masked.gather.v4f64 (<4 x double*> %Ptrs, i32 8, <4 x i1> %Mask, <4 x double> %PassTru)
void @masked.scatter.v4f64 (<4 x double> %Val, <4 x double*> %Ptrs, i32 8, <4 x i1> %Mask)

All new intrinsics need change to the LangRef document, to make sure we don't forget any. It also makes it easier for reviewers to understand the clear semantics in relation to the code. Please update docs/LangRef.rst.

cheers,
--renato

ab added a subscriber: ab.Apr 28 2015, 10:48 AM

Skimming through, the logic seems sound; a few nits (and a scary missing break).

Thanks!
-Ahmed

include/llvm/CodeGen/SelectionDAGNodes.h
1988 ↗(On Diff #22853)

One "is" too many.

lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp
1353 ↗(On Diff #22853)

missing break?

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
3724–3727 ↗(On Diff #22853)

{ } for the "if" as well, perhaps?

3775 ↗(On Diff #22853)

nullptr

3783–3786 ↗(On Diff #22853)

Same, { } ?

test/CodeGen/X86/masked_gather_scatter.ll
107–143 ↗(On Diff #22853)

Missing CHECKs ?

144–147 ↗(On Diff #22853)

Needless space?

test/CodeGen/X86/masked_memop.ll
10–11 ↗(On Diff #22853)

This is surprising, do we know why this happens?

delena updated this revision to Diff 24602.Apr 29 2015, 1:58 AM
delena edited edge metadata.
delena removed rL LLVM as the repository for this revision.

I followed Ahmed's review and uploaded a new patch.

rengolin added inline comments.Apr 29 2015, 3:54 AM
test/CodeGen/X86/masked_memop.ll
10–11 ↗(On Diff #22853)

I'm guessing the order has changed. Using AVX2-DAG on both and keeping the 32 as before should work, too, and be more explicit.

; AVX2-DAG: vpmaskmovd      32(%rdi)
; AVX2-DAG: vpmaskmovd      (%rdi)

Then, the 32 can come in any order.

ab added inline comments.Apr 30 2015, 7:44 AM
test/CodeGen/X86/masked_memop.ll
10–11 ↗(On Diff #22853)

Right, -DAG would work. What I'm saying is: the order change is a symptom - innocuous, yes - of something else - probably innocuous as well. We're just legalizing entirely new nodes and intrinsics. Why would any of the existing codegen ever change?

Elena, if you could spare some time to investigate this, it'd be much appreciated!

The test passes without this change, so it is unrelated to gather/scatter . I just revisited all my tests and thought that setting exact order of these 2 operations in incorrect.

rengolin accepted this revision.Jul 28 2015, 4:21 AM
rengolin edited edge metadata.
This revision is now accepted and ready to land.Jul 28 2015, 4:21 AM
rengolin closed this revision.Jul 28 2015, 4:21 AM

r236211