This is an archive of the discontinued LLVM Phabricator instance.

Extending Vector GetElementPtr - please review
ClosedPublic

Authored by delena on Jun 17 2015, 2:26 AM.

Details

Summary

Changing syntax of vector GEP.

Please see the start of discussion: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-March/082921.html
And the end agreement: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-March/082989.html

According to the current GEP syntax, vector GEP requires that each index must be a vector with the same number of elements.

%A = getelementptr i8, <4 x i8*> %ptrs, <4 x i64> %offsets

In this implementation I let each index be or vector or scalar. All vector indices must have the same number of elements. The scalar value will mean the splat vector value.

(1) %A = getelementptr i8, i8* %ptr, <4 x i64> %offsets
or
(2) %A = getelementptr i8, <4 x i8*> %ptrs, i64 %offset

In all cases the %A type is <4 x i8*>

In the case (2) we add the same offset to all pointers.

The case (1) covers C[B[i]] case, when we have the same base C and different offsets B[i].

I'll update the documentation once the patch is approved.

Diff Detail

Repository
rL LLVM

Event Timeline

delena updated this revision to Diff 27817.Jun 17 2015, 2:26 AM
delena retitled this revision from to Extending Vector GetElementPtr - please review.
delena updated this object.
delena edited the test plan for this revision. (Show Details)
delena added reviewers: hfinkel, nadav, aschwaighofer.
delena set the repository for this revision to rL LLVM.
delena added a subscriber: Unknown Object (MLST).

Hi Elena,

Quick comment: I think you'd need to update the documentation according to this change in LangRef (see e.g. http://llvm.org/docs/GetElementPtr.html#can-gep-index-into-vector-elements).

Thanks,
Michael

delena updated this object.Jun 17 2015, 10:56 PM
delena updated this object.Jun 17 2015, 11:03 PM
delena added reviewers: mzolotukhin, dblaikie.
delena removed a subscriber: mzolotukhin.
hfinkel edited edge metadata.Jun 22 2015, 7:13 PM

Can you please separate out the masked.store changes?

Please also include the LangRef update with this patch; they should not be separate.

include/llvm/IR/Instructions.h
842

Line too long? (looks like there are a bunch of these)

980–984
delena updated this revision to Diff 28237.Jun 23 2015, 7:23 AM
delena edited edge metadata.

Hi Hal, thank you for the review.

  1. removed all code changes related to the masked gather.
  2. added documentation, see changes in LangRef.rst.
  3. fixed code style errors.

Could somebody proceed with this review, please?

Thanks.

  • Elena

Hi Elena,

Maybe I'm missing something, but can you give an example on where this will be clearly beneficial? From what I see, you're just moving the time when you go from i8* to <N x i8*>, a bit further down the pipeline, and complicating the IR in the process. I'd have thought that it should be pretty simple for any front-end or back-end to do the splat on their own.

To justify such change, I'm looking for some kind of transformation / optimisation / selection that cannot be done unless the base pointer is a scalar.

cheers,
--renato

delena added a comment.Jul 1 2015, 1:32 AM

I really need this change to optimize masked gather and scatter code. I have a mail thread in LLVM dev, where I justify this change and people agreed with me.

Please see the start of discussion: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-March/082921.html
And the end agreement: http://lists.cs.uiuc.edu/pipermail/llvmdev/2015-March/082989.html

I'd have thought that it should be pretty simple for any front-end or back-end to do the splat on their own.

The front-end can do the splat, but I want to allow front-end to leave a scalar value and let the backend to splat.
When the frontend splats the scalar, it generates a broadcast instruction before GEP. While looking at the GEP in backend, I can't find this broadcast (but I need it!), it is moved up to another basic block.

To justify such change, I'm looking for some kind of transformation / optimisation / selection that cannot be done unless the base pointer is a scalar.

Initially I added an optimization for masked gather/scatter to this patch, but Hal asked me to remove it and submit separately.

I plan to add masked gather and scatter to the loop vectorizer. I want to use the updated form of Vector GEP there.

  • Elena

Ok, I missed that discussion, sorry. I agree it's justifiable.

The patch looks good to me, apart from some small comments. But I'd like Hal to give the final approval.

cheers,
--renato

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2792

This comment is outdated, needs to be moved down.

lib/IR/Verifier.cpp
2565

I think you should add an extra assert that the scalar types, if any, possibly must be pointers of the same family (int / float), but not necessarily the same width.

delena added a comment.Jul 1 2015, 1:50 AM

Renato,
thank you, I'll fix.

Hal, do you have more comments for this patch?

  • Elena

Just a couple inline documentation comments. Everything else looks good given the comments provided by others.

docs/LangRef.rst
6227

I think we also need to update the summary syntax at the top of the getelementptr documentation.

6253

I don't understand the <8 x i64> here. Aren't the vector lengths supposed to agree? The same issue carries through the example.

delena updated this revision to Diff 28963.Jul 2 2015, 10:43 AM
delena marked 5 inline comments as done.

I addressed all comments and uploaded a new patch.

Minor issues.

docs/LangRef.rst
6231

broadcasted -> broadcast

(broadcasted is the past tense, so 'was broadcasted' but 'will be broadcast')

6278

Add this back in your patch for masked gather

include/llvm/IR/Instructions.h
847

Missing indent.

861

Missing indent.

lib/AsmParser/LLParser.cpp
5546

I'd name this GEPWidth.

lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp
2764

Comments should end with a period.

2766

Line is too long.

lib/IR/Verifier.cpp
2554–2557

GEPWidth

I think it'd be more concise to use getType()->getVectorNumElements() instead of cast<VectorType>(getType())->getNumElements()

delena updated this revision to Diff 29069.Jul 6 2015, 3:13 AM
delena removed rL LLVM as the repository for this revision.
delena marked 5 inline comments as done.

I addressed all comments and uploaded a new patch.
Thanks.

delena added a subscriber: delena.Jul 8 2015, 3:57 AM

Hi Hal,

May I commit this patch?

Thanks.

  • Elena
hfinkel accepted this revision.Jul 8 2015, 6:40 PM
hfinkel edited edge metadata.

Hi Hal,

May I commit this patch?

Yes, LGTM. Thanks!

Thanks.

  • Elena
This revision is now accepted and ready to land.Jul 8 2015, 6:40 PM
delena closed this revision.Jul 9 2015, 12:47 AM

Committed. Thanks a lot to all reviewers.

docs/LangRef.rst
6253

Thank you! It was a bug.

6278

This is unrelated to masked.gather optimization. The masked gather is a user of vector GEP with or without scalar arguments.

Masked gather lowering will be more optimal if the base pointer is a scalar, but I'm not talking about this here.

include/llvm/IR/Instructions.h
861

I'm removing this additional assert. I don't need it.

lib/IR/Verifier.cpp
2565

I added an assert - all indices should be integer.