This is an archive of the discontinued LLVM Phabricator instance.

Re-factorization of X86InterleaveAccess into a class
ClosedPublic

Authored by Farhana on Oct 26 2016, 7:20 AM.

Details

Summary

This change-set support re-factorization of X86InterleaveAccess pass into a class without any functional changes in order to allow better code sharing.

Diff Detail

Repository
rL LLVM

Event Timeline

Farhana updated this revision to Diff 75878.Oct 26 2016, 7:20 AM
Farhana retitled this revision from to Re-factorization of X86InterleaveAccess into a class.
Farhana updated this object.
zvi added inline comments.Oct 27 2016, 7:36 AM
lib/Target/X86/X86InterleavedAccess.cpp
69 ↗(On Diff #75878)

Not related to this patch, but is setting the original vector load's alignment for all 'decomposed' loads correct?

lib/Target/X86/X86InterleavedAccess.h
18 ↗(On Diff #75878)

No need to include in the header. forward declaration is sufficient:
class X86Subtarget;

19 ↗(On Diff #75878)

Probably same as above, LLVM convention is to use forward declarations if reasonable.

53 ↗(On Diff #75878)

Consider documenting this c-tor's arguments.

66 ↗(On Diff #75878)

Does this method need to be public?

77 ↗(On Diff #75878)

Does this method need to be public?

RKSimon added inline comments.Oct 27 2016, 7:59 AM
lib/Target/X86/X86InterleavedAccess.cpp
152 ↗(On Diff #75878)

Isn't this a leak? Where is this deleted?

Farhana updated this revision to Diff 76113.Oct 27 2016, 3:13 PM

Thanks Simon and Zvi.

Here is the updated change-set that addresses your comments.

zvi added inline comments.Oct 28 2016, 2:01 AM
lib/Target/X86/X86InterleavedAccess.cpp
224 ↗(On Diff #76113)

It would better to instantiate as a function-local variable. If you do want to allocate on the heap, please use std::unique_ptr or something similar to manage the object.

Farhana updated this revision to Diff 76449.Oct 31 2016, 11:26 AM

The update addresses Zvi's comment about declaring Grp as a functional local variable.

DavidKreitzer added inline comments.Oct 31 2016, 2:14 PM
lib/Target/X86/X86InterleavedAccess.cpp
26 ↗(On Diff #76449)

elements --> element

53 ↗(On Diff #76449)

This reads funny. Did you mean "into \p NumSubVectors sub vectors of type \p T."?

55 ↗(On Diff #76449)

Please describe the return value in the comment. I assume it is supposed to indicate success? It seems odd to me that that would be necessary. I would expect isSupported() == true to imply that the transform is expected to succeed. Should this just return void instead?

64 ↗(On Diff #76449)

It would be slightly clearer if you used the variable names here, i.e.

Input-Vectors --> InputVectors
OutputVectors --> TransposedVectors

68 ↗(On Diff #76449)

Is this method really supposed to be restricted to 4x4 transpose? If so, your example should be a 4x4 transpose, not a 2x2 one. (Did you mean for the comment at the method definition to be here?)

Trasposed --> Transposed

95 ↗(On Diff #76449)

Don't duplicate the comments from the class definition unless they contain additional value (e.g. notes about the low level implementation).

116 ↗(On Diff #76449)

Why are you passing VecInst, NumSubVectors, & SubVecTy as arguments to decompose? These are all easily accessible from the class members.

119 ↗(On Diff #76449)

VecSize --> SubVecSize would be clearer

Farhana updated this revision to Diff 76517.Oct 31 2016, 8:11 PM

Thanks Dave, the update addresses your comment.

lib/Target/X86/X86InterleavedAccess.cpp
55 ↗(On Diff #76449)

So the intent is to use it for breaking down any kind of instruction such as load, shuffle. Currently, load is only supported. Also there might be some other challenges where we were not able to create the dummy vectors and break-down the instruction evenly.

68 ↗(On Diff #76449)

Yes.

116 ↗(On Diff #76449)

So the intent is to use this function in a general way and use it to break down any kind of instruction. Currently, it is used for load instruction, but it will also be used to break down the long shuffle instruction in the strided-store pattern.

RKSimon edited edge metadata.Nov 8 2016, 11:09 PM

Please can you replace the uses of uint32_t with unsigned?

Please can you replace the uses of uint32_t with unsigned?

Hi Simon,

I would think uint32_t would be preferred over unsigned because unsigned could lead to an incorrect result. Though I agree it will be safe to use unsigned here.

But in general is there a reason for wanting the size to change with the underlying architecture?

Farhana updated this revision to Diff 77381.Nov 9 2016, 11:04 AM
Farhana edited edge metadata.

Replaces uint32_t with unsigned.

delena added inline comments.Nov 9 2016, 7:35 PM
lib/Target/X86/X86InterleavedAccess.cpp
34 ↗(On Diff #77381)

const LoadInst* Inst;

50 ↗(On Diff #77381)

Do you need to keep Builder inside? I assume it may be a local var inside function.

81 ↗(On Diff #77381)

const Instruction *I

216 ↗(On Diff #77381)

Why do you need to pass an empty builder here?

220 ↗(On Diff #77381)

return Grp.isSupported() && Grp.lowerIntoOptimizedSequence();

Farhana updated this revision to Diff 77569.Nov 10 2016, 4:11 PM

This change-set addresses Elena's comments.

lib/Target/X86/X86InterleavedAccess.cpp
34 ↗(On Diff #77381)

Inst cannot be a constant pointer because we are bit-casting its pointer operand. Inst can be a both load/store, that's why I declared it as an Instruction instead of LoadInst.

50 ↗(On Diff #77381)

Yes, because Builder is used in different member functions and the different member functions might not have any idea about the insertion point. Having it as a member variable helps updating the insertion point automatically after each new instruction creation.

81 ↗(On Diff #77381)

I think it's covered by my previous comment.

216 ↗(On Diff #77381)

We are using this builder with insertionpoint set to load instruction in multiple member functions where the member functions have no idea about the central insertion point. Therefore, we need to have it as a member variable set to the central insertion point for the other member functions.

delena added inline comments.Nov 10 2016, 10:57 PM
lib/Target/X86/X86InterleavedAccess.cpp
127 ↗(On Diff #77569)

It is always "load" or you are thinking about any general case? You can call the function decomposeLoad.

141 ↗(On Diff #77569)

Zvi is right. And fix indentation, please.

34 ↗(On Diff #77381)

so "Instruction* Inst" is enough. The "const" is redundant here.

Farhana added inline comments.Nov 11 2016, 7:15 AM
lib/Target/X86/X86InterleavedAccess.cpp
127 ↗(On Diff #77569)

Yes, that's the plan to use it in a general way, for decomposing any kind of instructions such as load, store, shuffle.

141 ↗(On Diff #77569)

"And fix indentation, please"

Instruction *NewLoad =
      Builder.CreateAlignedLoad(NewBasePtr, LI->getAlignment());

Are you talking about this? Indentation is not off here. I reran clang-format, everything remained as it is.

69 ↗(On Diff #75878)

Sorry Zvi, I missed this comment earlier. Yes, that's correct.

Please can you replace the uses of uint32_t with unsigned?

I would think uint32_t would be preferred over unsigned because unsigned could lead to an incorrect result. Though I agree it will be safe to use unsigned here.

But in general is there a reason for wanting the size to change with the underlying architecture?

In general we should match whatever the original return values was - Type::getVectorNumElements() returns unsigned so the NumSubVectors should probably match that.

I'm sorry but I messed up on the types of a couple of the other instances - CreateShuffleVector should take an ArrayRef<uint32_t> and DataLayout::getTypeSizeInBits returns a uint64_t

lib/Target/X86/X86InterleavedAccess.cpp
196 ↗(On Diff #77569)

for (unsigned i = 0, e = Shuffles.size(); i != e ; ++i)

Farhana updated this revision to Diff 77650.Nov 11 2016, 12:17 PM

Hi Guys,

Is there anything else you want me to fix/clarify?

Farhana

zvi edited edge metadata.Nov 21 2016, 11:21 AM

LGTM. Thanks, Farhana!

X86InterleavedAccess.cpp
71 ↗(On Diff #77650)

Minor comment: AFAIU you can pass an ArrayRef object by value efficiently, but i did see several occurrences in LLVM's source code with ArrayRef's passed by reference. Whatever you choose, maybe make this entire file consistent (line 82, for example is inconsistent with line 71).
I see that the overriden methods such as TargetLowering::lowerInterleavedLoad pass ArrayRef objects by value, so maybe this would be a tie-breaker for following this convention?

Farhana updated this revision to Diff 78912.Nov 22 2016, 11:53 AM
Farhana edited edge metadata.

ArrayRefs are passed by value.

zvi added a comment.Nov 22 2016, 11:57 AM

Thanks, Farhana! LGTM.

RKSimon accepted this revision.Nov 23 2016, 4:04 AM
RKSimon edited edge metadata.

LGTM

This revision is now accepted and ready to land.Nov 23 2016, 4:04 AM
DavidKreitzer edited edge metadata.Nov 23 2016, 9:11 AM

Hi Farhana,
I have no further comments. This LGTM too.
-Dave

This revision was automatically updated to reflect the committed changes.