Page MenuHomePhabricator

[X86][LLVM]Expanding Supports lowerInterleaved{store|load}() in X86InterleavedAccess (VF64 stride 3)
ClosedPublic

Authored by m_zuckerman on Sep 11 2017, 7:02 AM.

Details

Summary

I continue to support different VF interleaved and in this pass for this patch, I added the vf64 stride3 support for both load and store.

Diff Detail

Repository
rL LLVM

Event Timeline

m_zuckerman created this revision.Sep 11 2017, 7:02 AM
m_zuckerman retitled this revision from [X86][LLVM]Expanding Supports lowerInterleavedStore() in X86InterleavedAccess (VF64 stride 3) to [X86][LLVM]Expanding Supports lowerInterleaved{store|load}}() in X86InterleavedAccess (VF64 stride 3) .Sep 11 2017, 7:04 AM
m_zuckerman added a subscriber: llvm-commits.
m_zuckerman retitled this revision from [X86][LLVM]Expanding Supports lowerInterleaved{store|load}}() in X86InterleavedAccess (VF64 stride 3) to [X86][LLVM]Expanding Supports lowerInterleaved{store|load}() in X86InterleavedAccess (VF64 stride 3) .Sep 11 2017, 7:11 AM
m_zuckerman edited the summary of this revision. (Show Details)
zvi added inline comments.Sep 12 2017, 2:13 AM
lib/Target/X86/X86InterleavedAccess.cpp
136 ↗(On Diff #114591)

I see this patch assumes we have support for both loads and stores which is not the case on ToT, so can you please modify the parent patch accordingly so that the path of patches leading from ToT to this patch can be understood?

zvi added inline comments.Sep 28 2017, 8:54 AM
lib/Target/X86/X86InterleavedAccess.cpp
83 ↗(On Diff #116979)

Make these methods static const or even better move them out of the class

186 ↗(On Diff #116979)

To be consistent with the existing code, maybe 'VecWidth' is a better name.

461 ↗(On Diff #116979)

Probably more efficient to use a static array and pass a partial chunk as needed:

static uint32_t Concat[] = {0, 1, ... 63};
makeArrayRef(Concat, 32);

Same for reorder below

479 ↗(On Diff #116979)

If we reach this point, can VecElems be other than 64?

593 ↗(On Diff #116979)

When you get a chance, please make change Mask to an ArrayRef in a separate NFC commit

625 ↗(On Diff #116979)

VPShuf and Vec should be ArrayRef

653 ↗(On Diff #116979)

Wouldn't this be better?

if (VecElems == 32) {
std::copy(Temp, Temp+3, TransposedMatrix);
return;
}
and handle VecElems > 32 here

m_zuckerman marked 6 inline comments as done.
m_zuckerman added inline comments.
lib/Target/X86/X86InterleavedAccess.cpp
479 ↗(On Diff #116979)

In theory, yes, but I block all the other cases with the if supported stage so in practice, the answer is no. Do you prefer to add an assertion?

zvi accepted this revision.Oct 1 2017, 12:54 AM

LGTM, and yes please add the assert

lib/Target/X86/X86InterleavedAccess.cpp
479 ↗(On Diff #116979)

I think we should since the path > 64 is not tested

This revision is now accepted and ready to land.Oct 1 2017, 12:54 AM
This revision was automatically updated to reflect the committed changes.