This is an archive of the discontinued LLVM Phabricator instance.

[VectorUtils] move x86's scaleShuffleMask to generic VectorUtils
ClosedPublic

Authored by spatel on Mar 20 2020, 9:01 AM.

Details

Summary

We have some long-standing missing shuffle optimizations that could use this transform via VectorCombine now:
https://bugs.llvm.org/show_bug.cgi?id=35454
(and we still don't get that case in the backend either)

This function is apparently templated because there's existing code in IR that treats mask values as unsigned and backend code that treats masks values as signed?

The mask values are not endian-dependent IIUC.

Diff Detail

Event Timeline

spatel created this revision.Mar 20 2020, 9:01 AM
lebedev.ri requested changes to this revision.Mar 20 2020, 9:50 AM
lebedev.ri added a reviewer: t.p.northover.

I'm not convinced this endian-agnostic, but i agree this is correct for little-endian.

This revision now requires changes to proceed.Mar 20 2020, 9:50 AM

Whether a given transform is endian-agnostic depends on the specific transform. If you're talking about reordering a bitcast and a shuffle, in particular, I'm pretty sure that's endian-agnostic. Trivial example:

%shufflefirst = shufflevector <1 x i64> %x, <1 x i64> %x, <1 x i32> <i32 0>
%z = bitcast <1 x i64> %shufflefirst to <2 x i32>

vs.

%bitcastfirst = bitcast <1 x i64> %x to <2 x i32>
%result = shufflevector <2 x i32> %bitcastfirst, <1 x i32> %bitcastfirst, <2 x i32> <i32 0, i32 1>

The shuffle is a no-op; an identity shuffle is spelled the same way on big-endian and little-endian targets.

This shouldn't be endian specific - DAGCombiner::visitVECTOR_SHUFFLE already does SHUFFLE(BITCAST,UNDEF)->BITCAST(SHUFFLE) with its own almost-identical version of ScaleShuffleMask.

This shouldn't be endian specific - DAGCombiner::visitVECTOR_SHUFFLE already does SHUFFLE(BITCAST,UNDEF)->BITCAST(SHUFFLE) with its own almost-identical version of ScaleShuffleMask.

Thanks - I knew that we had this somewhere else, but I missed that. I can remove that duplication as part of this patch if that improves the motivation.

spatel updated this revision to Diff 251894.Mar 22 2020, 8:59 AM

Patch updated:
Replace duplicate shuffle mask scaling from DAGCombiner with calls to the generic util.

lebedev.ri accepted this revision.Mar 22 2020, 10:23 AM

Thanks, that helped convince me of this not being endian-specific.
LG

llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp
19759–19760 ↗(On Diff #251894)

Do we want to preserve fastpath?

19765 ↗(On Diff #251894)

I'd personally keep ternary variant, not have two loops.

This revision is now accepted and ready to land.Mar 22 2020, 10:23 AM
RKSimon accepted this revision.Mar 22 2020, 10:46 AM

LGTM

llvm/include/llvm/Analysis/VectorUtils.h
346

Retaining a fast copy path for Scale == 1 would make sense

I'll check this in as-is and follow-up with the implementation improvements.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptMar 23 2020, 7:04 AM