This is an archive of the discontinued LLVM Phabricator instance.

[X86] Add Extend shuffle pattern to vNf32 shuffles.
ClosedPublic

Authored by goldstein.w.n on Feb 10 2023, 3:31 PM.

Details

Summary

There are some cases where its useful for float types, not quite as
hot as in the integer case, but still better than alternatives.

Diff Detail

Event Timeline

goldstein.w.n created this revision.Feb 10 2023, 3:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2023, 3:31 PM
goldstein.w.n requested review of this revision.Feb 10 2023, 3:31 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2023, 3:31 PM
RKSimon added inline comments.Feb 11 2023, 3:27 AM
llvm/test/CodeGen/X86/merge-consecutive-loads-512.ll
310 ↗(On Diff #496614)

Hmm - shuffle combining should already catch this - any idea why it doesn't?

goldstein.w.n added inline comments.Feb 11 2023, 9:13 AM
llvm/test/CodeGen/X86/merge-consecutive-loads-512.ll
310 ↗(On Diff #496614)

Yeah, for the float types it is beneficial to do lowerShuffleAsBlend before lowerShuffleAsZeroOrAnyExtend (so added when added lowerShuffleAsZeroOrAnyExtend), lowerShuffleAsBlend is what implements the the logic-op shuffles. I should/will update commit message to reflect that.

Split blend changes

goldstein.w.n added inline comments.
llvm/test/CodeGen/X86/merge-consecutive-loads-512.ll
310 ↗(On Diff #496614)

Moved the blend stuff to a new seperate patch: D143856

RKSimon added inline comments.Feb 20 2023, 6:45 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
14326

Don't bother with the if() test - getBitcast will do that internally anyway:

InputV = DAG.getBitcast(VT, InputV);

14333–14334

Drop the if() and always call getBitcast

17002

We've had problems before where we've created nodes before bailing out later on - causing various problems including hasOneUse assumption failures later on.

Is there any nice way that we can do the SimpleOnly checks BEFORE calling HalfBlend? Or maybe split into an analysis lambda and a getVectorShuffle lambda.

18417

Pull this out into a helper function?

18428

return DAG.getBitcast(MVT::v8f32, ZExt);

19154

return DAG.getBitcast(MVT::v16f32, ZExt);

goldstein.w.n marked 6 inline comments as done.Feb 20 2023, 7:51 PM

Various fixups

RKSimon added inline comments.Feb 21 2023, 3:11 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
17206

SmallVectorImpl<int> &InLaneMask

goldstein.w.n added inline comments.Feb 21 2023, 4:42 PM
llvm/lib/Target/X86/X86ISelLowering.cpp
17206

Any chance we can keep it as a pointer? imo output references are confusing as hell.

goldstein.w.n marked an inline comment as done.Feb 21 2023, 11:34 PM

Use reference instead of pointer and default smallvec size

RKSimon added inline comments.Feb 22 2023, 4:13 AM
llvm/lib/Target/X86/X86ISelLowering.cpp
16939

SimpleOnly is guaranteed to be true here

Maybe just return !(UseHiV1 || UseHiV2);

17208

Don't hardcode the LaneSize calculation - it prevents us using for other vector sizes in the future.

17209

InLaneMask.assign(Mask.begin(), Mask.end());

18407

Move if(!Subtarget.hasAVX2()) up here to avoid unnecessary computeInLaneShuffleMask calls

goldstein.w.n marked 4 inline comments as done.Feb 22 2023, 9:12 AM

Fix some nits

RKSimon accepted this revision.Feb 22 2023, 2:50 PM

LGTM - cheers

This revision is now accepted and ready to land.Feb 22 2023, 2:50 PM
This revision was landed with ongoing or failed builds.Feb 24 2023, 1:22 PM
This revision was automatically updated to reflect the committed changes.