This is an archive of the discontinued LLVM Phabricator instance.

[CostModel][X86] getScalarizationOverhead - improve extraction costs for > 128-bit vectors
ClosedPublic

Authored by RKSimon on May 13 2022, 1:53 AM.

Details

Summary

We were using the default getScalarizationOverhead expansion for extraction costs, which adds up all the individual element extraction costs.

This is fine for 128-bit vectors, but for 256/512-bit vectors each element extraction also has to account for extracting the upper 128-bit subvector extraction before it can handle the element. For scalarization costs we only need to extract each demanded subvector once.

Diff Detail

Event Timeline

RKSimon created this revision.May 13 2022, 1:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 13 2022, 1:53 AM
RKSimon requested review of this revision.May 13 2022, 1:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 13 2022, 1:53 AM
ABataev added inline comments.May 16 2022, 7:47 AM
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
3898

Can we estimate regcopy somehow here?

3905

Maybe use BaseT::getScalarizationOverhead again here rather than getVectorInstrCost(Instruction::ExtractElement? Also, what about possible getExtractWithExtendCost here?

RKSimon added inline comments.May 16 2022, 8:04 AM
llvm/lib/Target/X86/X86TargetTransformInfo.cpp
3898

OK - I can update this to call getShuffleCost(TTI::SK_ExtractSubvector) instead

3905

We can't easily use BaseT::getScalarizationOverhead as we need the index adjustments.

I've been looking at adding getExtractWithExtendCost handling to getScalarizationOverhead but it'll require a minor refactor that we don't need for this cost fix.

RKSimon updated this revision to Diff 431650.May 24 2022, 5:25 AM

Use explicit getShuffleCost(TTI::SK_ExtractSubvector) cost call

RKSimon marked an inline comment as done.May 24 2022, 5:26 AM

@ABataev I think as a patch adding upper subvector extraction support for X86 getScalarizationOverhead this is ready now. I'd prefer to wait to a later patch to add specific getExtractWithExtendCost handling to getScalarizationOverhead once I know exactly how I'm going to implement it in SLP.

ABataev accepted this revision.May 24 2022, 6:41 AM

LG, thanks!

This revision is now accepted and ready to land.May 24 2022, 6:41 AM
This revision was landed with ongoing or failed builds.May 24 2022, 7:18 AM
This revision was automatically updated to reflect the committed changes.