This is an archive of the discontinued LLVM Phabricator instance.

[X86][Costmodel] Load/store i32 Stride=3 VF=32 interleaving costs
ClosedPublic

Authored by lebedev.ri on Oct 16 2021, 10:02 AM.

Details

Summary

A few more tuples are being queried after D111546. Might be good to model them,
They all require a lot of manual assembly surgery.

The only sched models that for cpu's that support avx2
but not avx512 are: haswell, broadwell, skylake, zen1-3

For load we have:
https://godbolt.org/z/s5b6E6jsP - for intels Block RThroughput: <=32.0; for ryzens, Block RThroughput: <=24.0
So could pick cost of 32

For store we have:
https://godbolt.org/z/efh99d93b - for intels Block RThroughput: <=48.0; for ryzens, Block RThroughput: <=32.0
So we could pick cost of 48.

I'm directly using the shuffling asm the llc produced,
without any manual fixups that may be needed
to ensure sequential execution.

Diff Detail

Event Timeline

lebedev.ri created this revision.Oct 16 2021, 10:02 AM

Actually upload the right patch.

lebedev.ri edited the summary of this revision. (Show Details)Oct 16 2021, 11:23 AM

@lebedev.ri I've created D111960 to see if we can reduce these costs a little - given how manual their testing is, it'd be great to ensure the codegen is reasonable before you do any more work on them.

@lebedev.ri I've created D111960 to see if we can reduce these costs a little - given how manual their testing is, it'd be great to ensure the codegen is reasonable before you do any more work on them.

Sure, but for these strides these patches only finish the few missing pieces,
while after codegen improves we may optionally have to go back and update the costs
and a few more items to update won't really affect things.
The next big batch (stride=5/7/8) can indeed wait a bit for better codegen.

After these, hopefully D111460 lands, and then i think i'd like to decide on discounting not-fully-interleaved loads before finishing with the missing strides.

RKSimon accepted this revision.Oct 17 2021, 7:00 AM

LGTM

This revision is now accepted and ready to land.Oct 17 2021, 7:00 AM

LGTM

Thank you for the reviews!