This is an archive of the discontinued LLVM Phabricator instance.

[X86] Schedule-model second (mask) output of GATHER instruction
ClosedPublic

Authored by lebedev.ri on Jun 13 2021, 2:20 PM.

Details

Summary

Much like mulx's WriteIMulH, there are two outputs of AVX2 GATHER instructions.
This was changed back in rL160110, but the sched model change wasn't present.
So right now, for sched models that are marked as complete (znver3 only now),
codegen'ning GATHER results in a crash:

DefIdx 1 exceeds machine model writes for early-clobber renamable $ymm3, dead early-clobber renamable $ymm2 = VPGATHERDDYrm killed renamable $ymm3(tied-def 0), undef renamable $rax, 4, renamable $ymm0, 0, $noreg, killed renamable $ymm2(tied-def 1) :: (load 32, align 1)

https://godbolt.org/z/Ks7zW7WGh

I'm guessing we need to deal with this like we deal with WriteIMulH?

Diff Detail

Event Timeline

lebedev.ri created this revision.Jun 13 2021, 2:20 PM
lebedev.ri requested review of this revision.Jun 13 2021, 2:20 PM

What about the avx512 gather ops - are they OK?

llvm/lib/Target/X86/X86Schedule.td
128

Maybe call this WriteVecMaskedGather to more closely match the other vector load/store classes? Maybe move it to be with the WriteVecMaskedLoad defs as well for clarity.

I'm assuming we don't need separate float/integer classes for this?

@RKSimon sanity check: i'm not adding shed class to simplify modelling of GATHER instruction scheduling.
I'm only trying to fix a sched model correctness check.

Fix AVX512 gather too.

RKSimon added inline comments.Jun 14 2021, 5:47 AM
llvm/lib/Target/X86/X86SchedBroadwell.td
210

Why does it match a scalar load latency and not a vector load latency?

lebedev.ri added inline comments.Jun 14 2021, 5:53 AM
llvm/lib/Target/X86/X86SchedBroadwell.td
210

As you can see in llvm/lib/Target/X86/X86InstrAVX512.td/llvm/lib/Target/X86/X86InstrSSE.td changes,
for gather, WriteLoad is used for some reason. This simply mimics that.
I do guess it should be WriteVecLoadX/WriteVecLoadY, but that seems like a separate change?

Matt added a subscriber: Matt.Jun 14 2021, 12:41 PM
RKSimon accepted this revision.Jun 15 2021, 1:47 AM

LGTM with one (optional) rename minor - cheers

llvm/lib/Target/X86/X86Schedule.td
128

OK, very pedantic but maybe call this WriteVecMaskedGatherWriteback so that it matches the naming convention for vector loads and will match if we add masked gather classes in the future (although tbh they are so diverse in characteristic that it might be easier to just use customs).

This revision is now accepted and ready to land.Jun 15 2021, 1:47 AM

@RKSimon thank you for the review!

llvm/lib/Target/X86/X86Schedule.td
128

Yep, thank you for the suggestion, i didn't like the name i came up anyways.

GGanesh added a subscriber: GGanesh.Jul 6 2021, 4:06 AM