This is an archive of the discontinued LLVM Phabricator instance.

[X86][LV] X86 does *not* prefer vectorized addressing
ClosedPublic

Authored by lebedev.ri on Oct 11 2021, 7:06 AM.

Details

Summary

And another attempt to start untangling this ball of threads around gather.
There's TTI::prefersVectorizedAddressing()hoop, which confusingly defaults to true,
which tells LV to try to vectorize the addresses that lead to loads,
but X86 generally can not deal with vectors of addresses,
the only instructions that support that are GATHER/SCATTER,
but even those aren't available until AVX2, and aren't really usable until AVX512.

This specializes the hook for X86, to return true only if we have AVX512 or AVX2 w/ fast gather.

Diff Detail

Event Timeline

lebedev.ri created this revision.Oct 11 2021, 7:06 AM
lebedev.ri requested review of this revision.Oct 11 2021, 7:06 AM
RKSimon added inline comments.Oct 13 2021, 12:15 PM
llvm/test/Analysis/CostModel/X86/interleaved-store-i8-stride-2.ll
43 ↗(On Diff #378650)

pre-commit fix?

llvm/test/Transforms/LoopVectorize/X86/cost-model.ll
2

pre-commit the regeneration with the update script?

lebedev.ri marked 2 inline comments as done.

@RKSimon thank you for taking a look!
Rebased, addressed review notes.

RKSimon added inline comments.Oct 13 2021, 1:13 PM
llvm/lib/Target/X86/X86TargetTransformInfo.h
245

How come none of these are const methods?

248

const?

lebedev.ri marked 2 inline comments as done.Oct 13 2021, 1:20 PM
lebedev.ri added inline comments.
llvm/lib/Target/X86/X86TargetTransformInfo.h
244

Whoops, i did *not* intend to add LSRWithInstrQueries.

Addressed review notes.
I don't recall specifically why i didn't mark them as const initially.

I think this makes sense.

@pengfei do you want to check this on Intel's benchmarks?

I think this makes sense.

@pengfei do you want to check this on Intel's benchmarks?

Yes, I can check our internal benchmark. I guess this only affect non AVX512 targets, right?

llvm/lib/Target/X86/X86TargetTransformInfo.cpp
5015–5017

Do we need to override here? I think it returns false by default.

lebedev.ri marked an inline comment as done.Oct 15 2021, 12:29 AM

I think this makes sense.

@pengfei do you want to check this on Intel's benchmarks?

Yes, I can check our internal benchmark.

Thank you.
I'm not sure how this measures standalone,
my main hope this unblocks D111460.

I guess this only affect non AVX512 targets, right?

Yes.

llvm/lib/Target/X86/X86TargetTransformInfo.cpp
5015–5017

I think we want to be explicit here.

I think this makes sense.

@pengfei do you want to check this on Intel's benchmarks?

Yes, I can check our internal benchmark.

Thank you.
I'm not sure how this measures standalone,
my main hope this unblocks D111460.

Our internal benchmarks are not impacted by this patch. So I don't see evidences it can unblock D111460.

lebedev.ri marked an inline comment as done.Oct 16 2021, 12:29 AM

I think this makes sense.

@pengfei do you want to check this on Intel's benchmarks?

Yes, I can check our internal benchmark.

Thank you.
I'm not sure how this measures standalone,
my main hope this unblocks D111460.

Our internal benchmarks are not impacted by this patch.

Cool! Thanks for checking.
Does anyone want to formally stamp this patch?

So I don't see evidences it can unblock D111460.

RKSimon accepted this revision.Oct 16 2021, 2:17 AM

LGTM - cheers

This revision is now accepted and ready to land.Oct 16 2021, 2:17 AM

LGTM - cheers

@RKSimon @craig.topper @pengfei Thank you for the review!

@RKSimon looks like i may need to go back and add costs for these new VF's :)

This revision was automatically updated to reflect the committed changes.