This is an archive of the discontinued LLVM Phabricator instance.

[CostModel][X86][AArch64] Adjust cost of the scalarization part of min/max reduction.
ClosedPublic

Authored by craig.topper on Dec 8 2018, 1:55 PM.

Details

Summary

The comment says we need 3 extracts and a select at the end. But didn't we just account for the select in the vector cost above. Aren't we just extracting the single element after taking the min/max in the vector register?

Diff Detail

Repository
rL LLVM

Event Timeline

craig.topper created this revision.Dec 8 2018, 1:55 PM
RKSimon added inline comments.Dec 9 2018, 3:57 AM
test/Transforms/SLPVectorizer/X86/horizontal-minmax.ll
1108 ↗(On Diff #177400)

This codegen looks like the final icmp+selects are being done with the scalars.

craig.topper marked an inline comment as done.Dec 9 2018, 8:17 AM
craig.topper added inline comments.
test/Transforms/SLPVectorizer/X86/horizontal-minmax.ll
1097 ↗(On Diff #177400)

Isn’t this shuffle and the vector cmp+sel after it the last part of a reduction for elements 0 and 1.

RKSimon accepted this revision.Dec 9 2018, 9:29 AM

LGTM

test/Transforms/SLPVectorizer/X86/horizontal-minmax.ll
1097 ↗(On Diff #177400)

Yes, you're right - I mis-read the test IR. It has to a do a <4 x i32> reduction, then a min/max with the remaining 4 scalar elements (2 in the previous block + 2 in this block).

This revision is now accepted and ready to land.Dec 9 2018, 9:29 AM
This revision was automatically updated to reflect the committed changes.