This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Cost model for scalable reductions
ClosedPublic

Authored by reames on Jun 9 2022, 3:10 PM.

Details

Summary

This extends the existing cost model for reductions for scalable vectors.

The existing cost model assumes that reductions are roughly logarithmic in cost for unordered variants and linear for ordered ones. This change keeps that same basic model, and extends it out to the maximum number of elements a scalable vector could possibly have.

This results in costs which aren't terribly high for unordered reductions, but are for ordered ones. This seems about right; we want to strongly bias away from using scalable ordered reductions if the cost might be linear in VL.

Diff Detail

Event Timeline

reames created this revision.Jun 9 2022, 3:10 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 9 2022, 3:10 PM
reames requested review of this revision.Jun 9 2022, 3:10 PM

The existing cost model decision to incorporate vector length into the cost seems quite odd to me.

Are you refering to the Log2_32_Ceil(VL)? If you don't build some optimized ALU, you need to perform log2(vl) stages to reduce to a single value.

linear cost on ordered reductions makes little sense to me.

An ordered reduction should be linear across the vector. It also needs to be linear across every vector if we need to split. So it should be (LT.first * NumElements) for reducing each vector including the additional add for the scalar input. LT.first-1 scalar adds to join the reductions, I guess those could be chained using the scalar input to our instructions so maybe that is already costed. Then some base overhead.

reames added a comment.Jun 9 2022, 4:24 PM

@craig.topper You seem to be assuming a much different implementation that I was. I was thinking of these as likely being constant time operations.

For your model, I think we'd have to be conservative for scalable vectors and use the maximum possible VL. That would fit naturally in the structure, but would strongly bias against using them. Should I switch to that?

I suspect we're also going to need a process feature flag here at some point, but we can defer that. :)

reames planned changes to this revision.Jun 10 2022, 2:36 PM

For your model, I think we'd have to be conservative for scalable vectors and use the maximum possible VL. That would fit naturally in the structure, but would strongly bias against using them. Should I switch to that?

https://reviews.llvm.org/D127541 uses this approach for scatter/gather. I'll let that converge, and then return to this with that model in mind.

reames updated this revision to Diff 437742.Jun 16 2022, 3:52 PM

Rework with the max-vl based approach we used in scatter/gather.

reames edited the summary of this revision. (Show Details)Jun 16 2022, 3:55 PM

Don't we need to qualify Opcode?

D126372, I tried to implement a preliminary support, referring to other architectures limiting some opcode

Don't we need to qualify Opcode?

D126372, I tried to implement a preliminary support, referring to other architectures limiting some opcode

Isn't the opcode checked on line 356?

if (ISD != ISD::ADD && ISD != ISD::OR && ISD != ISD::XOR && ISD != ISD::AND &&
      ISD != ISD::FADD)

Don't we need to qualify Opcode?

D126372, I tried to implement a preliminary support, referring to other architectures limiting some opcode

Isn't the opcode checked on line 356?

if (ISD != ISD::ADD && ISD != ISD::OR && ISD != ISD::XOR && ISD != ISD::AND &&
      ISD != ISD::FADD)

I see. Thanks

This revision is now accepted and ready to land.Jun 27 2022, 9:37 AM
This revision was landed with ongoing or failed builds.Jun 27 2022, 12:44 PM
This revision was automatically updated to reflect the committed changes.
Miss_Grape added inline comments.
llvm/test/Analysis/CostModel/RISCV/reduce-scalable-fp.ll
9

I would like to know how the value of the code model is calculated? Is the cycle set when scheduling related?