This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Cost illegally typed insert/extract elements correctly
AbandonedPublic

Authored by reames on Aug 30 2023, 11:28 AM.

Details

Summary

We had been deferring to BasicTTI, and BasicTTI doesn't handle this correctly. It blindly returns a cost of 1 for each extract and insert. The net effect of this was that illegal build and explode vectors looked oddly cheap compared to their legal sub types.

Noticed this while playing around with SLP and -riscv-v-fixed-length-vector-lmul-max, but I don't believe this impacts default SLP codegen.

This probably should be sunk into BasicTTI. I even intend to try to do so after this lands, I just wanted to get the target specific test changes in first. The sinking into BasicTTI will be easier if it's NFC.

Diff Detail

Event Timeline

reames created this revision.Aug 30 2023, 11:28 AM
reames requested review of this revision.Aug 30 2023, 11:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 30 2023, 11:28 AM
luke added a comment.Aug 31 2023, 4:28 AM

Makes sense to me: this is to mirror the codegen in DAGTypeLegalizer ::SplitVecRes_INSERT_VECTOR_ELT and DAGTypeLegalizer::SplitVecOp_EXTRACT_VECTOR_ELT?

asb accepted this revision.Sep 5 2023, 3:18 AM

Weak LGTM just because I haven't done much in this area. Reasoning and code seem correct to me though.

This revision is now accepted and ready to land.Sep 5 2023, 3:18 AM
reames abandoned this revision.Sep 11 2023, 12:09 PM

Abandoning this. Looks like I never updated this previously, so let me explain why.

I'd thought this patch was going to be straight forward, but applying it to x264 showed regressions on several key routines. When looking into it, I stumbled into the non-linearity issue with build and explode vectors. That is now fixed (by falling back to the stack). Given those fixes, the cost model here is wrong. Specifically, the cost of the extracts and inserts end up being effective constant time when each lane is acted upon. As such, the costs proposed by this patch no longer apply for build and explode vector idioms, which are the dominant usage of this API.

We probably should separate out a costing API for build and explode separately from the individual extract and insert instructions. Maybe once that's done, we can come back and revisit this idea once it really only applies to non-build/explode usage of these instructions.