This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Use vmv.v.i for insertion into lane 0 of undef vector when profitable
ClosedPublic

Authored by reames on Dec 8 2022, 10:00 AM.

Details

Summary

If we're initializing lane 0 of an undef vector, we can optionally write to other lanes of the vector. Doing so may require additional work, so we don't want to e.g. always use a splat. However, since we don't have an immediate form of vmv.s.x it's useful to use a vmv.v.i if the work required is expected to be equal in practice.

At the moment, the new utility is only used by one case in INSERT_VECTOR_ELT lowering. My expectation is that we will reuse this in a couple other places, but each of those deserve individual review.

This change is inspired by D137530, but is not directly related to it. I vaguely remember we discussed the tradeoffs of using vmv.v.i in another recent review, but couldn't find it.

Diff Detail

Event Timeline

reames created this revision.Dec 8 2022, 10:00 AM
reames requested review of this revision.Dec 8 2022, 10:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptDec 8 2022, 10:00 AM
Herald added subscribers: eopXD, MaskRay. · View Herald Transcript
craig.topper added inline comments.Dec 8 2022, 11:45 AM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2840

insrt -> insert

2841

This might not be strictly true on an architecture with data path width less than VLEN. If SEW fits within the datapath length, vmv.x.s could be done in less than cycles than an LMUL=1 vmv.v.i. And LMUL=2 vmv.v.i would be 4x or more cycles than than vmv.x.s in such an architecture.

But I'm not sure how big of an issue that is.

reames added inline comments.Dec 8 2022, 1:10 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2841

While I agree with you in principle, I think we can safely punt this to the future. Our cost modeling for vector operations considers splats linear in LMUL. When we change that, we can change this location as well.

reames updated this revision to Diff 481431.Dec 8 2022, 1:15 PM

Address review comments

llvm/lib/Target/RISCV/RISCVISelLowering.cpp
2838

Apart from craig's concern about μ-arch implementation, LMUL>1 may increase register pressure so I limited it to LMUL<=1 in my patch, though I haven't assessed the impact of this yet :-).

reames updated this revision to Diff 481654.Dec 9 2022, 7:49 AM

Move back to LMUL1. The regalloc effect of LMUL2 isn't a point I'd considered, and we can switch back in a separate patch which independent review if it appears worthwhile.

pcwang-thead accepted this revision.Dec 12 2022, 1:02 AM

LGTM
@craig.topper Any other thoughts?

This revision is now accepted and ready to land.Dec 12 2022, 1:02 AM
This revision was landed with ongoing or failed builds.Dec 13 2022, 8:03 AM
This revision was automatically updated to reflect the committed changes.