This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Custom legalize splat_vector and disable unprofitable generic DAG combine
Changes PlannedPublic

Authored by reames on Sep 16 2022, 12:03 PM.

Details

Summary

The motivation of this patch is reduce the number of unique ways we handle splats in the RISCV backend. Before this, we would expand the splat vector into a build_vector for generic IR, but for intrinsics we'd frequently end up emitting a splat_vector during lowering and rely on later legalization. This meant that depending on the exact test case you looked at, very similar splats could take divergent paths during ISEL.

This change includes effectively a revert of D120328. This transformation is not generally profitable as it looses the information about the AVL of the vector being inserted. The result of this is that splats which could be done at a narrow VL, are instead done at VLMAX. Given splats are generally scheduled close to their consuming instruction, this results in widespread regressions on RISCV if splat_vectors make it into dag combine. (i.e. we end up needing to toggle VL repeatedly)

If desired, I could move the removed code into a target DAG for aarch64, but before doing that, I was hoping someone would have an idea on how to solve the generic problem in a profitable way. :) I don't know quite enough about AArch64 to know what's needed there, so pointers in the right direction are appreciated.

Diff Detail

Event Timeline

reames created this revision.Sep 16 2022, 12:03 PM
reames requested review of this revision.Sep 16 2022, 12:03 PM
Herald added a project: Restricted Project. · View Herald TranscriptSep 16 2022, 12:03 PM
craig.topper added inline comments.Sep 16 2022, 12:09 PM
llvm/lib/Target/RISCV/RISCVISelLowering.cpp
812

Does this do anything? RISCVTargetLowering::LowerOperation returns SDValue() for SPLAT_VECTOR unless it has i1 type.

Are you able to write tests that show the value of removing the combine for riscv? because the current ones don't highlight much. In general it seems good to simplify splat operations as subvector operations get in the way of other combines as well as isel (things like matching immediate operands).

From an AArch64 specific point of view these extracts are normally removed after operation legalisation (where for AArch64/SVE fixed length vectors are lowered to scalable vectors) when you end up with extract/insert subvector pairs. The problematic case is i1 fixed length vectors which are not type legal and hence the combine is necessary to catch those cases.

I've no objection to making this combine target specific but it does seem like something that is generically good and without seeing the exact problem it is hard to suggest an alternative.

reames planned changes to this revision.Oct 3 2022, 8:55 AM

I need to better explain this.

Matt added a subscriber: Matt.Oct 5 2022, 9:56 AM