This is an archive of the discontinued LLVM Phabricator instance.

[DAGCombine] Fold (store (insert_elt (load p)) x p) -> (store x)
ClosedPublic

Authored by luke on Jun 6 2023, 8:01 AM.

Details

Summary

If we have a store of a load with no other uses in between it, it's
considered dead and is removed. So sometimes when legalizing a fixed
length vector store of an insert, we end up producing better code
through scalarization than without.
An example is the follow below:

%a = load <4 x i64>, ptr %x
%b = insertelement <4 x i64> %a, i64 %y, i32 2
store <4 x i64> %b, ptr %x

If this is scalarized, then DAGCombine successfully removes 3 of the 4
stores which are considered dead, and on RISC-V we get:

sd a1, 16(a0)

However if we make the vector type legal (-mattr=+v), then we lose the
optimisation because we don't scalarize it.

This patch attempts to recover the optimisation for vectors by
identifying patterns where we store a load with a single insert
inbetween, replacing it with a scalar store of the inserted element.

Diff Detail

Event Timeline

luke created this revision.Jun 6 2023, 8:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 6 2023, 8:01 AM
luke requested review of this revision.Jun 6 2023, 8:01 AM
luke added inline comments.Jun 6 2023, 8:12 AM
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert.ll
8

This might be obscuring the behaviour of the actual insert_vector_elt codegen. Let me know if I should mark these stores as volatile

luke updated this revision to Diff 528873.Jun 6 2023, 8:14 AM

Reuse variable, use better comment

RKSimon added inline comments.Jun 15 2023, 3:13 AM
llvm/test/CodeGen/X86/fold-load-vec.ll
17 ↗(On Diff #528873)

Limit to cases where the insertion is the only user of the load?

luke updated this revision to Diff 535014.Jun 27 2023, 9:16 AM

Limit to cases where the store is the only user of the insert_subvector

luke marked an inline comment as done.Jun 27 2023, 9:16 AM
luke added inline comments.
llvm/test/CodeGen/X86/fold-load-vec.ll
17 ↗(On Diff #528873)

Sorry for the delay, rebased and updated now. Limiting to loads with a single use didn't fix this case, but limiting to inserts with a single use worked instead

luke marked an inline comment as done.Jun 27 2023, 9:17 AM
RKSimon accepted this revision.Jun 28 2023, 3:31 AM

LGTM - we just need to clear up whether fixed-vectors-insert.ll needs tweaking

llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert.ll
8

@craig.topper Any thoughts on this?

This revision is now accepted and ready to land.Jun 28 2023, 3:31 AM
frasercrmck added inline comments.Jun 28 2023, 3:33 AM
llvm/test/CodeGen/RISCV/rvv/fixed-vectors-insert.ll
8

Maybe we can just write a separate test function that inserts into vectors passed by value? I personally consider the load/store idiom in these tests to be from before we added proper vector calling convention support.

luke updated this revision to Diff 535335.Jun 28 2023, 5:01 AM

Rebase ontop of D153964

luke updated this revision to Diff 535400.Jun 28 2023, 7:25 AM

Manually update some non-update_test_checks.py tests to use volatile stores so they don't trigger the combine

luke marked 2 inline comments as done.Jun 28 2023, 7:30 AM
luke added inline comments.
llvm/test/CodeGen/ARM/fp16-promote.ll
902

Didn't notice some of these tests were hand written, just updated the diff there to mark some of the stores as volatile to prevent the combine from kicking in.

This revision was landed with ongoing or failed builds.Jun 28 2023, 2:45 PM
This revision was automatically updated to reflect the committed changes.