Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/CodeGen/AArch64/complex-deinterleaving-multiuses.ll | ||
---|---|---|
116 | Am I correct that we still want to deinterleave in case shufflevectors have external uses? |
Thanks. Can you also add a testcase based on multiple_muls_shuffle_external where it passes in pointers and loads the data, as opposed to passing in vectors. The idea is that pattern of load+shuffle turning into ld2 is going to be very common from where these are generated. It is good to test both if we are making cost-model like decisions.
Otherwise this LGTM.
llvm/test/CodeGen/AArch64/complex-deinterleaving-multiuses.ll | ||
---|---|---|
116 | I think we want the fastest code :) Which is usually the smallest but can depend on the exact instructions. I worry that it might depend on whether the shuffle will be folded into a ld2 or needs to be emitted anyway. Which may mean it needs to look at the shuffles operands to check whether it looks like they can be ignored or not. |
Add one more test to show extra ld2 is not generated for load -> shuffle instructions
llvm/test/CodeGen/AArch64/complex-deinterleaving-multiuses.ll | ||
---|---|---|
116 | Done - see multiple_muls_shuffle_external_with_loads |
Am I correct that we still want to deinterleave in case shufflevectors have external uses?