This is an archive of the discontinued LLVM Phabricator instance.

[RISCV]Combine to make the fixed vector vadd of splats are scalarized
Needs ReviewPublic

Authored by liaolucy on Aug 30 2023, 6:00 AM.

Details

Reviewers
craig.topper
luke
Summary

vadd (build_vector x), (build_vector c) -> build_vector (add x, c)
Try to fix https://github.com/llvm/llvm-project/issues/65068
Subsequently, more operands can be extended.

Diff Detail

Event Timeline

liaolucy created this revision.Aug 30 2023, 6:00 AM
liaolucy requested review of this revision.Aug 30 2023, 6:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 30 2023, 6:00 AM
luke added a comment.Aug 30 2023, 6:18 AM

Thanks for the patch. Did you take a look at scalarizeBinOpOfSplats in DAGCombiner? I think that might be more generic, it's the combine that kicks in for the scalable vector test case in that issue. I was wondering if it would be possible to get it to work on fixed vectors too.

Thanks for the patch. Did you take a look at scalarizeBinOpOfSplats in DAGCombiner? I think that might be more generic, it's the combine that kicks in for the scalable vector test case in that issue. I was wondering if it would be possible to get it to work on fixed vectors too.

I have a question, this ir has been optimized in the opt phase. https://godbolt.org/z/MPvvTG5dT Maybe we don't need another combine in codegen?

luke added a comment.Aug 30 2023, 7:09 AM

Thanks for the patch. Did you take a look at scalarizeBinOpOfSplats in DAGCombiner? I think that might be more generic, it's the combine that kicks in for the scalable vector test case in that issue. I was wondering if it would be possible to get it to work on fixed vectors too.

I have a question, this ir has been optimized in the opt phase. https://godbolt.org/z/MPvvTG5dT Maybe we don't need another combine in codegen?

Good point, I didn't realise opt already took care of this. I guess it depends if this pattern is introduced during codegen at all

luke added a comment.Aug 30 2023, 7:19 AM

Thanks for the patch. Did you take a look at scalarizeBinOpOfSplats in DAGCombiner? I think that might be more generic, it's the combine that kicks in for the scalable vector test case in that issue. I was wondering if it would be possible to get it to work on fixed vectors too.

I have a question, this ir has been optimized in the opt phase. https://godbolt.org/z/MPvvTG5dT Maybe we don't need another combine in codegen?

Good point, I didn't realise opt already took care of this. I guess it depends if this pattern is introduced during codegen at all

I originally noticed this when looking at the code generated whilst expanding ISD::ROT{L,R}, e.g. https://godbolt.org/z/Wh99EPbja
So it can be introduced after the middle-end (but it might not be that common)

Thanks for the patch. Did you take a look at scalarizeBinOpOfSplats in DAGCombiner? I think that might be more generic, it's the combine that kicks in for the scalable vector test case in that issue. I was wondering if it would be possible to get it to work on fixed vectors too.

I have a question, this ir has been optimized in the opt phase. https://godbolt.org/z/MPvvTG5dT Maybe we don't need another combine in codegen?

Good point, I didn't realise opt already took care of this. I guess it depends if this pattern is introduced during codegen at all

I originally noticed this when looking at the code generated whilst expanding ISD::ROT{L,R}, e.g. https://godbolt.org/z/Wh99EPbja
So it can be introduced after the middle-end (but it might not be that common)

Okay, making sure it's back-end introduced looks like it can be optimized. thanks

Another solution, enabling the isExtractVecEltCheap when XlenVT == vector_element_type, It's working for pr65068.ll.

But I'm worried about the side effects of this approach. How to test fixed length vectors? Any suggestions?

Try to use the command: -march=rv64gcv -mllvm -riscv-v-vector-bits-max=128 -mllvm -riscv-v-vector-bits-min=128 , but the llvm ir scalable(not fixed length vectors.)

luke added a comment.Sep 1 2023, 7:26 AM

Another solution, enabling the isExtractVecEltCheap when XlenVT == vector_element_type, It's working for pr65068.ll.

It looks like isExtractVecEltCheap might be meant for "free" extracts:

For example, if scalar operations occur on the same register file as vector operations, then an extract element may be a sub-register rename rather than an actual instruction.

But on RISC-V we'll need a vmv.x.s, so I don't think that qualifies as "free". But this is only if it's coming from a vector_shuffle if I understand correctly: build_vector splats and splat_vector splats should indeed be free because the extract_vector_elt will be combined away.

But I'm worried about the side effects of this approach. How to test fixed length vectors? Any suggestions?

Try to use the command: -march=rv64gcv -mllvm -riscv-v-vector-bits-max=128 -mllvm -riscv-v-vector-bits-min=128 , but the llvm ir scalable(not fixed length vectors.)

Is pr65068.ll not already testing this on fixed length vectors?

luke added a comment.Sep 1 2023, 8:09 AM

But on RISC-V we'll need a vmv.x.s, so I don't think that qualifies as "free". But this is only if it's coming from a vector_shuffle if I understand correctly: build_vector splats and splat_vector splats should indeed be free because the extract_vector_elt will be combined away.

I gave this a try, and it sadly results in a lot of regressions: https://github.com/llvm/llvm-project/commit/0dee41b1ea16a022495a06680410090a25b5bb66
I haven't looked into it yet, but I'm definitely missing something