If one input of a fixed vector multiply is a sign extend and
the other operand is a splat of a scalar, we can use vwmulsu_vx
if the scalar value has sufficient zero bits.
Details
Details
Diff Detail
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
This comment was removed by Chenbing.Zheng.
This comment was removed by Chenbing.Zheng.
| llvm/lib/Target/RISCV/RISCVISelLowering.cpp | ||
|---|---|---|
| 7652–7654 | The description says that we're now supporting scalar splats but AFAICT this will only work for zero-extending loads? Feels like maybe the testing you're adding is too narrowly-focused and dependent on the load. | |
| llvm/lib/Target/RISCV/RISCVISelLowering.cpp | ||
|---|---|---|
| 7652–7654 | I aggre with you. I am sorry about that I have no more ideal about other cases now. May I add a 'Fix me' here ? | |
| llvm/lib/Target/RISCV/RISCVISelLowering.cpp | ||
|---|---|---|
| 7652–7654 | Something like this should work. define <8 x i16> @vwmulsu_vx_v8i16_i8(<8 x i8>* %x, i16 %b) {
%a = load <8 x i8>, <8 x i8>* %x
%c = and i16 %b, 255
%d = insertelement <8 x i16> poison, i16 %c, i32 0
%e = shufflevector <8 x i16> %d, <8 x i16> poison, <8 x i32> zeroinitializer
%f = sext <8 x i8> %a to <8 x i16>
%g = mul <8 x i16> %e, %f
ret <8 x i16> %g
} | |
Comment Actions
I posted an alternative version as D119622. It makes use of MaskedValueIsZero like is used for vwmulu. A new DAG combine for VMV_V_X_VL is used to remove unnecessary AND instructions.
The description says that we're now supporting scalar splats but AFAICT this will only work for zero-extending loads? Feels like maybe the testing you're adding is too narrowly-focused and dependent on the load.