The optimization of (mul x, c) to (ADD (SLLI x, i0), (SLLI x, i1))
is only enabled for i32 multiplication on rv64, maybe due
to the regression in i64 multiplication on rv32.
However we can change the condition to that c should only
be used once, then the above regression can also be avoided,
while ohter chances of optimization can be enabled.
The reason why there is regression in the case the sub target has the M extension and the data size >= XLen, is that the immediate is used twice for i64 mul on rv32.
If we change the restriction from VT.getSizeInBits() >= Subtarget.getXLen() to ConstNode->hasOneUse(), then the regression introduced by the above situation can be avoided, but other chances of optimization can be enabled.