Helping to emit compressed BNEZ in comparasion with constant when possible.
Addressing following issue: https://github.com/llvm/llvm-project/issues/56393
Differential D132358
[RISCV][ISel] improved compressed instruction use dybv-sc on Aug 22 2022, 2:09 AM. Authored by
Details
Helping to emit compressed BNEZ in comparasion with constant when possible. Addressing following issue: https://github.com/llvm/llvm-project/issues/56393
Diff Detail
Event TimelineComment Actions As noted in the bug, this increases the critical path length in some cases. Have you benchmarked this? For immediates that fit in c.li the new sequence might be larger. c.li works for all registers. c.bnez only works for x8-x15 and short distances. Comment Actions Sorry for long silence. lw s0, 0(a0) li a2, 101 ld a0, 0(a1) slliw a1, s0, 1 addw a1, a1, s0 sw a1, 0(a0) blt s0, a2, .LBB0_2 transforms to: lw s0, 0(a0) ld a0, 0(a1) slti a2, s0, 101 slliw a1, s0, 1 addw a1, a1, s0 sw a1, 0(a0) bnez a2, .LBB0_2 Being put in a hot loop the latter one adds 7 cycles more to each iteration. I found out that It does not affect branch predictor or cache, so there must be a pipeline stall happening here. I'll investigate this further. Comment Actions So, after more running more spec tests in different modes (train and ref) on different RISCV boards (SiFive and THead) I got mixed results on performance. Performance increase on number on tests was insignificant while on other there was a slight decrease. On average performance declined by 0.5%. On the other hand, size reduction can be seen uniformly among all tests. On average it is 20 less bytes or 0.04% of size reduction. I think these amounts can't justify the performance cost. |