This is an archive of the discontinued LLVM Phabricator instance.

[RISCV] Add integer scalar instructions to isAssociativeAndCommutative
ClosedPublic

Authored by HsiangKai on Dec 22 2022, 1:01 AM.

Details

Summary

Inspired by D138107.

We can add ADD, AND, OR, XOR, MUL to isAssociativeAndCommutative to
increase instruction-level parallelism by the existing MachineCombiner
pass.

Diff Detail

Event Timeline

HsiangKai created this revision.Dec 22 2022, 1:01 AM
HsiangKai requested review of this revision.Dec 22 2022, 1:01 AM
HsiangKai updated this revision to Diff 484754.Dec 22 2022, 1:09 AM

Add default case for the switch statement.

HsiangKai updated this revision to Diff 484755.Dec 22 2022, 1:10 AM

clang format

kito-cheng added inline comments.Dec 22 2022, 1:33 AM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
2862–2864

This seems out of scope of this patch?

HsiangKai added inline comments.Dec 22 2022, 2:31 AM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
2862–2864

In order to let hasReassociableSibling() return correct result for integer scalar cases, we have to handle non-fp cases here.

I think it makes sense to return true when instructions have no FRM operand in this function. There may be better way to handle it for hasReassociableSibling(). Any suggestions?

craig.topper added inline comments.Dec 22 2022, 10:25 AM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
1381

I don't think MULH and MULHU are associative.

HsiangKai updated this revision to Diff 484993.Dec 22 2022, 5:21 PM

Remove MULH, MULHU from the list.

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
1381

I added them by mistake. Thanks for pointing it out.

kito-cheng added inline comments.Dec 22 2022, 7:22 PM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
2862–2864

Thanks for the explanation, the function name let me feel that's unrelated to integer operations, however I think this kind of change the behavior of this function, next if-condition is

if (MI1FrmOpIdx < 0 || MI2FrmOpIdx < 0)
  return false;

I didn't have idea how to improve that yet, but I feel it should be add check logic in hasReassociableSibling rather than here.

Anyway I don't have strong objection here.

kito-cheng added inline comments.Dec 22 2022, 9:56 PM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
1371

extract this switch to a new function isAssociativeAndCommutativeIntOpc and then use that in RISCVInstrInfo::hasReassociableSibling?

  • Move FRM operand checking to hasReassociableSibling.
  • Add ADD, SUB, ADDW, SUBW to getInverseOpcode and add test cases for the inverse cases.

clang format

HsiangKai added inline comments.Dec 22 2022, 10:37 PM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
1371

I moved FRM checking to hasReassociableSibling.

Have you had a chance to make some performance measurements?

llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
1379

One thing that I'm a little worried is the described in the spec possibility for microarchitecture fusion(7.1):

If both the high and low bits of the same product are required, then the recommended code sequence is: MULH[[S]U] rdh, rs1, rs2; MUL rdl, rs1, rs2 (source register specifiers must be in same order and rdh cannot be the same as rs1 or rs2). Microarchitectures can then fuse these into a single multiply operation instead of performing two separate multiplies.

As I know, we do nothing to support this fusion, however we quite often generate the recommended sequence (I checked some binaries I have on hand, it's not thorough research). Also some mul/mulh pairs we can easily fix by a simple DAG mutation at the scheduling stage:

%10:gpr = MULHU %9:gpr, %5:gpr
%11:gpr = MUL %9:gpr, %5:gpr
--->
%11:gpr = MUL %9:gpr, %5:gpr
%10:gpr = MULHU %9:gpr, %5:gpr

Machine combiner aggressively changes operands of multiplications, so I expect it'll worsen code from the fusion point of view.

Actually, I not sure that it's a real problem and just want to ensure that we are OK with it.

Have you had a chance to make some performance measurements?

Quoted from D138107,

I run C/C++ benchmarks in SPECrate 2017 on Fujitsu A64FX processor, which has two pipelines for integer operations and SIMD/FP operations each. 511.povray_r had 4% improvement. Other benchmarks (int: 500, 502, 505, 520, 523, 525, 531, 541, 557; fp: 508, 510, 519, 538, 544) were within 1% up/down. For a synthetic benchmark, it doubled the performance.

I have no performance number for RISC-V multiple issue machines. I am not sure what the impact is of the patch. Is there anyone can help to measure it?

HsiangKai added inline comments.Dec 25 2022, 6:32 PM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
1379

Thank you for pointing it out. It is a good point. I am not aware the possibilities of mulh/mul fusion in the microarchitecture design. It is hard to say which one is better unless we have more performance data. How about removing mul/mulw in this patch to be conservative?

Have you had a chance to make some performance measurements?

Quoted from D138107,

I run C/C++ benchmarks in SPECrate 2017 on Fujitsu A64FX processor, which has two pipelines for integer operations and SIMD/FP operations each. 511.povray_r had 4% improvement. Other benchmarks (int: 500, 502, 505, 520, 523, 525, 531, 541, 557; fp: 508, 510, 519, 538, 544) were within 1% up/down. For a synthetic benchmark, it doubled the performance.

I have no performance number for RISC-V multiple issue machines. I am not sure what the impact is of the patch. Is there anyone can help to measure it?

By the way, the number from D138107 are including SIMD/SVE instruction patterns. We only implement scalar part. (Vector instructions have more than 2 input operands. They are not applicable here.)

craig.topper added inline comments.Dec 25 2022, 10:57 PM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
1374

Do you plan to support Zbb MIN(U)/MAX(U)?

1379

I'm not sure we need to worry about fusion until the compiler supports the DAG mutation for such fusion.

asi-sc added inline comments.Dec 26 2022, 12:18 AM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
1379

From my expectations, reassociation of mul instructions that can be fused with mulh should be a quite rare case. Whereas reassociation of mul instructions happens often (I did some experiments previously). So, I agree with Craig but I'd ask you to add a comment about the possible impact of MachineCombiner on the fusion.

HsiangKai updated this revision to Diff 485280.Dec 26 2022, 3:08 AM
  • Add comment for MUL.
  • Add MIN(U)/MAX(U).

Thanks, no more comments from my side.

craig.topper added inline comments.Dec 28 2022, 11:45 AM
llvm/lib/Target/RISCV/RISCVInstrInfo.cpp
1354

swap the order of these conditions. There's no reason to call hasEqualFRM if we already know both indices are negative.

HsiangKai updated this revision to Diff 485571.Dec 28 2022, 8:18 PM

Address comments.

This revision is now accepted and ready to land.Dec 28 2022, 8:24 PM