Script (llvm-mca-compare.py) uses llvm-mca tool to print statistics in console for multiple files.
Script requires specified --llvm-mca-binary option (specified relative path to binary of llvm-mca). Options: --args [="-option1=<arg> -option2=<arg> ..."], -v or -h can also be used.
The script is used as follows:
$ llvm-project/llvm/utils/llvm-mca-compare.py file1.s --llvm-mca-binary=build/bin/llvm-mca
Input files: [f1]: file1.s ITERATIONS: 100 ----------------------------------------- Code region: 1 +---------------------+--------+ | | [f1]: | +=====================+========+ | Instructions: | 1100 | +---------------------+--------+ | Total Cycles: | 1097 | +---------------------+--------+ | Total uOps: | 1900 | +---------------------+--------+ | Dispatch Width: | 6 | +---------------------+--------+ | uOps Per Cycle: | 1.73 | +---------------------+--------+ | IPC: | 1.0 | +---------------------+--------+ | Block RThroughput: | 3.17 | +---------------------+--------+ Resource pressure per iteration: +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f1]: | - | - | 2.72 | 2.76 | 1.66 | 1.68 | 3 | 2.76 | 2.76 | 1.66 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
$ llvm-project/llvm/utils/llvm-mca-compare.py file1.s --llvm-mca-binary=build/bin/llvm-mca -v
run: $ build/bin/llvm-mca -json file1.s Simulation Parameters: -march : x86_64 -mcpu : skylake -mtriple : x86_64-unknown-linux-gnu Input files: [f1]: file1.s ITERATIONS: 100 ----------------------------------------- Code region: 1 +---------------------+--------+ | | [f1]: | +=====================+========+ | Instructions: | 1100 | +---------------------+--------+ | Total Cycles: | 1097 | +---------------------+--------+ | Total uOps: | 1900 | +---------------------+--------+ | Dispatch Width: | 6 | +---------------------+--------+ | uOps Per Cycle: | 1.73 | +---------------------+--------+ | IPC: | 1.0 | +---------------------+--------+ | Block RThroughput: | 3.17 | +---------------------+--------+ Resource pressure per iteration: +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f1]: | - | - | 2.72 | 2.76 | 1.66 | 1.68 | 3 | 2.76 | 2.76 | 1.66 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
$ llvm-project/llvm/utils/llvm-mca-compare.py file1.s --llvm-mca-binary=build/bin/llvm-mca --args="-dispatch=10 -noalias=false -iterations=300" -v
run: $ build/bin/llvm-mca -dispatch=10 -noalias=false -iterations=300 -json file1.s Simulation Parameters: -dispatch : 10 -march : x86_64 -mcpu : skylake -mtriple : x86_64-unknown-linux-gnu -noalias : False Input files: [f1]: file1.s ITERATIONS: 300 ----------------------------------------- Code region: 1 +---------------------+--------+ | | [f1]: | +=====================+========+ | Instructions: | 3300 | +---------------------+--------+ | Total Cycles: | 3097 | +---------------------+--------+ | Total uOps: | 5700 | +---------------------+--------+ | Dispatch Width: | 10 | +---------------------+--------+ | uOps Per Cycle: | 1.84 | +---------------------+--------+ | IPC: | 1.07 | +---------------------+--------+ | Block RThroughput: | 3 | +---------------------+--------+ Resource pressure per iteration: +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f1]: | - | - | 2.75 | 2.75 | 1.67 | 1.67 | 3 | 2.75 | 2.75 | 1.67 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
$ llvm-project/llvm/utils/llvm-mca-compare.py file1.s file2.s file3.s file4.s --llvm-mca-binary=build/bin/llvm-mca
Input files: [f1]: file1.s [f2]: file2.s [f3]: file3.s [f4]: file4.s ITERATIONS: 100 ----------------------------------------- Code region: 1 +---------------------+--------+--------+--------+--------+ | | [f1]: | [f2]: | [f3]: | [f4]: | +=====================+========+========+========+========+ | Instructions: | 1100 | 600 | 2800 | 1200 | +---------------------+--------+--------+--------+--------+ | Total Cycles: | 1097 | 897 | 2192 | 1096 | +---------------------+--------+--------+--------+--------+ | Total uOps: | 1900 | 1400 | 4500 | 2200 | +---------------------+--------+--------+--------+--------+ | Dispatch Width: | 6 | 6 | 6 | 6 | +---------------------+--------+--------+--------+--------+ | uOps Per Cycle: | 1.73 | 1.56 | 2.05 | 2.01 | +---------------------+--------+--------+--------+--------+ | IPC: | 1.0 | 0.67 | 1.28 | 1.09 | +---------------------+--------+--------+--------+--------+ | Block RThroughput: | 3.17 | 2.33 | 10 | 3.67 | +---------------------+--------+--------+--------+--------+ Resource pressure per iteration: +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f1]: | - | - | 2.72 | 2.76 | 1.66 | 1.68 | 3 | 2.76 | 2.76 | 1.66 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f2]: | - | - | 1.11 | 1.95 | 1.33 | 1.34 | 2 | 1.95 | 1.99 | 1.33 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f3]: | - | - | 4.71 | 4.72 | 7 | 7 | 10 | 4.73 | 4.84 | 7 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f4]: | - | - | 3 | 3 | 1.75 | 1.76 | 2 | 3.01 | 3.99 | 1.49 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
$ llvm-project/llvm/utils/llvm-mca-compare.py test-two-code-regions.s test-two-code-regions-opt.s --llvm-mca-binary=build/bin/llvm-mca
Input files: [f1]: test-two-code-regions.s [f2]: test-two-code-regions-opt.s ITERATIONS: 100 ----------------------------------------- Code region: 1 +---------------------+--------+--------+ | | [f1]: | [f2]: | +=====================+========+========+ | Instructions: | 300 | 100 | +---------------------+--------+--------+ | Total Cycles: | 303 | 103 | +---------------------+--------+--------+ | Total uOps: | 300 | 100 | +---------------------+--------+--------+ | Dispatch Width: | 6 | 6 | +---------------------+--------+--------+ | uOps Per Cycle: | 0.99 | 0.97 | +---------------------+--------+--------+ | IPC: | 0.99 | 0.97 | +---------------------+--------+--------+ | Block RThroughput: | 0.75 | 0.25 | +---------------------+--------+--------+ Resource pressure per iteration: +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f1]: | - | - | 0.75 | 0.75 | - | - | - | 0.75 | 0.75 | - | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f2]: | - | - | 0.25 | 0.25 | - | - | - | 0.25 | 0.25 | - | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ ----------------------------------------- Code region: 2 +---------------------+--------+--------+ | | [f1]: | [f2]: | +=====================+========+========+ | Instructions: | 200 | 100 | +---------------------+--------+--------+ | Total Cycles: | 203 | 103 | +---------------------+--------+--------+ | Total uOps: | 200 | 100 | +---------------------+--------+--------+ | Dispatch Width: | 6 | 6 | +---------------------+--------+--------+ | uOps Per Cycle: | 0.99 | 0.97 | +---------------------+--------+--------+ | IPC: | 0.99 | 0.97 | +---------------------+--------+--------+ | Block RThroughput: | 0.5 | 0.25 | +---------------------+--------+--------+ Resource pressure per iteration: +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f1]: | - | - | 0.5 | 0.5 | - | - | - | 0.5 | 0.5 | - | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f2]: | - | - | 0.25 | 0.25 | - | - | - | 0.25 | 0.25 | - | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
$ llvm-project/llvm/utils/llvm-mca-compare.py test-one-code-region.s test-two-code-regions.s --llvm-mca-binary=build/bin/llvm-mca
Input files: [f1]: test-one-code-region.s [f2]: test-two-code-regions.s ITERATIONS: 100 ----------------------------------------- Code region: 1 +---------------------+--------+--------+ | | [f1]: | [f2]: | +=====================+========+========+ | Instructions: | 100 | 300 | +---------------------+--------+--------+ | Total Cycles: | 103 | 303 | +---------------------+--------+--------+ | Total uOps: | 100 | 300 | +---------------------+--------+--------+ | Dispatch Width: | 6 | 6 | +---------------------+--------+--------+ | uOps Per Cycle: | 0.97 | 0.99 | +---------------------+--------+--------+ | IPC: | 0.97 | 0.99 | +---------------------+--------+--------+ | Block RThroughput: | 0.25 | 0.75 | +---------------------+--------+--------+ Resource pressure per iteration: +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f1]: | - | - | 0.25 | 0.25 | - | - | - | 0.25 | 0.25 | - | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f2]: | - | - | 0.75 | 0.75 | - | - | - | 0.75 | 0.75 | - | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ ----------------------------------------- Code region: 2 +---------------------+--------+--------+ | | [f1]: | [f2]: | +=====================+========+========+ | Instructions: | - | 200 | +---------------------+--------+--------+ | Total Cycles: | - | 203 | +---------------------+--------+--------+ | Total uOps: | - | 200 | +---------------------+--------+--------+ | Dispatch Width: | - | 6 | +---------------------+--------+--------+ | uOps Per Cycle: | - | 0.99 | +---------------------+--------+--------+ | IPC: | - | 0.99 | +---------------------+--------+--------+ | Block RThroughput: | - | 0.5 | +---------------------+--------+--------+ Resource pressure per iteration: +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f1]: | - | - | - | - | - | - | - | - | - | - | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+ | [f2]: | - | - | 0.5 | 0.5 | - | - | - | 0.5 | 0.5 | - | +--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
Used assembly files:
-use-mca ?