This is an archive of the discontinued LLVM Phabricator instance.

[llvm-mca] Compare multiple files
ClosedPublic

Authored by mmatic05 on Dec 6 2021, 2:24 AM.

Details

Summary

Script (llvm-mca-compare.py) uses llvm-mca tool to print statistics in console for multiple files.
Script requires specified --llvm-mca-binary option (specified relative path to binary of llvm-mca). Options: --args [="-option1=<arg> -option2=<arg> ..."], -v or -h can also be used.

The script is used as follows:

$ llvm-project/llvm/utils/llvm-mca-compare.py file1.s --llvm-mca-binary=build/bin/llvm-mca
Input files:
[f1]: file1.s

ITERATIONS: 100


-----------------------------------------
Code region: 1

+---------------------+--------+
|                     | [f1]:  |
+=====================+========+
| Instructions:       | 1100   |
+---------------------+--------+
| Total Cycles:       | 1097   |
+---------------------+--------+
| Total uOps:         | 1900   |
+---------------------+--------+
| Dispatch Width:     | 6      |
+---------------------+--------+
| uOps Per Cycle:     | 1.73   |
+---------------------+--------+
| IPC:                | 1.0    |
+---------------------+--------+
| Block RThroughput:  | 3.17   |
+---------------------+--------+

Resource pressure per iteration: 

+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
|        | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f1]:  | -          | -            | 2.72     | 2.76     | 1.66     | 1.68     | 3        | 2.76     | 2.76     | 1.66     |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
$ llvm-project/llvm/utils/llvm-mca-compare.py file1.s --llvm-mca-binary=build/bin/llvm-mca -v
run: $ build/bin/llvm-mca -json file1.s

Simulation Parameters: 
-march : x86_64
-mcpu : skylake
-mtriple : x86_64-unknown-linux-gnu


Input files:
[f1]: file1.s

ITERATIONS: 100


-----------------------------------------
Code region: 1

+---------------------+--------+
|                     | [f1]:  |
+=====================+========+
| Instructions:       | 1100   |
+---------------------+--------+
| Total Cycles:       | 1097   |
+---------------------+--------+
| Total uOps:         | 1900   |
+---------------------+--------+
| Dispatch Width:     | 6      |
+---------------------+--------+
| uOps Per Cycle:     | 1.73   |
+---------------------+--------+
| IPC:                | 1.0    |
+---------------------+--------+
| Block RThroughput:  | 3.17   |
+---------------------+--------+

Resource pressure per iteration: 

+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
|        | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f1]:  | -          | -            | 2.72     | 2.76     | 1.66     | 1.68     | 3        | 2.76     | 2.76     | 1.66     |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
$ llvm-project/llvm/utils/llvm-mca-compare.py file1.s --llvm-mca-binary=build/bin/llvm-mca --args="-dispatch=10 -noalias=false -iterations=300"  -v
run: $ build/bin/llvm-mca -dispatch=10 -noalias=false -iterations=300 -json file1.s

Simulation Parameters: 
-dispatch : 10
-march : x86_64
-mcpu : skylake
-mtriple : x86_64-unknown-linux-gnu
-noalias : False


Input files:
[f1]: file1.s

ITERATIONS: 300


-----------------------------------------
Code region: 1

+---------------------+--------+
|                     | [f1]:  |
+=====================+========+
| Instructions:       | 3300   |
+---------------------+--------+
| Total Cycles:       | 3097   |
+---------------------+--------+
| Total uOps:         | 5700   |
+---------------------+--------+
| Dispatch Width:     | 10     |
+---------------------+--------+
| uOps Per Cycle:     | 1.84   |
+---------------------+--------+
| IPC:                | 1.07   |
+---------------------+--------+
| Block RThroughput:  | 3      |
+---------------------+--------+

Resource pressure per iteration: 

+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
|        | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f1]:  | -          | -            | 2.75     | 2.75     | 1.67     | 1.67     | 3        | 2.75     | 2.75     | 1.67     |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
$ llvm-project/llvm/utils/llvm-mca-compare.py file1.s file2.s file3.s file4.s --llvm-mca-binary=build/bin/llvm-mca
Input files:
[f1]: file1.s
[f2]: file2.s
[f3]: file3.s
[f4]: file4.s

ITERATIONS: 100


-----------------------------------------
Code region: 1

+---------------------+--------+--------+--------+--------+
|                     | [f1]:  | [f2]:  | [f3]:  | [f4]:  |
+=====================+========+========+========+========+
| Instructions:       | 1100   | 600    | 2800   | 1200   |
+---------------------+--------+--------+--------+--------+
| Total Cycles:       | 1097   | 897    | 2192   | 1096   |
+---------------------+--------+--------+--------+--------+
| Total uOps:         | 1900   | 1400   | 4500   | 2200   |
+---------------------+--------+--------+--------+--------+
| Dispatch Width:     | 6      | 6      | 6      | 6      |
+---------------------+--------+--------+--------+--------+
| uOps Per Cycle:     | 1.73   | 1.56   | 2.05   | 2.01   |
+---------------------+--------+--------+--------+--------+
| IPC:                | 1.0    | 0.67   | 1.28   | 1.09   |
+---------------------+--------+--------+--------+--------+
| Block RThroughput:  | 3.17   | 2.33   | 10     | 3.67   |
+---------------------+--------+--------+--------+--------+

Resource pressure per iteration: 

+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
|        | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f1]:  | -          | -            | 2.72     | 2.76     | 1.66     | 1.68     | 3        | 2.76     | 2.76     | 1.66     |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f2]:  | -          | -            | 1.11     | 1.95     | 1.33     | 1.34     | 2        | 1.95     | 1.99     | 1.33     |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f3]:  | -          | -            | 4.71     | 4.72     | 7        | 7        | 10       | 4.73     | 4.84     | 7        |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f4]:  | -          | -            | 3        | 3        | 1.75     | 1.76     | 2        | 3.01     | 3.99     | 1.49     |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
$ llvm-project/llvm/utils/llvm-mca-compare.py test-two-code-regions.s test-two-code-regions-opt.s --llvm-mca-binary=build/bin/llvm-mca 
Input files:
[f1]: test-two-code-regions.s
[f2]: test-two-code-regions-opt.s

ITERATIONS: 100


-----------------------------------------
Code region: 1

+---------------------+--------+--------+
|                     | [f1]:  | [f2]:  |
+=====================+========+========+
| Instructions:       | 300    | 100    |
+---------------------+--------+--------+
| Total Cycles:       | 303    | 103    |
+---------------------+--------+--------+
| Total uOps:         | 300    | 100    |
+---------------------+--------+--------+
| Dispatch Width:     | 6      | 6      |
+---------------------+--------+--------+
| uOps Per Cycle:     | 0.99   | 0.97   |
+---------------------+--------+--------+
| IPC:                | 0.99   | 0.97   |
+---------------------+--------+--------+
| Block RThroughput:  | 0.75   | 0.25   |
+---------------------+--------+--------+

Resource pressure per iteration: 

+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
|        | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f1]:  | -          | -            | 0.75     | 0.75     | -        | -        | -        | 0.75     | 0.75     | -        |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f2]:  | -          | -            | 0.25     | 0.25     | -        | -        | -        | 0.25     | 0.25     | -        |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+



-----------------------------------------
Code region: 2

+---------------------+--------+--------+
|                     | [f1]:  | [f2]:  |
+=====================+========+========+
| Instructions:       | 200    | 100    |
+---------------------+--------+--------+
| Total Cycles:       | 203    | 103    |
+---------------------+--------+--------+
| Total uOps:         | 200    | 100    |
+---------------------+--------+--------+
| Dispatch Width:     | 6      | 6      |
+---------------------+--------+--------+
| uOps Per Cycle:     | 0.99   | 0.97   |
+---------------------+--------+--------+
| IPC:                | 0.99   | 0.97   |
+---------------------+--------+--------+
| Block RThroughput:  | 0.5    | 0.25   |
+---------------------+--------+--------+

Resource pressure per iteration: 

+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
|        | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f1]:  | -          | -            | 0.5      | 0.5      | -        | -        | -        | 0.5      | 0.5      | -        |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f2]:  | -          | -            | 0.25     | 0.25     | -        | -        | -        | 0.25     | 0.25     | -        |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
$ llvm-project/llvm/utils/llvm-mca-compare.py test-one-code-region.s test-two-code-regions.s --llvm-mca-binary=build/bin/llvm-mca 
Input files:
[f1]: test-one-code-region.s
[f2]: test-two-code-regions.s

ITERATIONS: 100


-----------------------------------------
Code region: 1

+---------------------+--------+--------+
|                     | [f1]:  | [f2]:  |
+=====================+========+========+
| Instructions:       | 100    | 300    |
+---------------------+--------+--------+
| Total Cycles:       | 103    | 303    |
+---------------------+--------+--------+
| Total uOps:         | 100    | 300    |
+---------------------+--------+--------+
| Dispatch Width:     | 6      | 6      |
+---------------------+--------+--------+
| uOps Per Cycle:     | 0.97   | 0.99   |
+---------------------+--------+--------+
| IPC:                | 0.97   | 0.99   |
+---------------------+--------+--------+
| Block RThroughput:  | 0.25   | 0.75   |
+---------------------+--------+--------+

Resource pressure per iteration: 

+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
|        | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f1]:  | -          | -            | 0.25     | 0.25     | -        | -        | -        | 0.25     | 0.25     | -        |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f2]:  | -          | -            | 0.75     | 0.75     | -        | -        | -        | 0.75     | 0.75     | -        |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+



-----------------------------------------
Code region: 2

+---------------------+--------+--------+
|                     | [f1]:  | [f2]:  |
+=====================+========+========+
| Instructions:       | -      | 200    |
+---------------------+--------+--------+
| Total Cycles:       | -      | 203    |
+---------------------+--------+--------+
| Total uOps:         | -      | 200    |
+---------------------+--------+--------+
| Dispatch Width:     | -      | 6      |
+---------------------+--------+--------+
| uOps Per Cycle:     | -      | 0.99   |
+---------------------+--------+--------+
| IPC:                | -      | 0.99   |
+---------------------+--------+--------+
| Block RThroughput:  | -      | 0.5    |
+---------------------+--------+--------+

Resource pressure per iteration: 

+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
|        | SKLDivider | SKLFPDivider | SKLPort0 | SKLPort1 | SKLPort2 | SKLPort3 | SKLPort4 | SKLPort5 | SKLPort6 | SKLPort7 |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f1]:  | -          | -            | -        | -        | -        | -        | -        | -        | -        | -        |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+
| [f2]:  | -          | -            | 0.5      | 0.5      | -        | -        | -        | 0.5      | 0.5      | -        |
+--------+------------+--------------+----------+----------+----------+----------+----------+----------+----------+----------+

Used assembly files:

Diff Detail

Event Timeline

mmatic05 created this revision.Dec 6 2021, 2:24 AM
mmatic05 requested review of this revision.Dec 6 2021, 2:24 AM
djtodoro added a subscriber: petarj.Dec 6 2021, 2:31 AM

Thanks for this! (I am not very familiar with this code, but the motivation sounds reasonable -- some nits included)

llvm/test/tools/llvm-mca/X86/Inputs/diff-option-file-3.s
65 ↗(On Diff #391996)

(https://github.com/mmatic05/llvm-project.git 00f463e835d4baa285bc488c9fcb859e0461e452) not needed

llvm/test/tools/llvm-mca/X86/Inputs/diff-option-file-4.s
45 ↗(On Diff #391996)

(https://github.com/mmatic05/llvm-project.git 00f463e835d4baa285bc488c9fcb859e0461e452) not needed

llvm/tools/llvm-mca/MachineCodePerformance.cpp
1 ↗(On Diff #391996)

This is missing the license information.

llvm/tools/llvm-mca/MachineCodePerformance.h
11 ↗(On Diff #391996)

I guess this should go into the cpp unit.

llvm/tools/llvm-mca/Views/ResourcePressureView.h
100 ↗(On Diff #391996)

SmallVectorImpl instead?

llvm/tools/llvm-mca/llvm-mca.cpp
461 ↗(On Diff #391996)

no need for the curly brackets here

592 ↗(On Diff #391996)

here as well

djtodoro added inline comments.Dec 6 2021, 2:41 AM
llvm/tools/llvm-mca/Views/SummaryView.cpp
114 ↗(On Diff #391996)

is there another way we can get this info? I don't like these uints passed by ref here...

Hi @mmatic05 ,

Thanks for contributing this patch. However, I have a few questions/concerns regarding your suggested design.

Is there a reason why this feature should be implemented as part of llvm-mca?

Wouldn't it be simpler to contribute this feature as a separate script or tool?

You could have a python script that simply drives multiple mca runs under the hood, and then collects the json output from each individual run. That same script would then also diff the various outputs and pretty-print the result either as normal text, or as yet-another json file.

That way, you would be able to provide your new feature without having to change llvm-mca.
Any extension/customization on the diffing logic would be fully transparent to the main llvm-mca driver.

What do you think?

-Andrea

+1 To refactor this as a helper python script (llvm-mca-diff.py ?) we can add to llvm-project\llvm\utils - there might be some json output we're missing but I think everything else can be kept out of the exe

I didn't realise that you already have a python script to do the llvm-mca comparisons.

Would it be possible to contribute your LLVM-MCA-PRETTY-PRINTER python script instead?

+1, we can keep it within llvm-project/llvm/utils

Thanks a lot to everyone on the comments, I'm working on it.

mmatic05 updated this revision to Diff 392398.EditedDec 7 2021, 7:44 AM
  • addressed comments
  • added separate python script which calls llvm-mca tool and has -diff option to compare multiple files

Thanks for the update. Can you please update the summary?

djtodoro added inline comments.Dec 7 2021, 7:48 AM
llvm/utils/llvm-mca-pretty-printer/llvm-mca-pretty-printer.py
33 ↗(On Diff #392398)

-use-mca ?

39 ↗(On Diff #392398)

Can this be more descriptive?

djtodoro added inline comments.Dec 7 2021, 7:50 AM
llvm/utils/llvm-mca-pretty-printer/llvm-mca-pretty-printer.py
117 ↗(On Diff #392398)

RThroughput

mmatic05 edited the summary of this revision. (Show Details)Dec 7 2021, 7:51 AM
ntesic added a subscriber: ntesic.Dec 7 2021, 8:00 AM

Not to descend into bike shedding, but is llvm-mca-pretty-printer.py a suitable name?

llvm/utils/llvm-mca-pretty-printer/llvm-mca-pretty-printer.py
33 ↗(On Diff #392398)

If we follow the style used by the llvm\utils\update_*_test_checks.py scripts - this should be:

--llvm-mca-binary=<path to llvm-mca>

Also, it might be easier if this was not in a subdir - llvm/utils/llvm-mca-pretty-printer.py - to avoid path problems

mmatic05 updated this revision to Diff 392691.Dec 8 2021, 2:24 AM
  • addressed comments
mmatic05 edited the summary of this revision. (Show Details)Dec 8 2021, 2:31 AM
mmatic05 updated this revision to Diff 392700.Dec 8 2021, 3:32 AM
mmatic05 edited the summary of this revision. (Show Details)
  • addressed comments
djtodoro added inline comments.Dec 8 2021, 3:41 AM
llvm/utils/llvm-mca-compare-multiple-files.py
47 ↗(On Diff #392700)

we should be using one naming convention (either camelCase or under_scores), but I vote for underscores

53 ↗(On Diff #392700)

this newline is not needed

91 ↗(On Diff #392700)

I suggest that we check for this package as:

try:
  import termtables
except ImportError:
  print('error: termtables not found.')
  return
RKSimon added inline comments.Dec 8 2021, 5:28 AM
llvm/utils/llvm-mca-compare-multiple-files.py
30 ↗(On Diff #392700)

why is skylake the default? should it be 'native'?

andreadb added a comment.EditedDec 8 2021, 5:44 AM

If the user doesn't specify an -mtriple, then we shouldn't pass a default. llvm-mca should be able to deduce the default target triple (i.e. sys::getDefaultTargetTriple()).
It should not be an x86 triple by default.

Same for the cpu, you shouldn't default it to 'skylake' (why skylake? Also, x86 may not be in the list of available targets).
Basically, that flag should not be passed to llvm-mca if the user doesn't explicitly set it. llvm-mca would default mcpu to llvm::sys::getHostCPUName().

About the flags:
do we actually need to specify -diff on the command line? Isn't it implicit if we pass multiple assembly files in input?

Also, (as a feature request) would it be possible to also get a diff of the resource pressure (for the iteration)? Comparing the summary view is important. However, I believe that most users will also want to use your script to quickly see the differences in resource pressure.

It would be very useful if we could also pass extra flags in input to llvm-mca.
As you know, flags like -noalias and/or -dispatch may affect the outcome of a simulation. When doing a diff, users should be allowed to pass extra args to llvm-mca.

This also leads to a (minor) issue with the output of this script.
The simulation setup is currently not printed out. The only context that we get is:

  • which files were passed in input to the script
  • the total number of iterations.

That information may not be enough to reproduce an experiment. For example, you would need to know what the other flags were (example: which subtarget was specified).

I think that we should print out which flags were actually passed in input to llvm-mca. If people are concerned about the verbosity of it, then we can do it only the user has asked for a more verbose output (e.g. flag -v (verbose) was passed in input to the script).

Last: have you considered also diffing .json files? I guess that the feature of diffing json files could be added in future. It would allow skipping the first step (i.e. the actual simulation step), and it would simply compare the .json outputs if compatible (i.e. target settings are the same). This is just an idea. In case, it could be contributed in future as a follow-up patch if we think it would be useful.

I hope it makes sense.

mmatic05 updated this revision to Diff 393119.Dec 9 2021, 5:29 AM
mmatic05 edited the summary of this revision. (Show Details)
  • addressed comments

Thanks a lot everyone on suggestions and comments.
We do not have information about the resource pressure (for iteration) in the json object that is the result of the llvm-mca tool. This may be one of the future patches, which will parse the stdout. I would also leave the json format printing for some next patch.

Thanks a lot, @mmatic05!

How about llvm/utils/llvm-mca-compare-multiple-files.py --> llvm/utils/llvm-mca-compare.py ?

  • addressed comments

Thanks a lot everyone on suggestions and comments.
We do not have information about the resource pressure (for iteration) in the json object that is the result of the llvm-mca tool. This may be one of the future patches, which will parse the stdout. I would also leave the json format printing for some next patch.

That's not correct.

Information about the per-"iteration resource pressure" is already in the json output.

Example:

addq    %rax, %rcx
addq    %rax, %rdx
addq    %rax, %rsi
addq    %rax, %rdi

Assuming the following simulation parameters:

"SimulationParameters": {
  "-march": "x86_64",
  "-mcpu": "znver1",
  "-mtriple": "x86_64-unknown-linux-gnu"
},

For 10 iterations, you get the following json output for the resource pressure view:

"Instructions": [
  "addq\t%rax, %rcx",
  "addq\t%rax, %rdx",
  "addq\t%rax, %rsi",
  "addq\t%rax, %rdi"
],
"ResourcePressureView": {
  "ResourcePressureInfo": [
    {
      "InstructionIndex": 0,
      "ResourceIndex": 5,
      "ResourceUsage": 1
    },
    {
      "InstructionIndex": 1,
      "ResourceIndex": 4,
      "ResourceUsage": 1
    },
    {
      "InstructionIndex": 2,
      "ResourceIndex": 3,
      "ResourceUsage": 1
    },
    {
      "InstructionIndex": 3,
      "ResourceIndex": 2,
      "ResourceUsage": 1
    },
    {
      "InstructionIndex": 4,
      "ResourceIndex": 2,
      "ResourceUsage": 1
    },
    {
      "InstructionIndex": 4,
      "ResourceIndex": 3,
      "ResourceUsage": 1
    },
    {
      "InstructionIndex": 4,
      "ResourceIndex": 4,
      "ResourceUsage": 1
    },
    {
      "InstructionIndex": 4,
      "ResourceIndex": 5,
      "ResourceUsage": 1
    }
  ]
},

There are four instructions and five indices. The last index is always for the full iteration. The index value for the iteration matches the length (in number of instructions) of the code snippet in input. So, it is 4 for that example.

andreadb added inline comments.Dec 9 2021, 6:47 AM
llvm/utils/llvm-mca-compare-multiple-files.py
34 ↗(On Diff #393119)

I am not entirely sure if this is the right approach. I suggest to delegate the parsing of llvm-mca specific arguments to the llvm-mca driver itself. I would only parse here flags that are specific to your script only.

In one of my previous posts, I was essentially suggesting to only parse fine names, and simply forward all other options to each llvm-mca invocation. Basically, llvm-mca arguments could be simply treated like a generic string to append to the llvm-mca invocation. You don't need to know what is being passed in input. If the llvm-mca run is successful, then those args made sense. You can always inspect the json file for the full set of simulation parameters and the summary view information (for the number of iterations).

The way how it is now is that we parse most llvm-mca arguments here, and then we let llvm-mca do the same again. This design is not ideal because if forces us to mirror changes to (non-view related) llvm-mca flags to this script too. This script doesn't need to know about the semantic of llvm-mca specific flags. Once the set of input code snippets is known, we simply forward everything else to llvm-mca.

That way, you avoid to duplicate the logic that parses arguments, and you can probably get rid of verify_program_inputs.

160–162 ↗(On Diff #393119)

We should get this information from the json output.

See json object "SimulationParameters". Example:

"SimulationParameters": {
  "-march": "x86_64",
  "-mcpu": "znver1",
  "-mtriple": "x86_64-unknown-linux-gnu"
},

That json object also stores information about flags like -noalias and -dispatch. Example:

"SimulationParameters": {
  "-dispatch": 2,
  "-march": "x86_64",
  "-mcpu": "znver1",
  "-mtriple": "x86_64-unknown-linux-gnu",
  "-noalias": false
},

The advantage of doing things this way, is that we know exactly which "native" subtarget was selected (for the case where no -mcpu is passed in input).

181–192 ↗(On Diff #393119)

What happens with asm files declaring multiple code regions? Your script only processes the first code region. I am not sure how we should deal with those.

If the idea is that this script shouldn't support multiple regions, then we should warn if one of the json outputs provides information about more than one code region.

mmatic05 updated this revision to Diff 393455.Dec 10 2021, 5:57 AM
mmatic05 edited the summary of this revision. (Show Details)
  • addressed comments

Thanks a lot everyone on suggestions and comments!

Thanks a lot Milica!

Personally I don't have other questions/requests about this script. So, the patch looks good to me.

If @RKSimon and @djtodoro are also happy with it, then feel free to commit.

-Andrea

RKSimon added inline comments.Dec 13 2021, 8:35 AM
llvm/utils/llvm-mca-compare.py
2

I think we're supposed to set this to python3 for all new scripts?

djtodoro added inline comments.Dec 15 2021, 12:50 AM
llvm/utils/llvm-mca-compare.py
2

I guess, but I see that most of the scripts still use the #!/usr/bin/env python for the main interpreter.

mmatic05 updated this revision to Diff 394499.Dec 15 2021, 2:35 AM
  • Change "#!/usr/bin/env python" to "#!/usr/bin/env python3"
RKSimon accepted this revision.Dec 15 2021, 2:56 AM

LGTM with one minor - cheers

llvm/utils/llvm-mca-compare.py
120

code_regeions_len --> code_regions_len

This revision is now accepted and ready to land.Dec 15 2021, 2:56 AM
mmatic05 updated this revision to Diff 394508.Dec 15 2021, 3:13 AM
  • Change "code_regeions_len" to "code_regions_len"
djtodoro accepted this revision.Dec 15 2021, 4:06 AM

lgtm, thanks!

andreadb accepted this revision.Dec 15 2021, 8:31 AM
This revision was landed with ongoing or failed builds.Dec 21 2021, 2:57 AM
This revision was automatically updated to reflect the committed changes.