This is an archive of the discontinued LLVM Phabricator instance.

[MCA] Add tests for IPC on Cortex-A55
ClosedPublic

Authored by asavonic on Mar 8 2021, 5:48 AM.

Details

Summary

The tests compare IPC statistics that MCA provides with IPC values
measured on Cortex-A55 hardware. For hardware tests, each snippet is
run in a loop unrolled by 1000, and IPC is measured by linux-perf.

Several tests do not match the hardware: the skewed ALU is not
supported, LDR seem to be missing a forwarding path.

Diff Detail

Event Timeline

asavonic created this revision.Mar 8 2021, 5:48 AM
asavonic requested review of this revision.Mar 8 2021, 5:48 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 8 2021, 5:48 AM
andreadb added inline comments.Mar 8 2021, 6:50 AM
llvm/test/tools/llvm-mca/AArch64/Cortex/IPC/A55-0.s
1 ↗(On Diff #328983)

Do you actually need --dispatch-stats?
If the goal of these tests is to simply check the IPC, then you should be able to simply pass flags --all-views=false -summary-view.

I also suggest to pass all these tests through the update_mca python script. If you only enable the summary-view, then the number of checks in the output will be very small.

llvm/test/tools/llvm-mca/AArch64/Cortex/IPC/A55-4.s
6–7 ↗(On Diff #328983)

Not sure if it might help in this case, but in general I recommend to have a look at whether some operand constraints might be defined using MCSchedPredicate defs in SchedWriteVariant.

Any chance we can use more descriptive testcases names? :-)

llvm/test/tools/llvm-mca/AArch64/Cortex/IPC/A55-4.s
6–7 ↗(On Diff #328983)

Is this something that is fixable in llvm? I thought it would not, in general, know the input value for an instruction.

RKSimon added a subscriber: RKSimon.Mar 9 2021, 1:15 AM
RKSimon added inline comments.
llvm/test/tools/llvm-mca/AArch64/Cortex/IPC/A55-4.s
6–7 ↗(On Diff #328983)

You might be able to do something with valuetracking - if you know the upper bits are zero/signsplat etc. But I don't think we have any knownbits/signbits support this late on in the compile.

andreadb added inline comments.Mar 9 2021, 1:44 AM
llvm/test/tools/llvm-mca/AArch64/Cortex/IPC/A55-4.s
6–7 ↗(On Diff #328983)

Yeah, I don’t think that there is anything that we can do about that specifically.

asavonic updated this revision to Diff 329284.Mar 9 2021, 3:53 AM

Renamed test files
Replaced "--dispatch-stats" with " --all-views=false --summary-view"

Any chance we can use more descriptive testcases names? :-)

Done :)

llvm/test/tools/llvm-mca/AArch64/Cortex/IPC/A55-0.s
1 ↗(On Diff #328983)

Do you actually need --dispatch-stats?
If the goal of these tests is to simply check the IPC, then you should be able to simply pass flags --all-views=false -summary-view.

Thanks. I replaced --dispatch-stats with --summary-view.

I also suggest to pass all these tests through the update_mca python script. If you only enable the summary-view, then the number of checks in the output will be very small.

I cannot figure out to how to combine both the reference checks and the checks
from update_mca. They match the same lines (IPC: X.XX), so we cannot use different
check prefixes:

# RUN: llvm-mca $(args) < %s | FileCheck %s --check-prefixes=CHECK,CHECK-IPC
# CHECK-IPC: IPC:
# CHECK-IPC-SAME: 1.00

add	w8, w8, #1

# CHECK:      Iterations:        1000  <-- this check does not match, scan continues from IPC: 1.00
# CHECK-NEXT: Instructions:      1000

I tried to add two FileCheck calls, but update_mca cannot handle this:

# RUN: llvm-mca $(args) < %s > %t.log
# RUN: FileCheck %s < %t.log
# RUN: FileCheck %s --check-prefix CHECK-IPC < %t.log

update_mca_test_checks.py:93: Warning: could not split tool and filecheck commands
andreadb accepted this revision.Mar 9 2021, 6:18 AM

One last question.
What is the plan with the SDIV test cases? I don't think that there is anything that we can do to improve that simulation, since it would require knowledge that isn't available at simulation time. The risk is to end up with a test which isn't very useful in practice (it will always be marked as XFAIL). In which case, I suggest to remove those DIV tests entirely.

This revision is now accepted and ready to land.Mar 9 2021, 6:18 AM

Ops.. I have accidentally "accepted" this patch.
Basically it LGTM if you remove the SDIV tests for now as I don't think that there is value in having them (unless you prove me wrong).

+1. LGTM, but it may be odd to have a test we don't think will ever be fixed in-tree.

One last question.
What is the plan with the SDIV test cases? I don't think that there is anything that we can do to improve that simulation, since it would require knowledge that isn't available at simulation time. The risk is to end up with a test which isn't very useful in practice (it will always be marked as XFAIL). In which case, I suggest to remove those DIV tests entirely.

They are not very useful as "tests", I agree, but they can be useful as a documentation,
highlighting the cases where MCA and hardware do not match. Although there is no point
in having two tests for the same issue, so we can remove one of them.

gbedwell added inline comments.Mar 9 2021, 7:08 AM
llvm/test/tools/llvm-mca/AArch64/Cortex/IPC/A55-0.s
1 ↗(On Diff #328983)

For the non-xfailing tests the auto-generated checks should serve the purpose as they will explicitly contain the line:

# CHECK-NEXT: IPC:               1.00

For the XFAIL tests, I think you'd have to do the trick of splitting the llvm-mca and FileCheck lines as above, in order to prevent the script from overwriting the checks. The advantage of the update script is that it automates the process for anyone running the script across (for example) the entire Aarch64 directory tree after making some change, but probably no big deal either way.

One last question.
What is the plan with the SDIV test cases? I don't think that there is anything that we can do to improve that simulation, since it would require knowledge that isn't available at simulation time. The risk is to end up with a test which isn't very useful in practice (it will always be marked as XFAIL). In which case, I suggest to remove those DIV tests entirely.

They are not very useful as "tests", I agree, but they can be useful as a documentation,
highlighting the cases where MCA and hardware do not match. Although there is no point
in having two tests for the same issue, so we can remove one of them.

True.
However, to be fair, the "issue" (so to say) is in the write definition from the Cortex-A55 scheduling model.
So, the scheduling model file is probably a better place where to document the issue about the DIV latency.
That being said, I don't really have a strong opinion on this, so it is fine by me if you want to still keep one of those tests.

asavonic updated this revision to Diff 333556.Mar 26 2021, 8:08 AM
asavonic edited the summary of this revision. (Show Details)
  • Adjusted SDIV operands to match the average latency specified in the model.
  • Added FP tests.
  • Added a test for instructions with OOO write and retire.

For the non XFAIL tests, I suggest to use the usual python script to auto generate CHECK directives.

Most of those tests are run with flags --all-views=false --summary-view, so the mca output is already minimal.
Consequently, the number of CHECK lines generated would be very small.

asavonic updated this revision to Diff 335873.Apr 7 2021, 11:11 AM

Enabled auto-generated checks for all tests except the XFAIL'ed ones.

andreadb accepted this revision.Apr 7 2021, 1:15 PM

LGTM

This revision was automatically updated to reflect the committed changes.