This is an archive of the discontinued LLVM Phabricator instance.

AArch64 Cortex-A57 FDIV/FSQRT scheduling fix (W-unit)
ClosedPublic

Authored by andrew.zhogin on Dec 7 2016, 7:22 AM.

Details

Summary

According to Cortex-A57 doc (http://infocenter.arm.com/help/topic/com.arm.doc.uan0015b/Cortex_A57_Software_Optimization_Guide_external.pdf), FDIV/FSQRT instructions should use F0 unit (W-unit in AArch64SchedA57.td, the same as cryptography instructions), not F1 unit (X-unit in td, like ASIMD absolute diff accum SABA/UABA).

This patch changes FDIV/FSQRT scheduling declarations to use A57UnitW instead of A57UnitX. Also, latencies for those instructions are corrected.

Diff Detail

Event Timeline

andrew.zhogin retitled this revision from to AArch64 Cortex-A57 FDIV/FSQRT scheduling fix (W-unit).
andrew.zhogin updated this object.
andrew.zhogin added subscribers: asl, llvm-commits.
evandro added a subscriber: evandro.Dec 7 2016, 2:57 PM
rovka added a subscriber: rovka.Dec 8 2016, 1:56 AM
rovka added inline comments.
test/CodeGen/AArch64/arm64-misched-A57-div-fix.ll
6

Tests shouldn't rely on grep, please use FileCheck instead.
http://llvm.org/docs/TestingGuide.html#writing-new-regression-tests

FileCheck instead of grep in the regression test.

andrew.zhogin marked an inline comment as done.Dec 8 2016, 3:55 AM
rovka added a comment.Dec 16 2016, 5:48 AM

I don't see anything wrong with this, it matches my reading of the manual. Do you have any performance numbers to share? (e.g. test-suite results)

I don't see anything wrong with this, it matches my reading of the manual. Do you have any performance numbers to share? (e.g. test-suite results)

No, don't have any.

Can you run any benchmark and share relative numbers? Despite the fact that the change is obvious, you may uncover pre-existing bugs in the compiler that need addressing before we can change this. The LLVM test-suite, SPEC, EEMBC, anything would help to have an idea.

cheers,
--renato

test/CodeGen/AArch64/arm64-misched-A57-div-fix.ll
6

It should also check for the unit you want to appear, not just the one you don't. :)

Test corrections: added check for right scheduling unit and latency.

andrew.zhogin marked an inline comment as done.Dec 19 2016, 6:36 PM

Can you run any benchmark and share relative numbers? Despite the fact that the change is obvious, you may uncover pre-existing bugs in the compiler that need addressing before we can change this. The LLVM test-suite, SPEC, EEMBC, anything would help to have an idea.

I don't have access to Cortex-A57 system.
I did only check-all runs at my X86-64 desktop (green).

I don't have access to Cortex-A57 system.
I did only check-all runs at my X86-64 desktop (green).

Unfortunately, check-all will make no difference here. You need benchmarks, not conformance tests.

I can run these for you, but it'll take a while, since we're moving labs and my benchmarking machine is being upgraded at the moment. We can probably have something for beginning next year, unless someone else can do that quicker.

cheers,
--renato

Can you run any benchmark and share relative numbers? Despite the fact that the change is obvious, you may uncover pre-existing bugs in the compiler that need addressing before we can change this. The LLVM test-suite, SPEC, EEMBC, anything would help to have an idea.

I don't have access to Cortex-A57 system.
I did only check-all runs at my X86-64 desktop (green).

I've run this patch on top of r290135 on test-suite, spec2K, spec2K6 and a few internal benchmarks.
I saw 2 minor regressions and 2 bigger improvements on internal benchmarks.
Across test-suite, spec2K and spec2K6, all results were within noise levels.

rengolin accepted this revision.Dec 21 2016, 4:09 AM
rengolin added a reviewer: rengolin.

Thanks Kristof,

I think that address my worries. :)

LGTM, thanks!

This revision is now accepted and ready to land.Dec 21 2016, 4:09 AM

Thanks Kristof,

I think that address my worries. :)

LGTM, thanks!

Thanks for the review.

I don't have commit access so could you please commit the patch?

rengolin closed this revision.Dec 23 2016, 5:02 AM

Committed in r290426. Cheers!