This is an archive of the discontinued LLVM Phabricator instance.

[AArch64] Use 16 bytes as preferred function alignment on Cortex-A57.
ClosedPublic

Authored by fhahn on Jul 3 2017, 8:54 AM.

Download Raw Diff

Details

Reviewers

t.p.northover
javed.absar
kristof.beyls
sbaranga
mcrosier

Commits

rGd4550baf3b6d: [AArch64] Use 16 bytes as preferred function alignment on Cortex-A57.
rL307389: [AArch64] Use 16 bytes as preferred function alignment on Cortex-A57.

Summary

This change gives a 0.89% speed on execution time, a 0.94% improvement
in benchmark scores and a 0.62% increase in binary size on a Cortex-A57.
These numbers are the geomean results on a wide range of benchmarks from
the test-suite, SPEC2000, SPEC2006 and a range of proprietary suites.

The software optimization guide for the Cortex-A57 recommends 16 byte
branch alignment.

Diff Detail

Event Timeline

fhahn created this revision.Jul 3 2017, 8:54 AM

Herald added subscribers: rengolin, aemerson. · View Herald TranscriptJul 3 2017, 8:54 AM

fhahn added a parent revision: D34951: [AArch64] Add test case for preferred function alignment (NFC). .Jul 3 2017, 8:54 AM

From a coding standpoint this all LGTM. However, I'm going to defer to a A57 code owner for final approval.

LGTM from a Cortex-A57 point-of-view: this patch is generating code in line with the optimization guide recommendations and the benchmark numbers quoted look good.

This revision is now accepted and ready to land.Jul 7 2017, 12:29 AM

rebased

fhahn closed this revision.Jul 7 2017, 3:43 AM

Revision Contents

Path

Size

lib/

Target/

AArch64/

AArch64Subtarget.cpp

1 line

test/

CodeGen/

AArch64/

preferred-function-alignment.ll

2 lines

Diff 105614

lib/Target/AArch64/AArch64Subtarget.cpp

Show First 20 Lines • Show All 75 Lines • ▼ Show 20 Lines	void AArch64Subtarget::initializeProperties() {
case Cyclone:		case Cyclone:
CacheLineSize = 64;		CacheLineSize = 64;
PrefetchDistance = 280;		PrefetchDistance = 280;
MinPrefetchStride = 2048;		MinPrefetchStride = 2048;
MaxPrefetchIterationsAhead = 3;		MaxPrefetchIterationsAhead = 3;
break;		break;
case CortexA57:		case CortexA57:
MaxInterleaveFactor = 4;		MaxInterleaveFactor = 4;
		PrefFunctionAlignment = 4;
break;		break;
case ExynosM1:		case ExynosM1:
MaxInterleaveFactor = 4;		MaxInterleaveFactor = 4;
MaxJumpTableSize = 8;		MaxJumpTableSize = 8;
PrefFunctionAlignment = 4;		PrefFunctionAlignment = 4;
PrefLoopAlignment = 3;		PrefLoopAlignment = 3;
break;		break;
case Falkor:		case Falkor:
▲ Show 20 Lines • Show All 210 Lines • Show Last 20 Lines

test/CodeGen/AArch64/preferred-function-alignment.ll

	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=generic < %s \| FileCheck --check-prefix=ALIGN2 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=generic < %s \| FileCheck --check-prefix=ALIGN2 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=cortex-a35 < %s \| FileCheck --check-prefix=ALIGN2 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=cortex-a35 < %s \| FileCheck --check-prefix=ALIGN2 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=cortex-a53 < %s \| FileCheck --check-prefix=ALIGN2 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=cortex-a53 < %s \| FileCheck --check-prefix=ALIGN2 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=cortex-a57 < %s \| FileCheck --check-prefix=ALIGN2 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=cortex-a73 < %s \| FileCheck --check-prefix=ALIGN2 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=cortex-a73 < %s \| FileCheck --check-prefix=ALIGN2 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=cyclone < %s \| FileCheck --check-prefix=ALIGN2 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=cyclone < %s \| FileCheck --check-prefix=ALIGN2 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=falkor < %s \| FileCheck --check-prefix=ALIGN2 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=falkor < %s \| FileCheck --check-prefix=ALIGN2 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=kryo < %s \| FileCheck --check-prefix=ALIGN2 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=kryo < %s \| FileCheck --check-prefix=ALIGN2 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=thunderx < %s \| FileCheck --check-prefix=ALIGN3 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=thunderx < %s \| FileCheck --check-prefix=ALIGN3 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=thunderxt81 < %s \| FileCheck --check-prefix=ALIGN3 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=thunderxt81 < %s \| FileCheck --check-prefix=ALIGN3 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=thunderxt83 < %s \| FileCheck --check-prefix=ALIGN3 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=thunderxt83 < %s \| FileCheck --check-prefix=ALIGN3 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=thunderxt88 < %s \| FileCheck --check-prefix=ALIGN3 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=thunderxt88 < %s \| FileCheck --check-prefix=ALIGN3 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=thunderx2t99 < %s \| FileCheck --check-prefix=ALIGN3 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=thunderx2t99 < %s \| FileCheck --check-prefix=ALIGN3 %s
				; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=cortex-a57 < %s \| FileCheck --check-prefix=ALIGN4 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=cortex-a72 < %s \| FileCheck --check-prefix=ALIGN4 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=cortex-a72 < %s \| FileCheck --check-prefix=ALIGN4 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=exynos-m1 < %s \| FileCheck --check-prefix=ALIGN4 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=exynos-m1 < %s \| FileCheck --check-prefix=ALIGN4 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=exynos-m2 < %s \| FileCheck --check-prefix=ALIGN4 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=exynos-m2 < %s \| FileCheck --check-prefix=ALIGN4 %s
	; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=exynos-m3 < %s \| FileCheck --check-prefix=ALIGN4 %s			; RUN: llc -mtriple=aarch64-unknown-linux -mcpu=exynos-m3 < %s \| FileCheck --check-prefix=ALIGN4 %s

	define void @test() {			define void @test() {
	ret void			ret void
	}			}

	; CHECK-LABEL: test			; CHECK-LABEL: test
	; ALIGN2: .p2align 2			; ALIGN2: .p2align 2
	; ALIGN3: .p2align 3			; ALIGN3: .p2align 3
	; ALIGN4: .p2align 4			; ALIGN4: .p2align 4