This is an archive of the discontinued LLVM Phabricator instance.

[Scheduling][ARM] Consistently enable PostRA Machine scheduling
ClosedPublic

Authored by dmgreen on Nov 3 2019, 12:59 PM.

Download Raw Diff

Details

Reviewers

t.p.northover
atrick
MatzeB
samparker

Commits

rG7d9af03ff7a0: [Scheduling][ARM] Consistently enable PostRA Machine scheduling

Summary

In the ARM backend, for historical reasons we have only some targets using Machine Scheduling. The rest use the old list scheduler as they are using itinaries and the list scheduler seems to produce better code (and not crash running out of register on v6m codes). So whether to use the MIScheduler or not is checked at runtime from the subtarget features.

This is fine, except for post-ra scheduling. Whether to use the old post-ra list scheduler or the post-ra machine schedule is decided as the pass manager is set up, in arms case from a newly constructed subtarget. Under some situations, like LTO, this won't include the correct cpu so can pick the wrong option. This can have a surprising effect on performance.

To fix that, this patch overrides targetSchedulesPostRAScheduling and addPreSched2 in the ARM backend, adding _both_ post-ra schedulers and picking at runtime which to execute. To pick between the two I've had to add a enablePostRAMachineScheduler() method that normally returns enableMachineScheduler() && enablePostRAScheduler(), which can be overridden to enable just one of PostRAMachineScheduler vs PostRAScheduler.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

dmgreen created this revision.Nov 3 2019, 12:59 PM

Herald added a project: Restricted Project. · View Herald TranscriptNov 3 2019, 12:59 PM

Herald added subscribers: dexonsmith, javed.absar, hiraditya and 2 others. · View Herald Transcript

samparker added inline comments.Nov 4 2019, 2:49 AM

llvm/lib/Target/ARM/ARMSubtarget.cpp
389	I know this is the existing logic, but what are we trying to do here with respect to IT blocks? If we're worried about IT blocks, like the comment suggests, shouldn't we be returning isThumb1Only()? Also, if we're in arm mode then we won't have them, but are we still concerned about predicated instructions (I assume not)?

dmgreen marked an inline comment as done.Nov 4 2019, 2:55 AM

dmgreen added inline comments.

llvm/lib/Target/ARM/ARMSubtarget.cpp
389	I think the comment should really say something like "Thumb1 cores are too simple to benefit from scheduling". I'm not sure what it means by IT blocks. I will change that. I don't think it would be particularly bad thing to do post-ra scheduling on thumb1 cores, it just won't benefit from it either, so we might as well save the compile time.

Updated comment

samparker added inline comments.Nov 4 2019, 3:20 AM

llvm/lib/Target/ARM/ARMTargetMachine.cpp
520	nit: opportunity
522	Should we instead still skip this at -O0?

Only enable when getOptLevel() != CodeGenOpt::None

LGTM

This revision is now accepted and ready to land.Nov 4 2019, 3:38 AM

Closed by commit rG7d9af03ff7a0: [Scheduling][ARM] Consistently enable PostRA Machine scheduling (authored by dmgreen). · Explain WhyNov 5 2019, 2:45 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

TargetSubtargetInfo.h

4 lines

lib/

CodeGen/

MachineScheduler.cpp

2 lines

TargetSubtargetInfo.cpp

4 lines

Target/

ARM/

3 lines

12 lines

2 lines

16 lines

test/

CodeGen/

ARM/

O3-pipeline.ll

1 line

cortex-a57-misched-ldm-wrback.ll

2 lines

cortex-a57-misched-ldm.ll

2 lines

cortex-a57-misched-stm-wrback.ll

2 lines

cortex-a57-misched-stm.ll

2 lines

cortex-a57-misched-vldm-wrback.ll

2 lines

cortex-a57-misched-vldm.ll

2 lines

cortex-a57-misched-vstm-wrback.ll

2 lines

cortex-a57-misched-vstm.ll

2 lines

postrasched.ll

30 lines

Diff 227835

llvm/include/llvm/CodeGen/TargetSubtargetInfo.h

Show First 20 Lines • Show All 200 Lines • ▼ Show 20 Lines	public:
virtual bool enableJoinGlobalCopies() const;		virtual bool enableJoinGlobalCopies() const;

/// True if the subtarget should run a scheduler after register allocation.		/// True if the subtarget should run a scheduler after register allocation.
///		///
/// By default this queries the PostRAScheduling bit in the scheduling model		/// By default this queries the PostRAScheduling bit in the scheduling model
/// which is the preferred way to influence this.		/// which is the preferred way to influence this.
virtual bool enablePostRAScheduler() const;		virtual bool enablePostRAScheduler() const;

		/// True if the subtarget should run a machine scheduler after register
		/// allocation.
		virtual bool enablePostRAMachineScheduler() const;

/// True if the subtarget should run the atomic expansion pass.		/// True if the subtarget should run the atomic expansion pass.
virtual bool enableAtomicExpand() const;		virtual bool enableAtomicExpand() const;

/// True if the subtarget should run the indirectbr expansion pass.		/// True if the subtarget should run the indirectbr expansion pass.
virtual bool enableIndirectBrExpand() const;		virtual bool enableIndirectBrExpand() const;

/// Override generic scheduling policy within a region.		/// Override generic scheduling policy within a region.
///		///
▲ Show 20 Lines • Show All 94 Lines • Show Last 20 Lines

llvm/lib/CodeGen/MachineScheduler.cpp

	Show First 20 Lines • Show All 396 Lines • ▼ Show 20 Lines

	bool PostMachineScheduler::runOnMachineFunction(MachineFunction &mf) {			bool PostMachineScheduler::runOnMachineFunction(MachineFunction &mf) {
	if (skipFunction(mf.getFunction()))			if (skipFunction(mf.getFunction()))
	return false;			return false;

	if (EnablePostRAMachineSched.getNumOccurrences()) {			if (EnablePostRAMachineSched.getNumOccurrences()) {
	if (!EnablePostRAMachineSched)			if (!EnablePostRAMachineSched)
	return false;			return false;
	} else if (!mf.getSubtarget().enablePostRAScheduler()) {			} else if (!mf.getSubtarget().enablePostRAMachineScheduler()) {
	LLVM_DEBUG(dbgs() << "Subtarget disables post-MI-sched.\n");			LLVM_DEBUG(dbgs() << "Subtarget disables post-MI-sched.\n");
	return false;			return false;
	}			}
	LLVM_DEBUG(dbgs() << "Before post-MI-sched:\n"; mf.print(dbgs()));			LLVM_DEBUG(dbgs() << "Before post-MI-sched:\n"; mf.print(dbgs()));

	// Initialize the context of the pass.			// Initialize the context of the pass.
	MF = &mf;			MF = &mf;
	MLI = &getAnalysis<MachineLoopInfo>();			MLI = &getAnalysis<MachineLoopInfo>();
	▲ Show 20 Lines • Show All 3,345 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetSubtargetInfo.cpp

	Show First 20 Lines • Show All 48 Lines • ▼ Show 20 Lines
	bool TargetSubtargetInfo::enableAdvancedRASplitCost() const {			bool TargetSubtargetInfo::enableAdvancedRASplitCost() const {
	return false;			return false;
	}			}

	bool TargetSubtargetInfo::enablePostRAScheduler() const {			bool TargetSubtargetInfo::enablePostRAScheduler() const {
	return getSchedModel().PostRAScheduler;			return getSchedModel().PostRAScheduler;
	}			}

				bool TargetSubtargetInfo::enablePostRAMachineScheduler() const {
				return enableMachineScheduler() && enablePostRAScheduler();
				}

	bool TargetSubtargetInfo::useAA() const {			bool TargetSubtargetInfo::useAA() const {
	return false;			return false;
	}			}

	void TargetSubtargetInfo::mirFileLoaded(MachineFunction &MF) const { }			void TargetSubtargetInfo::mirFileLoaded(MachineFunction &MF) const { }

llvm/lib/Target/ARM/ARMSubtarget.h

Show First 20 Lines • Show All 800 Lines • ▼ Show 20 Lines	public:
unsigned getMispredictionPenalty() const;		unsigned getMispredictionPenalty() const;

/// Returns true if machine scheduler should be enabled.		/// Returns true if machine scheduler should be enabled.
bool enableMachineScheduler() const override;		bool enableMachineScheduler() const override;

/// True for some subtargets at > -O0.		/// True for some subtargets at > -O0.
bool enablePostRAScheduler() const override;		bool enablePostRAScheduler() const override;

		/// True for some subtargets at > -O0.
		bool enablePostRAMachineScheduler() const override;

/// Enable use of alias analysis during code generation (during MI		/// Enable use of alias analysis during code generation (during MI
/// scheduling, DAGCombine, etc.).		/// scheduling, DAGCombine, etc.).
bool useAA() const override { return UseAA; }		bool useAA() const override { return UseAA; }

// enableAtomicExpand- True if we need to expand our atomics.		// enableAtomicExpand- True if we need to expand our atomics.
bool enableAtomicExpand() const override;		bool enableAtomicExpand() const override;

/// getInstrItins - Return the instruction itineraries based on subtarget		/// getInstrItins - Return the instruction itineraries based on subtarget
▲ Show 20 Lines • Show All 61 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMSubtarget.cpp

Show First 20 Lines • Show All 375 Lines • ▼ Show 20 Lines	if (isMClass() && hasMinSize())
return false;		return false;
// Enable the MachineScheduler before register allocation for subtargets		// Enable the MachineScheduler before register allocation for subtargets
// with the use-misched feature.		// with the use-misched feature.
return useMachineScheduler();		return useMachineScheduler();
}		}

// This overrides the PostRAScheduler bit in the SchedModel for any CPU.		// This overrides the PostRAScheduler bit in the SchedModel for any CPU.
bool ARMSubtarget::enablePostRAScheduler() const {		bool ARMSubtarget::enablePostRAScheduler() const {
		if (enableMachineScheduler())
		return false;
		if (disablePostRAScheduler())
		return false;
		// Thumb1 cores will generally not benefit from post-ra scheduling
		return !isThumb1Only();
		samparkerUnsubmitted Not Done Reply Inline Actions I know this is the existing logic, but what are we trying to do here with respect to IT blocks? If we're worried about IT blocks, like the comment suggests, shouldn't we be returning isThumb1Only()? Also, if we're in arm mode then we won't have them, but are we still concerned about predicated instructions (I assume not)? samparker: I know this is the existing logic, but what are we trying to do here with respect to IT blocks?
		dmgreenAuthorUnsubmitted Done Reply Inline Actions I think the comment should really say something like "Thumb1 cores are too simple to benefit from scheduling". I'm not sure what it means by IT blocks. I will change that. I don't think it would be particularly bad thing to do post-ra scheduling on thumb1 cores, it just won't benefit from it either, so we might as well save the compile time. dmgreen: I think the comment should really say something like "Thumb1 cores are too simple to benefit…
		}

		bool ARMSubtarget::enablePostRAMachineScheduler() const {
		if (!enableMachineScheduler())
		return false;
if (disablePostRAScheduler())		if (disablePostRAScheduler())
return false;		return false;
// Don't reschedule potential IT blocks.
return !isThumb1Only();		return !isThumb1Only();
}		}

bool ARMSubtarget::enableAtomicExpand() const { return hasAnyDataBarrier(); }		bool ARMSubtarget::enableAtomicExpand() const { return hasAnyDataBarrier(); }

bool ARMSubtarget::useStride4VFPs() const {		bool ARMSubtarget::useStride4VFPs() const {
// For general targets, the prologue can grow when VFPs are allocated with		// For general targets, the prologue can grow when VFPs are allocated with
// stride 4 (more vpush instructions). But WatchOS uses a compact unwind		// stride 4 (more vpush instructions). But WatchOS uses a compact unwind
▲ Show 20 Lines • Show All 69 Lines • Show Last 20 Lines

llvm/lib/Target/ARM/ARMTargetMachine.h

Show First 20 Lines • Show All 64 Lines • ▼ Show 20 Lines	bool isTargetHardFloat() const {
return TargetTriple.getEnvironment() == Triple::GNUEABIHF \|\|		return TargetTriple.getEnvironment() == Triple::GNUEABIHF \|\|
TargetTriple.getEnvironment() == Triple::MuslEABIHF \|\|		TargetTriple.getEnvironment() == Triple::MuslEABIHF \|\|
TargetTriple.getEnvironment() == Triple::EABIHF \|\|		TargetTriple.getEnvironment() == Triple::EABIHF \|\|
(TargetTriple.isOSBinFormatMachO() &&		(TargetTriple.isOSBinFormatMachO() &&
TargetTriple.getSubArch() == Triple::ARMSubArch_v7em) \|\|		TargetTriple.getSubArch() == Triple::ARMSubArch_v7em) \|\|
TargetTriple.isOSWindows() \|\|		TargetTriple.isOSWindows() \|\|
TargetABI == ARMBaseTargetMachine::ARM_ABI_AAPCS16;		TargetABI == ARMBaseTargetMachine::ARM_ABI_AAPCS16;
}		}

		bool targetSchedulesPostRAScheduling() const override { return true; };
};		};

/// ARM/Thumb little endian target machine.		/// ARM/Thumb little endian target machine.
///		///
class ARMLETargetMachine : public ARMBaseTargetMachine {		class ARMLETargetMachine : public ARMBaseTargetMachine {
public:		public:
ARMLETargetMachine(const Target &T, const Triple &TT, StringRef CPU,		ARMLETargetMachine(const Target &T, const Triple &TT, StringRef CPU,
StringRef FS, const TargetOptions &Options,		StringRef FS, const TargetOptions &Options,
Show All 17 Lines

llvm/lib/Target/ARM/ARMTargetMachine.cpp

Show First 20 Lines • Show All 316 Lines • ▼ Show 20 Lines	ARMBETargetMachine::ARMBETargetMachine(const Target &T, const Triple &TT,
: ARMBaseTargetMachine(T, TT, CPU, FS, Options, RM, CM, OL, false) {}		: ARMBaseTargetMachine(T, TT, CPU, FS, Options, RM, CM, OL, false) {}

namespace {		namespace {

/// ARM Code Generator Pass Configuration Options.		/// ARM Code Generator Pass Configuration Options.
class ARMPassConfig : public TargetPassConfig {		class ARMPassConfig : public TargetPassConfig {
public:		public:
ARMPassConfig(ARMBaseTargetMachine &TM, PassManagerBase &PM)		ARMPassConfig(ARMBaseTargetMachine &TM, PassManagerBase &PM)
: TargetPassConfig(TM, PM) {		: TargetPassConfig(TM, PM) {}
if (TM.getOptLevel() != CodeGenOpt::None) {
ARMGenSubtargetInfo STI(TM.getTargetTriple(), TM.getTargetCPU(),
TM.getTargetFeatureString());
if (STI.hasFeature(ARM::FeatureUseMISched))
substitutePass(&PostRASchedulerID, &PostMachineSchedulerID);
}
}

ARMBaseTargetMachine &getARMTargetMachine() const {		ARMBaseTargetMachine &getARMTargetMachine() const {
return getTM<ARMBaseTargetMachine>();		return getTM<ARMBaseTargetMachine>();
}		}

ScheduleDAGInstrs *		ScheduleDAGInstrs *
createMachineScheduler(MachineSchedContext *C) const override {		createMachineScheduler(MachineSchedContext *C) const override {
ScheduleDAGMILive *DAG = createGenericSchedLive(C);		ScheduleDAGMILive *DAG = createGenericSchedLive(C);
▲ Show 20 Lines • Show All 177 Lines • ▼ Show 20 Lines	if (getOptLevel() != CodeGenOpt::None) {
}));		}));

addPass(createIfConverter([](const MachineFunction &MF) {		addPass(createIfConverter([](const MachineFunction &MF) {
return !MF.getSubtarget<ARMSubtarget>().isThumb1Only();		return !MF.getSubtarget<ARMSubtarget>().isThumb1Only();
}));		}));
}		}
addPass(createMVEVPTBlockPass());		addPass(createMVEVPTBlockPass());
addPass(createThumb2ITBlockPass());		addPass(createThumb2ITBlockPass());

		// Add both scheduling passes to give the subtarget an opportunity to pick
		samparkerUnsubmitted Not Done Reply Inline Actions nit: opportunity samparker: nit: opportunity
		// between them.
		if (getOptLevel() != CodeGenOpt::None) {
		samparkerUnsubmitted Not Done Reply Inline Actions Should we instead still skip this at -O0? samparker: Should we instead still skip this at -O0?
		addPass(&PostMachineSchedulerID);
		addPass(&PostRASchedulerID);
		}
}		}

void ARMPassConfig::addPreEmitPass() {		void ARMPassConfig::addPreEmitPass() {
addPass(createThumb2SizeReductionPass());		addPass(createThumb2SizeReductionPass());

// Constant island pass work on unbundled instructions.		// Constant island pass work on unbundled instructions.
addPass(createUnpackMachineBundles([](const MachineFunction &MF) {		addPass(createUnpackMachineBundles([](const MachineFunction &MF) {
return MF.getSubtarget<ARMSubtarget>().isThumb2();		return MF.getSubtarget<ARMSubtarget>().isThumb2();
Show All 13 Lines

llvm/test/CodeGen/ARM/O3-pipeline.ll

	Show First 20 Lines • Show All 135 Lines • ▼ Show 20 Lines
	; CHECK-NEXT: MachineDominator Tree Construction			; CHECK-NEXT: MachineDominator Tree Construction
	; CHECK-NEXT: Machine Natural Loop Construction			; CHECK-NEXT: Machine Natural Loop Construction
	; CHECK-NEXT: Machine Block Frequency Analysis			; CHECK-NEXT: Machine Block Frequency Analysis
	; CHECK-NEXT: If Converter			; CHECK-NEXT: If Converter
	; CHECK-NEXT: MVE VPT block insertion pass			; CHECK-NEXT: MVE VPT block insertion pass
	; CHECK-NEXT: Thumb IT blocks insertion pass			; CHECK-NEXT: Thumb IT blocks insertion pass
	; CHECK-NEXT: MachineDominator Tree Construction			; CHECK-NEXT: MachineDominator Tree Construction
	; CHECK-NEXT: Machine Natural Loop Construction			; CHECK-NEXT: Machine Natural Loop Construction
				; CHECK-NEXT: PostRA Machine Instruction Scheduler
	; CHECK-NEXT: Post RA top-down list latency scheduler			; CHECK-NEXT: Post RA top-down list latency scheduler
	; CHECK-NEXT: Analyze Machine Code For Garbage Collection			; CHECK-NEXT: Analyze Machine Code For Garbage Collection
	; CHECK-NEXT: Machine Block Frequency Analysis			; CHECK-NEXT: Machine Block Frequency Analysis
	; CHECK-NEXT: MachinePostDominator Tree Construction			; CHECK-NEXT: MachinePostDominator Tree Construction
	; CHECK-NEXT: Branch Probability Basic Block Placement			; CHECK-NEXT: Branch Probability Basic Block Placement
	; CHECK-NEXT: Thumb2 instruction size reduce pass			; CHECK-NEXT: Thumb2 instruction size reduce pass
	; CHECK-NEXT: Unpack machine instruction bundles			; CHECK-NEXT: Unpack machine instruction bundles
	; CHECK-NEXT: optimise barriers pass			; CHECK-NEXT: optimise barriers pass
	Show All 15 Lines

llvm/test/CodeGen/ARM/cortex-a57-misched-ldm-wrback.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -misched-postra -enable-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s			; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -mattr=use-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s
	;			;

	@a = global i32 0, align 4			@a = global i32 0, align 4
	@b = global i32 0, align 4			@b = global i32 0, align 4
	@c = global i32 0, align 4			@c = global i32 0, align 4

	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; We need second, post-ra scheduling to have LDM instruction combined from single-loads			; We need second, post-ra scheduling to have LDM instruction combined from single-loads
	Show All 27 Lines

llvm/test/CodeGen/ARM/cortex-a57-misched-ldm.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -misched-postra -enable-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s			; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -mattr=use-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s

	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; We need second, post-ra scheduling to have LDM instruction combined from single-loads			; We need second, post-ra scheduling to have LDM instruction combined from single-loads
	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; CHECK: LDMIA			; CHECK: LDMIA
	; CHECK: rdefs left			; CHECK: rdefs left
	; CHECK-NEXT: Latency : 3			; CHECK-NEXT: Latency : 3
	; CHECK: Successors:			; CHECK: Successors:
	Show All 18 Lines

llvm/test/CodeGen/ARM/cortex-a57-misched-stm-wrback.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -misched-postra -enable-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s			; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -mattr=use-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s
	; N=3 STMIA_UPD should have latency 2cyc and writeback latency 1cyc			; N=3 STMIA_UPD should have latency 2cyc and writeback latency 1cyc

	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; We need second, post-ra scheduling to have STM instruction combined from single-stores			; We need second, post-ra scheduling to have STM instruction combined from single-stores
	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; CHECK: schedule starting			; CHECK: schedule starting
	; CHECK: STMIA_UPD			; CHECK: STMIA_UPD
	; CHECK: rdefs left			; CHECK: rdefs left
	Show All 26 Lines

llvm/test/CodeGen/ARM/cortex-a57-misched-stm.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -misched-postra -enable-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s			; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -mattr=use-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s
	; N=3 STMIB should have latency 2cyc			; N=3 STMIB should have latency 2cyc

	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; We need second, post-ra scheduling to have STM instruction combined from single-stores			; We need second, post-ra scheduling to have STM instruction combined from single-stores
	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; CHECK: schedule starting			; CHECK: schedule starting
	; CHECK: STMIB			; CHECK: STMIB
	; CHECK: rdefs left			; CHECK: rdefs left
	Show All 19 Lines

llvm/test/CodeGen/ARM/cortex-a57-misched-vldm-wrback.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -misched-postra -enable-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s			; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -mattr=use-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s
	;			;

	@a = global double 0.0, align 4			@a = global double 0.0, align 4
	@b = global double 0.0, align 4			@b = global double 0.0, align 4
	@c = global double 0.0, align 4			@c = global double 0.0, align 4

	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; We need second, post-ra scheduling to have VLDM instruction combined from single-loads			; We need second, post-ra scheduling to have VLDM instruction combined from single-loads
	Show All 40 Lines

llvm/test/CodeGen/ARM/cortex-a57-misched-vldm.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -misched-postra -enable-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s			; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -mattr=use-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s

	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; We need second, post-ra scheduling to have VLDM instruction combined from single-loads			; We need second, post-ra scheduling to have VLDM instruction combined from single-loads
	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; CHECK: VLDMDIA			; CHECK: VLDMDIA
	; CHECK: rdefs left			; CHECK: rdefs left
	; CHECK-NEXT: Latency : 6			; CHECK-NEXT: Latency : 6
	; CHECK: Successors:			; CHECK: Successors:
	Show All 20 Lines

llvm/test/CodeGen/ARM/cortex-a57-misched-vstm-wrback.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -misched-postra -enable-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s			; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -mattr=use-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s

	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; We need second, post-ra scheduling to have VSTM instruction combined from single-stores			; We need second, post-ra scheduling to have VSTM instruction combined from single-stores
	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; CHECK: schedule starting			; CHECK: schedule starting
	; CHECK: VSTMDIA_UPD			; CHECK: VSTMDIA_UPD
	; CHECK: rdefs left			; CHECK: rdefs left
	; CHECK-NEXT: Latency : 4			; CHECK-NEXT: Latency : 4
	Show All 33 Lines

llvm/test/CodeGen/ARM/cortex-a57-misched-vstm.ll

	; REQUIRES: asserts			; REQUIRES: asserts
	; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -misched-postra -enable-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s			; RUN: llc < %s -mtriple=armv8r-eabi -mcpu=cortex-a57 -mattr=use-misched -verify-misched -debug-only=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s

	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; We need second, post-ra scheduling to have VSTM instruction combined from single-stores			; We need second, post-ra scheduling to have VSTM instruction combined from single-stores
	; CHECK: ******** MI Scheduling ********			; CHECK: ******** MI Scheduling ********
	; CHECK: schedule starting			; CHECK: schedule starting
	; CHECK: VSTMDIA			; CHECK: VSTMDIA
	; CHECK: rdefs left			; CHECK: rdefs left
	; CHECK-NEXT: Latency : 2			; CHECK-NEXT: Latency : 2
	Show All 13 Lines

llvm/test/CodeGen/ARM/postrasched.ll

This file was added.

				; REQUIRES: asserts
				; RUN: llc < %s -mtriple=thumbv8m.main-none-eabi -debug-only=machine-scheduler,post-RA-sched -print-before=machine-scheduler -o - 2>&1 > /dev/null \| FileCheck %s

				; CHECK-LABEL: test_misched
				; Pre and post ra machine scheduling
				; CHECK: ******** MI Scheduling ********
				; CHECK: t2LDRi12
				; CHECK: Latency : 2
				; CHECK: ******** MI Scheduling ********
				; CHECK: t2LDRi12
				; CHECK: Latency : 2

				define i32 @test_misched(i32* %ptr) "target-cpu"="cortex-m33" {
				entry:
				%l = load i32, i32* %ptr
				store i32 0, i32* %ptr
				ret i32 %l
				}

				; CHECK-LABEL: test_rasched
				; CHECK: Subtarget disables post-MI-sched.
				; CHECK: ******** List Scheduling ********

				define i32 @test_rasched(i32* %ptr) {
				entry:
				%l = load i32, i32* %ptr
				store i32 0, i32* %ptr
				ret i32 %l
				}

This is an archive of the discontinued LLVM Phabricator instance.

[Scheduling][ARM] Consistently enable PostRA Machine schedulingClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 227835

llvm/include/llvm/CodeGen/TargetSubtargetInfo.h

llvm/lib/CodeGen/MachineScheduler.cpp

llvm/lib/CodeGen/TargetSubtargetInfo.cpp

llvm/lib/Target/ARM/ARMSubtarget.h

llvm/lib/Target/ARM/ARMSubtarget.cpp

llvm/lib/Target/ARM/ARMTargetMachine.h

llvm/lib/Target/ARM/ARMTargetMachine.cpp

llvm/test/CodeGen/ARM/O3-pipeline.ll

llvm/test/CodeGen/ARM/cortex-a57-misched-ldm-wrback.ll

llvm/test/CodeGen/ARM/cortex-a57-misched-ldm.ll

llvm/test/CodeGen/ARM/cortex-a57-misched-stm-wrback.ll

llvm/test/CodeGen/ARM/cortex-a57-misched-stm.ll

llvm/test/CodeGen/ARM/cortex-a57-misched-vldm-wrback.ll

llvm/test/CodeGen/ARM/cortex-a57-misched-vldm.ll

llvm/test/CodeGen/ARM/cortex-a57-misched-vstm-wrback.ll

llvm/test/CodeGen/ARM/cortex-a57-misched-vstm.ll

llvm/test/CodeGen/ARM/postrasched.ll

[Scheduling][ARM] Consistently enable PostRA Machine scheduling
ClosedPublic