Download Raw Diff

Details

Reviewers

arsenm
craig.topper
Carrot
RKSimon
qcolombet
reames

Summary

This is the first commit for the Spill2Reg optimization pass.
The goal of this pass is to selectively replace spills to the stack with
spills to vector registers. This can help remove back-end stalls in x86.

RFC:
https://lists.llvm.org/pipermail/llvm-dev/2022-January/154782.html
https://discourse.llvm.org/t/rfc-spill2reg-selectively-replace-spills-to-stack-with-spills-to-vector-registers/59630

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

vporpo created this revision.Jan 26 2022, 5:32 PM

Herald added subscribers: pengfei, hiraditya, mgorny, qcolombet. · View Herald TranscriptJan 26 2022, 5:32 PM

vporpo requested review of this revision.Jan 26 2022, 5:32 PM

Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2022, 5:32 PM

Herald added a subscriber: llvm-commits. · View Herald Transcript

vporpo added a child revision: D118299: [Spill2Reg][2/9] This patch adds spill/reload collection..Jan 26 2022, 5:37 PM

wxiao3 added a subscriber: wxiao3.Jan 26 2022, 7:06 PM

vporpo edited the summary of this revision. (Show Details)Jan 26 2022, 7:31 PM

lkail added a subscriber: lkail.Jan 26 2022, 8:07 PM

Matt added a subscriber: Matt.Jan 27 2022, 4:00 AM

Harbormaster completed remote builds in B145882: Diff 403450.Jan 27 2022, 7:21 AM

venkataramanan.kumar.llvm added a subscriber: venkataramanan.kumar.llvm.Jan 27 2022, 10:24 AM

mdchen added a subscriber: mdchen.Jan 27 2022, 5:51 PM

Disable the pass by default.

Harbormaster completed remote builds in B146201: Diff 403892.Jan 27 2022, 11:28 PM

tmsriram added a subscriber: tmsriram.Jan 31 2022, 2:11 PM

vporpo edited the summary of this revision. (Show Details)Feb 3 2022, 3:17 PM

arsenm added a subscriber: arsenm.Feb 3 2022, 3:22 PM

arsenm added inline comments.

llvm/lib/CodeGen/Spill2Reg.cpp
26–28	Probably should move this to TargetPassConfig

Moved spill2reg flag to TargetPassConfig.

llvm/lib/CodeGen/Spill2Reg.cpp
26–28	Done.

vporpo added reviewers: arsenm, craig.topper, Carrot.Feb 3 2022, 4:30 PM

Herald added a subscriber: wdng. · View Herald TranscriptFeb 3 2022, 4:30 PM

Harbormaster completed remote builds in B147528: Diff 405826.Feb 3 2022, 6:14 PM

Please can you rename the Spill2Reg patch sequence using [X/N] so we can more easily track dependencies

vporpo retitled this revision from [Spill2Reg] Initial commit. This is boilerplate code. to [Spill2Reg][1/9] Initial commit. This is boilerplate code..Feb 4 2022, 9:48 AM

vporpo added a reviewer: RKSimon.

xiangzhangllvm added a subscriber: xiangzhangllvm.Feb 6 2022, 4:21 PM

Very good idea!
and go further, we may create concept of "cheaper spills" for scalar regs in RA (by check the interference of vector regs with it) to recalculate the spill energy in spillplacer,
and if we can load/store more scalar regs from/to one vector, it may be profitable to spill a vector reg (even the interference exist between scalar regs and vector regs) instead of spill more scalar regs.

Yes, a two-tier spilling scheme might make sense for some targets: first spill to consecutive lanes in the vector, and then spill the vectors to memory. I think though that in x86 it may be a lot trickier to check when this will perform better than standard spills to stack.

LGTM, but wait for more of the rest of the sequence to submit

llvm/lib/CodeGen/Spill2Reg.cpp
53	Pointless overload and ;?

This revision is now accepted and ready to land.Feb 10 2022, 8:32 AM

Removed redundant overload.

Harbormaster completed remote builds in B148779: Diff 407581.Feb 10 2022, 11:16 AM

Rebased

Herald added a project: Restricted Project. · View Herald TranscriptJun 16 2022, 12:02 PM

Herald added subscribers: jsji, StephenFan. · View Herald Transcript

Harbormaster completed remote builds in B170340: Diff 437664.Jun 16 2022, 1:33 PM

Rebased

Harbormaster completed remote builds in B187995: Diff 461933.Sep 21 2022, 10:02 AM

vporpo added reviewers: qcolombet, reames.Sep 21 2022, 10:06 AM

Should this patch set optimize out spill/reload in function "test" of

test.cpp965 BDownload

I tried:

llvm-project/build/bin/clang -S -emit-llvm test.cpp -march=skylake-avx512 -O2
llvm-project/build/bin/llc \
	-enable-spill2reg -simplify-mir -spill2reg-mem-instrs=0 -spill2reg-vec-instrs=99999 \
	-march x86 -mattr=+avx2 -filetype=asm --x86-asm-syntax=intel test.ll

and got:

_Z4testPfS_S_:                          # @_Z4testPfS_S_
        .cfi_startproc
# %bb.0:                                # %entry
        sub     esp, 140
        .cfi_def_cfa_offset 144
        mov     eax, dword ptr [esp + 152]
        mov     ecx, dword ptr [esp + 144]
        vmovaps zmm0, zmmword ptr [ecx]
        vmovups zmmword ptr [esp + 64], zmm0    # 64-byte Spill
        mov     ecx, dword ptr [esp + 148]
        vmovaps zmm1, zmmword ptr [ecx]
        vaddps  zmm0, zmm0, zmm1
        vaddps  zmm1, zmm1, zmmword ptr [eax]
        vmovups zmmword ptr [esp], zmm1         # 64-byte Spill
        call    _Z12print_m512_fDv16_f
        vmovups zmm0, zmmword ptr [esp]         # 64-byte Reload
        call    _Z12print_m512_fDv16_f
        vmovups zmm0, zmmword ptr [esp + 64]    # 64-byte Reload
        add     esp, 140
        .cfi_def_cfa_offset 4
        ret

Did I miss anything? Thanks!

In the test you provided vector registers are spilled across the call . Spill2reg will not try to work with vector spills/reloads. The reasoning is that if there is a free register for spill2reg to use, then the register allocator would have already found it and used it to avoid the spill. It would also make little sense performance wise because saving/restoring a vector register to another vector register is simply redundant: you could have just used the destination register in the first place.

Diff 403450

llvm/include/llvm/CodeGen/MachinePassRegistry.def

	Show First 20 Lines • Show All 193 Lines • ▼ Show 20 Lines
	DUMMY_MACHINE_FUNCTION_PASS("legalizer", LegalizerPass, ())			DUMMY_MACHINE_FUNCTION_PASS("legalizer", LegalizerPass, ())
	DUMMY_MACHINE_FUNCTION_PASS("irtranslator", IRTranslatorPass, ())			DUMMY_MACHINE_FUNCTION_PASS("irtranslator", IRTranslatorPass, ())
	DUMMY_MACHINE_FUNCTION_PASS("regbankselect", RegBankSelectPass, ())			DUMMY_MACHINE_FUNCTION_PASS("regbankselect", RegBankSelectPass, ())
	DUMMY_MACHINE_FUNCTION_PASS("instruction-select", InstructionSelectPass, ())			DUMMY_MACHINE_FUNCTION_PASS("instruction-select", InstructionSelectPass, ())
	DUMMY_MACHINE_FUNCTION_PASS("reset-machine-function", ResetMachineFunctionPass, ())			DUMMY_MACHINE_FUNCTION_PASS("reset-machine-function", ResetMachineFunctionPass, ())
	DUMMY_MACHINE_FUNCTION_PASS("machineverifier", MachineVerifierPass, ())			DUMMY_MACHINE_FUNCTION_PASS("machineverifier", MachineVerifierPass, ())
	DUMMY_MACHINE_FUNCTION_PASS("machine-cycles", MachineCycleInfoWrapperPass, ())			DUMMY_MACHINE_FUNCTION_PASS("machine-cycles", MachineCycleInfoWrapperPass, ())
	DUMMY_MACHINE_FUNCTION_PASS("print-machine-cycles", MachineCycleInfoPrinterPass, ())			DUMMY_MACHINE_FUNCTION_PASS("print-machine-cycles", MachineCycleInfoPrinterPass, ())
				DUMMY_MACHINE_FUNCTION_PASS("spill2reg", Spill2RegPass, ())
	#undef DUMMY_MACHINE_FUNCTION_PASS			#undef DUMMY_MACHINE_FUNCTION_PASS

llvm/include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 548 Lines • ▼ Show 20 Lines	namespace llvm {

/// The pass transforms amx intrinsics to scalar operation if the function has		/// The pass transforms amx intrinsics to scalar operation if the function has
/// optnone attribute or it is O0.		/// optnone attribute or it is O0.
FunctionPass *createX86LowerAMXIntrinsicsPass();		FunctionPass *createX86LowerAMXIntrinsicsPass();

/// When learning an eviction policy, extract score(reward) information,		/// When learning an eviction policy, extract score(reward) information,
/// otherwise this does nothing		/// otherwise this does nothing
FunctionPass *createRegAllocScoringPass();		FunctionPass *createRegAllocScoringPass();

		/// This pass replaces spills to stack with spills to registers.
		extern char &Spill2RegID;
} // End llvm namespace		} // End llvm namespace

#endif		#endif

llvm/include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 457 Lines • ▼ Show 20 Lines
	void initializeVirtRegRewriterPass(PassRegistry&);			void initializeVirtRegRewriterPass(PassRegistry&);
	void initializeWarnMissedTransformationsLegacyPass(PassRegistry &);			void initializeWarnMissedTransformationsLegacyPass(PassRegistry &);
	void initializeWasmEHPreparePass(PassRegistry&);			void initializeWasmEHPreparePass(PassRegistry&);
	void initializeWholeProgramDevirtPass(PassRegistry&);			void initializeWholeProgramDevirtPass(PassRegistry&);
	void initializeWinEHPreparePass(PassRegistry&);			void initializeWinEHPreparePass(PassRegistry&);
	void initializeWriteBitcodePassPass(PassRegistry&);			void initializeWriteBitcodePassPass(PassRegistry&);
	void initializeWriteThinLTOBitcodePass(PassRegistry&);			void initializeWriteThinLTOBitcodePass(PassRegistry&);
	void initializeXRayInstrumentationPass(PassRegistry&);			void initializeXRayInstrumentationPass(PassRegistry&);
				void initializeSpill2RegPass(PassRegistry &);

	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_INITIALIZEPASSES_H			#endif // LLVM_INITIALIZEPASSES_H

llvm/lib/CodeGen/CMakeLists.txt

Show First 20 Lines • Show All 192 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMCodeGen
ScheduleDAGInstrs.cpp		ScheduleDAGInstrs.cpp
ScheduleDAGPrinter.cpp		ScheduleDAGPrinter.cpp
ScoreboardHazardRecognizer.cpp		ScoreboardHazardRecognizer.cpp
ShadowStackGCLowering.cpp		ShadowStackGCLowering.cpp
ShrinkWrap.cpp		ShrinkWrap.cpp
SjLjEHPrepare.cpp		SjLjEHPrepare.cpp
SlotIndexes.cpp		SlotIndexes.cpp
SpillPlacement.cpp		SpillPlacement.cpp
		Spill2Reg.cpp
SplitKit.cpp		SplitKit.cpp
StackColoring.cpp		StackColoring.cpp
StackMapLivenessAnalysis.cpp		StackMapLivenessAnalysis.cpp
StackMaps.cpp		StackMaps.cpp
StackProtector.cpp		StackProtector.cpp
StackSlotColoring.cpp		StackSlotColoring.cpp
SwiftErrorValueTracking.cpp		SwiftErrorValueTracking.cpp
SwitchLoweringUtils.cpp		SwitchLoweringUtils.cpp
▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CodeGen.cpp

Show First 20 Lines • Show All 119 Lines • ▼ Show 20 Lines	void llvm::initializeCodeGen(PassRegistry &Registry) {
initializeUnpackMachineBundlesPass(Registry);		initializeUnpackMachineBundlesPass(Registry);
initializeUnreachableBlockElimLegacyPassPass(Registry);		initializeUnreachableBlockElimLegacyPassPass(Registry);
initializeUnreachableMachineBlockElimPass(Registry);		initializeUnreachableMachineBlockElimPass(Registry);
initializeVirtRegMapPass(Registry);		initializeVirtRegMapPass(Registry);
initializeVirtRegRewriterPass(Registry);		initializeVirtRegRewriterPass(Registry);
initializeWasmEHPreparePass(Registry);		initializeWasmEHPreparePass(Registry);
initializeWinEHPreparePass(Registry);		initializeWinEHPreparePass(Registry);
initializeXRayInstrumentationPass(Registry);		initializeXRayInstrumentationPass(Registry);
		initializeSpill2RegPass(Registry);
}		}

void LLVMInitializeCodeGen(LLVMPassRegistryRef R) {		void LLVMInitializeCodeGen(LLVMPassRegistryRef R) {
initializeCodeGen(*unwrap(R));		initializeCodeGen(*unwrap(R));
}		}

llvm/lib/CodeGen/Spill2Reg.cpp

This file was added.

				//===- Spill2Reg.cpp - Spill To Register Optimization ---------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				//
				/// \file This file implements Spill2Reg, an optimization which selectively
				/// replaces spills/reloads to/from the stack with register copies to/from the
				/// vector register file. This works even on targets where load/stores have
				/// similar latency to register copies because it can free up memory units which
				/// helps avoid back-end stalls.
				///
				//===----------------------------------------------------------------------===//

				#include "llvm/CodeGen/MachineFunctionPass.h"
				#include "llvm/CodeGen/Passes.h"
				#include "llvm/InitializePasses.h"
				#include "llvm/Support/CommandLine.h"

				using namespace llvm;

				static cl::opt<bool> EnableSpill2Reg("enable-spill2reg", cl::Hidden,
				cl::init(true),
				cl::desc("Enable Spill2Reg pass"));

				arsenmUnsubmitted Not Done Reply Inline Actions Probably should move this to TargetPassConfig arsenm: Probably should move this to TargetPassConfig
				vporpoAuthorUnsubmitted Done Reply Inline Actions Done. vporpo: Done.
				namespace {

				class Spill2Reg : public MachineFunctionPass {
				public:
				static char ID;
				Spill2Reg() : MachineFunctionPass(ID) {
				initializeSpill2RegPass(*PassRegistry::getPassRegistry());
				}
				void getAnalysisUsage(AnalysisUsage &AU) const override;
				void releaseMemory() override;
				bool runOnMachineFunction(MachineFunction &) override;
				void print(raw_ostream &O, const Module * = nullptr) const override;
				};

				} // namespace

				void Spill2Reg::getAnalysisUsage(AnalysisUsage &AU) const {
				AU.setPreservesCFG();
				MachineFunctionPass::getAnalysisUsage(AU);
				}

				void Spill2Reg::releaseMemory() {}

				bool Spill2Reg::runOnMachineFunction(MachineFunction &MFn) {
				if (!EnableSpill2Reg)
				arsenmUnsubmitted Not Done Reply Inline Actions Pointless overload and ;? arsenm: Pointless overload and ;?
				return false;

				llvm_unreachable("Unimplemented");
				}

				void Spill2Reg::print(raw_ostream &O, const Module *m) const { ; }

				char Spill2Reg::ID = 0;

				char &llvm::Spill2RegID = Spill2Reg::ID;

				INITIALIZE_PASS_BEGIN(Spill2Reg, "spill2reg", "Spill2Reg", false, false)
				INITIALIZE_PASS_END(Spill2Reg, "spill2reg", "Spill2Reg", false, false)

llvm/lib/CodeGen/TargetPassConfig.cpp

Show First 20 Lines • Show All 1,393 Lines • ▼ Show 20 Lines	bool TargetPassConfig::addRegAssignAndRewriteOptimized() {
addPass(createRegAllocPass(true));		addPass(createRegAllocPass(true));

// Allow targets to change the register assignments before rewriting.		// Allow targets to change the register assignments before rewriting.
addPreRewrite();		addPreRewrite();

// Finally rewrite virtual registers.		// Finally rewrite virtual registers.
addPass(&VirtRegRewriterID);		addPass(&VirtRegRewriterID);

		// Replace spills to stack with spills to registers.
		addPass(&Spill2RegID);

// Regalloc scoring for ML-driven eviction - noop except when learning a new		// Regalloc scoring for ML-driven eviction - noop except when learning a new
// eviction policy.		// eviction policy.
addPass(createRegAllocScoringPass());		addPass(createRegAllocScoringPass());
return true;		return true;
}		}

/// Return true if the default global register allocator is in use and		/// Return true if the default global register allocator is in use and
/// has not be overriden on the command line with '-regalloc=...'		/// has not be overriden on the command line with '-regalloc=...'
▲ Show 20 Lines • Show All 134 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[Spill2Reg][1/9] Initial commit. This is boilerplate code.
AcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 403450

llvm/include/llvm/CodeGen/MachinePassRegistry.def

llvm/include/llvm/CodeGen/Passes.h

llvm/include/llvm/InitializePasses.h

llvm/lib/CodeGen/CMakeLists.txt

llvm/lib/CodeGen/CodeGen.cpp

llvm/lib/CodeGen/Spill2Reg.cpp

llvm/lib/CodeGen/TargetPassConfig.cpp

This is an archive of the discontinued LLVM Phabricator instance.

[Spill2Reg][1/9] Initial commit. This is boilerplate code.AcceptedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 403450

llvm/include/llvm/CodeGen/MachinePassRegistry.def

llvm/include/llvm/CodeGen/Passes.h

llvm/include/llvm/InitializePasses.h

llvm/lib/CodeGen/CMakeLists.txt

llvm/lib/CodeGen/CodeGen.cpp

llvm/lib/CodeGen/Spill2Reg.cpp

llvm/lib/CodeGen/TargetPassConfig.cpp

[Spill2Reg][1/9] Initial commit. This is boilerplate code.
AcceptedPublic