This is an archive of the discontinued LLVM Phabricator instance.

[InstrEmitter, SystemZ] Copy Access registers with the correct register class.
ClosedPublic

Authored by jonpa on Feb 22 2020, 3:40 PM.

Download Raw Diff

Details

Reviewers

uweigand
kparzysz

Commits

rGae4d39c9e4ad: [SystemZ] Copy Access registers and CC with the correct register class.

Summary

On SystemZ there are a set of "access registers" that can be copied in and out of 32-bit GPRs with special instructions. These instructions can only perform the copy using low 32-bit parts of the 64-bit GPRs. As reported and discussed at https://bugs.llvm.org/show_bug.cgi?id=44254, this is currently broken due to the fact that the default register class for 32-bit integers is GRX32, which also contains the high 32-bit part registers.

I have tried to find a simple way to constrain the register class of such copies (also at -O0), but this turned out to not be quite simple. Just selecting a pseudo instruction with a custom inserter does not seem to work since CopyToReg/CopyFromReg have special handlings in InstrEmitter.

I then tried in SystemZDAGToDAG.cpp to select a CopyToReg to (CopyToReg COPY_TO_REGCLASS), which worked fine. But I could not get the same results with CopyFromReg. (COPY_TO_REGCLASS CopyFromReg) only resulted in a later COPY into GR32, but the COPY from the Access register was still first made to GRX32.

One alternative might be to let InstrEmitter deduce the needed register class for a CopyFromReg if there is only one user which is a COPY_TO_REGCLASS. I thought this seemed a bit messy, so I instead tried adding a new TargetRegisterInfo hook that is used in InstrEmitter when emitting CopyToReg/CopyFromReg nodes.

Does this make sense, or is there a better way?

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

jonpa created this revision.Feb 22 2020, 3:40 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 22 2020, 3:40 PM

Herald added a subscriber: hiraditya. · View Herald Transcript

This is an "uglier" handling that does not involve any common code changes. The reason it is not so nice is that the backend needs to transform the SelectionDAG (in Select()) by first recognizing any copy to/from a special set of registers and then in the case of copyFromReg insert a target pseudo opcode just with the purpose of constraining the regclass of the created virtreg.

It would be much simpler if the target could just supply the right regclass in the first place, which was my original suggestion (please see "Diff 1" under the History tab).

Removed from reviewer list as the new patch takes all target-specific approach.

In D75014#1890295, @hliao wrote:

Removed from reviewer list as the new patch takes all target-specific approach.

Sorry - I meant to post these two approaches side-by-side, and personally I hope that the target-specific approach will not end up being used... So any comments on this would be much appreciated still! Do you think the new TRI seems reasonable?

I'm wondering if this handles all cases ... for CopyFromReg you apparently rely on the logic in EmitCopyFromReg that checks whether the value is used by some MachineNode with constrained regclass. But that logic isn't unconditionally used, e.g. it is skipped for "cloned" SUs ... not sure whether this could cause issues in more complicated scenarios.

Also, I'm not sure if the Glue handling is fully correct: for CopyToReg, you seem to simply drop the glue, which doesn't look right. For CopyFromReg, you keep the glue ... but you also glue the new node in as well, which may not be necessary (and may actually confuse code in EmitNode that scans glue chains?).

That said, if we do need to handle this problem in target code, it might actually be cleaner to just do it as a separate PreReload pass that simply looks at COPY nodes and constrains source/target virtual registers if required. This could be something like the X86 FlagsCopyLowering pass, I guess (but much simpler).

(Talking about flags, it seems that COPY from/to the %cc register would also require the same handling as access registers. Well, I guess we could implement a copy *to* %cc from a high register using TMHH, but we cannot implement a copy from %cc to a high register ...)

Oh, and one more thing: either way, can you please add the original test case from D74601 so we're sure this problem is (and remains) fixed. Thanks!

In D75014#1891349, @uweigand wrote:

Oh, and one more thing: either way, can you please add the original test case from D74601 so we're sure this problem is (and remains) fixed. Thanks!

I added the two test functions I could find already as @_ZTW1x -> tls-08.ll:@fun0, and @_Z6squareiiiiiii -> tls-09.ll.

In D75014#1891587, @jonpa wrote:

In D75014#1891349, @uweigand wrote:

Oh, and one more thing: either way, can you please add the original test case from D74601 so we're sure this problem is (and remains) fixed. Thanks!

I added the two test functions I could find already as @_ZTW1x -> tls-08.ll:@fun0, and @_Z6squareiiiiiii -> tls-09.ll.

Ah, I missed that, sorry. That's fine then.

... That said, if we do need to handle this problem in target code, it might actually be cleaner to just do it as a separate PreReload pass that simply looks at COPY nodes and constrains source/target virtual registers if required. This could be something like the X86 FlagsCopyLowering pass, I guess (but much simpler).

Handling this with a new pre-regalloc pass instead that transforms COPY instructions of special physregs before register allocation. An alternative to this would be to do this in EmitInstrWithCustomInserter(), but currently COPY instructions are not handled there.

It seems safest to build the target instructions compared to just constrain the virtual register class of the register of the COPY.
I don't think kill flags are useful to manage here, so they are ignored.

(Talking about flags, it seems that COPY from/to the %cc register would also require the same handling as access registers. Well, I guess we could implement a copy *to* %cc from a high register using TMHH, but we cannot implement a copy from %cc to a high register ...)

Handling also COPYs of CC. Copy *to* CC is now done either with TMLH or TMHH (depending on the source reg) in copyPhysReg(). A copy from CC is handled in this new pass instead.

Herald added a subscriber: mgorny. · View Herald TranscriptFeb 26 2020, 4:04 PM

It seems safest to build the target instructions compared to just constrain the virtual register class of the register of the COPY.

I'm not sure I understand this: can you explain what problem you see with constraining the register class?

Also, if you do directly emit the EAR etc., I'm wondering why you still keep the COPY in as well?

In D75014#1897768, @uweigand wrote:

It seems safest to build the target instructions compared to just constrain the virtual register class of the register of the COPY.

I'm not sure I understand this: can you explain what problem you see with constraining the register class?

I remember seeing that the register allocator would create a new virtual register and give it the register class based on calling MI->getRegClassConstraint() (or TII->getRegClass() directly). So in theory, it seems that if there is no MCInstrDesc anywhere that demands a particular register class, regalloc might feel free to take the optimal one (GRX32). I am not sure this is needed, but there is no mechanism that I know of that would constrain a *COPY* register regclass, although it may be that a COPY of a physreg into a virtreg is left alone.

Maybe someone could instead confirm that physreg copies do not get their virtreg regclasses changed ever. Maybe that is obvious and I just wasn't aware. Or, if there is no guarantee for this, perhaps a target hook like I suggested (getPhysRegCopyRegClass()) would be a better solution after all, since that would also then be an error if regalloc broke that.

Also, if you do directly emit the EAR etc., I'm wondering why you still keep the COPY in as well?

I figured that the coalescer should typically remove it. And I wasn't sure if there could ever be a problem with any other connected virtregs involved having to use a high-register. In that case there would have to be a copy to/from a GRH32.

In D75014#1898783, @jonpa wrote:

In D75014#1897768, @uweigand wrote:

It seems safest to build the target instructions compared to just constrain the virtual register class of the register of the COPY.

I'm not sure I understand this: can you explain what problem you see with constraining the register class?

I remember seeing that the register allocator would create a new virtual register and give it the register class based on calling MI->getRegClassConstraint() (or TII->getRegClass() directly). So in theory, it seems that if there is no MCInstrDesc anywhere that demands a particular register class, regalloc might feel free to take the optimal one (GRX32). I am not sure this is needed, but there is no mechanism that I know of that would constrain a *COPY* register regclass, although it may be that a COPY of a physreg into a virtreg is left alone.

Maybe someone could instead confirm that physreg copies do not get their virtreg regclasses changed ever. Maybe that is obvious and I just wasn't aware. Or, if there is no guarantee for this, perhaps a target hook like I suggested (getPhysRegCopyRegClass()) would be a better solution after all, since that would also then be an error if regalloc broke that.

Hmm, I see. I would expect that the one virtreg that was constrained will certainly keep its register class. However, I guess you're right that regalloc may create *new* virtregs (e.g. due to live range splitting etc.). I'm not sure either how the case where the virtreg is a COPY of a phyregs will be handled in that case. Thanks for pointing this out.

Also, if you do directly emit the EAR etc., I'm wondering why you still keep the COPY in as well?

I figured that the coalescer should typically remove it. And I wasn't sure if there could ever be a problem with any other connected virtregs involved having to use a high-register. In that case there would have to be a copy to/from a GRH32.

OK. Well, it's certainly conservatively correct to do it this way.

Given that the patch as-is looks correct to me, and it does fix an actual bug, I think we should go ahead with it for now. If we find another solution later (e.g. via common code change), we can always take the SystemZ pass out again. LGTM.

llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
824	I'm wondering if we need to check (or at least assert) that SrcReg is even a GRX32 here?

This revision is now accepted and ready to land.Mar 2 2020, 7:17 AM

jonpa marked an inline comment as done.Mar 3 2020, 7:45 AM

jonpa added inline comments.

llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp
824	The MachineVerifier would catch this with "Illegal physical register for instruction". There are no subregs anymore at this point (after regalloc), so it would be a very strange error. Perhaps an assert that copyPhysReg() is only called with two physreg operands (post-RA) would make sense? It seems that without the verifier a COPY from $r4d in tls-11.mir does not get caught anywhere.

Closed by commit rGae4d39c9e4ad: [SystemZ] Copy Access registers and CC with the correct register class. (authored by jonpa). · Explain WhyMar 3 2020, 7:47 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

lib/

Target/

SystemZ/

CMakeLists.txt

1 line

SystemZ.h

1 line

SystemZCopyPhysRegs.cpp

120 lines

SystemZInstrInfo.cpp

21 lines

SystemZTargetMachine.cpp

5 lines

test/

CodeGen/

SystemZ/

24 lines

37 lines

24 lines

18 lines

Diff 247904

llvm/lib/Target/SystemZ/CMakeLists.txt

	Show All 10 Lines
	tablegen(LLVM SystemZGenSubtargetInfo.inc -gen-subtarget)			tablegen(LLVM SystemZGenSubtargetInfo.inc -gen-subtarget)

	add_public_tablegen_target(SystemZCommonTableGen)			add_public_tablegen_target(SystemZCommonTableGen)

	add_llvm_target(SystemZCodeGen			add_llvm_target(SystemZCodeGen
	SystemZAsmPrinter.cpp			SystemZAsmPrinter.cpp
	SystemZCallingConv.cpp			SystemZCallingConv.cpp
	SystemZConstantPoolValue.cpp			SystemZConstantPoolValue.cpp
				SystemZCopyPhysRegs.cpp
	SystemZElimCompare.cpp			SystemZElimCompare.cpp
	SystemZFrameLowering.cpp			SystemZFrameLowering.cpp
	SystemZHazardRecognizer.cpp			SystemZHazardRecognizer.cpp
	SystemZISelDAGToDAG.cpp			SystemZISelDAGToDAG.cpp
	SystemZISelLowering.cpp			SystemZISelLowering.cpp
	SystemZInstrInfo.cpp			SystemZInstrInfo.cpp
	SystemZLDCleanup.cpp			SystemZLDCleanup.cpp
	SystemZLongBranch.cpp			SystemZLongBranch.cpp
	Show All 17 Lines

llvm/lib/Target/SystemZ/SystemZ.h

	Show First 20 Lines • Show All 187 Lines • ▼ Show 20 Lines
	} // end namespace SystemZ			} // end namespace SystemZ

	FunctionPass *createSystemZISelDag(SystemZTargetMachine &TM,			FunctionPass *createSystemZISelDag(SystemZTargetMachine &TM,
	CodeGenOpt::Level OptLevel);			CodeGenOpt::Level OptLevel);
	FunctionPass *createSystemZElimComparePass(SystemZTargetMachine &TM);			FunctionPass *createSystemZElimComparePass(SystemZTargetMachine &TM);
	FunctionPass *createSystemZShortenInstPass(SystemZTargetMachine &TM);			FunctionPass *createSystemZShortenInstPass(SystemZTargetMachine &TM);
	FunctionPass *createSystemZLongBranchPass(SystemZTargetMachine &TM);			FunctionPass *createSystemZLongBranchPass(SystemZTargetMachine &TM);
	FunctionPass *createSystemZLDCleanupPass(SystemZTargetMachine &TM);			FunctionPass *createSystemZLDCleanupPass(SystemZTargetMachine &TM);
				FunctionPass *createSystemZCopyPhysRegsPass(SystemZTargetMachine &TM);
	FunctionPass *createSystemZPostRewritePass(SystemZTargetMachine &TM);			FunctionPass *createSystemZPostRewritePass(SystemZTargetMachine &TM);
	FunctionPass *createSystemZTDCPass();			FunctionPass *createSystemZTDCPass();
	} // end namespace llvm			} // end namespace llvm

	#endif			#endif

llvm/lib/Target/SystemZ/SystemZCopyPhysRegs.cpp

This file was added.

				//===---------- SystemZPhysRegCopy.cpp - Handle phys reg copies -----------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This pass makes sure that a COPY of a physical register will be
				// implementable after register allocation in copyPhysReg() (this could be
				// done in EmitInstrWithCustomInserter() instead if COPY instructions would
				// be passed to it).
				//
				//===----------------------------------------------------------------------===//

				#include "SystemZMachineFunctionInfo.h"
				#include "SystemZTargetMachine.h"
				#include "llvm/CodeGen/MachineDominators.h"
				#include "llvm/CodeGen/MachineFunctionPass.h"
				#include "llvm/CodeGen/MachineInstrBuilder.h"
				#include "llvm/CodeGen/MachineRegisterInfo.h"
				#include "llvm/CodeGen/TargetInstrInfo.h"
				#include "llvm/CodeGen/TargetRegisterInfo.h"
				#include "llvm/Target/TargetMachine.h"

				using namespace llvm;

				#define SYSTEMZ_COPYPHYSREGS_NAME "SystemZ Copy Physregs"

				namespace llvm {
				void initializeSystemZCopyPhysRegsPass(PassRegistry&);
				}

				namespace {

				class SystemZCopyPhysRegs : public MachineFunctionPass {
				public:
				static char ID;
				SystemZCopyPhysRegs()
				: MachineFunctionPass(ID), TII(nullptr), MRI(nullptr) {
				initializeSystemZCopyPhysRegsPass(*PassRegistry::getPassRegistry());
				}

				StringRef getPassName() const override { return SYSTEMZ_COPYPHYSREGS_NAME; }

				bool runOnMachineFunction(MachineFunction &MF) override;
				void getAnalysisUsage(AnalysisUsage &AU) const override;

				private:

				bool visitMBB(MachineBasicBlock &MBB);

				const SystemZInstrInfo *TII;
				MachineRegisterInfo *MRI;
				};

				char SystemZCopyPhysRegs::ID = 0;

				} // end anonymous namespace

				INITIALIZE_PASS(SystemZCopyPhysRegs, "systemz-copy-physregs",
				SYSTEMZ_COPYPHYSREGS_NAME, false, false)

				FunctionPass *llvm::createSystemZCopyPhysRegsPass(SystemZTargetMachine &TM) {
				return new SystemZCopyPhysRegs();
				}

				void SystemZCopyPhysRegs::getAnalysisUsage(AnalysisUsage &AU) const {
				AU.setPreservesCFG();
				MachineFunctionPass::getAnalysisUsage(AU);
				}

				bool SystemZCopyPhysRegs::visitMBB(MachineBasicBlock &MBB) {
				bool Modified = false;

				// Certain special registers can only be copied from a subset of the
				// default register class of the type. It is therefore necessary to create
				// the target copy instructions before regalloc instead of in copyPhysReg().
				for (MachineBasicBlock::iterator MBBI = MBB.begin(), E = MBB.end();
				MBBI != E; ) {
				MachineInstr MI = &MBBI++;
				if (!MI->isCopy())
				continue;

				DebugLoc DL = MI->getDebugLoc();
				Register SrcReg = MI->getOperand(1).getReg();
				Register DstReg = MI->getOperand(0).getReg();
				if (DstReg.isVirtual() &&
				(SrcReg == SystemZ::CC \|\| SystemZ::AR32BitRegClass.contains(SrcReg))) {
				Register Tmp = MRI->createVirtualRegister(&SystemZ::GR32BitRegClass);
				if (SrcReg == SystemZ::CC)
				BuildMI(MBB, MI, DL, TII->get(SystemZ::IPM), Tmp);
				else
				BuildMI(MBB, MI, DL, TII->get(SystemZ::EAR), Tmp).addReg(SrcReg);
				MI->getOperand(1).setReg(Tmp);
				Modified = true;
				}
				else if (SrcReg.isVirtual() &&
				SystemZ::AR32BitRegClass.contains(DstReg)) {
				Register Tmp = MRI->createVirtualRegister(&SystemZ::GR32BitRegClass);
				MI->getOperand(0).setReg(Tmp);
				BuildMI(MBB, MBBI, DL, TII->get(SystemZ::SAR), DstReg).addReg(Tmp);
				Modified = true;
				}
				}

				return Modified;
				}

				bool SystemZCopyPhysRegs::runOnMachineFunction(MachineFunction &F) {
				TII = static_cast<const SystemZInstrInfo *>(F.getSubtarget().getInstrInfo());
				MRI = &F.getRegInfo();

				bool Modified = false;
				for (auto &MBB : F)
				Modified \|= visitMBB(MBB);

				return Modified;
				}

llvm/lib/Target/SystemZ/SystemZInstrInfo.cpp

Show First 20 Lines • Show All 814 Lines • ▼ Show 20 Lines	if (SystemZ::FP128BitRegClass.contains(DestReg) &&

if (DestRegHi != SrcReg)		if (DestRegHi != SrcReg)
copyPhysReg(MBB, MBBI, DL, DestRegHi, SrcReg, false);		copyPhysReg(MBB, MBBI, DL, DestRegHi, SrcReg, false);
BuildMI(MBB, MBBI, DL, get(SystemZ::VREPG), DestRegLo)		BuildMI(MBB, MBBI, DL, get(SystemZ::VREPG), DestRegLo)
.addReg(SrcReg, getKillRegState(KillSrc)).addImm(1);		.addReg(SrcReg, getKillRegState(KillSrc)).addImm(1);
return;		return;
}		}

// Move CC value from/to a GR32.		// Move CC value from a GR32.
if (SrcReg == SystemZ::CC) {
auto MIB = BuildMI(MBB, MBBI, DL, get(SystemZ::IPM), DestReg);
if (KillSrc) {
const MachineFunction *MF = MBB.getParent();
const TargetRegisterInfo *TRI = MF->getSubtarget().getRegisterInfo();
MIB->addRegisterKilled(SrcReg, TRI);
}
return;
}
if (DestReg == SystemZ::CC) {		if (DestReg == SystemZ::CC) {
		uweigandUnsubmitted Not Done Reply Inline Actions I'm wondering if we need to check (or at least assert) that SrcReg is even a GRX32 here? uweigand: I'm wondering if we need to check (or at least assert) that SrcReg is even a GRX32 here?
		jonpaAuthorUnsubmitted Done Reply Inline Actions The MachineVerifier would catch this with "Illegal physical register for instruction". There are no subregs anymore at this point (after regalloc), so it would be a very strange error. Perhaps an assert that copyPhysReg() is only called with two physreg operands (post-RA) would make sense? It seems that without the verifier a COPY from $r4d in tls-11.mir does not get caught anywhere. jonpa: The MachineVerifier would catch this with "Illegal physical register for instruction". There…
BuildMI(MBB, MBBI, DL, get(SystemZ::TMLH))		unsigned Opcode =
		SystemZ::GR32BitRegClass.contains(SrcReg) ? SystemZ::TMLH : SystemZ::TMHH;
		BuildMI(MBB, MBBI, DL, get(Opcode))
.addReg(SrcReg, getKillRegState(KillSrc))		.addReg(SrcReg, getKillRegState(KillSrc))
.addImm(3 << (SystemZ::IPM_CC - 16));		.addImm(3 << (SystemZ::IPM_CC - 16));
return;		return;
}		}

// Everything else needs only one instruction.		// Everything else needs only one instruction.
unsigned Opcode;		unsigned Opcode;
if (SystemZ::GR64BitRegClass.contains(DestReg, SrcReg))		if (SystemZ::GR64BitRegClass.contains(DestReg, SrcReg))
Opcode = SystemZ::LGR;		Opcode = SystemZ::LGR;
else if (SystemZ::FP32BitRegClass.contains(DestReg, SrcReg))		else if (SystemZ::FP32BitRegClass.contains(DestReg, SrcReg))
// For z13 we prefer LDR over LER to avoid partial register dependencies.		// For z13 we prefer LDR over LER to avoid partial register dependencies.
Opcode = STI.hasVector() ? SystemZ::LDR32 : SystemZ::LER;		Opcode = STI.hasVector() ? SystemZ::LDR32 : SystemZ::LER;
else if (SystemZ::FP64BitRegClass.contains(DestReg, SrcReg))		else if (SystemZ::FP64BitRegClass.contains(DestReg, SrcReg))
Opcode = SystemZ::LDR;		Opcode = SystemZ::LDR;
else if (SystemZ::FP128BitRegClass.contains(DestReg, SrcReg))		else if (SystemZ::FP128BitRegClass.contains(DestReg, SrcReg))
Opcode = SystemZ::LXR;		Opcode = SystemZ::LXR;
else if (SystemZ::VR32BitRegClass.contains(DestReg, SrcReg))		else if (SystemZ::VR32BitRegClass.contains(DestReg, SrcReg))
Opcode = SystemZ::VLR32;		Opcode = SystemZ::VLR32;
else if (SystemZ::VR64BitRegClass.contains(DestReg, SrcReg))		else if (SystemZ::VR64BitRegClass.contains(DestReg, SrcReg))
Opcode = SystemZ::VLR64;		Opcode = SystemZ::VLR64;
else if (SystemZ::VR128BitRegClass.contains(DestReg, SrcReg))		else if (SystemZ::VR128BitRegClass.contains(DestReg, SrcReg))
Opcode = SystemZ::VLR;		Opcode = SystemZ::VLR;
else if (SystemZ::AR32BitRegClass.contains(DestReg, SrcReg))		else if (SystemZ::AR32BitRegClass.contains(DestReg, SrcReg))
Opcode = SystemZ::CPYA;		Opcode = SystemZ::CPYA;
else if (SystemZ::AR32BitRegClass.contains(DestReg) &&
SystemZ::GR32BitRegClass.contains(SrcReg))
Opcode = SystemZ::SAR;
else if (SystemZ::GR32BitRegClass.contains(DestReg) &&
SystemZ::AR32BitRegClass.contains(SrcReg))
Opcode = SystemZ::EAR;
else		else
llvm_unreachable("Impossible reg-to-reg copy");		llvm_unreachable("Impossible reg-to-reg copy");

BuildMI(MBB, MBBI, DL, get(Opcode), DestReg)		BuildMI(MBB, MBBI, DL, get(Opcode), DestReg)
.addReg(SrcReg, getKillRegState(KillSrc));		.addReg(SrcReg, getKillRegState(KillSrc));
}		}

void SystemZInstrInfo::storeRegToStackSlot(		void SystemZInstrInfo::storeRegToStackSlot(
▲ Show 20 Lines • Show All 943 Lines • Show Last 20 Lines

llvm/lib/Target/SystemZ/SystemZTargetMachine.cpp

Show First 20 Lines • Show All 216 Lines • ▼ Show 20 Lines	createPostMachineScheduler(MachineSchedContext *C) const override {
return new ScheduleDAGMI(C,		return new ScheduleDAGMI(C,
std::make_unique<SystemZPostRASchedStrategy>(C),		std::make_unique<SystemZPostRASchedStrategy>(C),
/RemoveKillFlags=/true);		/RemoveKillFlags=/true);
}		}

void addIRPasses() override;		void addIRPasses() override;
bool addInstSelector() override;		bool addInstSelector() override;
bool addILPOpts() override;		bool addILPOpts() override;
		void addPreRegAlloc() override;
void addPostRewrite() override;		void addPostRewrite() override;
void addPostRegAlloc() override;		void addPostRegAlloc() override;
void addPreSched2() override;		void addPreSched2() override;
void addPreEmitPass() override;		void addPreEmitPass() override;
};		};

} // end anonymous namespace		} // end anonymous namespace

Show All 15 Lines	if (getOptLevel() != CodeGenOpt::None)
return false;		return false;
}		}

bool SystemZPassConfig::addILPOpts() {		bool SystemZPassConfig::addILPOpts() {
addPass(&EarlyIfConverterID);		addPass(&EarlyIfConverterID);
return true;		return true;
}		}

		void SystemZPassConfig::addPreRegAlloc() {
		addPass(createSystemZCopyPhysRegsPass(getSystemZTargetMachine()));
		}

void SystemZPassConfig::addPostRewrite() {		void SystemZPassConfig::addPostRewrite() {
addPass(createSystemZPostRewritePass(getSystemZTargetMachine()));		addPass(createSystemZPostRewritePass(getSystemZTargetMachine()));
}		}

void SystemZPassConfig::addPostRegAlloc() {		void SystemZPassConfig::addPostRegAlloc() {
// PostRewrite needs to be run at -O0 also (in which case addPostRewrite()		// PostRewrite needs to be run at -O0 also (in which case addPostRewrite()
// is not called).		// is not called).
if (getOptLevel() == CodeGenOpt::None)		if (getOptLevel() == CodeGenOpt::None)
▲ Show 20 Lines • Show All 57 Lines • Show Last 20 Lines

llvm/test/CodeGen/SystemZ/tls-08.ll

This file was added.

				; RUN: llc < %s -mcpu=z196 -mtriple=s390x-linux-gnu -O0 \
				; RUN: -stop-before=regallocfast 2>&1 \| FileCheck %s
				; RUN: llc < %s -mcpu=z196 -mtriple=s390x-linux-gnu -O3 \
				; RUN: -stop-before=livevars 2>&1 \| FileCheck %s
				;
				; Test that copies to/from access registers are handled before regalloc with
				; GR32 regs.

				@x = dso_local thread_local global i32 0, align 4
				define weak_odr hidden i32* @fun0() {
				; CHECK: name: fun0
				; CHECK: {{%[0-9]+}}:gr32bit = EAR $a0
				; CHECK: {{%[0-9]+}}:gr32bit = EAR $a1
				ret i32* @x
				}

				define i32 @fun1() {
				; CHECK: name: fun1
				; CHECK: [[VREG0:%[0-9]+]]:gr32bit = COPY %0
				; CHECK-NEXT: $a1 = SAR [[VREG0]]
				; CHECK: {{%[0-9]+}}:gr32bit = EAR $a0
				%val = call i32 asm "blah", "={a0}, {a1}" (i32 0)
				ret i32 %val
				}

llvm/test/CodeGen/SystemZ/tls-09.ll

This file was added.

				; RUN: llc < %s -mcpu=z196 -mtriple=s390x-linux-gnu -O0
				;
				; Test that a0 and a1 are copied successfully into GR32 registers.

				@x = dso_local thread_local global i32 0, align 4
				define i32 @fun0(i32 signext, i32 signext, i32 signext, i32 signext, i32 signext, i32 signext, i32 signext) {
				%8 = alloca i32, align 4
				%9 = alloca i32, align 4
				%10 = alloca i32, align 4
				%11 = alloca i32, align 4
				%12 = alloca i32, align 4
				%13 = alloca i32, align 4
				%14 = alloca i32, align 4
				%15 = load i32, i32* @x, align 4
				store i32 %0, i32* %8, align 4
				store i32 %1, i32* %9, align 4
				store i32 %2, i32* %10, align 4
				store i32 %3, i32* %11, align 4
				store i32 %4, i32* %12, align 4
				store i32 %5, i32* %13, align 4
				store i32 %6, i32* %14, align 4
				%16 = load i32, i32* %8, align 4
				%17 = add nsw i32 %15, %16
				%18 = load i32, i32* %9, align 4
				%19 = add nsw i32 %17, %18
				%20 = load i32, i32* %10, align 4
				%21 = add nsw i32 %19, %20
				%22 = load i32, i32* %11, align 4
				%23 = add nsw i32 %21, %22
				%24 = load i32, i32* %12, align 4
				%25 = add nsw i32 %23, %24
				%26 = load i32, i32* %13, align 4
				%27 = add nsw i32 %25, %26
				%28 = load i32, i32* %14, align 4
				%29 = add nsw i32 %27, %28
				ret i32 %29
				}

llvm/test/CodeGen/SystemZ/tls-10.mir

This file was added.

				# RUN: llc -mtriple=s390x-linux-gnu -mcpu=z196 -O0 -start-after=finalize-isel \
				# RUN: -stop-before=regallocfast -o - %s \| FileCheck %s
				# RUN: llc -mtriple=s390x-linux-gnu -mcpu=z196 -O3 -start-after=finalize-isel \
				# RUN: -stop-before=livevars -o - %s \| FileCheck %s
				#
				# Test that a COPY from CC gets implemented with an IPM to a GR32 reg.

				---
				name: fun0
				tracksRegLiveness: true
				registers:
				- { id: 0, class: grx32bit }
				body: \|
				bb.0:
				liveins: $cc
				; CHECK-LABEL: name: fun0
				; CHECK: %1:gr32bit = IPM implicit $cc
				; CHECK-NEXT: %0:grx32bit = COPY %1
				; CHECK-NEXT: $r2l = COPY %0
				; CHECK-NEXT: Return implicit $r2l
				%0:grx32bit = COPY $cc
				$r2l = COPY %0
				Return implicit $r2l
				...

llvm/test/CodeGen/SystemZ/tls-11.mir

This file was added.

				# RUN: llc -mtriple=s390x-linux-gnu -mcpu=z196 -O0 -start-before=prologepilog \
				# RUN: -o - %s \| FileCheck %s
				#
				# Test that a COPY to CC gets implemented with a tmlh or tmhh depending on
				# the source register.

				---
				name: fun0
				tracksRegLiveness: true
				body: \|
				bb.0:
				liveins: $r3l, $r4h
				; CHECK-LABEL: fun0
				; CHECK: tmlh %r3, 12288
				; CHECK: tmhh %r4, 12288
				$cc = COPY $r3l
				$cc = COPY $r4h
				...