This is an archive of the discontinued LLVM Phabricator instance.

Generate -1/0/1 memcmp/strcmp result for z13
AbandonedPublic

Authored by uweigand on Aug 12 2016, 2:51 PM.

Download Raw Diff

Details

Reviewers

Summary

The current IPM sequence for strcmp/memcmp result value generates -MAXINT/0/1 rather than -1/0/1, which is a valid but slightly unusual result. Some programs may rely on -1. There was a postgres test failure that was attributed to this. For z13 we can use lochi to compute the result just as fast but get the expected answer. Update: Apparently there are php test failures related to this as well.

Diff Detail

Event Timeline

RolandF updated this revision to Diff 67918.Aug 12 2016, 2:51 PM

RolandF retitled this revision from to Generate -1/0/1 memcmp/strcmp result for z13.

RolandF updated this object.

RolandF added a reviewer: uweigand.Aug 15 2016, 7:49 AM

RolandF added a subscriber: • zhanjunl.

RolandF added a subscriber: colpell.Aug 15 2016, 11:15 AM

RolandF updated this object.Aug 16 2016, 3:11 PM

Hmm, so we actually have two (or three) separate issues here.

First of all, I agree that memcmp etc. should return -1/0/1, to avoid confusion. However, this should be the same on z13 and earlier machines; the behaviour should not depend on the architecture level. In GCC, we actually achieve that same result using IPM, via an sll / sra sequence (instead of srl / rotl):

ipm     %r2
sll     %r2,2
sra     %r2,30

It would probably make sense to do the same in LLVM as well. (Note that this means optimzeCompareInstr would have to recognize this sequence too; or maybe it does make sense to make an intermediate SELECT_CMP node just to simplify the opimization step.)

The second issue is that indeed on z13, it would be preferable to use LOCHI instead of the IPM sequence. But this is a pure optimization issue, which should not affect observable behaviour (and can probably be implemented via a second patch). Note that the same LOCHI optimization would *also* be beneficial on z13 for the other cases where we currently use a(nother) IPM sequence to implement SELECT_CC.

Finally, looking at the GCC code, there is another optimization there that LLVM does not yet implement: a somewhat frequent use case is that that result of memcmp is cast to a long (either explicitly or implicitly due to ABI conventions). GCC is able to omit a separate sign-extension step by using -in this case- a sequence like:

ipm     %r2
sllg    %r2,%r2,34
srag    %r2,%r2,62

On z13, of course we'd want to implement this via LOCGHI.

Some more questions regarding details of your patch:

Why expand SELECT_CMP sometimes before and sometimes after RA? Is there any particular benefit to doing it early? (Usually, early expansion is only necessary when we want to create new basic blocks.)
What's the purpose of RotByReg in emitSelectCmp? It appears to be unused ...

The ipm/sll/sra sequence gives -2/0/1 rather than -1/0/1.

Since the ipm sequence was originally done even earlier, I hesitated delaying it all the way to post-RA. But if post-RA is preferred, I am happy to do the expansion there.

RotByReg was just left over from trying several ways of writing rotate by a literal amount before finding the one that is expected. It should have been removed.

In D23467#518772, @RolandF wrote:

The ipm/sll/sra sequence gives -2/0/1 rather than -1/0/1.

Good point, I missed that. Still, the fact remains that this is what GCC has been generating since at least 2001, and we haven't ever seen any report of application bugs due to that ... Given that, I think it still would be preferable to generate that sequence instead of the one LLVM currently generates (which apparently does cause application problems).

Just for my curiosity, do you know any further details about what is causing the misbehaviour with postgres or php? It would be interesting to understand why the particular value used by LLVM causes problems, but other values don't (as I understand, some implementations of strcmp in libc also have historically returns values other than -1/0/1).

Since the ipm sequence was originally done even earlier, I hesitated delaying it all the way to post-RA. But if post-RA is preferred, I am happy to do the expansion there.

In general, it is preferable to expand things earlier rather than sooner. However, "early" usually means expanding during the DAG isel phase. On the other hand, if you have to expand late, you do it after RA. The primary reason for having another expansion step at the MI level (after DAG isel), but before RA, is if you need to create new basic blocks during expansion, which you cannot do at the DAG level.

So I guess my question is rather, if you can / have to expand to the IPM sequence early, why don't you simply leave that expansion at the DAG level (as it is today, just gated by a non-z13 check)? Otherwise, if there's no difference codegen-wise, it would be in general be preferable to have the expansion for z13 and pre-z13 in closer proximity, just for easier reading of the code. But in the end, that's just a preference, not a hard requirement ...

I took a look at the php problem. It was failing test /ext/standard/tests/strings/substr_compare.phpt, due to a -2 from a substr_compare operation where -1 was expected. The implementation in Zend/zend_operators.c is a call to the library memcmp function, so it is not a compiler issue.

In D23467#521100, @RolandF wrote:

I took a look at the php problem. It was failing test /ext/standard/tests/strings/substr_compare.phpt, due to a -2 from a substr_compare operation where -1 was expected. The implementation in Zend/zend_operators.c is a call to the library memcmp function, so it is not a compiler issue.

Interesting. While I'm not a PHP expert, the documentation for substr_compare I can find states:

Return Values

Returns < 0 if main_str from position offset is less than str, > 0 if it is greater than str, and 0 if they are equal. If offset is equal to or greater than the length of main_str, or the length is set and is less than 1 (prior to PHP 5.6), substr_compare() prints a warning and returns FALSE.

Given that, an implementation that calls out to a standard C memcmp and passes through the return value seems to be correct, and the test case would appear to be simply buggy. As long as this shows up only in a (possibly incorrect) test case and not in real-world PHP application, it doesn't seem like something we need to fix in the compiler.

Unlike the php test failure, which is dependent on the library memcmp behaviour and fails for both clang and gcc, the postgres test failure only happens with clang. The uuid regression test fails for clang, and the failure goes away if src/backend/utils/adt/uuid.c is compiled with gcc. The issue is the result for the uuid_internal_cmp function, which is just a 16 byte memcmp. The address of the function is stored in a table of builtins and only called by address, and the complexity of the application and test environment make it difficult to trace back to where this function is called, which may be many places. It might be desirable to just be compatible with gcc. This diff updates the approach to use the gcc-type IPM/SLL/SRA sequence. The sequence is first translated into a SELECT_CMP operation. This makes it easier to perform the memcmp compare to zero optimization (SRA kills the CC). It also should make it easier to add support for LOCHI, since the compare to zero code can be shared, and to get the promotion to 64-bit case with shared code.

In D23467#524485, @RolandF wrote:

Unlike the php test failure, which is dependent on the library memcmp behaviour and fails for both clang and gcc, the postgres test failure only happens with clang. The uuid regression test fails for clang, and the failure goes away if src/backend/utils/adt/uuid.c is compiled with gcc. The issue is the result for the uuid_internal_cmp function, which is just a 16 byte memcmp. The address of the function is stored in a table of builtins and only called by address, and the complexity of the application and test environment make it difficult to trace back to where this function is called, which may be many places.

In src/include/utils/sortsupport.h, I see this code in ApplySortComparator:

compare = (*ssup->comparator) (datum1, datum2, ssup);
if (ssup->ssup_reverse)
        compare = -compare;

If the comparator routine returns INT_MIN, the result of the unary minus is undefined at this point. This might well explain the problem you're seeing.

It might be desirable to just be compatible with gcc. This diff updates the approach to use the gcc-type IPM/SLL/SRA sequence. The sequence is first translated into a SELECT_CMP operation. This makes it easier to perform the memcmp compare to zero optimization (SRA kills the CC). It also should make it easier to add support for LOCHI, since the compare to zero code can be shared, and to get the promotion to 64-bit case with shared code.

Makes sense to me. The patch looks mostly good, see inline comments for more details.

lib/Target/SystemZ/SystemZInstrFormats.td
2762	Don't we have add a Def for CC as well? This will expand to a sequence including a SRA, so it will clobber CC ...
lib/Target/SystemZ/SystemZInstrInfo.cpp
225	Would be nice to have a 32-bit and a 64-bit variant, where a LGFR of a SELECT_CMP32 would combine into a SELECT_CMP64 that is then implemented via SLLG / SRAG.
240	I think this need an implicit def of CC so that it matches what the instruction does. Did you try running the test cases with -verify-machineinstrs? That should have detected such mismatches ...
498	If we had the SELECT_CMP64, we wouldn't have to handle LGFR here.
501	I think if SELCMP is NULL, we also need to return false here, otherwise we'll crash below.
576	So given that we no longer generate the IPM / SRL / RLL sequence, and removeIPMBasedCompare checks for that very sequence, do we even need that routine any more?

Now fixed in a slightly different manner in as r353304.

uweigand abandoned this revision.Feb 6 2019, 7:13 AM

Revision Contents

Path

Size

lib/

Target/

SystemZ/

SystemZISelLowering.h

3 lines

SystemZISelLowering.cpp

1 line

SystemZInstrFormats.td

9 lines

1 line

60 lines

2 lines

2 lines

SystemZSelectionDAGInfo.cpp

20 lines

test/

CodeGen/

SystemZ/

8 lines

8 lines

8 lines

8 lines

Diff 69149

lib/Target/SystemZ/SystemZISelLowering.h

Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	enum NodeType : unsigned {
BR_CCMASK,		BR_CCMASK,

// Selects between operand 0 and operand 1. Operand 2 is the		// Selects between operand 0 and operand 1. Operand 2 is the
// mask of condition-code values for which operand 0 should be		// mask of condition-code values for which operand 0 should be
// chosen over operand 1; it has the same form as BR_CCMASK.		// chosen over operand 1; it has the same form as BR_CCMASK.
// Operand 3 is the flag operand.		// Operand 3 is the flag operand.
SELECT_CCMASK,		SELECT_CCMASK,

		// Implements a 1/0/-1 integer result based on CC.
		SELECT_CMP,

// Evaluates to the gap between the stack pointer and the		// Evaluates to the gap between the stack pointer and the
// base of the dynamically-allocatable area.		// base of the dynamically-allocatable area.
ADJDYNALLOC,		ADJDYNALLOC,

// Extracts the value of a 32-bit access register. Operand 0 is		// Extracts the value of a 32-bit access register. Operand 0 is
// the number of the register.		// the number of the register.
EXTRACT_ACCESS,		EXTRACT_ACCESS,

▲ Show 20 Lines • Show All 509 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZISelLowering.cpp

Show First 20 Lines • Show All 4,607 Lines • ▼ Show 20 Lines	switch ((SystemZISD::NodeType)Opcode) {
OPCODE(PCREL_WRAPPER);		OPCODE(PCREL_WRAPPER);
OPCODE(PCREL_OFFSET);		OPCODE(PCREL_OFFSET);
OPCODE(IABS);		OPCODE(IABS);
OPCODE(ICMP);		OPCODE(ICMP);
OPCODE(FCMP);		OPCODE(FCMP);
OPCODE(TM);		OPCODE(TM);
OPCODE(BR_CCMASK);		OPCODE(BR_CCMASK);
OPCODE(SELECT_CCMASK);		OPCODE(SELECT_CCMASK);
		OPCODE(SELECT_CMP);
OPCODE(ADJDYNALLOC);		OPCODE(ADJDYNALLOC);
OPCODE(EXTRACT_ACCESS);		OPCODE(EXTRACT_ACCESS);
OPCODE(POPCNT);		OPCODE(POPCNT);
OPCODE(UMUL_LOHI64);		OPCODE(UMUL_LOHI64);
OPCODE(SDIVREM32);		OPCODE(SDIVREM32);
OPCODE(SDIVREM64);		OPCODE(SDIVREM64);
OPCODE(UDIVREM32);		OPCODE(UDIVREM32);
OPCODE(UDIVREM64);		OPCODE(UDIVREM64);
▲ Show 20 Lines • Show All 1,658 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZInstrFormats.td

Show First 20 Lines • Show All 2,747 Lines • ▼ Show 20 Lines	class SelectWrapper<RegisterOperand cls>
let usesCustomInserter = 1;		let usesCustomInserter = 1;
// Although the instructions used by these nodes do not in themselves		// Although the instructions used by these nodes do not in themselves
// change CC, the insertion requires new blocks, and CC cannot be live		// change CC, the insertion requires new blocks, and CC cannot be live
// across them.		// across them.
let Defs = [CC];		let Defs = [CC];
let Uses = [CC];		let Uses = [CC];
}		}

		// Implements 1/0/-1 integer result based on CC, for use in strcmp and memcmp
		// expansions.
		class SelectCmpWrapper<RegisterOperand cls>
		: Pseudo<(outs cls:$dst),
		(ins),
		[(set cls:$dst, (z_select_cmp))]> {
		let Uses = [CC];
		uweigandAuthorUnsubmitted Not Done Reply Inline Actions Don't we have add a Def for CC as well? This will expand to a sequence including a SRA, so it will clobber CC ... uweigand: Don't we have add a Def for CC as well? This will expand to a sequence including a SRA, so it…
		}

// Stores $new to $addr if $cc is true ("" case) or false (Inv case).		// Stores $new to $addr if $cc is true ("" case) or false (Inv case).
multiclass CondStores<RegisterOperand cls, SDPatternOperator store,		multiclass CondStores<RegisterOperand cls, SDPatternOperator store,
SDPatternOperator load, AddressingMode mode> {		SDPatternOperator load, AddressingMode mode> {
let Defs = [CC], Uses = [CC], usesCustomInserter = 1 in {		let Defs = [CC], Uses = [CC], usesCustomInserter = 1 in {
def "" : Pseudo<(outs),		def "" : Pseudo<(outs),
(ins cls:$new, mode:$addr, imm32zx4:$valid, imm32zx4:$cc),		(ins cls:$new, mode:$addr, imm32zx4:$valid, imm32zx4:$cc),
[(store (z_select_ccmask cls:$new, (load mode:$addr),		[(store (z_select_ccmask cls:$new, (load mode:$addr),
imm32zx4:$valid, imm32zx4:$cc),		imm32zx4:$valid, imm32zx4:$cc),
▲ Show 20 Lines • Show All 161 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZInstrInfo.h

Show First 20 Lines • Show All 142 Lines • ▼ Show 20 Lines	class SystemZInstrInfo : public SystemZGenInstrInfo {
void expandRXYPseudo(MachineInstr &MI, unsigned LowOpcode,		void expandRXYPseudo(MachineInstr &MI, unsigned LowOpcode,
unsigned HighOpcode) const;		unsigned HighOpcode) const;
void expandZExtPseudo(MachineInstr &MI, unsigned LowOpcode,		void expandZExtPseudo(MachineInstr &MI, unsigned LowOpcode,
unsigned Size) const;		unsigned Size) const;
void expandLoadStackGuard(MachineInstr *MI) const;		void expandLoadStackGuard(MachineInstr *MI) const;
void emitGRX32Move(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,		void emitGRX32Move(MachineBasicBlock &MBB, MachineBasicBlock::iterator MBBI,
const DebugLoc &DL, unsigned DestReg, unsigned SrcReg,		const DebugLoc &DL, unsigned DestReg, unsigned SrcReg,
unsigned LowLowOpcode, unsigned Size, bool KillSrc) const;		unsigned LowLowOpcode, unsigned Size, bool KillSrc) const;
		void expandSelectCmp(MachineInstr &MI) const;
virtual void anchor();		virtual void anchor();

public:		public:
explicit SystemZInstrInfo(SystemZSubtarget &STI);		explicit SystemZInstrInfo(SystemZSubtarget &STI);

// Override TargetInstrInfo.		// Override TargetInstrInfo.
unsigned isLoadFromStackSlot(const MachineInstr &MI,		unsigned isLoadFromStackSlot(const MachineInstr &MI,
int &FrameIndex) const override;		int &FrameIndex) const override;
▲ Show 20 Lines • Show All 109 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZInstrInfo.cpp

Show First 20 Lines • Show All 216 Lines • ▼ Show 20 Lines	void SystemZInstrInfo::emitGRX32Move(MachineBasicBlock &MBB,
}		}
unsigned Rotate = (DestIsHigh != SrcIsHigh ? 32 : 0);		unsigned Rotate = (DestIsHigh != SrcIsHigh ? 32 : 0);
BuildMI(MBB, MBBI, DL, get(Opcode), DestReg)		BuildMI(MBB, MBBI, DL, get(Opcode), DestReg)
.addReg(DestReg, RegState::Undef)		.addReg(DestReg, RegState::Undef)
.addReg(SrcReg, getKillRegState(KillSrc))		.addReg(SrcReg, getKillRegState(KillSrc))
.addImm(32 - Size).addImm(128 + 31).addImm(Rotate);		.addImm(32 - Size).addImm(128 + 31).addImm(Rotate);
}		}

		void SystemZInstrInfo::expandSelectCmp(MachineInstr &MI) const {
		uweigandAuthorUnsubmitted Not Done Reply Inline Actions Would be nice to have a 32-bit and a 64-bit variant, where a LGFR of a SELECT_CMP32 would combine into a SELECT_CMP64 that is then implemented via SLLG / SRAG. uweigand: Would be nice to have a 32-bit and a 64-bit variant, where a LGFR of a SELECT_CMP32 would…
		const SystemZInstrInfo *TII =
		static_cast<const SystemZInstrInfo *>(STI.getInstrInfo());

		MachineBasicBlock *MBB = MI.getParent();
		unsigned DestReg = MI.getOperand(0).getReg();
		DebugLoc DL = MI.getDebugLoc();

		BuildMI(*MBB, MI, DL, TII->get(SystemZ::IPM))
		.addReg(DestReg, RegState::Define);
		BuildMI(*MBB, MI, DL, TII->get(SystemZ::SLL))
		.addReg(DestReg, RegState::Define).addReg(DestReg, RegState::Kill)
		.addReg(0).addImm(2);
		BuildMI(*MBB, MI, DL, TII->get(SystemZ::SRA))
		.addReg(DestReg, RegState::Define).addReg(DestReg, RegState::Kill)
		.addReg(0).addImm(30);
		uweigandAuthorUnsubmitted Not Done Reply Inline Actions I think this need an implicit def of CC so that it matches what the instruction does. Did you try running the test cases with -verify-machineinstrs? That should have detected such mismatches ... uweigand: I think this need an implicit def of CC so that it matches what the instruction does. Did you…

		MI.eraseFromParent();
		}

// If MI is a simple load or store for a frame object, return the register		// If MI is a simple load or store for a frame object, return the register
// it loads or stores and set FrameIndex to the index of the frame object.		// it loads or stores and set FrameIndex to the index of the frame object.
// Return 0 otherwise.		// Return 0 otherwise.
//		//
// Flag is SimpleBDXLoad for loads and SimpleBDXStore for stores.		// Flag is SimpleBDXLoad for loads and SimpleBDXStore for stores.
static int isSimpleMove(const MachineInstr &MI, int &FrameIndex,		static int isSimpleMove(const MachineInstr &MI, int &FrameIndex,
unsigned Flag) {		unsigned Flag) {
const MCInstrDesc &MCID = MI.getDesc();		const MCInstrDesc &MCID = MI.getDesc();
▲ Show 20 Lines • Show All 229 Lines • ▼ Show 20 Lines
}		}

// If the destination of MI has no uses, delete it as dead.		// If the destination of MI has no uses, delete it as dead.
static void eraseIfDead(MachineInstr MI, const MachineRegisterInfo MRI) {		static void eraseIfDead(MachineInstr MI, const MachineRegisterInfo MRI) {
if (MRI->use_nodbg_empty(MI->getOperand(0).getReg()))		if (MRI->use_nodbg_empty(MI->getOperand(0).getReg()))
MI->eraseFromParent();		MI->eraseFromParent();
}		}

		static bool removeSelectCmpCompare(MachineInstr &Compare, unsigned SrcReg,
		const MachineRegisterInfo *MRI,
		const TargetRegisterInfo *TRI) {
		MachineInstr *LGFR = nullptr;
		MachineInstr *SELCMP = getDef(SrcReg, MRI);
		if (SELCMP && SELCMP->getOpcode() == SystemZ::LGFR) {
		LGFR = SELCMP;
		SELCMP = getDef(LGFR->getOperand(1).getReg(), MRI);
		}
		uweigandAuthorUnsubmitted Not Done Reply Inline Actions If we had the SELECT_CMP64, we wouldn't have to handle LGFR here. uweigand: If we had the SELECT_CMP64, we wouldn't have to handle LGFR here.

		if (SELCMP && SELCMP->getOpcode() != SystemZ::SelectCmp32)
		return false;
		uweigandAuthorUnsubmitted Not Done Reply Inline Actions I think if SELCMP is NULL, we also need to return false here, otherwise we'll crash below. uweigand: I think if SELCMP is NULL, we also need to return false here, otherwise we'll crash below.

		// Check that there are no assignments to CC between SelectCmp and Compare
		if (SELCMP->getParent() != Compare.getParent())
		return false;
		MachineBasicBlock::iterator MBBI = SELCMP, MBBE = Compare.getIterator();
		for (++MBBI; MBBI != MBBE; ++MBBI) {
		MachineInstr &MI = *MBBI;
		if (MI.modifiesRegister(SystemZ::CC, TRI))
		return false;
		}

		Compare.eraseFromParent();
		if (LGFR)
		eraseIfDead(LGFR, MRI);
		eraseIfDead(SELCMP, MRI);

		return true;
		}

// Compare compares SrcReg against zero. Check whether SrcReg contains		// Compare compares SrcReg against zero. Check whether SrcReg contains
// the result of an IPM sequence whose input CC survives until Compare,		// the result of an IPM sequence whose input CC survives until Compare,
// and whether Compare is therefore redundant. Delete it and return		// and whether Compare is therefore redundant. Delete it and return
// true if so.		// true if so.
static bool removeIPMBasedCompare(MachineInstr &Compare, unsigned SrcReg,		static bool removeIPMBasedCompare(MachineInstr &Compare, unsigned SrcReg,
const MachineRegisterInfo *MRI,		const MachineRegisterInfo *MRI,
const TargetRegisterInfo *TRI) {		const TargetRegisterInfo *TRI) {
MachineInstr *LGFR = nullptr;		MachineInstr *LGFR = nullptr;
Show All 33 Lines	static bool removeIPMBasedCompare(MachineInstr &Compare, unsigned SrcReg,
return true;		return true;
}		}

bool SystemZInstrInfo::optimizeCompareInstr(		bool SystemZInstrInfo::optimizeCompareInstr(
MachineInstr &Compare, unsigned SrcReg, unsigned SrcReg2, int Mask,		MachineInstr &Compare, unsigned SrcReg, unsigned SrcReg2, int Mask,
int Value, const MachineRegisterInfo *MRI) const {		int Value, const MachineRegisterInfo *MRI) const {
assert(!SrcReg2 && "Only optimizing constant comparisons so far");		assert(!SrcReg2 && "Only optimizing constant comparisons so far");
bool IsLogical = (Compare.getDesc().TSFlags & SystemZII::IsLogical) != 0;		bool IsLogical = (Compare.getDesc().TSFlags & SystemZII::IsLogical) != 0;
		if (Value == 0 &&
		!IsLogical &&
		removeSelectCmpCompare(Compare, SrcReg, MRI, &RI)) {
		return true;
		}
return Value == 0 && !IsLogical &&		return Value == 0 && !IsLogical &&
removeIPMBasedCompare(Compare, SrcReg, MRI, &RI);		removeIPMBasedCompare(Compare, SrcReg, MRI, &RI);
		uweigandAuthorUnsubmitted Not Done Reply Inline Actions So given that we no longer generate the IPM / SRL / RLL sequence, and removeIPMBasedCompare checks for that very sequence, do we even need that routine any more? uweigand: So given that we no longer generate the IPM / SRL / RLL sequence, and removeIPMBasedCompare…
}		}

// If Opcode is a move that has a conditional variant, return that variant,		// If Opcode is a move that has a conditional variant, return that variant,
// otherwise return 0.		// otherwise return 0.
static unsigned getConditionalMove(unsigned Opcode) {		static unsigned getConditionalMove(unsigned Opcode) {
switch (Opcode) {		switch (Opcode) {
case SystemZ::LR: return SystemZ::LOCR;		case SystemZ::LR: return SystemZ::LOCR;
case SystemZ::LGR: return SystemZ::LOCGR;		case SystemZ::LGR: return SystemZ::LOCGR;
▲ Show 20 Lines • Show All 499 Lines • ▼ Show 20 Lines	MachineInstr *SystemZInstrInfo::foldMemoryOperandImpl(
MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,		MachineFunction &MF, MachineInstr &MI, ArrayRef<unsigned> Ops,
MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,		MachineBasicBlock::iterator InsertPt, MachineInstr &LoadMI,
LiveIntervals *LIS) const {		LiveIntervals *LIS) const {
return nullptr;		return nullptr;
}		}

bool SystemZInstrInfo::expandPostRAPseudo(MachineInstr &MI) const {		bool SystemZInstrInfo::expandPostRAPseudo(MachineInstr &MI) const {
switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
		case SystemZ::SelectCmp32:
		expandSelectCmp(MI);
		return true;

case SystemZ::L128:		case SystemZ::L128:
splitMove(MI, SystemZ::LG);		splitMove(MI, SystemZ::LG);
return true;		return true;

case SystemZ::ST128:		case SystemZ::ST128:
splitMove(MI, SystemZ::STG);		splitMove(MI, SystemZ::STG);
return true;		return true;

▲ Show 20 Lines • Show All 470 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZInstrInfo.td

	Show First 20 Lines • Show All 340 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Select instructions			// Select instructions
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	def Select32Mux : SelectWrapper<GRX32>, Requires<[FeatureHighWord]>;			def Select32Mux : SelectWrapper<GRX32>, Requires<[FeatureHighWord]>;
	def Select32 : SelectWrapper<GR32>;			def Select32 : SelectWrapper<GR32>;
	def Select64 : SelectWrapper<GR64>;			def Select64 : SelectWrapper<GR64>;

				def SelectCmp32 : SelectCmpWrapper<GR32>;

	// We don't define 32-bit Mux stores because the low-only STOC should			// We don't define 32-bit Mux stores because the low-only STOC should
	// always be used if possible.			// always be used if possible.
	defm CondStore8Mux : CondStores<GRX32, nonvolatile_truncstorei8,			defm CondStore8Mux : CondStores<GRX32, nonvolatile_truncstorei8,
	nonvolatile_anyextloadi8, bdxaddr20only>,			nonvolatile_anyextloadi8, bdxaddr20only>,
	Requires<[FeatureHighWord]>;			Requires<[FeatureHighWord]>;
	defm CondStore16Mux : CondStores<GRX32, nonvolatile_truncstorei16,			defm CondStore16Mux : CondStores<GRX32, nonvolatile_truncstorei16,
	nonvolatile_anyextloadi16, bdxaddr20only>,			nonvolatile_anyextloadi16, bdxaddr20only>,
	Requires<[FeatureHighWord]>;			Requires<[FeatureHighWord]>;
	▲ Show 20 Lines • Show All 1,526 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZOperators.td

	Show First 20 Lines • Show All 179 Lines • ▼ Show 20 Lines
	def z_iabs : SDNode<"SystemZISD::IABS", SDTIntUnaryOp, []>;			def z_iabs : SDNode<"SystemZISD::IABS", SDTIntUnaryOp, []>;
	def z_icmp : SDNode<"SystemZISD::ICMP", SDT_ZICmp, [SDNPOutGlue]>;			def z_icmp : SDNode<"SystemZISD::ICMP", SDT_ZICmp, [SDNPOutGlue]>;
	def z_fcmp : SDNode<"SystemZISD::FCMP", SDT_ZCmp, [SDNPOutGlue]>;			def z_fcmp : SDNode<"SystemZISD::FCMP", SDT_ZCmp, [SDNPOutGlue]>;
	def z_tm : SDNode<"SystemZISD::TM", SDT_ZICmp, [SDNPOutGlue]>;			def z_tm : SDNode<"SystemZISD::TM", SDT_ZICmp, [SDNPOutGlue]>;
	def z_br_ccmask : SDNode<"SystemZISD::BR_CCMASK", SDT_ZBRCCMask,			def z_br_ccmask : SDNode<"SystemZISD::BR_CCMASK", SDT_ZBRCCMask,
	[SDNPHasChain, SDNPInGlue]>;			[SDNPHasChain, SDNPInGlue]>;
	def z_select_ccmask : SDNode<"SystemZISD::SELECT_CCMASK", SDT_ZSelectCCMask,			def z_select_ccmask : SDNode<"SystemZISD::SELECT_CCMASK", SDT_ZSelectCCMask,
	[SDNPInGlue]>;			[SDNPInGlue]>;
				def z_select_cmp : SDNode<"SystemZISD::SELECT_CMP", SDT_ZI32Intrinsic,
				[SDNPInGlue]>;
	def z_adjdynalloc : SDNode<"SystemZISD::ADJDYNALLOC", SDT_ZAdjDynAlloc>;			def z_adjdynalloc : SDNode<"SystemZISD::ADJDYNALLOC", SDT_ZAdjDynAlloc>;
	def z_extract_access : SDNode<"SystemZISD::EXTRACT_ACCESS",			def z_extract_access : SDNode<"SystemZISD::EXTRACT_ACCESS",
	SDT_ZExtractAccess>;			SDT_ZExtractAccess>;
	def z_popcnt : SDNode<"SystemZISD::POPCNT", SDTIntUnaryOp>;			def z_popcnt : SDNode<"SystemZISD::POPCNT", SDTIntUnaryOp>;
	def z_umul_lohi64 : SDNode<"SystemZISD::UMUL_LOHI64", SDT_ZGR128Binary64>;			def z_umul_lohi64 : SDNode<"SystemZISD::UMUL_LOHI64", SDT_ZGR128Binary64>;
	def z_sdivrem32 : SDNode<"SystemZISD::SDIVREM32", SDT_ZGR128Binary32>;			def z_sdivrem32 : SDNode<"SystemZISD::SDIVREM32", SDT_ZGR128Binary32>;
	def z_sdivrem64 : SDNode<"SystemZISD::SDIVREM64", SDT_ZGR128Binary64>;			def z_sdivrem64 : SDNode<"SystemZISD::SDIVREM64", SDT_ZGR128Binary64>;
	def z_udivrem32 : SDNode<"SystemZISD::UDIVREM32", SDT_ZGR128Binary32>;			def z_udivrem32 : SDNode<"SystemZISD::UDIVREM32", SDT_ZGR128Binary32>;
	▲ Show 20 Lines • Show All 497 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZSelectionDAGInfo.cpp

Show First 20 Lines • Show All 157 Lines • ▼ Show 20 Lines	static SDValue emitCLC(SelectionDAG &DAG, const SDLoc &DL, SDValue Chain,
if (Size > 3 * 256)		if (Size > 3 * 256)
return DAG.getNode(SystemZISD::CLC_LOOP, DL, VTs, Chain, Src1, Src2,		return DAG.getNode(SystemZISD::CLC_LOOP, DL, VTs, Chain, Src1, Src2,
DAG.getConstant(Size, DL, PtrVT),		DAG.getConstant(Size, DL, PtrVT),
DAG.getConstant(Size / 256, DL, PtrVT));		DAG.getConstant(Size / 256, DL, PtrVT));
return DAG.getNode(SystemZISD::CLC, DL, VTs, Chain, Src1, Src2,		return DAG.getNode(SystemZISD::CLC, DL, VTs, Chain, Src1, Src2,
DAG.getConstant(Size, DL, PtrVT));		DAG.getConstant(Size, DL, PtrVT));
}		}

// Convert the current CC value into an integer that is 0 if CC == 0,
// less than zero if CC == 1 and greater than zero if CC >= 2.
// The sequence starts with IPM, which puts CC into bits 29 and 28
// of an integer and clears bits 30 and 31.
static SDValue addIPMSequence(const SDLoc &DL, SDValue Glue,
SelectionDAG &DAG) {
SDValue IPM = DAG.getNode(SystemZISD::IPM, DL, MVT::i32, Glue);
SDValue SRL = DAG.getNode(ISD::SRL, DL, MVT::i32, IPM,
DAG.getConstant(SystemZ::IPM_CC, DL, MVT::i32));
SDValue ROTL = DAG.getNode(ISD::ROTL, DL, MVT::i32, SRL,
DAG.getConstant(31, DL, MVT::i32));
return ROTL;
}

std::pair<SDValue, SDValue> SystemZSelectionDAGInfo::EmitTargetCodeForMemcmp(		std::pair<SDValue, SDValue> SystemZSelectionDAGInfo::EmitTargetCodeForMemcmp(
SelectionDAG &DAG, const SDLoc &DL, SDValue Chain, SDValue Src1,		SelectionDAG &DAG, const SDLoc &DL, SDValue Chain, SDValue Src1,
SDValue Src2, SDValue Size, MachinePointerInfo Op1PtrInfo,		SDValue Src2, SDValue Size, MachinePointerInfo Op1PtrInfo,
MachinePointerInfo Op2PtrInfo) const {		MachinePointerInfo Op2PtrInfo) const {
if (auto *CSize = dyn_cast<ConstantSDNode>(Size)) {		if (auto *CSize = dyn_cast<ConstantSDNode>(Size)) {
uint64_t Bytes = CSize->getZExtValue();		uint64_t Bytes = CSize->getZExtValue();
assert(Bytes > 0 && "Caller should have handled 0-size case");		assert(Bytes > 0 && "Caller should have handled 0-size case");
Chain = emitCLC(DAG, DL, Chain, Src1, Src2, Bytes);		Chain = emitCLC(DAG, DL, Chain, Src1, Src2, Bytes);
SDValue Glue = Chain.getValue(1);		SDValue Glue = Chain.getValue(1);
return std::make_pair(addIPMSequence(DL, Glue, DAG), Chain);		SDValue SelCmp = DAG.getNode(SystemZISD::SELECT_CMP, DL, MVT::i32, Glue);
		return std::make_pair(SelCmp, Chain);
}		}
return std::make_pair(SDValue(), SDValue());		return std::make_pair(SDValue(), SDValue());
}		}

std::pair<SDValue, SDValue> SystemZSelectionDAGInfo::EmitTargetCodeForMemchr(		std::pair<SDValue, SDValue> SystemZSelectionDAGInfo::EmitTargetCodeForMemchr(
SelectionDAG &DAG, const SDLoc &DL, SDValue Chain, SDValue Src,		SelectionDAG &DAG, const SDLoc &DL, SDValue Chain, SDValue Src,
SDValue Char, SDValue Length, MachinePointerInfo SrcPtrInfo) const {		SDValue Char, SDValue Length, MachinePointerInfo SrcPtrInfo) const {
// Use SRST to find the character. End is its address on success.		// Use SRST to find the character. End is its address on success.
Show All 34 Lines	std::pair<SDValue, SDValue> SystemZSelectionDAGInfo::EmitTargetCodeForStrcmp(
SelectionDAG &DAG, const SDLoc &DL, SDValue Chain, SDValue Src1,		SelectionDAG &DAG, const SDLoc &DL, SDValue Chain, SDValue Src1,
SDValue Src2, MachinePointerInfo Op1PtrInfo,		SDValue Src2, MachinePointerInfo Op1PtrInfo,
MachinePointerInfo Op2PtrInfo) const {		MachinePointerInfo Op2PtrInfo) const {
SDVTList VTs = DAG.getVTList(Src1.getValueType(), MVT::Other, MVT::Glue);		SDVTList VTs = DAG.getVTList(Src1.getValueType(), MVT::Other, MVT::Glue);
SDValue Unused = DAG.getNode(SystemZISD::STRCMP, DL, VTs, Chain, Src1, Src2,		SDValue Unused = DAG.getNode(SystemZISD::STRCMP, DL, VTs, Chain, Src1, Src2,
DAG.getConstant(0, DL, MVT::i32));		DAG.getConstant(0, DL, MVT::i32));
Chain = Unused.getValue(1);		Chain = Unused.getValue(1);
SDValue Glue = Chain.getValue(2);		SDValue Glue = Chain.getValue(2);
return std::make_pair(addIPMSequence(DL, Glue, DAG), Chain);		SDValue SelCmp = DAG.getNode(SystemZISD::SELECT_CMP, DL, MVT::i32, Glue);
		return std::make_pair(SelCmp, Chain);
}		}

// Search from Src for a null character, stopping once Src reaches Limit.		// Search from Src for a null character, stopping once Src reaches Limit.
// Return a pair of values, the first being the number of nonnull characters		// Return a pair of values, the first being the number of nonnull characters
// and the second being the out chain.		// and the second being the out chain.
//		//
// This can be used for strlen by setting Limit to 0.		// This can be used for strlen by setting Limit to 0.
static std::pair<SDValue, SDValue> getBoundedStrlen(SelectionDAG &DAG,		static std::pair<SDValue, SDValue> getBoundedStrlen(SelectionDAG &DAG,
Show All 27 Lines

test/CodeGen/SystemZ/memcmp-01.ll

Show All 12 Lines	; CHECK: br %r14
ret i32 %res		ret i32 %res
}		}

; Check a case where the result is used as an integer.		; Check a case where the result is used as an integer.
define i32 @f2(i8 %src1, i8 %src2) {		define i32 @f2(i8 %src1, i8 %src2) {
; CHECK-LABEL: f2:		; CHECK-LABEL: f2:
; CHECK: clc 0(2,%r2), 0(%r3)		; CHECK: clc 0(2,%r2), 0(%r3)
; CHECK: ipm [[REG:%r[0-5]]]		; CHECK: ipm [[REG:%r[0-5]]]
; CHECK: srl [[REG]], 28		; CHECK: sll [[REG]], 2
; CHECK: rll %r2, [[REG]], 31		; CHECK: sra [[REG]], 30
; CHECK: br %r14		; CHECK: br %r14
%res = call i32 @memcmp(i8 %src1, i8 %src2, i64 2)		%res = call i32 @memcmp(i8 %src1, i8 %src2, i64 2)
ret i32 %res		ret i32 %res
}		}

; Check a case where the result is tested for equality.		; Check a case where the result is tested for equality.
define void @f3(i8 %src1, i8 %src2, i32 *%dest) {		define void @f3(i8 %src1, i8 %src2, i32 *%dest) {
; CHECK-LABEL: f3:		; CHECK-LABEL: f3:
▲ Show 20 Lines • Show All 70 Lines • ▼ Show 20 Lines
}		}

; Check the upper end of the CLC range. Here the result is used both as		; Check the upper end of the CLC range. Here the result is used both as
; an integer and for branching.		; an integer and for branching.
define i32 @f7(i8 %src1, i8 %src2, i32 *%dest) {		define i32 @f7(i8 %src1, i8 %src2, i32 *%dest) {
; CHECK-LABEL: f7:		; CHECK-LABEL: f7:
; CHECK: clc 0(256,%r2), 0(%r3)		; CHECK: clc 0(256,%r2), 0(%r3)
; CHECK: ipm [[REG:%r[0-5]]]		; CHECK: ipm [[REG:%r[0-5]]]
; CHECK: srl [[REG]], 28		; CHECK: sll [[REG]], 2
; CHECK: rll %r2, [[REG]], 31		; CHECK: sra [[REG]], 30
; CHECK: blr %r14		; CHECK: blr %r14
; CHECK: br %r14		; CHECK: br %r14
entry:		entry:
%res = call i32 @memcmp(i8 %src1, i8 %src2, i64 256)		%res = call i32 @memcmp(i8 %src1, i8 %src2, i64 256)
%cmp = icmp slt i32 %res, 0		%cmp = icmp slt i32 %res, 0
br i1 %cmp, label %exit, label %store		br i1 %cmp, label %exit, label %store

store:		store:
▲ Show 20 Lines • Show All 103 Lines • Show Last 20 Lines

test/CodeGen/SystemZ/memcmp-02.ll

Show All 12 Lines	; CHECK: br %r14
ret i64 %res		ret i64 %res
}		}

; Check a case where the result is used as an integer.		; Check a case where the result is used as an integer.
define i64 @f2(i8 %src1, i8 %src2) {		define i64 @f2(i8 %src1, i8 %src2) {
; CHECK-LABEL: f2:		; CHECK-LABEL: f2:
; CHECK: clc 0(2,%r2), 0(%r3)		; CHECK: clc 0(2,%r2), 0(%r3)
; CHECK: ipm [[REG:%r[0-5]]]		; CHECK: ipm [[REG:%r[0-5]]]
; CHECK: srl [[REG]], 28		; CHECK: sll [[REG]], 2
; CHECK: rll [[REG]], [[REG]], 31		; CHECK: sra [[REG]], 30
; CHECK: lgfr %r2, [[REG]]		; CHECK: lgfr %r2, [[REG]]
; CHECK: br %r14		; CHECK: br %r14
%res = call i64 @memcmp(i8 %src1, i8 %src2, i64 2)		%res = call i64 @memcmp(i8 %src1, i8 %src2, i64 2)
ret i64 %res		ret i64 %res
}		}

; Check a case where the result is tested for equality.		; Check a case where the result is tested for equality.
define void @f3(i8 %src1, i8 %src2, i64 *%dest) {		define void @f3(i8 %src1, i8 %src2, i64 *%dest) {
▲ Show 20 Lines • Show All 71 Lines • ▼ Show 20 Lines
}		}

; Check the upper end of the CLC range. Here the result is used both as		; Check the upper end of the CLC range. Here the result is used both as
; an integer and for branching.		; an integer and for branching.
define i64 @f7(i8 %src1, i8 %src2, i64 *%dest) {		define i64 @f7(i8 %src1, i8 %src2, i64 *%dest) {
; CHECK-LABEL: f7:		; CHECK-LABEL: f7:
; CHECK: clc 0(256,%r2), 0(%r3)		; CHECK: clc 0(256,%r2), 0(%r3)
; CHECK: ipm [[REG:%r[0-5]]]		; CHECK: ipm [[REG:%r[0-5]]]
; CHECK: srl [[REG]], 28		; CHECK: sll [[REG]], 2
; CHECK: rll [[REG]], [[REG]], 31		; CHECK: sra [[REG]], 30
; CHECK: lgfr %r2, [[REG]]		; CHECK: lgfr %r2, [[REG]]
; CHECK: blr %r14		; CHECK: blr %r14
; CHECK: br %r14		; CHECK: br %r14
entry:		entry:
%res = call i64 @memcmp(i8 %src1, i8 %src2, i64 256)		%res = call i64 @memcmp(i8 %src1, i8 %src2, i64 256)
%cmp = icmp slt i64 %res, 0		%cmp = icmp slt i64 %res, 0
br i1 %cmp, label %exit, label %store		br i1 %cmp, label %exit, label %store

Show All 20 Lines

test/CodeGen/SystemZ/strcmp-01.ll

	; Test strcmp using CLST, i32 version.			; Test strcmp using CLST, i32 version.
	;			;
	; RUN: llc < %s -mtriple=s390x-linux-gnu \| FileCheck %s			; RUN: llc < %s -mtriple=s390x-linux-gnu \| FileCheck %s

	declare signext i32 @strcmp(i8 %src1, i8 %src2)			declare signext i32 @strcmp(i8 %src1, i8 %src2)

	; Check a case where the result is used as an integer.			; Check a case where the result is used as an integer.
	define i32 @f1(i8 %src1, i8 %src2) {			define i32 @f1(i8 %src1, i8 %src2) {
	; CHECK-LABEL: f1:			; CHECK-LABEL: f1:
	; CHECK: lhi %r0, 0			; CHECK: lhi %r0, 0
	; CHECK: [[LABEL:\.[^:]*]]:			; CHECK: [[LABEL:\.[^:]*]]:
	; CHECK: clst %r2, %r3			; CHECK: clst %r2, %r3
	; CHECK-NEXT: jo [[LABEL]]			; CHECK-NEXT: jo [[LABEL]]
	; CHECK-NEXT: BB#{{[0-9]+}}			; CHECK-NEXT: BB#{{[0-9]+}}
	; CHECK-NEXT: ipm [[REG:%r[0-5]]]			; CHECK-NEXT: ipm [[REG:%r[0-5]]]
	; CHECK: srl [[REG]], 28			; CHECK: sll [[REG]], 2
	; CHECK: rll %r2, [[REG]], 31			; CHECK: sra [[REG]], 30
	; CHECK: br %r14			; CHECK: br %r14
	%res = call i32 @strcmp(i8 %src1, i8 %src2)			%res = call i32 @strcmp(i8 %src1, i8 %src2)
	ret i32 %res			ret i32 %res
	}			}

	; Check a case where the result is tested for equality.			; Check a case where the result is tested for equality.
	define void @f2(i8 %src1, i8 %src2, i32 *%dest) {			define void @f2(i8 %src1, i8 %src2, i32 *%dest) {
	; CHECK-LABEL: f2:			; CHECK-LABEL: f2:
	Show All 21 Lines
	define i32 @f3(i8 %src1, i8 %src2, i32 *%dest) {			define i32 @f3(i8 %src1, i8 %src2, i32 *%dest) {
	; CHECK-LABEL: f3:			; CHECK-LABEL: f3:
	; CHECK: lhi %r0, 0			; CHECK: lhi %r0, 0
	; CHECK: [[LABEL:\.[^:]*]]:			; CHECK: [[LABEL:\.[^:]*]]:
	; CHECK: clst %r2, %r3			; CHECK: clst %r2, %r3
	; CHECK-NEXT: jo [[LABEL]]			; CHECK-NEXT: jo [[LABEL]]
	; CHECK-NEXT: BB#{{[0-9]+}}			; CHECK-NEXT: BB#{{[0-9]+}}
	; CHECK-NEXT: ipm [[REG:%r[0-5]]]			; CHECK-NEXT: ipm [[REG:%r[0-5]]]
	; CHECK: srl [[REG]], 28			; CHECK: sll [[REG]], 2
	; CHECK: rll %r2, [[REG]], 31			; CHECK: sra [[REG]], 30
	; CHECK: blr %r14			; CHECK: blr %r14
	; CHECK: br %r14			; CHECK: br %r14
	entry:			entry:
	%res = call i32 @strcmp(i8 %src1, i8 %src2)			%res = call i32 @strcmp(i8 %src1, i8 %src2)
	%cmp = icmp slt i32 %res, 0			%cmp = icmp slt i32 %res, 0
	br i1 %cmp, label %exit, label %store			br i1 %cmp, label %exit, label %store

	store:			store:
	store i32 0, i32 *%dest			store i32 0, i32 *%dest
	br label %exit			br label %exit

	exit:			exit:
	ret i32 %res			ret i32 %res
	}			}

test/CodeGen/SystemZ/strcmp-02.ll

	; Test strcmp using CLST, i64 version.			; Test strcmp using CLST, i64 version.
	;			;
	; RUN: llc < %s -mtriple=s390x-linux-gnu \| FileCheck %s			; RUN: llc < %s -mtriple=s390x-linux-gnu \| FileCheck %s

	declare i64 @strcmp(i8 %src1, i8 %src2)			declare i64 @strcmp(i8 %src1, i8 %src2)

	; Check a case where the result is used as an integer.			; Check a case where the result is used as an integer.
	define i64 @f1(i8 %src1, i8 %src2) {			define i64 @f1(i8 %src1, i8 %src2) {
	; CHECK-LABEL: f1:			; CHECK-LABEL: f1:
	; CHECK: lhi %r0, 0			; CHECK: lhi %r0, 0
	; CHECK: [[LABEL:\.[^:]*]]:			; CHECK: [[LABEL:\.[^:]*]]:
	; CHECK: clst %r2, %r3			; CHECK: clst %r2, %r3
	; CHECK-NEXT: jo [[LABEL]]			; CHECK-NEXT: jo [[LABEL]]
	; CHECK-NEXT: BB#{{[0-9]+}}			; CHECK-NEXT: BB#{{[0-9]+}}
	; CHECK-NEXT: ipm [[REG:%r[0-5]]]			; CHECK-NEXT: ipm [[REG:%r[0-5]]]
	; CHECK: srl [[REG]], 28			; CHECK: sll [[REG]], 2
	; CHECK: rll [[REG]], [[REG]], 31			; CHECK: sra [[REG]], 30
	; CHECK: lgfr %r2, [[REG]]			; CHECK: lgfr %r2, [[REG]]
	; CHECK: br %r14			; CHECK: br %r14
	%res = call i64 @strcmp(i8 %src1, i8 %src2)			%res = call i64 @strcmp(i8 %src1, i8 %src2)
	ret i64 %res			ret i64 %res
	}			}

	; Check a case where the result is tested for equality.			; Check a case where the result is tested for equality.
	define void @f2(i8 %src1, i8 %src2, i64 *%dest) {			define void @f2(i8 %src1, i8 %src2, i64 *%dest) {
	Show All 22 Lines
	define i64 @f3(i8 %src1, i8 %src2, i64 *%dest) {			define i64 @f3(i8 %src1, i8 %src2, i64 *%dest) {
	; CHECK-LABEL: f3:			; CHECK-LABEL: f3:
	; CHECK: lhi %r0, 0			; CHECK: lhi %r0, 0
	; CHECK: [[LABEL:\.[^:]*]]:			; CHECK: [[LABEL:\.[^:]*]]:
	; CHECK: clst %r2, %r3			; CHECK: clst %r2, %r3
	; CHECK-NEXT: jo [[LABEL]]			; CHECK-NEXT: jo [[LABEL]]
	; CHECK-NEXT: BB#{{[0-9]+}}			; CHECK-NEXT: BB#{{[0-9]+}}
	; CHECK-NEXT: ipm [[REG:%r[0-5]]]			; CHECK-NEXT: ipm [[REG:%r[0-5]]]
	; CHECK: srl [[REG]], 28			; CHECK: sll [[REG]], 2
	; CHECK: rll [[REG]], [[REG]], 31			; CHECK: sra [[REG]], 30
	; CHECK: lgfr %r2, [[REG]]			; CHECK: lgfr %r2, [[REG]]
	; CHECK: blr %r14			; CHECK: blr %r14
	; CHECK: br %r14			; CHECK: br %r14
	entry:			entry:
	%res = call i64 @strcmp(i8 %src1, i8 %src2)			%res = call i64 @strcmp(i8 %src1, i8 %src2)
	%cmp = icmp slt i64 %res, 0			%cmp = icmp slt i64 %res, 0
	br i1 %cmp, label %exit, label %store			br i1 %cmp, label %exit, label %store

	store:			store:
	store i64 0, i64 *%dest			store i64 0, i64 *%dest
	br label %exit			br label %exit

	exit:			exit:
	ret i64 %res			ret i64 %res
	}			}

This is an archive of the discontinued LLVM Phabricator instance.

Generate -1/0/1 memcmp/strcmp result for z13AbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 69149

lib/Target/SystemZ/SystemZISelLowering.h

lib/Target/SystemZ/SystemZISelLowering.cpp

lib/Target/SystemZ/SystemZInstrFormats.td

lib/Target/SystemZ/SystemZInstrInfo.h

lib/Target/SystemZ/SystemZInstrInfo.cpp

lib/Target/SystemZ/SystemZInstrInfo.td

lib/Target/SystemZ/SystemZOperators.td

lib/Target/SystemZ/SystemZSelectionDAGInfo.cpp

test/CodeGen/SystemZ/memcmp-01.ll

test/CodeGen/SystemZ/memcmp-02.ll

test/CodeGen/SystemZ/strcmp-01.ll

test/CodeGen/SystemZ/strcmp-02.ll

Generate -1/0/1 memcmp/strcmp result for z13
AbandonedPublic