This is an archive of the discontinued LLVM Phabricator instance.

[BranchFolder] Drop kill/dead flags if they aren't present in all merged instructions
Needs ReviewPublic

Authored by uabelho on Aug 21 2018, 1:37 AM.

Download Raw Diff

Details

Reviewers

kparzysz
gberry
craig.topper
rnk
qcolombet
MatzeB
jonpa

Summary

Just like we already do with undef flags, we need to drop kill flags if
they aren't present in all merged instructions. Otherwise we would merge

<instr1> $r0
<instr2> killed $r0

and

<instr1> killed $r0
<instr2> undef $r0

into

<instr1> killed $r0
<instr2> $r0

and then the verifier would complain about $r0 not being live at <instr2>.

Similarly, we need to drop dead flags if they aren't present in all
merged instructions. Otherwise we would merge

dead $r0 = <instr1>
<instr2> undef $r0

and

$r0 = <instr1>
<instr2> $r0

into

dead $r0 = <instr1>
<instr2> $r0

and then the verifier would complain about $r0 not being live at <instr2>.

Diff Detail

Event Timeline

Found the problem when testing my out of tree target, but it's exposed also with the attached testcase.

Does the fix make sense or should we do something else?

Ping

Ping?

Rebased, fixed a spelling mistake.

Anyone has an opinion about this?

Rebased, updated to drop dead flags as well.

Herald added a project: Restricted Project. · View Herald TranscriptDec 17 2019, 5:59 AM

Herald added a subscriber: hiraditya. · View Herald Transcript

uabelho added reviewers: qcolombet, MatzeB, jonpa.Dec 19 2019, 12:49 AM

I am a little curious as to how these differences in flags arise. I can understand that a kill flag can be seen as an optimization and is optional. The same would go for a dead flag. But I am not sure about an undef flag... If it is present in one case but not in the other, the two instructions do not seem equivalent, or?

So I suppose your patch makes sense to me, although I am not familiar enough to understand the undef case... Perhaps some comment on how these three cases can emerge and may be treated would be good? (at least to me :-)

bjope added a subscriber: bjope.Jan 24 2020, 2:43 PM

bjope added inline comments.

llvm/lib/CodeGen/BranchFolding.cpp
861 ↗	(On Diff #234285)	Afaict this isn't consistent with the mirrored situation (if MO says "undef" and OtherMO says "killed") the result will be that "undef" is dropped, but not an addition of "killed". I think the result should be the same regardless of which bb that use as a base for the merge. Shouldn't it? Btw, is there a problem with simply clearing the killed flag unless it is set in both operands? (I'm not sure if an undef implies killed today. Or if it that might change in the future when the freeze stuff has been implemented. In your example above instr2 is a def, but I assume it could be a use as well, that could be undef, or maybe even a use of the same undef value as used in instr1.)

In D51028#1839423, @jonpa wrote:

I am a little curious as to how these differences in flags arise. I can understand that a kill flag can be seen as an optimization and is optional. The same would go for a dead flag. But I am not sure about an undef flag... If it is present in one case but not in the other, the two instructions do not seem equivalent, or?

So I suppose your patch makes sense to me, although I am not familiar enough to understand the undef case... Perhaps some comment on how these three cases can emerge and may be treated would be good? (at least to me :-)

One full llc reproducer for my out-of-tree target showing that we need to clear the dead flag would look like this:

@e = external global i16, align 1

declare void @q(i16, i16, i16, i16, i16)

define void @x() {
bb.0:
  %tmp = alloca i16, align 1
  %tmp3 = alloca i16, align 1
  br i1 undef, label %bb.2, label %bb.1

bb.1:                                             ; preds = %bb.0
  %0 = load volatile i16, i16* @e, align 1
  store i16 0, i16* %tmp3, align 1
  tail call void @q(i16 undef, i16 undef, i16 undef, i16 undef, i16 %0)
  unreachable

bb.2:                                             ; preds = %bb.0
  %1 = load volatile i16, i16* @e, align 1
  store i16 0, i16* %tmp, align 1
  tail call void @q(i16 undef, i16 undef, i16 undef, i16 undef, i16 undef)
  unreachable
}

Note the difference in the last argument to function q in bb.1 and bb2.

When we reach the BranchFolder the load and the setting up of the last
argument (which should be passed on the stack) has turned into the following:

In bb.1:

$r0 = mv16Sym @e
$a0h = mv_r16_rmod1_ar16 killed $r0
push_any16 killed $a0h

and in bb.2:

$r0 = mv16Sym @e
dead $a0h = mv_r16_rmod1_ar16 killed $r0
push_any16 undef $a0h

The BranchFolder wants to merge the above and without the clearing of the "dead"
flag in this patch we get:

$r0 = mv16Sym @e
dead $a0h = mv_r16_rmod1_ar16 killed $r0
push_any16 $a0h

and then the verifier pukes.

My patch removes the "dead" above so we instead get

$r0 = mv16Sym @e
$a0h = mv_r16_rmod1_ar16 killed $r0
push_any16 $a0h

which the verifer is happy with and I see nothing wrong with.

So in this case, an "undef" in the llc input could lead to a verifier error.

uabelho marked an inline comment as done.Jan 27 2020, 2:10 AM

uabelho added inline comments.

llvm/lib/CodeGen/BranchFolding.cpp
861 ↗	(On Diff #234285)	I don't know. I thought that keeping "killed" if possible is better than dropping it to keep as exact liveness information as we can, but I think it can be dropped to if we prefer that.

...
So in this case, an "undef" in the llc input could lead to a verifier error.

Ah, now I think I get it... Thanks for explaining.

Revision Contents

Path

Size

lib/

CodeGen/

BranchFolding.cpp

17 lines

test/

CodeGen/

Hexagon/

branchfolder-clear-kill.mir

47 lines

Diff 197926

lib/CodeGen/BranchFolding.cpp

Show First 20 Lines • Show All 859 Lines • ▼ Show 20 Lines	while (CommonTailLen--) {

assert(MBBICommon != MBBIECommon &&		assert(MBBICommon != MBBIECommon &&
"Reached BB end within common tail length!");		"Reached BB end within common tail length!");
assert(MBBICommon->isIdenticalTo(*MBBI) && "Expected matching MIIs!");		assert(MBBICommon->isIdenticalTo(*MBBI) && "Expected matching MIIs!");

// Merge MMOs from memory operations in the common block.		// Merge MMOs from memory operations in the common block.
if (MBBICommon->mayLoad() \|\| MBBICommon->mayStore())		if (MBBICommon->mayLoad() \|\| MBBICommon->mayStore())
MBBICommon->cloneMergedMemRefs(MBB->getParent(), {&MBBICommon, &*MBBI});		MBBICommon->cloneMergedMemRefs(MBB->getParent(), {&MBBICommon, &*MBBI});
// Drop undef flags if they aren't present in all merged instructions.		// Drop undef/kill flags if they aren't present in all merged instructions.
for (unsigned I = 0, E = MBBICommon->getNumOperands(); I != E; ++I) {		for (unsigned I = 0, E = MBBICommon->getNumOperands(); I != E; ++I) {
MachineOperand &MO = MBBICommon->getOperand(I);		MachineOperand &MO = MBBICommon->getOperand(I);
if (MO.isReg() && MO.isUndef()) {		if (MO.isReg() && MO.isUndef()) {
const MachineOperand &OtherMO = MBBI->getOperand(I);		const MachineOperand &OtherMO = MBBI->getOperand(I);
if (!OtherMO.isUndef())		if (!OtherMO.isUndef())
MO.setIsUndef(false);		MO.setIsUndef(false);
}		}
		if (MO.isReg() && MO.isKill()) {
		const MachineOperand &OtherMO = MBBI->getOperand(I);
		// An exception to the clearing of the kill flag is if we merge
		// something like:
		// <instr1> undef $r0
		// $r0 = <instr2>
		// and
		// <instr1> killed $r0
		// $r0 = <instr2>
		// Here we should keep the kill flag even if it's only set in one of the
		// merged paths since it's undef in the other. So the only real value
		// in $r0 that actually reaches <instr1> will indeed be killed there.
		if (!OtherMO.isKill() && !OtherMO.isUndef())
		MO.setIsKill(false);
		}
}		}

++MBBI;		++MBBI;
++MBBICommon;		++MBBICommon;
}		}
}		}

void BranchFolder::mergeCommonTails(unsigned commonTailIndex) {		void BranchFolder::mergeCommonTails(unsigned commonTailIndex) {
▲ Show 20 Lines • Show All 1,225 Lines • Show Last 20 Lines

test/CodeGen/Hexagon/branchfolder-clear-kill.mir

This file was added.

				# RUN: llc -march=hexagon -run-pass branch-folder %s -o - -verify-machineinstrs \| FileCheck %s

				# When the branchfolder merges common tails, it needs to clear both undef and
				# killed flags if they differ between the merged blocks.

				# In the below example, if the killed flag is not cleared we will be left with
				# A2_nop 0, killed $r0
				# A2_nop 0, $r0
				# and then the verifier will complain about use of an undefined physical
				# register.

				---
				# CHECK-LABEL: name: func0
				# CHECK-LABEL: bb.0:
				# CHECK: liveins: $r0, $r31
				# CHECK: A2_nop implicit $r0
				# CHECK: A2_nop implicit $r0
				# CHECK: PS_jmpret

				name: func0
				tracksRegLiveness: true

				body: \|
				bb.0:
				liveins: $r0, $r31
				successors: %bb.1, %bb.2
				J2_jumpt undef $p0, %bb.2, implicit-def $pc
				J2_jump %bb.1, implicit-def $pc

				bb.1:
				liveins: $r0, $r31
				successors: %bb.3
				A2_nop implicit $r0
				A2_nop implicit killed $r0
				J2_jump %bb.3, implicit-def $pc

				bb.2:
				liveins: $r0, $r31
				successors: %bb.3
				A2_nop implicit killed $r0
				A2_nop implicit undef $r0
				J2_jump %bb.3, implicit-def $pc

				bb.3:
				liveins: $r31
				PS_jmpret killed $r31, implicit-def $pc
				...