This is an archive of the discontinued LLVM Phabricator instance.

[BranchFolder] Drop kill/dead flags if they aren't present in all merged instructions
Needs ReviewPublic

Authored by uabelho on Aug 21 2018, 1:37 AM.

Details

Summary

Just like we already do with undef flags, we need to drop kill flags if
they aren't present in all merged instructions. Otherwise we would merge

<instr1> $r0
<instr2> killed $r0

and

<instr1> killed $r0
<instr2> undef $r0

into

<instr1> killed $r0
<instr2> $r0

and then the verifier would complain about $r0 not being live at <instr2>.

Similarly, we need to drop dead flags if they aren't present in all
merged instructions. Otherwise we would merge

dead $r0 = <instr1>
<instr2> undef $r0

and

$r0 = <instr1>
<instr2> $r0

into

dead $r0 = <instr1>
<instr2> $r0

and then the verifier would complain about $r0 not being live at <instr2>.

Diff Detail

Event Timeline

uabelho created this revision.Aug 21 2018, 1:37 AM

Found the problem when testing my out of tree target, but it's exposed also with the attached testcase.

Does the fix make sense or should we do something else?

uabelho updated this revision to Diff 197926.May 3 2019, 1:07 AM

Rebased, fixed a spelling mistake.

Anyone has an opinion about this?

uabelho updated this revision to Diff 234285.Dec 17 2019, 5:59 AM
uabelho retitled this revision from [BranchFolder] Drop kill flags if they aren't present in all merged instructions to [BranchFolder] Drop kill/dead flags if they aren't present in all merged instructions.
uabelho edited the summary of this revision. (Show Details)

Rebased, updated to drop dead flags as well.

Herald added a project: Restricted Project. · View Herald TranscriptDec 17 2019, 5:59 AM
Herald added a subscriber: hiraditya. · View Herald Transcript

I am a little curious as to how these differences in flags arise. I can understand that a kill flag can be seen as an optimization and is optional. The same would go for a dead flag. But I am not sure about an undef flag... If it is present in one case but not in the other, the two instructions do not seem equivalent, or?

So I suppose your patch makes sense to me, although I am not familiar enough to understand the undef case... Perhaps some comment on how these three cases can emerge and may be treated would be good? (at least to me :-)

bjope added a subscriber: bjope.Jan 24 2020, 2:43 PM
bjope added inline comments.
llvm/lib/CodeGen/BranchFolding.cpp
861 ↗(On Diff #234285)

Afaict this isn't consistent with the mirrored situation (if MO says "undef" and OtherMO says "killed") the result will be that "undef" is dropped, but not an addition of "killed".
I think the result should be the same regardless of which bb that use as a base for the merge. Shouldn't it?

Btw, is there a problem with simply clearing the killed flag unless it is set in both operands?
(I'm not sure if an undef implies killed today. Or if it that might change in the future when the freeze stuff has been implemented. In your example above instr2 is a def, but I assume it could be a use as well, that could be undef, or maybe even a use of the same undef value as used in instr1.)

I am a little curious as to how these differences in flags arise. I can understand that a kill flag can be seen as an optimization and is optional. The same would go for a dead flag. But I am not sure about an undef flag... If it is present in one case but not in the other, the two instructions do not seem equivalent, or?

So I suppose your patch makes sense to me, although I am not familiar enough to understand the undef case... Perhaps some comment on how these three cases can emerge and may be treated would be good? (at least to me :-)

One full llc reproducer for my out-of-tree target showing that we need to clear the dead flag would look like this:

@e = external global i16, align 1

declare void @q(i16, i16, i16, i16, i16)

define void @x() {
bb.0:
  %tmp = alloca i16, align 1
  %tmp3 = alloca i16, align 1
  br i1 undef, label %bb.2, label %bb.1

bb.1:                                             ; preds = %bb.0
  %0 = load volatile i16, i16* @e, align 1
  store i16 0, i16* %tmp3, align 1
  tail call void @q(i16 undef, i16 undef, i16 undef, i16 undef, i16 %0)
  unreachable

bb.2:                                             ; preds = %bb.0
  %1 = load volatile i16, i16* @e, align 1
  store i16 0, i16* %tmp, align 1
  tail call void @q(i16 undef, i16 undef, i16 undef, i16 undef, i16 undef)
  unreachable
}

Note the difference in the last argument to function q in bb.1 and bb2.

When we reach the BranchFolder the load and the setting up of the last
argument (which should be passed on the stack) has turned into the following:

In bb.1:

$r0 = mv16Sym @e
$a0h = mv_r16_rmod1_ar16 killed $r0
push_any16 killed $a0h

and in bb.2:

$r0 = mv16Sym @e
dead $a0h = mv_r16_rmod1_ar16 killed $r0
push_any16 undef $a0h

The BranchFolder wants to merge the above and without the clearing of the "dead"
flag in this patch we get:

$r0 = mv16Sym @e
dead $a0h = mv_r16_rmod1_ar16 killed $r0
push_any16 $a0h

and then the verifier pukes.

My patch removes the "dead" above so we instead get

$r0 = mv16Sym @e
$a0h = mv_r16_rmod1_ar16 killed $r0
push_any16 $a0h

which the verifer is happy with and I see nothing wrong with.

So in this case, an "undef" in the llc input could lead to a verifier error.

uabelho marked an inline comment as done.Jan 27 2020, 2:10 AM
uabelho added inline comments.
llvm/lib/CodeGen/BranchFolding.cpp
861 ↗(On Diff #234285)

I don't know. I thought that keeping "killed" if possible is better than dropping it to keep as exact liveness information as we can, but I think it can be dropped to if we prefer that.

...
So in this case, an "undef" in the llc input could lead to a verifier error.

Ah, now I think I get it... Thanks for explaining.