This is an archive of the discontinued LLVM Phabricator instance.

[RegisterCoalescer] Don't set read-undef in pruneValues, only clear
ClosedPublic

Authored by uabelho on Oct 6 2017, 4:33 AM.

Details

Summary

The comments in the code said

// Remove <def,read-undef> flags. This def is now a partial redef.

but the code didn't just remove read-undef, it could introduce new ones which
could case errors.

E.g. if we have something like

%vreg1<def> = IMPLICIT_DEF
%vreg2:subreg1<def, read-undef> = op %vreg3, %vreg4
%vreg2:subreg2<def> = op %vreg6, %vreg7

and we merge %vreg1 and %vreg2 then we should not set undef on the second subreg
def, which the old code did.

Now we solve this by actually do what the code comment says. We remove
read-undef flags rather than remove or introduce them.

Diff Detail

Event Timeline

uabelho created this revision.Oct 6 2017, 4:33 AM

We ran into this problem on our out-of-tree target, but I've tried to make a reasonable test case
for X86.

Does this change make sense to you? We've run with it locally for quite some time now and it
solves our problem and hasn't introduced new ones that we are aware of. I don't know this code
though so maybe there is a better fix?

MatzeB edited edge metadata.Oct 10 2017, 12:02 PM

Thanks for the patch and the testcase, this looks good.

I slightly simplified the testcase to this while trying out the patch:

# RUN: llc -mtriple=x86_64-- %s -o - -run-pass=simple-register-coalescing | FileCheck %s
---
name: f
body: |
  bb.0:
    JB_1 %bb.2, undef implicit killed %eflags
    JMP_1 %bb.1

  bb.1:
    %0 : gr64 = IMPLICIT_DEF
    NOOP implicit-def undef %1.sub_32bit : gr64
    NOOP implicit-def %1.sub_16bit : gr64
    JMP_1 %bb.3

  bb.2:
    NOOP implicit-def %0
    %1 = COPY %0

  bb.3:
    NOOP implicit killed %0
    NOOP implicit killed %1
...

# We should have a setting of both sub_32bit and sub_16bit. The first one
# should be undef and not dead, and the second should not be undef.

# CHECK-NOT:  dead
# CHECK:      NOOP implicit-def undef %1.sub_32bit
# CHECK-NOT:  undef
# CHECK-NEXT: NOOP implicit-def %1.sub_16bit
lib/CodeGen/RegisterCoalescer.cpp
2687–2690

I wonder if we shouldn't change it to this instead which:

if (MO.getSubReg() != 0 && MO.isUndef() && !EraseImpDef)
  MO.setIsUndef(false);

it's simpler, matches the comment in front of the for and it seems to pass the x86 tests at least...

Thanks for the patch and the testcase, this looks good.

I slightly simplified the testcase to this while trying out the patch:

# RUN: llc -mtriple=x86_64-- %s -o - -run-pass=simple-register-coalescing | FileCheck %s
---
name: f
body: |
  bb.0:
    JB_1 %bb.2, undef implicit killed %eflags
    JMP_1 %bb.1

  bb.1:
    %0 : gr64 = IMPLICIT_DEF
    NOOP implicit-def undef %1.sub_32bit : gr64
    NOOP implicit-def %1.sub_16bit : gr64
    JMP_1 %bb.3

  bb.2:
    NOOP implicit-def %0
    %1 = COPY %0

  bb.3:
    NOOP implicit killed %0
    NOOP implicit killed %1
...

# We should have a setting of both sub_32bit and sub_16bit. The first one
# should be undef and not dead, and the second should not be undef.

# CHECK-NOT:  dead
# CHECK:      NOOP implicit-def undef %1.sub_32bit
# CHECK-NOT:  undef
# CHECK-NEXT: NOOP implicit-def %1.sub_16bit

Nice, I like that!

Thanks for the comments, I'll update the patch.

lib/CodeGen/RegisterCoalescer.cpp
2687–2690

But this means we will never set isUndef, we will only clear it? So the stuff I talk about in the commit message that we sometimes need to introduce undef, if we've removed an IMPLICIT_DEF, isn't really true? In those cases maybe there would already be undef on the MO?

I suppose the simplified code works if
(MO.getSubReg() != 0) && !MO.isUndef() && EraseImpDef && !LR.getVNInfoBefore(Def)
can never happen?

I ran a bunch of tests with

assert(!((MO.getSubReg() != 0) && !MO.isUndef() && EraseImpDef && !LR.getVNInfoBefore(Def)) &&
       "Unexpected need of read-undef on subreg write!");
if (MO.getSubReg() != 0 && MO.isUndef() && !EraseImpDef)
  MO.setIsUndef(false);

and I didn't see the assert blowing and everything went well so maybe it's indeed what we want to do.

I'll update the patch to your suggestion!

uabelho updated this revision to Diff 118547.Oct 10 2017, 11:48 PM
uabelho retitled this revision from [RegisterCoalescer] Don't set read-undef if there is a previous def to [RegisterCoalescer] Don't set read-undef in pruneValues, only clear.
uabelho edited the summary of this revision. (Show Details)

Updated according to Matthias's suggestions.

Changed the code change, and simplified test case. Updated commit message.

MatzeB accepted this revision.Oct 11 2017, 1:44 PM

LGTM

lib/CodeGen/RegisterCoalescer.cpp
2687–2690

The only way we can have a partial write and nothing live before is when the write is marked undef. So we must already have all the undef flags we need.

Or coming from another angle coalescing should only increase the range of live intervals so we should only need to erase some undef flags but never add new ones.

This revision is now accepted and ready to land.Oct 11 2017, 1:44 PM
This revision was automatically updated to reflect the committed changes.
uabelho added inline comments.Oct 11 2017, 11:24 PM
lib/CodeGen/RegisterCoalescer.cpp
2687–2690

Great! Thank you!