This is an archive of the discontinued LLVM Phabricator instance.

[RegisterCoalescer] Fix for SubRange join unreachable
ClosedPublic

Authored by dstuttard on Jun 20 2017, 2:25 AM.

Details

Summary

During remat, some subranges might end up having invalid segments which caused problems for later
coalescing.

Added in a check to remove segments that are invalidated as part of the remat.

See http://llvm.org/PR33524

Diff Detail

Repository
rL LLVM

Event Timeline

dstuttard created this revision.Jun 20 2017, 2:25 AM

Added llvm-commits

Adding MatzeB as reviewer - you've made some recent changes in the same area.

I've added this change specifically to address a problem seen in the associated PR (http://llvm.org/PR33524), however I'm not sure that this is necessarily the right way to go about fixing this issue.

I think that the remat results in an incorrect segment in the SubRange - which looks something like [128r, 128d) and is also an undef - and removing it does clear up the problem. (Prior to the remat the segment in the SubRange is correct, albeit for an undef).
However, I wondered if the part of the code that went wrong later (results in an unreachable) should in fact be able to deal with this itself.

Test case?

lib/CodeGen/RegisterCoalescer.cpp
1235 ↗(On Diff #103178)

Debug code?

Test case?

There's an example that provokes the problem in the bugzilla http://llvm.org/PR33524.

I guess at this stage I'm looking for some indication whether this is a reasonable fix for the problem - before submission I'll tidy up the reproducer in the bugzilla and add as a test case.

lib/CodeGen/RegisterCoalescer.cpp
1235 ↗(On Diff #103178)

Yes, I'll remove.

dstuttard updated this revision to Diff 103200.Jun 20 2017, 6:35 AM

Removing debug comment
Also adding the test case from the bugzilla as a new test case (I realised it is
a bit annoying to have to go elsewhere to get hold of the reproducer)
I'll update it as a test if this change looks promising

dstuttard marked 2 inline comments as done.Jun 20 2017, 6:36 AM

Removed the commented out debug statement

qcolombet requested changes to this revision.Jun 20 2017, 10:02 AM
qcolombet added inline comments.
test/CodeGen/AMDGPU/pr33524.ll
1 ↗(On Diff #103200)

You need to add a RUN line and some FileCheck command to check we are generating correct code.
FWIW, you'll have something more robust with a .mir test (llc -stop-before simple-register-coalescing -simplify-mir)
Add a comment on what this test is checking. In particular listing the pr number here is a good practice.
Give a meaning full name to the filename, e.g., reg-coal-join-subrange.

This revision now requires changes to proceed.Jun 20 2017, 10:02 AM
dstuttard updated this revision to Diff 103354.Jun 21 2017, 4:46 AM
dstuttard edited edge metadata.

Updating the test as per review comments

I've left it as a .ll test rather than .mir as the mir print routines can't cope
with pseudo source values (used in the llvm.amdgcn.buffer.load intrinsics)

"TargetCustom pseudo source values are not supported"

FYI I tried this patch in my out-of-tree backend (hoping to resolve http://llvm.org/PR32773). I observed a segfault SR.removeValNo(RmValNo) because RmValNo may be null.

I don't know yet whether this is specific to my backend, but I thought I'd mention it in case it indicates a more general problem.

FYI I tried this patch in my out-of-tree backend (hoping to resolve http://llvm.org/PR32773). I observed a segfault SR.removeValNo(RmValNo) because RmValNo may be null.

I don't know yet whether this is specific to my backend, but I thought I'd mention it in case it indicates a more general problem.

Doh = forgot the check for null. I'll upload a change for this.
The fact that you got a segfault here is promising - it could mean it might be a similar problem. Try the new patch.
If it fails then take a look at the SubRanges around the failing SubRange join - it could be a similar problem to the one this fix addresses (but isn't caught by it) - in particular any remats that happen before the failing SubRange join.
There are a couple of failures logged in bugzilla that look similar but aren't fixed by this change, so there are definitely some other issues in this area.

Added in mising check on RmValNo for null

arsenm added a subscriber: arsenm.Jun 21 2017, 1:55 PM

Updating the test as per review comments

I've left it as a .ll test rather than .mir as the mir print routines can't cope
with pseudo source values (used in the llvm.amdgcn.buffer.load intrinsics)

"TargetCustom pseudo source values are not supported"

You can strip out the MemOperands in the MIR test

test/CodeGen/AMDGPU/pr33524.ll
1 ↗(On Diff #103200)

Remove these extra comments

2–4 ↗(On Diff #103200)

These will be redundant with the run line

67–70 ↗(On Diff #103200)

You can remove all the metadata

MatzeB edited edge metadata.Jun 21 2017, 5:17 PM

Thanks for working on this! A bunch of nitpicks are below but overal the fix looks fine.

lib/CodeGen/RegisterCoalescer.cpp
1228 ↗(On Diff #103439)

I would not describe vreg2:sub0 as undef here, the COPY is a normal definition like any other.
It just happens that after coalescing we don't have a definition left because the copy was reading a partially undef value (but that effect will be visible after the ==> arrow).

1231 ↗(On Diff #103439)

This would be the place to mention that vreg2:sub0 is undef now and the subrange needs to be removed.

1232 ↗(On Diff #103439)

Why do you need DstIdx == 0, it seems to me that we need the fixup regardless of DstIdx.

1238 ↗(On Diff #103439)

"undef tagged as def" is a strange description. How about: "Removing undefined subrange ..." as debug message?

1241–1243 ↗(On Diff #103439)

How about if (VNInfo *RmValNo = getVNInfoAt(CurrIdx.getRegSlot())) (It shouldn't matter here because NewMI should not write to that part of the register. But writing getRegSlot() feels more natural to check for liveranges going out of an instruction).

1246 ↗(On Diff #103439)

You should call DstInt.removeEmptySubRanges() after cleaning subranges.

dstuttard updated this revision to Diff 103576.Jun 22 2017, 7:40 AM

Updating in line with comments from reviewers

I haven't updated the test to .mir as I don't fully understand what
@arsenm means - perhaps if we discuss offline I can do a subsequent patch to
update the test, but the one there will suffice for now?

dstuttard marked 6 inline comments as done.Jun 22 2017, 8:22 AM

Updating in line with comments from reviewers

I haven't updated the test to .mir as I don't fully understand what
@arsenm means - perhaps if we discuss offline I can do a subsequent patch to
update the test, but the one there will suffice for now?

The TargetCustom error is from the pseudo value source used for the buffer intrinsics memory operands. If you remove those you should avoid the error. Also it may still reproduce if you replace the intrinsics in the IR with volatile loads

MatzeB accepted this revision.Jun 26 2017, 6:25 PM

The code fix LGTM, but please wait for @qcolombet/@arsenm before committing.

Making a .mir test seems indeed hard at the moment, as the printer already fails; manually stripping memory operands only works after printing I presume?

The code fix LGTM, but please wait for @qcolombet/@arsenm before committing.

Making a .mir test seems indeed hard at the moment, as the printer already fails; manually stripping memory operands only works after printing I presume?

Yes, stripping the operands only works after printing :(

@arsenm I'll try using volatile loads and stores to see if that works in this case. Would it be acceptable in the meantime to accept this change and then I'll update the .ll test with a .mir one when it works?

arsenm accepted this revision.Jun 30 2017, 8:48 AM

LGTM

dstuttard updated this revision to Diff 105062.Jul 3 2017, 7:07 AM

Updating the test to a .mir test
Replacing the buffer.load intrinsics with load volatile worked

@qcolombet - any further comments or are you happy for this to go in?

qcolombet accepted this revision.Jul 5 2017, 10:08 AM

LGTM.

Thanks

This revision is now accepted and ready to land.Jul 5 2017, 10:08 AM
This revision was automatically updated to reflect the committed changes.