This is an archive of the discontinued LLVM Phabricator instance.

[RegisterCoalescer] Avoid "Use not jointly dominated by defs" in removePartialRedundancy
Needs ReviewPublic

Authored by tpr on Sep 10 2018, 12:42 AM.

Details

Summary

removePartialRedundancy was using extendToIndices on each subrange
without passing in an Undefs vector containing main reg defs that are
undef in the subrange, causing the above assert.

Unfortunately I can only reproduce this in an ll test. Turning it into a
mir test makes the problem go away.

Change-Id: I4f3ba2afb20d79f9467afb3d882f0328d38531be

Diff Detail

Event Timeline

tpr created this revision.Sep 10 2018, 12:42 AM

Unfortunately I can only reproduce this in an ll test. Turning it into a mir test makes the problem go away.

Why is that?

uabelho added inline comments.
test/CodeGen/AMDGPU/regcoal-removepartial-redundancy-not-jointly-dominated.ll
3

I tried running the test case on trunk now and it doesn't trigger the verifier even without the fix in this patch so something must have changed so the "wanted" situation doesn't happen with this ll-input anymore.

tpr added a comment.Sep 14 2018, 5:03 AM

Hmm, me too for trunk. It only fails on our fork that is 340950 plus local changes, so the code generated by instruction selection is probably different.

I will try again to turn it into a mir test, and/or determine why that doesn't work.

tpr updated this revision to Diff 165599.Sep 14 2018, 3:34 PM

V2: mir test

tpr added a comment.Sep 14 2018, 3:40 PM
  1. That test only fails without this fix if you have the fix in D51257.
  1. I finally tracked down why I couldn't get a mir test before: Compiling a ll file as usual, arriving at RegisterCoalescer, the numbering of the MBBs was not quite in order. Using -stop-before names the MBBs in the mir according to those out-of-order numbers, but reading the mir in again with -run-pass fails to preserve it. Instead it renumbers the MBBs in order. RegisterCoalescing processes MBBs with the same loop depth and critical edgeness and connectedness in numbering order, so the pass behaved differently.

I finally tracked down why I couldn't get a mir test before: Compiling a ll file as usual, arriving at RegisterCoalescer, the numbering of the MBBs was not quite in order. Using -stop-before names the MBBs in the mir according to those out-of-order numbers, but reading the mir in again with -run-pass fails to preserve it. Instead it renumbers the MBBs in order. RegisterCoalescing processes MBBs with the same loop depth and critical edgeness and connectedness in numbering order, so the pass behaved differently.

Could you file a public report for this issue?

test/CodeGen/AMDGPU/regcoal-removepartial-redundancy-not-jointly-dominated.mir
8 ↗(On Diff #165599)

Please check that the code after coalescing is correct.
Eg., the pass could silently crash and produce an empty function, the test would still pass!

foad added a subscriber: foad.Sep 12 2019, 6:02 AM

If I apply this now, the new test fails with:

*** Bad machine code: Too few operands ***
- function:    _amdgpu_ps_main
- basic block: %bb.6  (0x53b7738)
- instruction: %8:sreg_32_xm0_xexec = S_BUFFER_LOAD_DWORD_IMM undef %9:sreg_128, 2708, 0 :: (dereferenceable invariant load 4)
5 operands expected, but 4 given.

Does this mean that the format of buffer load instructions has changed since you wrote this mir test?

Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2019, 6:02 AM
foad added a comment.Sep 12 2019, 7:04 AM

Answering my own question: yes S_BUFFER_LOAD_DWORD_IMM needs an extra 0 operand because of a6322941ffc15275e837435eee7aabba0ca15211 and S_ENDPGM also needs a 0 operand as in 7df225dfc25adc8371188dc57f3adf300b0bd697.