This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Fix access beyond the end of the basic block in execMayBeModifiedBeforeAnyUse.
ClosedPublic

Authored by vpykhtin on Oct 14 2020, 5:00 AM.

Details

Summary

I was wrong in thinking that MRI.use_instructions return unique instructions and mislead Jay in his previous patch D64393.

First loop counted more instructions than it was in reality and the second loop went beyond the basic block with that counter.

Jay, as you suggested, I used your previous code that relied on MRI.use_operands to constrain the number of instructions to check among.
I just inlined modifiesRegister to reduce the number of passes over instruction operands and added assert on BB end boundary.

I'm trying to make a test for this but this is hard.

Diff Detail

Event Timeline

vpykhtin created this revision.Oct 14 2020, 5:00 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 14 2020, 5:00 AM
vpykhtin requested review of this revision.Oct 14 2020, 5:00 AM
foad added inline comments.Oct 14 2020, 6:00 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
7096

Please leave the blank line!

7122

Is it possible to enter this loop with NumUse == 0? If so, I would expect the assert on the next line to fail.

7131–7132

Use a range-based for over I->operands() ?

vpykhtin updated this revision to Diff 298166.Oct 14 2020, 9:30 AM

Per review fixes.

vpykhtin marked 3 inline comments as done.Oct 14 2020, 9:32 AM
vpykhtin added inline comments.
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
7122

I added the check before the loop.

rampitec added inline comments.Oct 14 2020, 9:55 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
7132

It will miss an exec subreg clobbers. Do we even use regmasks except calls? Then if it is a call it cannot be folded anyway.

7143

The indention suggests this return belongs to the else while it belongs to if. Missing braces.

Actually do we have a test with a call in between def and use?

vpykhtin marked an inline comment as done.Oct 14 2020, 10:21 PM

Actually do we have a test with a call in between def and use?

Probably not, how a call looks like at this stage?

llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
7132

I took this code from modifiesRegister, need to check.

Actually do we have a test with a call in between def and use?

Probably not, how a call looks like at this stage?

It should be s_setpc_b64 at this point and it is a terminator and branch. Although I am not sure it can be detected as an EXEC change.

Actually do we have a test with a call in between def and use?

Probably not, how a call looks like at this stage?

It should be s_setpc_b64 at this point and it is a terminator and branch. Although I am not sure it can be detected as an EXEC change.

DPP combiner is constrained to work on a def and uses in one BB, I assume if a call is a terminator then there should be no uses after it.

vpykhtin updated this revision to Diff 299082.Oct 19 2020, 9:33 AM

Added test, more review issues fixed.

vpykhtin marked 2 inline comments as done.Oct 19 2020, 9:33 AM
rampitec added inline comments.Oct 19 2020, 11:25 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
7141

I do not completely agree with clang tidy here, but I suggest to write "else if" on a single line and adjust indention accordingly.

llvm/unittests/Target/AMDGPU/ExecMayBeModifiedBeforeAnyUse.cpp
38 ↗(On Diff #299082)

tidy is right here, "auto *" is preferable.

vpykhtin updated this revision to Diff 299407.Oct 20 2020, 10:30 AM

Fixed formattind and lint issues.

vpykhtin marked 2 inline comments as done.Oct 20 2020, 10:31 AM
rampitec added inline comments.Oct 20 2020, 11:16 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
7142

Indention is off here.

vpykhtin updated this revision to Diff 299596.Oct 21 2020, 1:41 AM

Fixed formatting issue

This revision is now accepted and ready to land.Oct 21 2020, 9:10 AM

Jay, Matt do you have objections with this patch?

arsenm added inline comments.Oct 22 2020, 9:35 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
7107–7108

Why change to the uses instead of instructions?

foad accepted this revision.Oct 22 2020, 9:52 AM
vpykhtin marked an inline comment as done.Oct 22 2020, 9:55 AM
vpykhtin added inline comments.
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
7107–7108

This is the core problem. instructions aren't guaranteed to be unique on iteration depending on the order of uses in use-def list for a register. See the unittest below that shows how this can happen.

vpykhtin marked an inline comment as done.Oct 22 2020, 10:28 AM
foad added inline comments.Oct 22 2020, 11:19 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
7107–7108

To put it another way: counting uses here allows you to exit early from the loop below when you have seen all the uses. Counting instructions here (with use_nodbg_instructions) does not work reliably because you might count some instructions twice depending on the order they appear in the use list.