This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] With XNACK, cannot clause a load with result coalesced with operand
ClosedPublic

Authored by tpr on Jan 21 2019, 2:08 AM.

Details

Summary

With XNACK, an smem load whose result is coalesced with an operand (thus
it overwrites its own operand) cannot appear in a clause, because some
other instruction might XNACK and restart the whole clause.

The clause breaker already realized that an smem that overwrites an
operand cannot appear in a clause, and broke the clause. The problem
that this commit fixes is that the SIFormMemoryClauses optimization
formed a bundle with early clobber, which caused the earlier code that
set up the coalesced operand to be removed as dead.

Change-Id: I703c4d5b0bf7d6060222bec491f45c18bb3c0016

Diff Detail

Repository
rL LLVM

Event Timeline

tpr created this revision.Jan 21 2019, 2:08 AM
rampitec added inline comments.Jan 21 2019, 10:03 AM
test/CodeGen/AMDGPU/smem-no-clause-coalesced.mir
7 ↗(On Diff #182762)

You need to have some positive checks, at least a function label.

rampitec added inline comments.Jan 21 2019, 10:05 AM
lib/Target/AMDGPU/SIFormMemoryClauses.cpp
126 ↗(On Diff #182762)

MI.uses() maybe? Then you do not need to check for isDef() and have less to scan.

arsenm added inline comments.Jan 21 2019, 10:07 AM
lib/Target/AMDGPU/SIFormMemoryClauses.cpp
123 ↗(On Diff #182762)

This assumes the instruction has any operands

tpr marked 4 inline comments as done.Jan 22 2019, 12:39 AM
tpr added inline comments.
lib/Target/AMDGPU/SIFormMemoryClauses.cpp
126 ↗(On Diff #182762)

Good idea to use MI.uses(), but it looks like I still need to check isDef() according to the comment in MachineInstr.h:

/// Returns a range that includes all operands that are register uses.
/// This may include unrelated operands which are not register uses.

So I guess it might include implicit defs at the end or something.

tpr updated this revision to Diff 182854.Jan 22 2019, 12:40 AM
tpr marked an inline comment as done.

V2: Addressed review comments.

This revision is now accepted and ready to land.Jan 22 2019, 10:03 AM
arsenm added inline comments.Jan 22 2019, 10:09 AM
test/CodeGen/AMDGPU/smem-no-clause-coalesced.mir
1 ↗(On Diff #182854)

There's no point in the second check prefix if you don't add a no-xnack run line

27–29 ↗(On Diff #182854)

Can you run-pass=none to repack the register numbers

tpr marked 2 inline comments as done.Jan 23 2019, 4:02 AM
This revision was automatically updated to reflect the committed changes.