This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Waitcnt pass. Add S_WAITCNT 0 if incomplete predecessor info
Needs ReviewPublic

Authored by msearles on Nov 17 2017, 9:47 AM.
This revision needs review, but all reviewers have resigned.

Details

Reviewers
arsenm
Summary

When merging waitcnt info from preds, add a S_WAITCNT 0 at the top of a block if a pred has not yet been visited (so incomplete info re: preds) and we're sure that we will not revisit the block.

Diff Detail

Event Timeline

msearles created this revision.Nov 17 2017, 9:47 AM
msearles abandoned this revision.Nov 17 2017, 9:49 AM
msearles updated this revision to Diff 123383.Nov 17 2017, 10:32 AM
arsenm added inline comments.Nov 17 2017, 11:00 AM
lib/Target/AMDGPU/SIInsertWaitcnts.cpp
1320

s/pred/Pred

test/CodeGen/AMDGPU/waitcnt-no-preds.ll
3

A smaller mir test is probably possible

7

CHECK-LABEL is usually only used for function names

9

Regex at least for the first number

15

instnamer

91–99

Remove metadata

msearles updated this revision to Diff 123700.Nov 20 2017, 6:25 PM

Adjust per reviewer comments.

msearles marked 4 inline comments as done.Nov 20 2017, 6:28 PM
msearles added inline comments.
test/CodeGen/AMDGPU/waitcnt-no-preds.ll
3

I was not successful in my attempts to create a mir test; with llvm.amdgcn.buffer.load.f32(), mir-generation fails; without llvm.amdgcn.buffer.load.f32(), the generated code is perturbed and the bug is no longer hit. I did, however, reduce the *ll a fair bit.

msearles marked an inline comment as done.Nov 20 2017, 6:28 PM
arsenm added a comment.Jul 9 2018, 3:15 AM

Is this still necessary? I thought I saw a similar patch before

arsenm resigned from this revision.Feb 21 2019, 6:53 PM