This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/SI: Optimize adjacent s_nop instructions
ClosedPublic

Authored by tstellarAMD on Mar 30 2016, 8:21 AM.

Details

Reviewers
nhaehnle
arsenm
Summary

Use the operand for how long to wait. This is somewhat
distasteful, since it would be better to just emit s_nop
with the right argument in the first place. This would require
changing TII::insertNoop to emit N operands, which would be easy.
Slightly more problematic is the post-RA scheduler and hazard recognizer
represent nops as a single null node, and would require inventing
another way of representing N nops.

Patch by: Matt Arsenault

Diff Detail

Event Timeline

tstellarAMD retitled this revision from to AMDGPU/SI: Optimize adjacent s_nop instructions.
tstellarAMD updated this object.
tstellarAMD added a reviewer: arsenm.
tstellarAMD added a subscriber: llvm-commits.

Looks mostly good, just one comment.

lib/Target/AMDGPU/SIShrinkInstructions.cpp
260–261

I think you need to guard against the case where Next == MBB.end() - in that case, NextMI.getOpcode() (or I guess technically already the assignment to NextMI) seems to invoke undefined behavior.

Fix possible undefined behavior.

tstellarAMD marked an inline comment as done.Apr 22 2016, 11:07 AM
nhaehnle accepted this revision.Apr 22 2016, 3:54 PM
nhaehnle added a reviewer: nhaehnle.

LGTM

This revision is now accepted and ready to land.Apr 22 2016, 3:54 PM
arsenm closed this revision.Apr 25 2016, 12:59 PM
arsenm edited edge metadata.

r267456