This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Mark s_barrier as having side effects but not accessing memory.
ClosedPublic

Authored by foad on Sep 5 2019, 3:21 AM.

Details

Summary

This fixes poor scheduling in a function containing a barrier and a few
load instructions.

Without this fix, ScheduleDAGInstrs::buildSchedGraph adds an artificial
edge in the dependency graph from the barrier instruction to the exit
node representing live-out latency, with a latency of about 500 cycles.
Because of this it thinks the critical path through the graph also has
a latency of about 500 cycles. And because of that it does not think
that any of the load instructions are on the critical path, so it
schedules them with no regard for their (80 cycle) latency, which gives
poor results.

Event Timeline

foad created this revision.Sep 5 2019, 3:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 5 2019, 3:22 AM
arsenm accepted this revision.Sep 5 2019, 9:00 AM

LGTM

llvm/include/llvm/IR/IntrinsicsAMDGPU.td
209

This does end up not adding readnone to the intrinsic declaration, correct?

llvm/test/CodeGen/AMDGPU/schedule-barrier.mir
44

You can remove the into and IR fragment

This revision is now accepted and ready to land.Sep 5 2019, 9:00 AM
foad marked 2 inline comments as done.Sep 6 2019, 3:04 AM
foad added inline comments.
llvm/include/llvm/IR/IntrinsicsAMDGPU.td
209

The declaration comes out as:

; Function Attrs: convergent nounwind
declare void @llvm.amdgcn.s.barrier() #0
This revision was automatically updated to reflect the committed changes.
foad marked an inline comment as done.