This is an archive of the discontinued LLVM Phabricator instance.

[HazardRec] Allow inserting multiple wait-states simultaneously
ClosedPublic

Authored by kerbowa on Oct 19 2020, 5:11 PM.

Details

Summary

If a target can encode multiple wait-states into a noop allow emitting such
instructions directly.

Diff Detail

Event Timeline

kerbowa created this revision.Oct 19 2020, 5:11 PM
kerbowa requested review of this revision.Oct 19 2020, 5:11 PM
foad added a comment.Oct 20 2020, 2:01 AM

Looks good modulo comments inline.

llvm/include/llvm/CodeGen/ScheduleHazardRecognizer.h
120

I think unsigned would make more sense, and WaitStates seems like AMDGPU-specific terminology.

llvm/include/llvm/CodeGen/TargetInstrInfo.h
1349

Likewise.

foad added a comment.Oct 20 2020, 2:10 AM

There's a comment in AMDGPUTargetMachine.cpp that you could remove:

// FIXME: This stand-alone pass will emit indiv. S_NOP 0, as needed. It would
// be better for it to emit S_NOP <N> when possible.

Do we properly count number of pre-existing wait states if we have s_nop > 0?

kerbowa updated this revision to Diff 299433.Oct 20 2020, 11:42 AM

Address comments.

Do we properly count number of pre-existing wait states if we have s_nop > 0?

In "HazardRecognizerMode", yes. It uses SIInstrInfo::getNumWaitStates which accounts for noops with immediates > 0.

The post-RA scheduler should not insert these combined noops so we did not handle it there.

This revision is now accepted and ready to land.Oct 20 2020, 11:58 AM
dmgreen accepted this revision.Oct 20 2020, 2:51 PM
dmgreen added a subscriber: dmgreen.

Looks nice and clean.

This revision was landed with ongoing or failed builds.Oct 20 2020, 5:06 PM
This revision was automatically updated to reflect the committed changes.