This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Define mode register
ClosedPublic

Authored by arsenm on Jan 3 2020, 12:29 PM.

Details

Reviewers
rampitec
Summary

This should eventually model FP mode constraints as well as the other
special fields it tracks.

Diff Detail

Event Timeline

arsenm created this revision.Jan 3 2020, 12:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2020, 12:29 PM

I would define individual fields as separate registers. That way we will be able to reschedule instructions across some mode changes if these instructions are not affected by a particular change.

I would define individual fields as separate registers. That way we will be able to reschedule instructions across some mode changes if these instructions are not affected by a particular change.

I think that's way overcomplicated, especially for the frequency of mode writes. There's a cost to adding each register and register operand

I would define individual fields as separate registers. That way we will be able to reschedule instructions across some mode changes if these instructions are not affected by a particular change.

I think that's way overcomplicated, especially for the frequency of mode writes. There's a cost to adding each register and register operand

We used to have this optimization with HSAIL and it deemed to be quite profitable.

I would define individual fields as separate registers. That way we will be able to reschedule instructions across some mode changes if these instructions are not affected by a particular change.

I think that's way overcomplicated, especially for the frequency of mode writes. There's a cost to adding each register and register operand

We used to have this optimization with HSAIL and it deemed to be quite profitable.

Unlike HSAIL, in LLVM the FP mode is going to be a middle end, IR problem. The backend won't be responsible for inserting or maintaining mode switches. All of that should be done in the IR on the constrained FP intrinsics. Separately tracking the bits on the machine level is going to increase the cost and complexity too much (just the one I think is too much, but manageable)

arsenm updated this revision to Diff 265711.May 22 2020, 5:33 AM

Rebase and fix asserts when actually used

Unlike HSAIL, in LLVM the FP mode is going to be a middle end, IR problem. The backend won't be responsible for inserting or maintaining mode switches. All of that should be done in the IR on the constrained FP intrinsics. Separately tracking the bits on the machine level is going to increase the cost and complexity too much (just the one I think is too much, but manageable)

Well, actually it was in the SC, so backend. The point to split it is to be able to reschedule without false dependencies and to minimize required switches. At the very least we need to minimize switches.

Unlike HSAIL, in LLVM the FP mode is going to be a middle end, IR problem. The backend won't be responsible for inserting or maintaining mode switches. All of that should be done in the IR on the constrained FP intrinsics. Separately tracking the bits on the machine level is going to increase the cost and complexity too much (just the one I think is too much, but manageable)

Well, actually it was in the SC, so backend. The point to split it is to be able to reschedule without false dependencies and to minimize required switches. At the very least we need to minimize switches.

Minimizing the switches is an IR problem. The backends isn’t going to be responsible for scheduling instructions around mode switches, and we just need to express the possible dependency. You can handle those issues in an IR pass that understands setreg intrinsic calls and constrained intrinsics. The other issue with breaking this down is the documentation for which instructions read which sub fields is quite poor and varies per subtarget (at least for the denormal mode). We would also have to understand the mode switch perfectly, which won’t always be the case if a variable setting is used with s_setreg. In the immediate case we would have to define a lot of variants of setreg setting different mode bit sets

rampitec accepted this revision.May 22 2020, 11:17 AM

OK, let's try to handle it in IR.

This revision is now accepted and ready to land.May 22 2020, 11:17 AM