This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Remove cndmask from readsExecAsData
ClosedPublic

Authored by rampitec on Jan 21 2022, 9:59 AM.

Diff Detail

Event Timeline

rampitec created this revision.Jan 21 2022, 9:59 AM
rampitec requested review of this revision.Jan 21 2022, 9:59 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 21 2022, 9:59 AM
Herald added a subscriber: wdng. · View Herald Transcript
ruiling added inline comments.
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
133

The function name is confusing, all these instructions access EXEC just as normal vector instructions. One cause of the failure @sebastian-ne showed in D116053 is we are dropping convergent attribute when lowering amdgcn.icmp into V_CMP. I am not sure if there is any other reason. I think may be a right fix for the issue is define convergent version of V_CMP, I am not sure whether it sounds reasonable? Please also add some comments this is just workaround for the issue that we are lowering convergent icmp into normal V_CMP. We should fix it some other way.

140

Did you see a failure caused by v_readfirstlane? if not, please also remove it.

foad accepted this revision.Jan 24 2022, 6:40 AM

LGTM. I agree with all of @ruiling's comments but I don't think they should block this patch, which is clearly a step in the right direction.

This revision is now accepted and ready to land.Jan 24 2022, 6:40 AM
foad added inline comments.Jan 24 2022, 6:53 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
133

I think may be a right fix for the issue is define convergent version of V_CMP, I am not sure whether it sounds reasonable?

Yes, sounds good to me. There is also a related problem in that the "convergent" attribute in IR prevents an instruction from being sunk into an "if" but it does not prevent it from being hoisted out of an "if". For ballot we need both. I'm not sure if the MachineIR convergent flag does both.

ruiling added inline comments.Jan 24 2022, 7:14 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
133

Yes, sounds good to me. There is also a related problem in that the "convergent" attribute in IR prevents an instruction from being sunk into an "if" but it does not prevent it from being hoisted out of an "if". For ballot we need both. I'm not sure if the MachineIR convergent flag does both.

MachineIR have prohibited both sinking and hoisting.

rampitec added inline comments.Jan 24 2022, 10:57 AM
llvm/lib/Target/AMDGPU/SIInstrInfo.cpp
133

The function name and the whole idea reflects that unlike other VALU where exec just predicates the execution these instructions use exec much like a normal operand which is reflected in the actual numerical answer. I.e., for cmp that is:

dst = cmp(src0, src1) & exec

For readfirstlane that is:

dst = vgpr[cttz(exec)]

Marking cmp as convergent might be a better fix because it will still allow to move compares within a block, although it sounds a little orthogonal to the exec use modeling to me.

140

No, because it is convergent. But technically is has the same effect of using exec.

This revision was automatically updated to reflect the committed changes.