This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Do not ignore exec use where exec is read as data
ClosedPublic

Authored by rampitec on Jan 20 2022, 11:02 AM.

Details

Summary

Compares, v_cndmask_b32, and v_readfirstlane_b32 use EXEC
in a way which modifies the result. This implicit EXEC use
shall not be ignored for the purposes of instruction moves.

Diff Detail

Event Timeline

rampitec created this revision.Jan 20 2022, 11:02 AM
rampitec requested review of this revision.Jan 20 2022, 11:02 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 20 2022, 11:02 AM
Herald added a subscriber: wdng. · View Herald Transcript
vangthao accepted this revision.Jan 20 2022, 12:53 PM

LGTM! Thanks!

This revision is now accepted and ready to land.Jan 20 2022, 12:53 PM
foad reopened this revision.Jan 21 2022, 3:38 AM

Compares, v_cndmask_b32, and v_readfirstlane_b32 use EXEC in a way which modifies the result.

This doesn't make any sense to me for cndmask. It handles EXEC exactly the same as any other VALU instruction: it writes a result in any lane which is enabled by EXEC, and does nothing for any lane that is disabled.

This revision is now accepted and ready to land.Jan 21 2022, 3:38 AM

Compares, v_cndmask_b32, and v_readfirstlane_b32 use EXEC in a way which modifies the result.

This doesn't make any sense to me for cndmask. It handles EXEC exactly the same as any other VALU instruction: it writes a result in any lane which is enabled by EXEC, and does nothing for any lane that is disabled.

It writes a scalar register and content of that register is different with different exec. Same thing as compare. See the bug found by Sebastian here: D116053.

foad added a comment.Jan 21 2022, 9:43 AM

Compares, v_cndmask_b32, and v_readfirstlane_b32 use EXEC in a way which modifies the result.

This doesn't make any sense to me for cndmask. It handles EXEC exactly the same as any other VALU instruction: it writes a result in any lane which is enabled by EXEC, and does nothing for any lane that is disabled.

It writes a scalar register and content of that register is different with different exec. Same thing as compare. See the bug found by Sebastian here: D116053.

No, v_cndmask_b32 does not write a scalar register!

Compares, v_cndmask_b32, and v_readfirstlane_b32 use EXEC in a way which modifies the result.

This doesn't make any sense to me for cndmask. It handles EXEC exactly the same as any other VALU instruction: it writes a result in any lane which is enabled by EXEC, and does nothing for any lane that is disabled.

It writes a scalar register and content of that register is different with different exec. Same thing as compare. See the bug found by Sebastian here: D116053.

No, v_cndmask_b32 does not write a scalar register!

Ugh, my bad!

rampitec closed this revision.Jan 21 2022, 11:23 AM

Closing this because it is landed. Here is patch to remove cndmask: D117909