This is an archive of the discontinued LLVM Phabricator instance.

RFD: AMDGPU: Stop validating earlyclobber operands in assembler
ClosedPublic

Authored by foad on Sep 20 2022, 2:55 AM.

Details

Summary

This validation was introduced in D34003 for v_qsad/v_mqsad instructions
but it applies to all instructions with earlyclobber operands, which now
includes v_mad_i64/v_mad_u64.

In all these cases I do not think there is documentation saying that the
destination must not overlap the sources. Rather there are *some* cases
where the instruction may not function correctly if there is an overlap,
and we are using earlyclobber as a conservative way of preventing
codegen from generating those cases.

I think it is unhelpful for the assembler to enforce the earlyclobber
restriction because it prevents assembling cases where the programmer
knows that in fact the overlap is safe.

See also: https://github.com/llvm/llvm-project/issues/57610

Diff Detail

Event Timeline

foad created this revision.Sep 20 2022, 2:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 20 2022, 2:55 AM
foad requested review of this revision.Sep 20 2022, 2:55 AM
Herald added a project: Restricted Project. · View Herald TranscriptSep 20 2022, 2:55 AM
dp accepted this revision.Sep 20 2022, 7:54 AM

LGTM

This revision is now accepted and ready to land.Sep 20 2022, 7:54 AM
foad added a comment.Sep 20 2022, 8:47 AM
In D134272#3802809, @dp wrote:

LGTM

Thanks. I'd like to get some more opinions (at least @rampitec's) before landing this.

dp added a comment.Sep 20 2022, 8:52 AM

I asked shader writers (Ilya Perminov in particular), and they are happy with your change.

Why not keep this validation and limit it to specific opcodes? For example multiplass DGEMM has the same restriction for the same reason.

Why not keep this validation and limit it to specific opcodes? For example multiplass DGEMM has the same restriction for the same reason.

Not only opcodes, but the specifically invalid overlaps

Why not keep this validation and limit it to specific opcodes? For example multiplass DGEMM has the same restriction for the same reason.

Not only opcodes, but the specifically invalid overlaps

Right, it affects specific operands only, we just have no way to express it in the td.

foad added a comment.Sep 21 2022, 5:28 AM

Why not keep this validation and limit it to specific opcodes? For example multiplass DGEMM has the same restriction for the same reason.

Not only opcodes, but the specifically invalid overlaps

Right, it affects specific operands only, we just have no way to express it in the td.

Adding checks for specific operands of specific (DGEMM?) opcodes sounds great, if the restrictions are clearly documented, but I don't think it belongs in this patch.

For mad/qsad/mqsad I don't think there are any clearly documented restrictions, just some reports of hardware bugs that we try to work around in codegen, so I don't think the assembler should be enforcing any restrictions.

So I'd still like to proceed with this patch.

rampitec accepted this revision.Sep 21 2022, 12:43 PM

OK, let's remove this and add a more specific verification later if needed.

This revision was landed with ongoing or failed builds.Sep 21 2022, 1:47 PM
This revision was automatically updated to reflect the committed changes.