This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU] Define SGPR_NULL64 register. NFCI.
ClosedPublic

Authored by rampitec on Jun 10 2022, 12:11 PM.

Details

Summary

On gfx10+ null register can be used as both 32 and 64 bit operand.
Define a 64 bit version of the register to use during codegen.

Diff Detail

Event Timeline

rampitec created this revision.Jun 10 2022, 12:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2022, 12:11 PM
rampitec requested review of this revision.Jun 10 2022, 12:11 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 10 2022, 12:11 PM
Herald added a subscriber: wdng. · View Herald Transcript
foad accepted this revision.Jun 13 2022, 12:58 PM

I guess this is OK. I'm a bit surprised that null is defined like a real physical register, but I guess it has always worked this way. And MIPS seems to do the same for their r0 register which works the same way.

This revision is now accepted and ready to land.Jun 13 2022, 12:58 PM

I guess this is OK. I'm a bit surprised that null is defined like a real physical register, but I guess it has always worked this way. And MIPS seems to do the same for their r0 register which works the same way.

It is in fact a real HW register, although quite special. Anyway we need to fit it into an operand, it needs to be a part of actual RC, and size shall match.

This revision was landed with ongoing or failed builds.Jun 13 2022, 1:23 PM
This revision was automatically updated to reflect the committed changes.

LGTM, but could there be more opportunities to use this?
Are there cases where we want to have value 0 in a source operand (a normal VSrc, not VOPDstS64orS32 like is used in https://reviews.llvm.org/D127542), it can't be folded, and we could use sgpr null 64? It might be interesting to have a test case for that with folding excluded.

LGTM, but could there be more opportunities to use this?
Are there cases where we want to have value 0 in a source operand (a normal VSrc, not VOPDstS64orS32 like is used in https://reviews.llvm.org/D127542), it can't be folded, and we could use sgpr null 64? It might be interesting to have a test case for that with folding excluded.

0 is inline literal, so usually we can use inline 0 instead.

LGTM, but could there be more opportunities to use this?
Are there cases where we want to have value 0 in a source operand (a normal VSrc, not VOPDstS64orS32 like is used in https://reviews.llvm.org/D127542), it can't be folded, and we could use sgpr null 64? It might be interesting to have a test case for that with folding excluded.

0 is inline literal, so usually we can use inline 0 instead.

I can see one marginal case where a 64 bit add/sub is expanded and we are unable to shrink the first instruction so produce V_ADD_CO_CI_U32/V_SUB_CO_CI_U32, e.g. add or sub with an SGPR operand. Here null can be used as a carry-in.

LGTM, but could there be more opportunities to use this?
Are there cases where we want to have value 0 in a source operand (a normal VSrc, not VOPDstS64orS32 like is used in https://reviews.llvm.org/D127542), it can't be folded, and we could use sgpr null 64? It might be interesting to have a test case for that with folding excluded.

0 is inline literal, so usually we can use inline 0 instead.

I can see one marginal case where a 64 bit add/sub is expanded and we are unable to shrink the first instruction so produce V_ADD_CO_CI_U32/V_SUB_CO_CI_U32, e.g. add or sub with an SGPR operand. Here null can be used as a carry-in.

And even this case is impractical. On gfx10 we can use 2 constants, so addc with a vgpr and sgpr will fall down to be shrunk and use vcc as a carry-in, and an operation with 2 sgprs will be SALU.

LGTM, but could there be more opportunities to use this?
Are there cases where we want to have value 0 in a source operand (a normal VSrc, not VOPDstS64orS32 like is used in https://reviews.llvm.org/D127542), it can't be folded, and we could use sgpr null 64? It might be interesting to have a test case for that with folding excluded.

0 is inline literal, so usually we can use inline 0 instead.

I can see one marginal case where a 64 bit add/sub is expanded and we are unable to shrink the first instruction so produce V_ADD_CO_CI_U32/V_SUB_CO_CI_U32, e.g. add or sub with an SGPR operand. Here null can be used as a carry-in.

And even this case is impractical. On gfx10 we can use 2 constants, so addc with a vgpr and sgpr will fall down to be shrunk and use vcc as a carry-in, and an operation with 2 sgprs will be SALU.

Ok, thanks! It seems we have many ways to optimize instructions with zero operands, so the use of the null sgpr is quite specific.

Ok, thanks! It seems we have many ways to optimize instructions with zero operands, so the use of the null sgpr is quite specific.

Right. The problem with null being used in place of a vcc as a carry in particular that it prevents shrinking. And a for a normal vsrc we can do better with inline literals.