Page MenuHomePhabricator

[AMDGPU] Use s_add_i32 for address additions
ClosedPublic

Authored by sebastian-ne on May 28 2021, 8:38 AM.

Details

Summary

This allows to convert the add instruction to s_addk_i32 and
v_add_nc_u32 instead of needing v_add_co_u32 when converting to a VALU
instruction.

Diff Detail

Event Timeline

sebastian-ne created this revision.May 28 2021, 8:38 AM
sebastian-ne requested review of this revision.May 28 2021, 8:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 28 2021, 8:38 AM
foad added a comment.May 28 2021, 8:56 AM

This allows to convert the add instruction to s_addk_i32

Nice. (But perhaps we should be able to convert s_add_u32 -> s_addk_i32 if scc is dead?)

and v_add_nc_u32 instead of needing v_add_co_u32 when converting to a VALU instruction.

None of the tests show this. Why is it better? Just because it does not clobber vcc?

llvm/lib/Target/AMDGPU/SIFrameLowering.cpp
311

Matt usually objects to this extra indentation on the grounds that clang-format is wrong.

1297–1302

*=?

This allows to convert the add instruction to s_addk_i32

Nice. (But perhaps we should be able to convert s_add_u32 -> s_addk_i32 if scc is dead?)

That would be nice, but how can I find out if SCC is unused? The dead flag is unreliable (at least for GlobalISel it is not set when the ShrinkInstructions pass is run. In some review Matt suggested that should remove the flag altogether).

and v_add_nc_u32 instead of needing v_add_co_u32 when converting to a VALU instruction.

None of the tests show this. Why is it better? Just because it does not clobber vcc?

I hoped it could save a register, but you’re right, it doesn’t change anything.

rampitec accepted this revision.Jun 1 2021, 2:42 PM

LGTM modulo Jay's comments.

This revision is now accepted and ready to land.Jun 1 2021, 2:42 PM
arsenm accepted this revision.Jun 1 2021, 2:44 PM
This revision was automatically updated to reflect the committed changes.
sebastian-ne marked an inline comment as done.