This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Force shrinking of add/sub even if the carry is used
ClosedPublic

Authored by arsenm on Aug 28 2018, 4:54 AM.

Details

Reviewers
rampitec
Summary

The original motivating example uses a 64-bit add, so the carry
is used. Insert a copy from VCC. This may allow shrinking of
the used carry instruction. At worst, we are replacing a
mov to materialize the constant with a copy of vcc.

Diff Detail

Event Timeline

arsenm created this revision.Aug 28 2018, 4:54 AM

I am not sure that a copy + e32 instruction is better than a single e64 instruction. In fact I think it is worse.

I am not sure that a copy + e32 instruction is better than a single e64 instruction. In fact I think it is worse.

In practice the copy is always eliminated since its usually paired with a carry in operation. The total cycle count is the same with reduced code size

I am not sure that a copy + e32 instruction is better than a single e64 instruction. In fact I think it is worse.

In practice the copy is always eliminated since its usually paired with a carry in operation. The total cycle count is the same with reduced code size

Usually the add is shrunk, so it’s more code size neutral

This revision is now accepted and ready to land.Aug 28 2018, 10:06 AM
arsenm closed this revision.Aug 28 2018, 11:45 AM

r340862