This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Apply i16 add->sub pattern with zext to i32
ClosedPublic

Authored by arsenm on Jan 7 2020, 10:40 AM.

Details

Reviewers
rampitec
kerbowa
Summary

This was only applying the deeper nested zext pattern, and missing the
special case code size fold.

Diff Detail

Event Timeline

arsenm created this revision.Jan 7 2020, 10:40 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 7 2020, 10:40 AM

Will it correctly work with and without sram-ecc? I.e. do we have any assumptions about high 16 content of an i16 value anywhere?

Will it correctly work with and without sram-ecc? I.e. do we have any assumptions about high 16 content of an i16 value anywhere?

That only matters for memory accesses as far as I know. This isn't really a new pattern, and the existing predicates don't check

Will it correctly work with and without sram-ecc? I.e. do we have any assumptions about high 16 content of an i16 value anywhere?

That only matters for memory accesses as far as I know. This isn't really a new pattern, and the existing predicates don't check

It is more than memory as far as I know, even arithmetic instructions will either zero or preserve the high bits.

Will it correctly work with and without sram-ecc? I.e. do we have any assumptions about high 16 content of an i16 value anywhere?

That only matters for memory accesses as far as I know. This isn't really a new pattern, and the existing predicates don't check

It is more than memory as far as I know, even arithmetic instructions will either zero or preserve the high bits.

This is controlled by a bit starting in gfx9 I think. Eventually we need to split the instruction definitions to add a tied operand for the preserved high case. These are separate problems from this patch anyway

This revision is now accepted and ready to land.Jan 7 2020, 12:21 PM