This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Fold omod into instructions
ClosedPublic

Authored by arsenm on Feb 21 2017, 9:15 AM.

Details

Diff Detail

Event Timeline

arsenm created this revision.Feb 21 2017, 9:15 AM
artem.tamazov requested changes to this revision.Feb 22 2017, 2:57 AM

Comments need to be updated at least.

lib/Target/AMDGPU/SIFoldOperands.cpp
790–792

Output modifiers are not compatible with output denorms, i.e.:

  • If output denorms are allowed (in the HW MODe register), then any output modifier is ignored
  • If output denorms are not allowed, then denorms will be flushed to +/-0 first. Then, if output modifier is non-zero, -0 will be forced to +0 prior applying the modifier.

Output modifiers are not IEEE compliant (-0*x=+0). Output modifiers are ignored by hardware if ieee bit is set in the HW MODE register.

The above applies to all supported floating types, including f16, f32, f64.

829

Nope. Pls. see comment above.

853

Yes. If IEEE is set, OMOD does not work.

This revision now requires changes to proceed.Feb 22 2017, 2:57 AM
arsenm updated this revision to Diff 89459.Feb 22 2017, 7:04 PM
arsenm edited edge metadata.

Don't fold if IEEE bit is set. Don't fold unless no-signed-zeros-fp-math is enabled

mareko edited edge metadata.Feb 23 2017, 6:57 AM

Other than that, nice work.

lib/Target/AMDGPU/AMDGPUCallingConv.td
38

Why are the calling conventions being changed?

arsenm added inline comments.Feb 23 2017, 4:03 PM
lib/Target/AMDGPU/AMDGPUCallingConv.td
38

I needed a way to get an f16 input into a graphics shader. This would just assert on unhandled value type before. I can commit this separately

mareko added inline comments.Feb 24 2017, 1:42 AM
lib/Target/AMDGPU/AMDGPUCallingConv.td
38

OK. I guess you can keep it here.

This revision is now accepted and ready to land.Feb 27 2017, 8:38 AM
arsenm closed this revision.Feb 27 2017, 11:50 AM

r296372