This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Partially fix not respecting dynamic denormal mode
ClosedPublic

Authored by arsenm on Jun 15 2023, 3:52 PM.

Details

Reviewers
rampitec
foad
Pierre-vh
cdevadas
Group Reviewers
Restricted Project
Summary

The most notable issue was producing v_mad_f32 in functions with the
dynamic mode, since it just ignores the mode. fdiv lowering is still
somewhat broken because it involves a mode switch and we need to query
the original mode.

Diff Detail

Event Timeline

arsenm created this revision.Jun 15 2023, 3:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2023, 3:52 PM
arsenm requested review of this revision.Jun 15 2023, 3:52 PM
Herald added a project: Restricted Project. · View Herald TranscriptJun 15 2023, 3:52 PM
Herald added a subscriber: wdng. · View Herald Transcript

Can you please explain why does it need reversal of the HasFP32Denormals? This creates a lot turmoil.

arsenm added a comment.Jul 5 2023, 2:31 PM

Can you please explain why does it need reversal of the HasFP32Denormals? This creates a lot turmoil.

It's conceptually backwards. The default logic has to be treat like IEEE. It needs to be the positive condition that we definitely have flushing on

Can you please explain why does it need reversal of the HasFP32Denormals? This creates a lot turmoil.

It's conceptually backwards. The default logic has to be treat like IEEE. It needs to be the positive condition that we definitely have flushing on

It should be possible to logically separate it into 2 patches then: one reversing the flag but not changing the behavior, and then the fix itself?

arsenm added a comment.Jul 5 2023, 2:47 PM

It should be possible to logically separate it into 2 patches then: one reversing the flag but not changing the behavior, and then the fix itself?

Yes, but it's kind of the same thing. I'd have to go spend the time to re-invert all the logic here and doesn't seem worth the effort

It should be possible to logically separate it into 2 patches then: one reversing the flag but not changing the behavior, and then the fix itself?

Yes, but it's kind of the same thing. I'd have to go spend the time to re-invert all the logic here and doesn't seem worth the effort

Most of the patch is the logic inversion, I believe I have scanned it 3 times already. Could you please highlight the actual fix, or why does the inversion itself fixes it?

arsenm added a comment.Jul 7 2023, 8:23 AM

It should be possible to logically separate it into 2 patches then: one reversing the flag but not changing the behavior, and then the fix itself?

Yes, but it's kind of the same thing. I'd have to go spend the time to re-invert all the logic here and doesn't seem worth the effort

Most of the patch is the logic inversion, I believe I have scanned it 3 times already. Could you please highlight the actual fix, or why does the inversion itself fixes it?

Because there's a 3rd option now, dynamic. It could be either mode. !IEEE doesn't imply denormals can be flushed, it has to be ==PreserveSign. This fixes illegally forming v_mad_f32 (see v_mac_f32_dynamic_ftz, this would be wrong if the function was called from an IEEE context if it used v_mad_f32)

rampitec accepted this revision.Jul 7 2023, 12:03 PM

It should be possible to logically separate it into 2 patches then: one reversing the flag but not changing the behavior, and then the fix itself?

Yes, but it's kind of the same thing. I'd have to go spend the time to re-invert all the logic here and doesn't seem worth the effort

Most of the patch is the logic inversion, I believe I have scanned it 3 times already. Could you please highlight the actual fix, or why does the inversion itself fixes it?

Because there's a 3rd option now, dynamic. It could be either mode. !IEEE doesn't imply denormals can be flushed, it has to be ==PreserveSign. This fixes illegally forming v_mad_f32 (see v_mac_f32_dynamic_ftz, this would be wrong if the function was called from an IEEE context if it used v_mad_f32)

OK, thanks.

This revision is now accepted and ready to land.Jul 7 2023, 12:03 PM