This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU: Fold FP clamp as modifier bit
ClosedPublic

Authored by arsenm on Feb 17 2017, 11:29 PM.

Details

Reviewers
arsenm
Summary

The manual is unclear on the details of this. It's not
clear to me if denormals are not allowed with clamp,
or if that is only omod. Not allowing denorms for
fp16 or fp64 isn't useful so I also question if that
is really a restriction. Same with whether this is valid
without IEEE mode enabled.

Diff Detail

Event Timeline

arsenm created this revision.Feb 17 2017, 11:29 PM
mareko added a subscriber: mareko.Feb 19 2017, 3:50 PM

I only know that exceptions won't occur with the clamp modifier. No idea about denormals.

Also, shouldn't this handle MIN as well?

I only know that exceptions won't occur with the clamp modifier. No idea about denormals.

Also, shouldn't this handle MIN as well?

There's no practical reason to handle min. The higher level operation minnum(x, x) is folded to x in the IR, so this should only be appearing when we emit this pattern for the clamp operation, where max was arbitrarily chosen.

I only know that exceptions won't occur with the clamp modifier. No idea about denormals.

Also, shouldn't this handle MIN as well?

There's no practical reason to handle min. The higher level operation minnum(x, x) is folded to x in the IR, so this should only be appearing when we emit this pattern for the clamp operation, where max was arbitrarily chosen.

I don't understand. FPClamp(x) = min(max(x, 0), 1). I don't see min handled here, that's why I asked.

I only know that exceptions won't occur with the clamp modifier. No idea about denormals.

Also, shouldn't this handle MIN as well?

There's no practical reason to handle min. The higher level operation minnum(x, x) is folded to x in the IR, so this should only be appearing when we emit this pattern for the clamp operation, where max was arbitrarily chosen.

I don't understand. FPClamp(x) = min(max(x, 0), 1). I don't see min handled here, that's why I asked.

This isn't directly matching the clamp pattern. In the DAG we match that to AMDGPUISD::CLAMP. We emit that as the max(x, x) clamp. We match that here

arsenm updated this revision to Diff 89227.Feb 21 2017, 9:05 AM

Allow with denormals enabled

arsenm accepted this revision.Feb 22 2017, 3:39 PM

r295905

This revision is now accepted and ready to land.Feb 22 2017, 3:39 PM
arsenm closed this revision.Feb 22 2017, 3:40 PM
foad added a subscriber: foad.Jun 29 2022, 4:02 AM
foad added inline comments.
test/CodeGen/AMDGPU/clamp.ll
108–109

Does this FIXME still make sense? Not sure what it was trying to say in the first place.

128

Ditto.

Herald added a project: Restricted Project. · View Herald TranscriptJun 29 2022, 4:02 AM
arsenm added inline comments.Jun 29 2022, 6:16 AM
test/CodeGen/AMDGPU/clamp.ll
108–109

The source modifier is used here, so I don't think so