This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] Remove ftz variants of cvt with rounding mode
ClosedPublic

Authored by bkramer on Aug 21 2018, 8:30 AM.

Details

Diff Detail

Repository
rL LLVM

Event Timeline

bkramer created this revision.Aug 21 2018, 8:30 AM
tra added a comment.Aug 21 2018, 9:44 AM

This is a surprise. PTX ISA does not mention that .ftz is not applicable to cvt.*.f16.* instructions.
Is it only cvt that does not support .ftz or does it impact other instructions? PTX spec has add/sub/mul/fma/set/setp instructions that support f16 and have .ftz variant.

In D51042#1207769, @tra wrote:

This is a surprise. PTX ISA does not mention that .ftz is not applicable to cvt.*.f16.* instructions.
Is it only cvt that does not support .ftz or does it impact other instructions? PTX spec has add/sub/mul/fma/set/setp instructions that support f16 and have .ftz variant.

It's only cvt with an explicit rounding mode. I actually ran the output of f16-instructions.ll with FTZ through ptxas and removed instructions until it compiled it. This might even be a bug in ptxas.

tra accepted this revision.Aug 21 2018, 10:59 AM
In D51042#1207769, @tra wrote:

This is a surprise. PTX ISA does not mention that .ftz is not applicable to cvt.*.f16.* instructions.
Is it only cvt that does not support .ftz or does it impact other instructions? PTX spec has add/sub/mul/fma/set/setp instructions that support f16 and have .ftz variant.

It's only cvt with an explicit rounding mode. I actually ran the output of f16-instructions.ll with FTZ through ptxas and removed instructions until it compiled it. This might even be a bug in ptxas.

It may be worth filing a bug with NVIDIA to either fix the problem or clarify the docs.

This revision is now accepted and ready to land.Aug 21 2018, 10:59 AM
This revision was automatically updated to reflect the committed changes.