This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU/GlobalISel: Legalize s16->s64 G_FPTOSI/G_FPTOUI
ClosedPublic

Authored by Petar.Avramovic on Jul 17 2020, 3:07 AM.

Details

Summary

Add narrowScalarFor action and
narrow scalar for typeIndex==0 for G_FPTOSI/G_FPTOUI.
Legalize using narrowScalarFor as s16->s32 G_FPTOSI/G_FPTOUI
followed by s32->s64 G_SEXT/G_ZEXT.

Diff Detail

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptJul 17 2020, 3:07 AM
foad added inline comments.Jul 17 2020, 3:28 AM
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-fptosi.mir
598–633

This expansion is huge! How about implementing it as: fpext from f16 to f32, then fptosi/fptoui from f32 to i32 (you know this will be in range because the range of f16 is only about +/-65536), then sext/zext from i32 to i64?

Petar.Avramovic edited the summary of this revision. (Show Details)

This might be narrowScalar on TypeIndex==0 but it requires that s16 is at TypeIndex==1 thus I used custom legalize.

foad added a comment.Jul 17 2020, 4:40 AM

The generated code looks much better, thanks. The implementation looks OK to me too, but perhaps wait to see if @arsenm has comments?

This might be narrowScalar on TypeIndex==0 but it requires that s16 is at TypeIndex==1 thus I used custom legalize.

You can do
.narrowScalarIf(all(typeIs(0, S64), typeIs(1, S16)), changeTo(0, S32))

but we should have a narrowFor like the other actions

Also it might be worth fixing this in the DAG. It ends up with the huge expansion now

Petar.Avramovic edited the summary of this revision. (Show Details)
arsenm accepted this revision.Jul 17 2020, 10:24 AM

LGTM with nit

llvm/lib/Target/AMDGPU/AMDGPULegalizerInfo.cpp
658

Missing space after ,

This revision is now accepted and ready to land.Jul 17 2020, 10:24 AM
This revision was automatically updated to reflect the committed changes.