This is an archive of the discontinued LLVM Phabricator instance.

[AMDGPU][ConstantFolding] Fold llvm.amdgcn.fract intrinsic
ClosedPublic

Authored by foad on Feb 26 2020, 7:17 AM.

Diff Detail

Event Timeline

foad created this revision.Feb 26 2020, 7:17 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 26 2020, 7:17 AM
arsenm added inline comments.Feb 26 2020, 7:27 AM
llvm/lib/Analysis/ConstantFolding.cpp
1810

This should match the instruction behavior (although I guess we can ignore the bug on SI)

1811–1814

The specs aren't necessarily relevant here, since this just needs to match the instruction behavior. Talking about the min when it isn't here is potentially confusing

foad marked an inline comment as done.Feb 26 2020, 7:46 AM
foad added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1810

Is there a good public reference for that? The Vega ISA Reference Guide doesn't go into much detail.

arsenm added inline comments.Feb 26 2020, 7:56 AM
llvm/lib/Analysis/ConstantFolding.cpp
1810

This is always a problem, and no. I just go by this comment:

// V_FRACT is buggy on SI, so the F32 version is never used and (x-floor(x)) is
// used instead. However, SI doesn't have V_FLOOR_F64, so the most efficient
// way to implement it is using V_FRACT_F64.
// The workaround for the V_FRACT bug is:
//    fract(x) = isnan(x) ? x : min(V_FRACT(x), 0.99999999999999999)

// Convert floor(x) to (x - fract(x))
foad updated this revision to Diff 246767.Feb 26 2020, 9:33 AM
foad marked an inline comment as done.

Implement the OpenCL-like behaviour.

foad marked 3 inline comments as done.Feb 26 2020, 9:33 AM
foad added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1810

OK, so it sounds like the (non-buggy) hardware uses the same trick as the OpenCL definition, to avoid ever returning 1.0. I'll try to confirm that on some real hardware.

foad marked 2 inline comments as done.Feb 27 2020, 6:16 AM
foad added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1810

I've confirmed this for f16 and f32 types, on some real gfx9 hardware.

arsenm accepted this revision.Feb 27 2020, 6:26 AM
This revision is now accepted and ready to land.Feb 27 2020, 6:26 AM
foad marked an inline comment as done.Feb 27 2020, 6:36 AM
foad added inline comments.
llvm/lib/Analysis/ConstantFolding.cpp
1810

... and confirmed for f64 too.

This revision was automatically updated to reflect the committed changes.