This is an archive of the discontinued LLVM Phabricator instance.

[HIP] Use native math functions for `-fcuda-approx-transcendentals`
ClosedPublic

Authored by yaxunl on Jul 9 2023, 7:26 AM.

Details

Summary

CUDA allows replacing standard math functions with less accurate
native math functions when -use_fast_math is specified
(https://docs.nvidia.com/cuda/cuda-c-programming-guide/index.html#intrinsic-functions).
Cuda-clang does this when -fcuda-approx-transcendentals
or -ffast-math is specified. HIP-clang currently passes option
-fcuda-approx-transcendentals to clang -cc1 and predefines
__CLANG_CUDA_APPROX_TRANSCENDENTALS__ but does
not replace standard math functions with native math functions.

This patch implements this in a similar approach as cuda-clang.

Diff Detail

Event Timeline

yaxunl created this revision.Jul 9 2023, 7:26 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 9 2023, 7:26 AM
Herald added a subscriber: mattd. · View Herald Transcript
yaxunl requested review of this revision.Jul 9 2023, 7:26 AM
yaxunl updated this revision to Diff 538435.Jul 9 2023, 7:43 AM

fix test

b-sumner added inline comments.Jul 9 2023, 8:45 AM
clang/lib/Headers/__clang_hip_math.h
304

We could consider multiplying native_sin here with the native_recip of native_cos.

arsenm added a subscriber: arsenm.Jul 9 2023, 4:22 PM

This would be a lot easier if we clang fp was fully featured. As far as I can tell it lets you set #pragma clang fp reassociate(on), and contract(fast), but doesn't have a way to set arcp, afn, ninf or nnan

clang/lib/Headers/__clang_hip_math.h
160

We should have llvm.exp10 but don't today. Just inline __builtin_exp2f(M_LOG2_10_F * x)

163

__builtin_expf

280

We want llvm.log10.f32 with afn set. You get closer by just using __builtin_log10f. Ideally we would have a pragma to set afn locally

283

builtin_log2f or builtin_amdgcn_log. Ideally would be llvm.log2.f32 with afn set

286

Same as log10 case, except with log

301

Really we want these inlined with FMF set, the built libraries currently have no flags

arsenm added inline comments.Jul 9 2023, 4:30 PM
clang/lib/Headers/__clang_hip_math.h
163

Maybe this should just be __builtin_amdgcn_exp2f

yaxunl marked 8 inline comments as done.Jul 12 2023, 1:32 PM
yaxunl added inline comments.
clang/lib/Headers/__clang_hip_math.h
160

done

163

builtin_expf causes extra calculations. will use builtin_amdgcn_exp2f

280

will use __builtin_log10f

283

will use __builtin_amdgcn_logf

286

will use __builtin_logf

304

will do

yaxunl updated this revision to Diff 539715.Jul 12 2023, 1:54 PM
yaxunl marked 6 inline comments as done.

revised by comments

ping

It passes internal CI. Also tested with Blender main branch with -ffast-math and no regressions were found.

arsenm accepted this revision.Jul 24 2023, 2:54 PM
This revision is now accepted and ready to land.Jul 24 2023, 2:54 PM
This revision was landed with ongoing or failed builds.Jul 25 2023, 7:05 AM
This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptJul 25 2023, 7:05 AM