This is an archive of the discontinued LLVM Phabricator instance.

Do not copy long double and 128-bit fp format from aux target for AMDGPU
ClosedPublic

Authored by yaxunl on Jan 31 2019, 10:47 AM.

Details

Summary

rC352620 caused regressions because it copied floating point format from
aux target.

floating point format decides whether extended long double is supported.
It is x86_fp80 on x86 but IEEE double on amdgcn.

Document usage of long doubel type in HIP programming guide
https://github.com/ROCm-Developer-Tools/HIP/pull/890

Diff Detail

Repository
rC Clang

Event Timeline

yaxunl created this revision.Jan 31 2019, 10:47 AM

Okay, so you silently have an incompatible ABI for anything in the system headers that mentions long double. Do you have any plans to address or work around that, or is the hope that it just doesn't matter?

I feel like this should be a special case for AMDGPU rather than a general behavior with aux targets.

Okay, so you silently have an incompatible ABI for anything in the system headers that mentions long double. Do you have any plans to address or work around that, or is the hope that it just doesn't matter?

I feel like this should be a special case for AMDGPU rather than a general behavior with aux targets.

If host do not pass long double to device we will be fine. So we need to diagnose long double kernel arguments. However I'd like to do it in separate patch since we want to fix the regression first.

Since this maybe a special case for AMDGPU, I will fix it in AMDGPUTargetInfo.

Okay, so you silently have an incompatible ABI for anything in the system headers that mentions long double. Do you have any plans to address or work around that, or is the hope that it just doesn't matter?

I feel like this should be a special case for AMDGPU rather than a general behavior with aux targets.

If host do not pass long double to device we will be fine. So we need to diagnose long double kernel arguments. However I'd like to do it in separate patch since we want to fix the regression first.

Okay. Do you also need to look for global structs and other way that information might be passed? I suppose at some level you just have to document it as a danger and treat further diagnostics as QoI.

Since this maybe a special case for AMDGPU, I will fix it in AMDGPUTargetInfo.

Alright. That should be as easy as saving the old value and restoring it after the overwrite.

yaxunl updated this revision to Diff 184569.Jan 31 2019, 12:23 PM
yaxunl retitled this revision from Do not copy floating pointer format from aux target to Do not copy long double and 128-bit fp format from aux target for AMDGPU.
yaxunl edited the summary of this revision. (Show Details)

Fix in AMDGPUTargetInfo.

Okay, so you silently have an incompatible ABI for anything in the system headers that mentions long double. Do you have any plans to address or work around that, or is the hope that it just doesn't matter?

I feel like this should be a special case for AMDGPU rather than a general behavior with aux targets.

If host do not pass long double to device we will be fine. So we need to diagnose long double kernel arguments. However I'd like to do it in separate patch since we want to fix the regression first.

Okay. Do you also need to look for global structs and other way that information might be passed? I suppose at some level you just have to document it as a danger and treat further diagnostics as QoI.

I created a pull request to document long double usage in HIP https://github.com/ROCm-Developer-Tools/HIP/pull/890

Explanatory comment, please. Otherwise LGTM.

Explanatory comment, please. Otherwise LGTM.

will do when committing.

This revision was not accepted when it landed; it landed in state Needs Review.Jan 31 2019, 1:57 PM
This revision was automatically updated to reflect the committed changes.