Page MenuHomePhabricator

yaxunl (Yaxun Liu)
User

Projects

User does not belong to any projects.

User Details

User Since
May 13 2015, 10:16 AM (200 w, 5 d)

Recent Activity

Fri, Mar 15

yaxunl added a comment to D59316: [HIP-Clang] propagate -mllvm options to opt and llc.
Fri, Mar 15, 2:43 PM · Restricted Project
yaxunl added a comment to D59316: [HIP-Clang] propagate -mllvm options to opt and llc.

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

This isn't really true. We do run lld to link the final executable. It also doesn't change that opt and llc should never be involved in the process

Can lld do ISA level linking? That is, one device function in one object file calls another device function in a different object file, and we let lld link them together?

We can't link multiple objects, but we do need to link the single object with lld. The relocations even for functions in the same module are 0 until lld fixes them up. Do we have execution tests for function calls using HIP? Since it looks like lld isn't getting used here, I suspect they aren't workingh

Fri, Mar 15, 12:35 PM · Restricted Project
yaxunl added a comment to D59316: [HIP-Clang] propagate -mllvm options to opt and llc.

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

This isn't really true. We do run lld to link the final executable. It also doesn't change that opt and llc should never be involved in the process

Fri, Mar 15, 12:14 PM · Restricted Project

Thu, Mar 14

yaxunl accepted D59316: [HIP-Clang] propagate -mllvm options to opt and llc.

LGTM. Thanks!

Thu, Mar 14, 11:02 AM · Restricted Project
yaxunl added a comment to D59316: [HIP-Clang] propagate -mllvm options to opt and llc.

Here we are looking at the code which emulates a "linker" for HIP toolchain. The offloading action builder requests the offloading toolchain have a linker, but amdgpu does not have a real linker (ISA level linker), so we have to emulate that. If we have an ISA level linker we can get rid of all these stuff, but I don't think that will happen in short time.

Thu, Mar 14, 11:02 AM · Restricted Project

Tue, Mar 5

yaxunl committed rGc5be267003ef: [CUDA][HIP][Sema] Fix template kernel with function as template parameter (authored by yaxunl).
[CUDA][HIP][Sema] Fix template kernel with function as template parameter
Tue, Mar 5, 10:20 AM
yaxunl committed rL355421: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.
[CUDA][HIP][Sema] Fix template kernel with function as template parameter
Tue, Mar 5, 10:20 AM
yaxunl committed rC355421: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.
[CUDA][HIP][Sema] Fix template kernel with function as template parameter
Tue, Mar 5, 10:20 AM
yaxunl closed D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.
Tue, Mar 5, 10:19 AM · Restricted Project
yaxunl committed rG071949c3afae: Allow bundle size to be 0 in clang-offload-bundler (authored by yaxunl).
Allow bundle size to be 0 in clang-offload-bundler
Tue, Mar 5, 9:52 AM
yaxunl committed rC355419: Allow bundle size to be 0 in clang-offload-bundler.
Allow bundle size to be 0 in clang-offload-bundler
Tue, Mar 5, 9:52 AM
yaxunl committed rL355419: Allow bundle size to be 0 in clang-offload-bundler.
Allow bundle size to be 0 in clang-offload-bundler
Tue, Mar 5, 9:52 AM
yaxunl closed D58057: Allow bundle size to be 0 in clang-offload-bundler.
Tue, Mar 5, 9:52 AM · Restricted Project
yaxunl committed rGab851939fc6b: [HIP] Do not unbundle object files for -fno-gpu-rdc (authored by yaxunl).
[HIP] Do not unbundle object files for -fno-gpu-rdc
Tue, Mar 5, 8:09 AM
yaxunl committed rC355410: [HIP] Do not unbundle object files for -fno-gpu-rdc.
[HIP] Do not unbundle object files for -fno-gpu-rdc
Tue, Mar 5, 8:09 AM
yaxunl committed rL355410: [HIP] Do not unbundle object files for -fno-gpu-rdc.
[HIP] Do not unbundle object files for -fno-gpu-rdc
Tue, Mar 5, 8:09 AM
yaxunl closed D58917: [HIP] Do not unbundle object files for -fno-gpu-rdc.
Tue, Mar 5, 8:09 AM · Restricted Project, Restricted Project
yaxunl added a reviewer for D58057: Allow bundle size to be 0 in clang-offload-bundler: ABataev.
Tue, Mar 5, 7:02 AM · Restricted Project
yaxunl added a comment to D58057: Allow bundle size to be 0 in clang-offload-bundler.

Alexey, could you please also review this patch? Thanks.

Tue, Mar 5, 7:02 AM · Restricted Project
yaxunl added inline comments to D58917: [HIP] Do not unbundle object files for -fno-gpu-rdc.
Tue, Mar 5, 7:02 AM · Restricted Project, Restricted Project

Mon, Mar 4

yaxunl created D58917: [HIP] Do not unbundle object files for -fno-gpu-rdc.
Mon, Mar 4, 12:01 PM · Restricted Project, Restricted Project

Thu, Feb 28

yaxunl committed rGfd2c5c05fc8f: Partial revert of r353952: [HIP] Handle compile -m options and propagate into… (authored by yaxunl).
Partial revert of r353952: [HIP] Handle compile -m options and propagate into…
Thu, Feb 28, 9:08 AM
yaxunl committed rC355106: Partial revert of r353952: [HIP] Handle compile -m options and propagate into….
Partial revert of r353952: [HIP] Handle compile -m options and propagate into…
Thu, Feb 28, 9:08 AM
yaxunl committed rL355106: Partial revert of r353952: [HIP] Handle compile -m options and propagate into….
Partial revert of r353952: [HIP] Handle compile -m options and propagate into…
Thu, Feb 28, 9:08 AM

Wed, Feb 27

yaxunl committed rG785cbd850b76: [NFC] minor revision of r354929 [CUDA][HIP] Check calling convention based on… (authored by yaxunl).
[NFC] minor revision of r354929 [CUDA][HIP] Check calling convention based on…
Wed, Feb 27, 7:48 AM
yaxunl committed rL354990: [NFC] minor revision of r354929 [CUDA][HIP] Check calling convention based on….
[NFC] minor revision of r354929 [CUDA][HIP] Check calling convention based on…
Wed, Feb 27, 7:45 AM
yaxunl committed rC354990: [NFC] minor revision of r354929 [CUDA][HIP] Check calling convention based on….
[NFC] minor revision of r354929 [CUDA][HIP] Check calling convention based on…
Wed, Feb 27, 7:45 AM
yaxunl added inline comments to D57716: [CUDA][HIP] Check calling convention based on function target.
Wed, Feb 27, 7:11 AM · Restricted Project
yaxunl added inline comments to D57716: [CUDA][HIP] Check calling convention based on function target.
Wed, Feb 27, 6:57 AM · Restricted Project

Tue, Feb 26

yaxunl committed rGe739ac0e2555: [HIP] change kernel stub name (authored by yaxunl).
[HIP] change kernel stub name
Tue, Feb 26, 6:03 PM
yaxunl committed rC354948: [HIP] change kernel stub name.
[HIP] change kernel stub name
Tue, Feb 26, 6:03 PM
yaxunl committed rL354948: [HIP] change kernel stub name.
[HIP] change kernel stub name
Tue, Feb 26, 6:03 PM
yaxunl closed D58518: [HIP] change kernel stub name.
Tue, Feb 26, 6:03 PM · Restricted Project, Restricted Project
yaxunl committed rGfa49c3a888e8: [CUDA][HIP] Check calling convention based on function target (authored by yaxunl).
[CUDA][HIP] Check calling convention based on function target
Tue, Feb 26, 2:27 PM
yaxunl committed rL354929: [CUDA][HIP] Check calling convention based on function target.
[CUDA][HIP] Check calling convention based on function target
Tue, Feb 26, 2:27 PM
yaxunl committed rC354929: [CUDA][HIP] Check calling convention based on function target.
[CUDA][HIP] Check calling convention based on function target
Tue, Feb 26, 2:27 PM
yaxunl closed D57716: [CUDA][HIP] Check calling convention based on function target.
Tue, Feb 26, 2:27 PM · Restricted Project
yaxunl accepted D58623: [AMDGPU] Allow using integral non-type template parameters.

LGTM. Thanks!

Tue, Feb 26, 9:38 AM · Restricted Project
yaxunl added a comment to D57716: [CUDA][HIP] Check calling convention based on function target.

ping

Tue, Feb 26, 8:23 AM · Restricted Project
yaxunl added a reviewer for D58518: [HIP] change kernel stub name: rjmccall.
Tue, Feb 26, 8:23 AM · Restricted Project, Restricted Project
yaxunl committed rGd83c74028db0: [OpenCL] Fix assertion due to blocks (authored by yaxunl).
[OpenCL] Fix assertion due to blocks
Tue, Feb 26, 8:20 AM
yaxunl committed rL354893: [OpenCL] Fix assertion due to blocks.
[OpenCL] Fix assertion due to blocks
Tue, Feb 26, 8:20 AM
yaxunl committed rC354893: [OpenCL] Fix assertion due to blocks.
[OpenCL] Fix assertion due to blocks
Tue, Feb 26, 8:20 AM
yaxunl closed D58658: [OpenCL] Fix assertion due to blocks.
Tue, Feb 26, 8:20 AM · Restricted Project
yaxunl added a comment to D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.

I would like to fix the validation issue only and leave the overload resolution issue for future.

As I understand it, the "validation issue" is just that you'd like a diagnostic to be emitted when resolving the template argument in order to force SFINAE to pick a different template. I think that's actually just the overload-resolution issue.

Tue, Feb 26, 7:16 AM · Restricted Project

Mon, Feb 25

yaxunl created D58658: [OpenCL] Fix assertion due to blocks.
Mon, Feb 25, 5:39 PM · Restricted Project

Fri, Feb 22

yaxunl added inline comments to D58518: [HIP] change kernel stub name.
Fri, Feb 22, 3:47 PM · Restricted Project, Restricted Project
yaxunl reopened D58518: [HIP] change kernel stub name.
Fri, Feb 22, 1:45 PM · Restricted Project, Restricted Project
yaxunl updated the diff for D58518: [HIP] change kernel stub name.

Fixed regressions.

Fri, Feb 22, 1:45 PM · Restricted Project, Restricted Project
yaxunl updated the diff for D57716: [CUDA][HIP] Check calling convention based on function target.

modify test to use non-template functions.

Fri, Feb 22, 8:44 AM · Restricted Project
yaxunl added inline comments to D57716: [CUDA][HIP] Check calling convention based on function target.
Fri, Feb 22, 8:40 AM · Restricted Project

Thu, Feb 21

yaxunl committed rG00ebc0cb92e9: revert r354615: [HIP] change kernel stub name (authored by yaxunl).
revert r354615: [HIP] change kernel stub name
Thu, Feb 21, 8:20 PM
yaxunl committed rL354651: revert r354615: [HIP] change kernel stub name.
revert r354615: [HIP] change kernel stub name
Thu, Feb 21, 8:19 PM
yaxunl committed rC354651: revert r354615: [HIP] change kernel stub name.
revert r354615: [HIP] change kernel stub name
Thu, Feb 21, 8:19 PM
yaxunl updated the diff for D57716: [CUDA][HIP] Check calling convention based on function target.

Revised by Artem's comments.

Thu, Feb 21, 2:04 PM · Restricted Project
yaxunl committed rG8d7cf0e2d4b5: [HIP] change kernel stub name (authored by yaxunl).
[HIP] change kernel stub name
Thu, Feb 21, 12:13 PM
yaxunl committed rL354615: [HIP] change kernel stub name.
[HIP] change kernel stub name
Thu, Feb 21, 12:12 PM
yaxunl committed rC354615: [HIP] change kernel stub name.
[HIP] change kernel stub name
Thu, Feb 21, 12:11 PM
yaxunl closed D58518: [HIP] change kernel stub name.
Thu, Feb 21, 12:11 PM · Restricted Project, Restricted Project
yaxunl updated the diff for D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.

I would like to fix the validation issue only and leave the overload resolution issue for future.

Thu, Feb 21, 11:43 AM · Restricted Project
yaxunl accepted D58509: [CodeGen] Fix string literal address space casting..

LGTM. Thanks!

Thu, Feb 21, 11:28 AM · Restricted Project
yaxunl added a comment to D58518: [HIP] change kernel stub name.
In D58518#1406124, @tra wrote:

My guess is that this is needed because HIP debugger can see symbols from both host and device executables at the same time. Is that so?

If that's the case, I guess HIP may have similar naming problem for __host__ __device__ foo() if it's used on both host and device.

Thu, Feb 21, 11:24 AM · Restricted Project, Restricted Project
yaxunl created D58518: [HIP] change kernel stub name.
Thu, Feb 21, 10:05 AM · Restricted Project, Restricted Project

Feb 15 2019

yaxunl added a comment to D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.

It is totally unreasonable, at the time you are resolving a template argument, to investigate how the corresponding template parameter is used within the template and use that to shape how the template argument is resolved. That is simply not how the C++ template model works. Given that CODA doesn't distinguish between host and device functions in the type system, if you are going to have a rule here, it has to be based on, at most, (1) the current semantic context (which may not even be a function), (2) the template being specialized, and (3) the declarations in the template-argument set.

As I've said before on a previous patch, I think the *best* rule would be to recognize a hard difference between host and device function types, probably by making function types default to being host function types and requiring function pointers that can store device function pointers to be explicitly annotated. However, that would not be source-compatible with ordinary CUDA, which is presumably unacceptable.

The second-best rule would be to preserve compatibility by making an unannotated function type still be "unknown whether host or device", but to also allow the creation of explicitly host-only and device-only function types. For source compatibility, DREs to functions would formally have the unknown function type. Converting a pointer to an unknown function into a pointer to a host function would do some basic checking on the operand expression (basically to verify that it's not obviously a device function), and resolving an overload set in the context of a host-only function pointer type would do the obvious filtering.

Otherwise, you're going to be stuck where you are right now, which is that you're messing around with heuristics because somebody added a language extension that isn't actually very well thought out. But if that's what you have to do, it's what you have to do. For this specific question, where you are trying to resolve an overloaded template argument, I think there are basically two sensible options.

  • You can filter the overloads by the host-ness of the template. This makes some sense, because it's probably most likely that a function template that takes a function as a template argument is going to call it — but not necessarily, because it very well might decide instead to call over to the device to invoke the function. Also, not all templates have a "host-ness"; that's pretty much exclusive to function templates.
  • You can filter the overload by the host-ness of the current context. Again, this makes some sense because it's likely that a host function is trying to pass down a host function — but again, it's not hard to think of exceptions. And again, this has the problem that the context isn't always a function and so doesn't necessarily have a host-ness. Any sort of additional template-specific guidance seems doomed to gradually turn into the second design I mentioned above where you have the ability to be more specific about function types.

    For the time being, this is still a Clang extension, and while Artem mentioned that NVIDIA is investigating it, that's presumably still an investigation and we still have an opportunity to shape their thinking. So I would really recommend taking the second approach, or maybe even trying to convince them to take the first. (How common is higher-order programming on the device, anyway, that you can't break source compatibility for it?) For this specific line of inquiry, that would probably mean not trying to automatically use any particular filter on the overload set but instead just relying on the programmer to annotation what kind of function they want.
Feb 15 2019, 9:28 PM · Restricted Project
yaxunl added a comment to D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.

But what we've just been talking about is not a validity rule, it's an overload-resolution rule. It's not *invalid* to use a device function as a template argument to a host function template (or to a class template, which of course is neither host nor device). All you need to do is to resolve otherwise-intractable overload ambiguities by matching with the host-ness of the current context, which there's probably already code to do for when an overload set is used as e.g. a function argument.

Feb 15 2019, 4:39 PM · Restricted Project

Feb 14 2019

yaxunl added a comment to D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.
In D56411#1398291, @tra wrote:

That said, does CUDA have a general rule resolving __host__ vs. __device__ overloads based on context? And does it allow overloading based solely on __host__ vs. __device__?

NVCC does not. Clang does. See https://goo.gl/EXnymm for the details.

AFAICT, NVIDIA is starting to consider adopting Clang's approach:
http://lists.llvm.org/pipermail/cfe-dev/2018-November/060070.html (original message from Bryce apparently didn't make it to the cfe-dev archive)

Okay. Probably the template-argument rule ought to be the same as the address-of-function rule, which I assume means that there's a final pass that resolves ambiguities in favor of functions that can be used from the current context, to the extent that that's meaningful. It's hard to tell because that document does not appear to include a formal specification.

Regardless, that has no effect on this patch.

Feb 14 2019, 1:00 PM · Restricted Project
yaxunl added a comment to D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.
In D56411#1398291, @tra wrote:

That said, does CUDA have a general rule resolving __host__ vs. __device__ overloads based on context? And does it allow overloading based solely on __host__ vs. __device__?

NVCC does not. Clang does. See https://goo.gl/EXnymm for the details.

AFAICT, NVIDIA is starting to consider adopting Clang's approach:
http://lists.llvm.org/pipermail/cfe-dev/2018-November/060070.html (original message from Bryce apparently didn't make it to the cfe-dev archive)

Feb 14 2019, 11:28 AM · Restricted Project
yaxunl added a comment to D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.

I think the diagnostic should come during instantiation when you find an evaluated use of a host function within a device function.

It seems the body of function template is checked only during parsing of the definition of the template itself. When a function
template is instantiated, the body of the instantiated function is not checked again.

No, that's not correct. However, it's checked somewhat differently, and it's possible that the existing diagnostic is not set up to fire along all common paths. Try moving the diagnostic to MarkFunctionReferenced, and note that OdrUse will be false in all the unevaluated contexts.

You are right. After I disable current diagnostic, I saw PerformPendingInstantiations at the end of parsing the TU, where the AST of the instantiated function is iterated and MarkFunctionReferenced is called. I will try to fix my patch as suggested. Thanks.

I got one concern. If we want to do overload resolution of function type template argument based on host or device, we need to do that before template instantiation, right?

e.g. we have two functions having the same name f and type, but one is __host__ and the other is __device__, and we pass it as a template argument to a template function g. We want to choose __device__ f if g itself is __device__ and __host__ f if g itself is __host__. If we want to do this we have to do the check before template instantiation, right?

Yes, you would need to check that when resolving the overload to a single declaration. That would be separate from diagnosing uses.

That said, does CUDA have a general rule resolving __host__ vs. __device__ overloads based on context? And does it allow overloading based solely on __host__ vs. __device__?

Feb 14 2019, 9:03 AM · Restricted Project
yaxunl added a comment to D56411: [CUDA][HIP][Sema] Fix template kernel with function as template parameter.

I think the diagnostic should come during instantiation when you find an evaluated use of a host function within a device function.

It seems the body of function template is checked only during parsing of the definition of the template itself. When a function
template is instantiated, the body of the instantiated function is not checked again.

No, that's not correct. However, it's checked somewhat differently, and it's possible that the existing diagnostic is not set up to fire along all common paths. Try moving the diagnostic to MarkFunctionReferenced, and note that OdrUse will be false in all the unevaluated contexts.

You are right. After I disable current diagnostic, I saw PerformPendingInstantiations at the end of parsing the TU, where the AST of the instantiated function is iterated and MarkFunctionReferenced is called. I will try to fix my patch as suggested. Thanks.

Feb 14 2019, 8:35 AM · Restricted Project

Feb 13 2019

yaxunl committed rGc18e9ecd4fc1: [CUDA][HIP] Use device side kernel and variable names when registering them (authored by yaxunl).
[CUDA][HIP] Use device side kernel and variable names when registering them
Feb 13 2019, 6:00 PM
yaxunl committed rL354004: [CUDA][HIP] Use device side kernel and variable names when registering them.
[CUDA][HIP] Use device side kernel and variable names when registering them
Feb 13 2019, 6:00 PM
yaxunl committed rC354004: [CUDA][HIP] Use device side kernel and variable names when registering them.
[CUDA][HIP] Use device side kernel and variable names when registering them
Feb 13 2019, 6:00 PM
yaxunl closed D58163: [CUDA][HIP] Use device side kernel and variable names when registering them.
Feb 13 2019, 5:59 PM · Restricted Project
yaxunl updated the diff for D58163: [CUDA][HIP] Use device side kernel and variable names when registering them.

Revised by Artem's comments.

Feb 13 2019, 11:42 AM · Restricted Project
yaxunl added inline comments to D58163: [CUDA][HIP] Use device side kernel and variable names when registering them.
Feb 13 2019, 11:27 AM · Restricted Project
yaxunl added inline comments to D58163: [CUDA][HIP] Use device side kernel and variable names when registering them.
Feb 13 2019, 11:08 AM · Restricted Project
yaxunl added a comment to rG7084b56ee2e0: [HIP] Handle compile -m options and propagate into LLC.

It seem we only need to remove -mattr=-code-object-v3. we do not need to handle the feature options e.g. -mcode-object-v3 here, since they will be automatically pass to clang -cc1 and handled by clang -cc1. Your can keep the test since it is still correct.

Feb 13 2019, 10:53 AM

Feb 12 2019

yaxunl created D58163: [CUDA][HIP] Use device side kernel and variable names when registering them.
Feb 12 2019, 7:57 PM · Restricted Project

Feb 11 2019

yaxunl accepted D56871: [AMDGPU] Require at least protected visibility for certain symbols.

LGTM. Thanks!

Feb 11 2019, 10:37 AM · Restricted Project
yaxunl created D58057: Allow bundle size to be 0 in clang-offload-bundler.
Feb 11 2019, 8:52 AM · Restricted Project

Feb 6 2019

yaxunl added a comment to D57829: [HIP] Disable emitting llvm.linker.options in device compilation.
In D57829#1387412, @tra wrote:

Could you elaborate on why you want to disable this metadata? I think the original idea of llvm.linker.options was that it should be ignored if the back-end does not support it.

Feb 6 2019, 11:16 AM
yaxunl created D57831: AMDGPU: set wchar_t and wint_t to be unsigned short on windows.
Feb 6 2019, 11:06 AM
yaxunl created D57829: [HIP] Disable emitting llvm.linker.options in device compilation.
Feb 6 2019, 10:53 AM

Feb 5 2019

yaxunl updated the diff for D57716: [CUDA][HIP] Check calling convention based on function target.

My last fix is not right. This patch fixes the issue by checking calling convention based on whether it is host or device function.

Feb 5 2019, 12:24 PM · Restricted Project

Feb 4 2019

yaxunl created D57716: [CUDA][HIP] Check calling convention based on function target.
Feb 4 2019, 2:36 PM · Restricted Project
yaxunl added a comment to D57707: Add Triple::isAMDGPU.

Needs unit test

Feb 4 2019, 12:48 PM
yaxunl updated the diff for D57707: Add Triple::isAMDGPU.

add usage of it

Feb 4 2019, 12:47 PM
yaxunl created D57707: Add Triple::isAMDGPU.
Feb 4 2019, 12:29 PM

Jan 31 2019

yaxunl committed rL352801: Do not copy long double and 128-bit fp format from aux target for AMDGPU.
Do not copy long double and 128-bit fp format from aux target for AMDGPU
Jan 31 2019, 1:57 PM
yaxunl committed rC352801: Do not copy long double and 128-bit fp format from aux target for AMDGPU.
Do not copy long double and 128-bit fp format from aux target for AMDGPU
Jan 31 2019, 1:57 PM
yaxunl closed D57527: Do not copy long double and 128-bit fp format from aux target for AMDGPU.
Jan 31 2019, 1:57 PM
yaxunl added a comment to D57527: Do not copy long double and 128-bit fp format from aux target for AMDGPU.

Explanatory comment, please. Otherwise LGTM.

Jan 31 2019, 1:04 PM
yaxunl added a comment to D57527: Do not copy long double and 128-bit fp format from aux target for AMDGPU.

Okay, so you silently have an incompatible ABI for anything in the system headers that mentions long double. Do you have any plans to address or work around that, or is the hope that it just doesn't matter?

I feel like this should be a special case for AMDGPU rather than a general behavior with aux targets.

If host do not pass long double to device we will be fine. So we need to diagnose long double kernel arguments. However I'd like to do it in separate patch since we want to fix the regression first.

Okay. Do you also need to look for global structs and other way that information might be passed? I suppose at some level you just have to document it as a danger and treat further diagnostics as QoI.

I created a pull request to document long double usage in HIP https://github.com/ROCm-Developer-Tools/HIP/pull/890

Jan 31 2019, 12:24 PM
yaxunl updated the diff for D57527: Do not copy long double and 128-bit fp format from aux target for AMDGPU.

Fix in AMDGPUTargetInfo.

Jan 31 2019, 12:23 PM
yaxunl added a comment to D57527: Do not copy long double and 128-bit fp format from aux target for AMDGPU.

Okay, so you silently have an incompatible ABI for anything in the system headers that mentions long double. Do you have any plans to address or work around that, or is the hope that it just doesn't matter?

I feel like this should be a special case for AMDGPU rather than a general behavior with aux targets.

Jan 31 2019, 11:09 AM
yaxunl created D57527: Do not copy long double and 128-bit fp format from aux target for AMDGPU.
Jan 31 2019, 10:47 AM

Jan 30 2019

yaxunl committed rL352620: [HIP] Fix size_t for MSVC environment.
[HIP] Fix size_t for MSVC environment
Jan 30 2019, 4:27 AM
yaxunl committed rC352620: [HIP] Fix size_t for MSVC environment.
[HIP] Fix size_t for MSVC environment
Jan 30 2019, 4:27 AM
yaxunl closed D56318: [HIP] Fix size_t for MSVC environment.
Jan 30 2019, 4:27 AM

Jan 29 2019

yaxunl updated the diff for D56318: [HIP] Fix size_t for MSVC environment.

Use const argument.

Jan 29 2019, 8:20 PM