Page MenuHomePhabricator

gregrodgers (Greg Rodgers)
User

Projects

User does not belong to any projects.

User Details

User Since
Aug 22 2016, 7:51 AM (223 w, 3 d)

Recent Activity

Jul 29 2020

gregrodgers added a comment to D84743: [Clang][AMDGCN] Universal device offloading macros header.

This is all excellent feedback. Thank you.
I don't understand what I see on the godbolt link. So far, we have only tested with clang. We will test with gcc to understand the fail.
I will make the change to use numeric values for _DEVICE_ARCH and change "UNKNOWN" to some integer (maybe -1). The value _DEVICE_GPU is intended to be generational within a specific _DEVICE_ARCH.
To be clear, this is primarily for users or library writers to implement device specific or host-only code. This is not something that should be automatic. Users or library authors will opt-in with their own include. So maybe it does not belong in clang/lib/Headers.
As noted in the header comment, I expect the community to help keep this up to date as offloading technology evolves.

Jul 29 2020, 7:59 AM · Restricted Project

Jun 12 2020

gregrodgers accepted D81109: llvm-link: Add support for archive files as inputs..

LGTM, only one new auto in an existing code with 11 previous autos. Looks like consistent usage.

Jun 12 2020, 9:12 AM · Unknown Object (Project), Restricted Project

May 14 2020

gregrodgers commandeered D42800: Let CUDA toolchain support amdgpu target.
May 14 2020, 4:21 PM
gregrodgers abandoned D42800: Let CUDA toolchain support amdgpu target.
May 14 2020, 4:21 PM
gregrodgers abandoned D27928: Add isGPU() to Triple.
May 14 2020, 4:21 PM · Unknown Object (Project)

Mar 30 2020

gregrodgers added a comment to D76987: Rename options --cuda-gpu-arch and --no-cuda-gpu-arch.

This was discussed on llvm-dev three years ago. Here is the thread.

Mar 30 2020, 10:49 AM · Restricted Project

Feb 6 2020

gregrodgers added a comment to D73657: [OPENMP] Load plugins from same directory as the libomptarget.so and quick fail mechanism for offloading plugins.

I will rework this patch to

  • Try dlopen on relative library name first to for LD_LIBRARY_PATH search. If that fails, I will load using full path name.
  • Reorganize the names arrays into a single array to avoid a counter.
Feb 6 2020, 8:37 PM · Unknown Object (Project)

Jan 29 2020

gregrodgers added reviewers for D73657: [OPENMP] Load plugins from same directory as the libomptarget.so and quick fail mechanism for offloading plugins: ronl, ABataev, jdoerfert, RaviNarayanaswamy.
Jan 29 2020, 11:58 AM · Unknown Object (Project)
gregrodgers created D73657: [OPENMP] Load plugins from same directory as the libomptarget.so and quick fail mechanism for offloading plugins.
Jan 29 2020, 11:49 AM · Unknown Object (Project)

Aug 23 2018

gregrodgers added a comment to D50845: [CUDA/OpenMP] Define only some host macros during device compilation.

I have a longer comment on header files, but let me first understand this patch.

Aug 23 2018, 1:37 PM

Aug 22 2018

gregrodgers added a comment to D47849: [OpenMP][Clang][NVPTX] Enable math functions called in an OpenMP NVPTX target device region to be resolved as device-native function calls.

I like the idea of using an automatic include as a cc1 option (-include). However, I would prefer a more general automatic include for OpenMP, not just for math functions (clang_cuda_device_functions.h). Clang cuda automatically includes clang_cuda_runtime_wrapper.h. It includes other files as needed like clang_cuda_device_functions.h. Lets hypothetically call my proposed automatic include for OpenMP , clang_openmp_runtime_wrapper.h.

Aug 22 2018, 9:58 AM · Restricted Project

Jun 25 2018

gregrodgers added a comment to D48455: Remove hip.amdgcn.bc hc.amdgcn.bc from HIP Toolchains.

Why not provide a specific list of --hip-device-lib= for VDI builds? I am not sure about defining functions inside headers instead of using a hip bc lib.

Jun 25 2018, 8:27 AM · Unknown Object (Project), Restricted Project

May 7 2018

gregrodgers added a comment to D46185: [OpenMP] Allow nvptx sm_30 to be used as an offloading device.

I agree that George's RMW proposed code is correct. This was my first attempt at an RMW code. Maybe we should implement atomicMax as a device function in architecture-specific (e.g sm_30) device library. This way the code in loop.cu can remain just a call to atomicMax. Such an implementation would need an overloaded atomicMax.

May 7 2018, 9:12 AM · Unknown Object (Project)

Apr 4 2018

gregrodgers added a comment to D44992: [OpenMP] enable bc file compilation using the latest clang.

So , will the deviceRTLs/nvptx change? Instead of extern shared, what will it use for those data structures?

Apr 4 2018, 9:41 AM · Unknown Object (Project)

Apr 2 2018

gregrodgers added a comment to D44992: [OpenMP] enable bc file compilation using the latest clang.

Maybe my search is missing something, but the only place I see CUDARelocatableDeviceCode is in lib/Sema/SemaDeclAttr.cpp to allow for extern shared. How could this be causing slowness? I would think forcing extern to be global would be slower.

Apr 2 2018, 9:13 AM · Unknown Object (Project)

Feb 5 2018

gregrodgers added a comment to D42800: Let CUDA toolchain support amdgpu target.

Here my replys to the inline comments. Everything should be fixed in the next revision.

Feb 5 2018, 2:18 PM

Feb 1 2018

gregrodgers added a comment to D42800: Let CUDA toolchain support amdgpu target.

Sorry, all my great inline comments got lost somehow. I am a newbie to Phabricator. I will try to reconstruct my comments.

Feb 1 2018, 6:45 PM
gregrodgers requested changes to D42800: Let CUDA toolchain support amdgpu target.

Thanks to everyone for the reviews. I hope I replied to all inline comments. Since I sent this to Sam to post, we discovered a major shortcoming. As tra points out, there is a lot of cuda headers in the cuda sdk that are processed. We are able to override asm() expansions with #undef and redefine as an equivalent amdgpu component so the compiler never sees the asm(). I am sure we will need to add more redefines as we broaden our testing. But that is not the big problem. We would like to be able to run cudaclang for AMD GPUs without an install of cuda. Of course you must always install cuda if any of your targeted GPUs are NVidia GPUs. To run cudaclang without cuda when only non-NVidia gpus are specified, we need an open set of headers and we must replace the fatbin tools used in the toolchain. The later can be addressed by using the libomptarget methods for embedding multiple target GPU objects. The former is going to take a lot of work. I am going to be sending an updated patch that has the stubs for the open headers noted in clang_cuda_runtime_wrapper.h. They will be included with the CC1 flag -DUSE_OPEN_HEADERS__. This will be generated by the cuda driver when it finds no cuda installation and all target GPUs are not NVidia.

Feb 1 2018, 6:42 PM

Dec 19 2016

gregrodgers added a comment to D27928: Add isGPU() to Triple.

Justin, the commonality between nvptx and amdgcn LLVM IR is exactly why I would like isGPU(). I actually do want to assume that "isGPU" <--> "isNVPTX || isAMDGCN".

Dec 19 2016, 6:14 PM · Unknown Object (Project)
gregrodgers added a comment to D27928: Add isGPU() to Triple.

I can email you a bigger patch from our development tree. I would rather not post it in public yet because it still needs some work. Here are two examples from this patch.

Dec 19 2016, 2:22 PM · Unknown Object (Project)
gregrodgers added a comment to D27928: Add isGPU() to Triple.

Thank you Justin, Yes, I plan to use this extensively in clang for common OpenMP code generation. But I don't have those patches ready yet.
isGPU() may also be used for compilation of cuda code to LLVM IR as alternative to isNVPTX(). I will discuss with google authors first.
I formatted to 80 lines. Thank you for your patience with a new contributor.

Dec 19 2016, 11:15 AM · Unknown Object (Project)
gregrodgers updated the diff for D27928: Add isGPU() to Triple.

Formatted to 80 characters.

Dec 19 2016, 11:01 AM · Unknown Object (Project)
gregrodgers retitled D27928: Add isGPU() to Triple from to Add isGPU() to Triple.
Dec 19 2016, 9:52 AM · Unknown Object (Project)