Page MenuHomePhabricator

tra (Artem Belevich)
User

Projects

User does not belong to any projects.

User Details

User Since
Jan 8 2015, 1:53 PM (273 w, 5 d)

Recent Activity

Today

tra added a child revision for D77665: [CUDA] Simplify GPU variant handling. NFC.: D77670: [CUDA] Add partial support for recent CUDA versions..
Tue, Apr 7, 12:31 PM · Restricted Project
tra added a parent revision for D77670: [CUDA] Add partial support for recent CUDA versions.: D77665: [CUDA] Simplify GPU variant handling. NFC..
Tue, Apr 7, 12:31 PM · Restricted Project
tra added inline comments to D77665: [CUDA] Simplify GPU variant handling. NFC..
Tue, Apr 7, 12:31 PM · Restricted Project
tra created D77670: [CUDA] Add partial support for recent CUDA versions..
Tue, Apr 7, 11:58 AM · Restricted Project
tra updated the diff for D77665: [CUDA] Simplify GPU variant handling. NFC..

Enumerate all known GPU variants during libdevice detection instead of
hardcoding them.

Tue, Apr 7, 11:55 AM · Restricted Project
tra created D77665: [CUDA] Simplify GPU variant handling. NFC..
Tue, Apr 7, 11:26 AM · Restricted Project

Yesterday

tra added a comment to D77451: Accept -x cu to indicate language is CUDA, transfer CUDA language flag to header-file arguments.

NVCC uses different options that should be properly translated

Interested to see how this will work. Is clang itself going to support these args (act compatibly with nvcc, or is the idea that just tools will be?

Mon, Apr 6, 1:37 PM · Restricted Project
tra added a reviewer for D77451: Accept -x cu to indicate language is CUDA, transfer CUDA language flag to header-file arguments: sammccall.

Please add some more details to the bug description. This change is to make clangd work when compilation database sees CUDA sources compiled with nvcc. NVCC uses different options that should be properly translated. This patch only deals with recognizing the sources as CUDA, but does not handle the compiler options. While not perfect, it's still a useful improvement.

Mon, Apr 6, 12:00 PM · Restricted Project

Fri, Apr 3

tra added a comment to D77352: [llvm] Fix missing FileCheck directive colons.

LGTM for NVPTX tests.

Fri, Apr 3, 9:40 AM · Restricted Project
tra accepted D77398: [cuda][hip] Fix `RegisterVar` function prototype..
Fri, Apr 3, 9:08 AM · Restricted Project

Thu, Apr 2

tra added a comment to D77240: [CUDA] Add missing cmath overloads.

I just noticed those as well. I forgot to put the new definitions into the forward declare header. Will do it in a second. The OpenMP math overlay doesn't have one so I forgot :(

Thu, Apr 2, 9:44 AM · Restricted Project

Wed, Apr 1

tra added a comment to D77240: [CUDA] Add missing cmath overloads.

We do have a problem. With your patch I see a lot of errors about function redefinitions conflicting with the ones in CUDA's math_functions.hpp:

Wed, Apr 1, 4:20 PM · Restricted Project
tra added a comment to D77240: [CUDA] Add missing cmath overloads.

At least that one is defined in what is "now" __clang_cuda_math.h:

Wed, Apr 1, 3:11 PM · Restricted Project
tra updated subscribers of D77239: [CUDA][NFCI] Use unqualified lookup for math functions.

The other macro uses a unqualified lookup already and the qualified one
will cause problems in the OpenMP overlay.

Wed, Apr 1, 2:38 PM · Restricted Project
tra accepted D77238: [CUDA][NFC] Split math.h functions out of __clang_cuda_device_functions.h.
Wed, Apr 1, 2:13 PM · Restricted Project
tra added a comment to D77240: [CUDA] Add missing cmath overloads.

We'll need to make sure that all of these new functions are vailable in all supported CUDA versions.
E.g. acoshf does not seem to be present in CUDA-9.

Wed, Apr 1, 2:13 PM · Restricted Project

Tue, Mar 31

tra added a comment to D77149: [Alignment][NFC] Use Align in SelectionDAG::getMemIntrinsicNode.

LGTM for NVPTX changes.

Tue, Mar 31, 9:57 AM · Restricted Project

Mon, Mar 30

tra accepted D76987: Rename options --cuda-gpu-arch and --no-cuda-gpu-arch.

+ @echristo who OK'ed the idea conditional on the actual patch. :-)

Mon, Mar 30, 12:30 PM · Restricted Project
tra added a comment to D76365: [cuda][hip] Add CUDA builtin surface/texture reference support..
Mon, Mar 30, 10:16 AM · Restricted Project
tra accepted D76795: [HIP] Change default --gpu-max-threads-per-block value to 1024.
Mon, Mar 30, 10:16 AM
tra resigned from D76994: [DAG] Fix PR45049: LegalizeTypes crash.

Looks OK to me, but I'm not that familiar with the legalizer.

Mon, Mar 30, 9:44 AM · Restricted Project

Fri, Mar 27

tra added a comment to D76365: [cuda][hip] Add CUDA builtin surface/texture reference support..

The new revision is accepted, right? Just want to confirm as it seems you accept it before I posted the new change.

Fri, Mar 27, 3:59 PM · Restricted Project
tra added a comment to D76948: [cuda][hip] Add CUDA builtin surface/texture reference support..

I tried that before submitting this one. But, as it's in the closed state, I cannot submit that anymore. I will attach the difference against the previous change somewhere.

Fri, Mar 27, 2:17 PM · Restricted Project
tra reopened D76365: [cuda][hip] Add CUDA builtin surface/texture reference support..

Reopened for further work

Fri, Mar 27, 2:17 PM · Restricted Project
tra added a comment to D76948: [cuda][hip] Add CUDA builtin surface/texture reference support..

Would it be possible to update the old review with the new diff? It would make it easier to see the incremental changes you've made. If the old review can be reopened that would be great as it would keep all relevant info in one place, but I'm fine doing the review here, too, if phabricator does not let you do it.

Fri, Mar 27, 1:43 PM · Restricted Project
tra committed rGfe8063e1a0e9: Revert "[cuda][hip] Add CUDA builtin surface/texture reference support." (authored by tra).
Revert "[cuda][hip] Add CUDA builtin surface/texture reference support."
Fri, Mar 27, 10:20 AM
tra added a reverting change for rG6a9ad5f3f4ac: [cuda][hip] Add CUDA builtin surface/texture reference support.: rGfe8063e1a0e9: Revert "[cuda][hip] Add CUDA builtin surface/texture reference support.".
Fri, Mar 27, 10:19 AM
tra added a comment to D76365: [cuda][hip] Add CUDA builtin surface/texture reference support..

That's a partial template specialization needs handling. I am revising that patch. Please revert it first. Thanks.

Fri, Mar 27, 10:16 AM · Restricted Project
tra added a comment to D76365: [cuda][hip] Add CUDA builtin surface/texture reference support..
In D76365#1946345, @tra wrote:

Looks like the change breaks compilation for us:

In file included from <built-in>:1:
In file included from llvm_unstable/toolchain/lib/clang/google3-trunk/include/__clang_cuda_runtime_wrapper.h:104:
In file included from cuda/include/cuda_runtime.h:116: cuda/include/cuda_surface_types.h:91:42: error: illegal device builtin surface reference type 'surface<void, dim>' declared here
struct  __device_builtin_surface_type__  surface<void, dim> : public surfaceReference
                                         ^
cuda/include/cuda_surface_types.h:91:42: note: 'surface<void, dim>' needs to be instantiated from a class template with the 2nd template argument as an integral value
1 error generated when compiling for sm_60.

I'm investigating, but we may need to roll back this patch. Stay tuned.

Fri, Mar 27, 10:16 AM · Restricted Project
tra added a comment to D76365: [cuda][hip] Add CUDA builtin surface/texture reference support..

Looks like the change breaks compilation for us:

Fri, Mar 27, 9:41 AM · Restricted Project

Thu, Mar 26

tra accepted D76365: [cuda][hip] Add CUDA builtin surface/texture reference support..

LGTM. Next step is to figure out what various __nv_tex_surf_handler(<string>...) maps to for various strings (there are ~110 of them in CUDA-10.2) and implement its replacement. I think we should be able to do it in the wrapper file.

Thu, Mar 26, 10:18 AM · Restricted Project

Wed, Mar 25

tra added inline comments to D76365: [cuda][hip] Add CUDA builtin surface/texture reference support..
Wed, Mar 25, 10:16 AM · Restricted Project

Mon, Mar 23

tra accepted D76520: [CUDA][HIP] Add -Xarch_device and -Xarch_host options.
Mon, Mar 23, 12:33 PM · Restricted Project
tra added a comment to D76520: [CUDA][HIP] Add -Xarch_device and -Xarch_host options.

-Xarch_ does not work for passing -cc1 options in the beginning. This patch does not change that.

This requires some further changes about how the options after -Xarch_ are handled. I would suggest to do that in another patch.

Mon, Mar 23, 10:55 AM · Restricted Project
tra added a comment to D76520: [CUDA][HIP] Add -Xarch_device and -Xarch_host options.

-Xarch_ works with driver options having value, e.g. -fcf-protection=branch. I added a test for that.

-mframe-pointer=none is a cc1 option. That's why it cannot be passed by -Xarch_. If it is made a driver option it can be passed.

Mon, Mar 23, 9:48 AM · Restricted Project

Fri, Mar 20

tra added inline comments to D76365: [cuda][hip] Add CUDA builtin surface/texture reference support..
Fri, Mar 20, 5:22 PM · Restricted Project
tra added a reviewer for D76520: [CUDA][HIP] Add -Xarch_device and -Xarch_host options: echristo.

Does it handle options with values? E.g. if I want to pass -mframe-pointer=none ? I vaguely recall the current -Xarch_* implementation had some limitations.
It may be worth adding a test for that.

Fri, Mar 20, 1:01 PM · Restricted Project
tra added inline comments to D76365: [cuda][hip] Add CUDA builtin surface/texture reference support..
Fri, Mar 20, 12:28 PM · Restricted Project

Thu, Mar 19

tra added a comment to D76365: [cuda][hip] Add CUDA builtin surface/texture reference support..
Thu, Mar 19, 5:33 PM · Restricted Project
tra added a comment to D76365: [cuda][hip] Add CUDA builtin surface/texture reference support..

I believe LLVM does have nvvm.texsurf.handle implemented: https://github.com/llvm/llvm-project/blob/d9972f848294b06807c8764615852ba2bc1e8a74/llvm/include/llvm/IR/IntrinsicsNVVM.td#L1150

Thu, Mar 19, 4:27 PM · Restricted Project
tra added a comment to D76365: [cuda][hip] Add CUDA builtin surface/texture reference support..

Note that, clang-based one needs defining texture fetch functions as they could not be reused from CUDA SDK. That part is enclosed with #if defined(clang).

Thu, Mar 19, 4:27 PM · Restricted Project
tra accepted D76455: [NFC] Refactor handling of Xarch option.
Thu, Mar 19, 3:56 PM · Restricted Project
tra added a comment to D76455: [NFC] Refactor handling of Xarch option.

+1 for refactoring, but what's the long term plan.
Long time ago echristo@ and I had a vague idea to change clang's option parsing to allow something like -Xarch_host <host-only args> -Xarch_device <args for all GPU compilations...> -Xarch=<target> <options for <target> only...>

Thu, Mar 19, 3:22 PM · Restricted Project

Wed, Mar 18

tra accepted D76032: [HIP] Fix duplicate clang -cc1 options on MSVC toolchain.
Wed, Mar 18, 10:52 AM · Restricted Project

Mon, Mar 16

tra committed rG74bf95d71dfc: [CUDA] Updated CompileCudaWithLLVM doc. (authored by tra).
[CUDA] Updated CompileCudaWithLLVM doc.
Mon, Mar 16, 3:52 PM
tra updated subscribers of D76030: [CUDA] Warn about unsupported CUDA SDK version only if it's used..

@hans -- this should be cherry-picked into 10 if it's not too late yet.

Mon, Mar 16, 3:51 PM · Restricted Project
tra accepted D76039: [HIP] Let clang recognize .hip extension.
Mon, Mar 16, 12:34 PM · Restricted Project
tra added a comment to D75811: [CUDA] Choose default architecture based on CUDA installation.

Your help here and over on CMake's side has been very helpful. Thank you!
I'll @ you on CMake's side if I need any help while working on CUDA support. Hopefully you won't mind. :)

Mon, Mar 16, 12:01 PM · Restricted Project

Thu, Mar 12

tra added a comment to D75811: [CUDA] Choose default architecture based on CUDA installation.

After some work on my CMake changes, Clang detection as a CUDA compiler works and I can compile CUDA code.

Thu, Mar 12, 11:23 AM · Restricted Project
tra committed rGeb2ba2ea953b: [CUDA] Warn about unsupported CUDA SDK version only if it's used. (authored by tra).
[CUDA] Warn about unsupported CUDA SDK version only if it's used.
Thu, Mar 12, 10:19 AM
tra closed D76030: [CUDA] Warn about unsupported CUDA SDK version only if it's used..
Thu, Mar 12, 10:18 AM · Restricted Project

Wed, Mar 11

tra committed rG8527c1ed66c6: Added constraints on cl-options.cu test (authored by tra).
Added constraints on cl-options.cu test
Wed, Mar 11, 4:30 PM
tra added a reviewer for D76032: [HIP] Fix duplicate clang -cc1 options on MSVC toolchain: rnk.

LGTM, but I'm not familiar with the details of clang-cl. I've added Reid who'd have a better idea.

Wed, Mar 11, 4:30 PM · Restricted Project
tra updated the diff for D76030: [CUDA] Warn about unsupported CUDA SDK version only if it's used..

clang-formatted the changes.

Wed, Mar 11, 4:30 PM · Restricted Project
tra updated the diff for D76030: [CUDA] Warn about unsupported CUDA SDK version only if it's used..

Added a test.

Wed, Mar 11, 3:19 PM · Restricted Project
tra created D76030: [CUDA] Warn about unsupported CUDA SDK version only if it's used..
Wed, Mar 11, 3:19 PM · Restricted Project
tra added a comment to D73231: [CUDA] Assume the latest known CUDA version if we've found an unknown one..

This patch should fix it: https://reviews.llvm.org/D76030

Wed, Mar 11, 3:19 PM · Restricted Project
tra committed rG0c06a389e593: [CUDA,clang-cl] Filter out unsupported arguments for device-side compilation. (authored by tra).
[CUDA,clang-cl] Filter out unsupported arguments for device-side compilation.
Wed, Mar 11, 2:08 PM
tra closed D75310: [CUDA,clang-cl] Filter out unsupported arguments for device-side compilation..
Wed, Mar 11, 2:07 PM · Restricted Project
tra added inline comments to D73231: [CUDA] Assume the latest known CUDA version if we've found an unknown one..
Wed, Mar 11, 10:45 AM · Restricted Project

Tue, Mar 10

tra added a comment to D75817: [NVPTX] Fix instruction selection for addresses in case of addrspacecasts.

I did some more testing, and it appears that that the InstCombine pass (opt -instcombine) is responsible for this.
Maybe it would be better to ensure InstCombine doesn't reorder instead of fixing this in instruction selection as I do now?
I'm not sure if that would influence other backends that rely on this reordering, though.

Tue, Mar 10, 9:12 AM · Restricted Project

Mon, Mar 9

tra added a comment to D75811: [CUDA] Choose default architecture based on CUDA installation.

Magically changing compiler target based on something external to compiler is a bad idea IMO. I would expect a compilation with exactly the same compiler options to do exactly the same thing. If we magically change default target, that will not be the case.

It'd be the same behaviour as NVCC, which compiles for the lowest architecture it supports.

Mon, Mar 9, 2:35 PM · Restricted Project
tra requested changes to D75817: [NVPTX] Fix instruction selection for addresses in case of addrspacecasts.

While such reordering may be beneficial, in general, GEP(ASC(x)) is not always equivalent of ASC(GEP(x)), so we can't just blindly do it. I believe this has been discussed few times in the past, though I can't find relevant emails now.

Mon, Mar 9, 10:46 AM · Restricted Project
tra requested changes to D75806: [CUDA] Add CUDA 10.2 detection.

This looks like an elaborate way to achieve the effect of -Wno-unknown_cuda_version. :-)

Mon, Mar 9, 10:13 AM · Restricted Project
tra added a comment to D75811: [CUDA] Choose default architecture based on CUDA installation.

I'm not sure that's the problem worth solving.

Mon, Mar 9, 9:41 AM · Restricted Project
tra accepted D68578: [HIP] Fix device stub name.

Few nits. LGTM otherwise.

Mon, Mar 9, 9:41 AM · Restricted Project
tra added a comment to D75788: [OpenMP] Provide math functions in OpenMP device code via OpenMP variants.

Couple of nits below. LGTM for CUDA headers otherwise.

Mon, Mar 9, 9:08 AM · Restricted Project

Feb 27 2020

tra created D75310: [CUDA,clang-cl] Filter out unsupported arguments for device-side compilation..
Feb 27 2020, 3:43 PM · Restricted Project

Feb 25 2020

tra added a comment to D75001: [OpenMP][cmake] ignore warning on unknown CUDA version .

@tra Will it also include -fopenmp-targets=nvptx64-nvidia-cuda?

Feb 25 2020, 12:10 PM · Restricted Project
tra added a comment to D75001: [OpenMP][cmake] ignore warning on unknown CUDA version .

I like this way better. I was hoping we could do it in our cmake only :)

Give others a day or so to comment before your commit but I'm fine with this.

Well, that doesn't really work if openmp-commits is only subscribed on commit. That said, the solution is a bit ugly but I don't have an alternative right now.

Somewhat related, that means Clang issues a warning for every compilation should there be a "unsupported" CUDA version around, even if it's not used? @tra maybe we can only issue the warning if CUDA is going to be used?

Feb 25 2020, 10:00 AM · Restricted Project

Feb 19 2020

tra added inline comments to D68578: [HIP] Fix device stub name.
Feb 19 2020, 9:21 AM · Restricted Project

Feb 18 2020

tra added inline comments to D68578: [HIP] Fix device stub name.
Feb 18 2020, 10:25 AM · Restricted Project
tra added a comment to D68578: [HIP] Fix device stub name.

Nice! Thank you for making these changes.

Feb 18 2020, 9:03 AM · Restricted Project

Feb 14 2020

tra added a comment to D74571: [OpenMP][CUDA] Add CUDA 10.2 support.

That sounds like the right approach for OpenMP. We require a minimal CUDA version, based on what we use internally, but no maximal version if possible since we don't use new features anyway.

Feb 14 2020, 10:48 AM

Feb 13 2020

tra added a comment to D74571: [OpenMP][CUDA] Add CUDA 10.2 support.

Interesting distinction.

Should compiling without warning indicate comprehensive support, or merely that we ran the tests and they passed?

Feb 13 2020, 3:54 PM
tra committed rG019ab61e25f2: [NVPTX, LSV] Move the LSV optimization pass to later when the graph is cleaner (authored by nouiz).
[NVPTX, LSV] Move the LSV optimization pass to later when the graph is cleaner
Feb 13 2020, 12:24 PM
tra closed D74444: [NVPTX, LSV] Move the LSV optimization pass to later when the graph is cleaner.
Feb 13 2020, 12:24 PM · Restricted Project
tra added a comment to D74571: [OpenMP][CUDA] Add CUDA 10.2 support.

Do the in tree tests all pass with the 10.2 toolchain? That's not exactly the same as whether it works but is the closest approximation we have available.

Assuming yes, this patch seems uncontroversial.

Yes, in tree tests pass with 10.2.

Feb 13 2020, 12:05 PM
tra requested changes to D74571: [OpenMP][CUDA] Add CUDA 10.2 support.

It's a bit premature to call CUDA-10.2 supported. We can compile using it, but clang/llvm has no support for the new things introduced by CUDA-10.2.
E.g. CUDA-10.2 introduces new PTX version with new instructions (and matching clang builtins)
https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#ptx-isa-version-6-5

Feb 13 2020, 11:38 AM

Feb 12 2020

tra accepted D74444: [NVPTX, LSV] Move the LSV optimization pass to later when the graph is cleaner.
Feb 12 2020, 1:47 PM · Restricted Project
tra accepted D74444: [NVPTX, LSV] Move the LSV optimization pass to later when the graph is cleaner.

Minor test nits. LGTM otherwise.
I'm curious, what does generated ptx for the function look before/after the patch.

Feb 12 2020, 12:24 PM · Restricted Project

Feb 11 2020

tra added inline comments to D74444: [NVPTX, LSV] Move the LSV optimization pass to later when the graph is cleaner.
Feb 11 2020, 2:49 PM · Restricted Project
tra added a comment to D74078: [Clang] When using SEH, create a impl instance for CrashRecoveryContext. NFCI..

Tensorflow folks report that the __try() here does not compile on windows:

ERROR: T:/tmp/nsz6drem/external/llvm-project/llvm/BUILD:3716:1: C++ compilation of rule '@llvm-project//llvm:support' failed (Exit 2)
external/llvm-project/llvm/lib/Support/CrashRecoveryContext.cpp(218): error C2712: Cannot use __try in functions that require object unwinding
Feb 11 2020, 1:36 PM · Restricted Project

Feb 10 2020

tra accepted D74360: [llvm][TableGen] Define FieldInit::isConcrete overload.

LGTM to unblock our build/integration.

Feb 10 2020, 4:04 PM · Restricted Project
tra added a comment to D74360: [llvm][TableGen] Define FieldInit::isConcrete overload.

LGTM, but wait a bit before landing in case @nhaehnle has concerns.

Feb 10 2020, 3:10 PM · Restricted Project
tra added a comment to D74339: Make .rodata* and .eh_frame* the last of all PROGBITS sections..

+@tra because of D47396

What types of relocations are from/to .nv_fatbin?

Feb 10 2020, 12:45 PM · Restricted Project

Feb 4 2020

tra added a reviewer for D73979: [HIP] Allow non-incomplete array type for extern shared var: rsmith.
Feb 4 2020, 2:05 PM
tra updated subscribers of D73979: [HIP] Allow non-incomplete array type for extern shared var.

All extern shared vars are sharing the same address, however, they may be used as different types in different functions.

For example,

__device__ int foo() {
  extern __shared__ int a;
  for (...) a+=...;
  return a;
}

__device__ double bar(int x) {
  extern __shared__ double b[10];
  for(...) b[x]+=...;
  return b[0];
}

__global__ void k() {
  foo();
  //...
  bar();
}
Feb 4 2020, 2:05 PM
tra accepted D73942: [hip] Properly populate macros based on host processor..

Thank you for adding the escape hatch option.

Feb 4 2020, 12:23 PM · Restricted Project
tra added a comment to D73979: [HIP] Allow non-incomplete array type for extern shared var.

Based on CUDA usage of extern shared var (https://devblogs.nvidia.com/using-shared-memory-cuda-cc/), it seems CUDA also assumes all extern shared vars have the same address, therefore HIP and CUDA have similar behavior.

Feb 4 2020, 12:12 PM
tra added a comment to D73979: [HIP] Allow non-incomplete array type for extern shared var.

A better description for the change would be helpful.

Feb 4 2020, 10:58 AM
tra added a comment to D73942: [hip] Properly populate macros based on host processor..

On one hand the change makes sense to me and fits well with what we've done so far.
On the other hand, I worry that this is likely to break things.
We sort of have been implicitly relying on not having the macros related to advanced CPU features enabled on device side which typically results in device-side compilation seeing a more portable/simpler version of the code.
Defining the macros that enable more host-side CPU-specific code may trigger interesting compatibility features. Postponed diagnostics combined with ignore (some) errors in the wrong-side-only code should probably deal with most of them, but I suspect we'll see new interesting failure cases. We would not know until we try.

Feb 4 2020, 10:21 AM · Restricted Project

Jan 29 2020

tra accepted D73651: [OpenCL][CUDA][HIP][SYCL] Add norecurse.

LGTM for CUDA.

Jan 29 2020, 11:31 AM · Restricted Project

Jan 28 2020

tra updated subscribers of D73231: [CUDA] Assume the latest known CUDA version if we've found an unknown one..

@hans : that's another candidate for 10.x cherry-pick, if you're OK with it.

Jan 28 2020, 10:30 AM · Restricted Project
tra updated the summary of D73231: [CUDA] Assume the latest known CUDA version if we've found an unknown one..
Jan 28 2020, 10:30 AM · Restricted Project
tra committed rG12fefeef203a: [CUDA] Assume the latest known CUDA version if we've found an unknown one. (authored by tra).
[CUDA] Assume the latest known CUDA version if we've found an unknown one.
Jan 28 2020, 10:21 AM
tra closed D73231: [CUDA] Assume the latest known CUDA version if we've found an unknown one..
Jan 28 2020, 10:21 AM · Restricted Project

Jan 27 2020

tra updated the diff for D73231: [CUDA] Assume the latest known CUDA version if we've found an unknown one..

Use std::string instead of Twine which can't be stored.

Jan 27 2020, 4:46 PM · Restricted Project

Jan 24 2020

tra accepted D73299: [HIP] Fix environment variable HIP_DEVICE_LIB_PATH.
Jan 24 2020, 5:21 PM · Restricted Project
tra committed rG0df13627c6a4: [CUDA] Fix order of memcpy arguments in __shfl_*(<64-bit type>). (authored by tra).
[CUDA] Fix order of memcpy arguments in __shfl_*(<64-bit type>).
Jan 24 2020, 3:14 PM