User Details
- User Since
- Jan 8 2015, 1:53 PM (315 w, 4 d)
Today
LGTM
Thu, Jan 21
LGTM.
Wed, Jan 20
Added a test for the corner case Richard has pointed out in the comments.
Removed unneeded changes.
Addressed Richard's comments.
Tue, Jan 19
LGTM for __clang_cuda_builtin_vars.h.
Presumably, __managed__ variables would have to be memory-mapped into the host address space.
I'd propose splitting the patch into two. One with the addition of CUID and the other that changes the way we havdle static vars.
CUID is useful on its own and is relatively uncontroversial.
Thu, Jan 14
Do you have any measurements showing actual impact on performance or memory consumption?
Go ahead.
Wed, Jan 13
For CUDA we have tests in the test-suite (https://github.com/llvm/llvm-test-suite/tree/main/External/CUDA) and a handful of buildbots running them (e.g. http://lab.llvm.org:8011/#/builders/55).
AMD should probably set up some public build/test bots for HIP, too.
In this case, manual testing and rollback if something breaks is about all we can do.
Mon, Jan 11
This adds a cuda header type with file extension "cuh". The output type file
extension is "cuhi" - not sure if this is a good choice. This allows
language servers to properly handle cuh files without additional arguments.
Thu, Jan 7
LGTM
Wed, Jan 6
LGTM overall.
LGTM.
Tue, Jan 5
There's --cuda-noopt-device-debug option specifically to allow compiling GPU code with full debug info. Clang will generate optimized PTX, but ptxas optimizations will be disabled.
LGTM.
Dec 15 2020
@jdoerfert -- do you happen to know how archives with bitcode are used by OpenMP? Does OpenMP ever link just the archive alone w/o any other bitcode files?
The change look OK. But we should still wait for @jsjodin to confirm that the first-file-is-different for archive files is unintentional.
Dec 14 2020
The patch could use an OK with OMP folks, considering that we've changed the way offload bunder is invoked for OMP.
Use device in the test case.
Dec 10 2020
My first variant of the patch only helped with some cases when the surface/texture attribute type was used.
Trying to reduce real-world failure resulted in an example that I've added as the test case which was still failing with this patch applied.
Found another corner case (reference within a template with the surface/texture
attibute.) and figured out a better fix.
Dec 9 2020
Reorganized tests for unsupported debug options & dwarf version clamping.
llvm::Linker::linkModules() is a static member, so there is no need
to pass reference to llvm::Linker instance to loadArFile() function.
Dec 8 2020
It appears that we need to add special handling for texture/surface references. Nominally they are host-side objects, but they are accessed/used from device functions as far as Sema is concerned.
The patch description describes what the patch does, but does not tell us much about the problem is is supposed to fix.
Could you give us more details on why the patch is needed?
Dec 7 2020
Adjusted openmp test for the changed -gembed-source warning.
Addressed comments.
Updated to address the comments. PTAL.
-E by default prints preprocessed output to stdout. CUDA will print preprocessed output from all subcompilations. What does HIP do in this case? Printing out the bundle is probably not what the user will expect.
IMO preprocessed output is frequently used as a debugging tool, so it's important for users to be able to read it. Bundled output is rather cumbersome to deal with. It's possible to manually unbundle it, but the tool is not documented well and it's not particularly suitable for human use.
Dec 4 2020
Removed forgotten pop_macro(DEVICE)
@jlebar PTAL. The previous fix did not quite fix the problem. __device__ operator new/delete was still not defined, but the issue was not reported due to deferred diags if the functions were not used.
Better fix for the early inclusion of <new>
Reopening the tracker for a better fix.
Simplified dwarf version clamping.
Dec 3 2020
Updated according to Devid's feedback.
Dec 2 2020
Dec 1 2020
I'd suggest adding more details on the background of this change to the commit log (point to the comment in the isBetterOverloadCandidate ?) and outline the intention to enable the new way to do overloading after some soak time.
While I agree that the default GPU choice is not likely to be correct, or usable, for everyone, but the warning seems to be a half-measure.
If the default is not usable, then it should not be the default. If it's usable, then we don't need a warning.
Nov 30 2020
LGTM in general.