- User Since
- Jan 8 2015, 1:53 PM (390 w, 6 d)
Thu, Jun 30
Syntax/style looks OK to me with a few nits.
Wed, Jun 29
Tue, Jun 28
Do we have tests that verify -target-feature arguments? It may be worth adding a test case there checking for redundant features.
we no longer will have a cached CUDA installation so we will usually create it twice.
Mon, Jun 27
@nikic -- we could use your input here. The patch looks OK to me, but I'm not particularly familiar with AA machinery in general and could use a second opinion.
Thu, Jun 23
The linker wrapper cannot do anything with these embedded PTX files because we do not know how to link them,
Wed, Jun 22
LGTM with a nit.
Very nice. LGTM.
Tue, Jun 21
Good catch! Thank you for fixing this.
Thu, Jun 16
Playing devil's advocate, I've got to ask -- do we even want to support JIT?
LGTM in general.
Wed, Jun 15
Tue, Jun 14
AFAICT, the test case you've added works fine with the compiler at HEAD: https://cuda.godbolt.org/z/q3xYMfdeb
I guess it only shows up in assertion-enabled builds. Can you check what happens if you run the test case compiled as CUDA? I suspect it will have the same issue -- we don't override setAuxTarget at all.
Do you have a practical use case where this feature would be needed?
Mon, Jun 13
Good catch. Thank you for fixing this.
LGTM overall with a couple of nits.
Fri, Jun 10
Thu, Jun 9
I think the root cause of the problem here is that CUDA compilation assumes that the GPU-side looks identical to the host side of the compilation. The test was intended to verify that and, AFAICT, it did its job flagging the issue here.
Wed, Jun 8
Jun 6 2022
Looks OK syntax-wise. Library path should probably be fixed, though it appears to be a somewhat separate issue and could be done in its own patch, if the required change is not trivial.
Jun 3 2022
clang: error: unable to execute command: Executable "clang-offload-packager" doesn't exist!
Jun 2 2022
This patch only supports printing these sections, later we will want to
support dumping files the user may be interested in via another flag. I
am unsure if this should go here in llvm-objdump or llvm-objcopy.
Jun 1 2022
LGTM in principle, with new nits.
May 31 2022
May 25 2022
How much work would it take to add cuid generation in the new driver, similar to what the old driver does, using the same logic, however imperfect it is? I'd be OK with that as a possibly permanent solution.
Currently, we infer the usage of the device linker by the user
specifying an offloading toolchain, e.g. (--offload-arch=...) or
(-fopenmp-targets=...), but this shouldn't be strictly necessary.
Is this patch in its current form blocking any of your other work? no-cuid approach, even if we figure out how to do it, will likely take some time. Do you need an interim solution until then?
LGTM. Do you need me to land it?
May 24 2022
I'm still itching to figure out a way to avoid CUID altogether and with the new driver it may be possible.
CUID serves two purposes:
a) avoid name conflicts during device-side linking ("must be globally unique" part)
b) allow host to refer to something in the GPU executable ("stable within TU" part)
I would suggest separating it into separate LLVM and MLIR patches.
May 23 2022
Couple of nits. LGTM otherwise.
The one downside to this is that we are no longer stable under multiple compilations of the same file.
The syntax is confusing. Normally only triple would be the argument for -Xoffload-linker option.
Having both -Xoffload-linker and -Xoffload-linker= variants also looks odd to me.
Looks like it's a wash -- public API is smaller, but the implementations didn't really benefit much. If anything, they got a bit less readable.
May 20 2022
My original suggestion which resulted in this patch was to generalize the API to just one function accepting Instruction* and let the target deal with the details.
For many targets there's a fair amount of duplicated code across the per-instruction variant implementation of shouldExpandX and using single call would allow to consolidate the atomic expansion logic in fewer places.
May 19 2022
May 18 2022
LGTM with a possible nit.
May 17 2022
@echristo - FYI, in case you have an opinion on target builtin feature checks.
My concerns have been addressed. I'll defer the final LGTM to @Anastasia.
May 16 2022
What's the minimum SM version do we currently support? I did find from one of your patch (https://reviews.llvm.org/D24943) that it had the function hasAtomAddF32, but now I can't find it. Was it removed because we support a minimum SM version, like SM20?
Oops. forgot to hit 'submit' last week, so there's some overlap with Aaron's question.
May 13 2022
May 11 2022
Good catch. LGTM.
May 10 2022
Please add a TODO around the items that need further work, so we don't forget about them. Review comments tend to fade from memory.
LGTM in principle.