OpenMP recently moved to the new offloading driver, this had the effect
of making it more difficult to inspect intermediate code for the device.
This patch adds -foffload-host-only and -foffload-device-only to
control which sides get compiled. This will allow users to more easily
inspect output without needing the temp files.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Small change, returning an empty action type should let us stop without an extra check.
clang/include/clang/Driver/Options.td | ||
---|---|---|
2529–2532 | We should probably alias --cuda-host-only/--cuda-device-only to these. |
ping
clang/include/clang/Driver/Options.td | ||
---|---|---|
2529–2532 | Unfortunately I can't make these a strict alias for -foffload-host-only/-foffload-device-only without breaking their current usage. I could just check them along with these new versions, but that makes it a little cluttered. |
clang/include/clang/Driver/Options.td | ||
---|---|---|
2529–2532 | Could you give me an example why we can use just one set of options to control host-only and device-only compilation? Is that because we also have --cuda-compile-host-device? E.g. I believe you could've used --cuda-host-only and --cuda-device-only instead of -foffload-*-only without the loss of functionality of your patch. |
clang/include/clang/Driver/Options.td | ||
---|---|---|
2529–2532 | I could have used those options, but this applies in general to the new driver, so it wouldn't be intuitive if we used --cuda-device-only when offloading to x86_64 with OpenMP for example. The problem is that the --cuda-device-only is already used in the current offloading driver so if I just make it an alias to this option it won't work. I was hoping to avoid touching the old driver, but I guess I could go back and replace all uses of --cuda-host/device-only with these new options. Also I did forget to add the fno- variants for these and use them correctly. I'm not sure if there's an explicit reason to, but everything else in the OpenMP world uses -f prefixes so I'm mostly just sticking with that. |
Adding -fno variants and replacing the uses of --cuda-host-only with and alias and this flag.
clang/include/clang/Driver/Options.td | ||
---|---|---|
2529–2532 |
On the other hand, adding another set of options replicating the functionality creates more opportunities for conflicting usage. E.g. what should we do if user specifies --cuda-device-only and -foffload-device-only ? We then need to diagnose that in a sensible way, at the very least. In general, we have been transitioning some options that used to be CUDA-specific into more general variants. Usually --cuda* -> --offload* (--cuda-gpu-arch->--offload-arch) or --gpu*. (-fcuda-rdc -> -fgpu-rdc) Speaking of names and renaming. Perhaps -f is not the best choice here. -f options are conventionally used to control code generation parameters. -foffload-*-only mostly controls the driver behavior. Perhaps it would make sense to call them --offload-host-only, --offload-device-only and --offload-host-device and alias them to the matching --cuda* counterparts (or vice-versa + search/replace their use in the code). |
clang/include/clang/Driver/Options.td | ||
---|---|---|
2529 | We should be using "--" for the new options. | |
clang/lib/Driver/Driver.cpp | ||
4206–4207 | Comment may need updating. | |
4312–4323 | This will not always be correct. E.g. --offload-host-only --offload-host-device would be true here, but we would still want to compile for both and device. Is there a reason we can no longer rely on HostAction? | |
4330 | This could use a comment. I don't quite understand what we're doing here and why. | |
4404 | Same problem as with the host-checking above. |
clang/include/clang/Driver/Options.td | ||
---|---|---|
2529 | What's the reason for preferring -- over -? Is it just because -o can bind to output? | |
clang/lib/Driver/Driver.cpp | ||
4312–4323 | Guess I should just use the last argument. Host action is used below, can probably merge these if statements. | |
4330 | OpenMP requires host IR for the device compile, but -E doesn't generate one, but we don't use it for preprocessing. |
clang/include/clang/Driver/Options.td | ||
---|---|---|
2529 | Convention, I guess. Legacy (e.g. -nostdlib ) and single-letter options (e.g. -o -m, -f ) use single dash. Long options typically use double-dash. | |
clang/lib/Driver/Driver.cpp | ||
4312–4323 | I guess the general idea is to avoid the ambiguity about what controls the behavior of the function. Is that the Args, or the HostAction? bool CompileHostOnly = false; bool CompileDeviceOnly = false; ... Arg *PartialCompilationArg = Args.getLastArg( options::OPT_cuda_host_only, options::OPT_cuda_device_only, options::OPT_cuda_compile_host_device); CompileHostOnly = PartialCompilationArg && PartialCompilationArg->getOption().matches( options::OPT_cuda_host_only); CompileDeviceOnly = PartialCompilationArg && PartialCompilationArg->getOption().matches( options::OPT_cuda_device_only); ... |
clang/lib/Driver/Driver.cpp | ||
---|---|---|
4312–4323 | I more or less copied that below, but I suppose we are recalculating them each phase. We don't have a monolithic class with my new implementation, but I suppose I could add these to the Compilation or something if we don't want to recalculate it. Beside that, does anything else seem amiss? |
LGTM with a minor nit.
clang/lib/Driver/Driver.cpp | ||
---|---|---|
4332 | Nit: (!isa<CompileJobAction>(HostAction) && PL.back() != phases::Preprocess) -> !(isa<CompileJobAction>(HostAction) || PL.back() == phases::Preprocess) It's a bit easier to understand that way, IMO. We could also return early if HostOnly is set and make this condition simpler. |
We should be using "--" for the new options.