This is an archive of the discontinued LLVM Phabricator instance.

[OpenMP] Add options to only compile the host or device when offloading
ClosedPublic

Authored by jhuber6 on Apr 21 2022, 5:41 PM.

Details

Summary

OpenMP recently moved to the new offloading driver, this had the effect
of making it more difficult to inspect intermediate code for the device.
This patch adds -foffload-host-only and -foffload-device-only to
control which sides get compiled. This will allow users to more easily
inspect output without needing the temp files.

Diff Detail

Event Timeline

jhuber6 created this revision.Apr 21 2022, 5:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2022, 5:41 PM
Herald added a subscriber: guansong. · View Herald Transcript
jhuber6 requested review of this revision.Apr 21 2022, 5:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2022, 5:41 PM
jhuber6 updated this revision to Diff 424449.Apr 22 2022, 5:38 AM

Moving some stuff.

jhuber6 updated this revision to Diff 425214.Apr 26 2022, 7:38 AM

Small change, returning an empty action type should let us stop without an extra check.

tra added inline comments.Apr 26 2022, 11:46 AM
clang/include/clang/Driver/Options.td
2535–2538

We should probably alias --cuda-host-only/--cuda-device-only to these.

ping

clang/include/clang/Driver/Options.td
2535–2538

Unfortunately I can't make these a strict alias for -foffload-host-only/-foffload-device-only without breaking their current usage. I could just check them along with these new versions, but that makes it a little cluttered.

tra added inline comments.Apr 26 2022, 1:48 PM
clang/include/clang/Driver/Options.td
2535–2538

Could you give me an example why we can use just one set of options to control host-only and device-only compilation?

Is that because we also have --cuda-compile-host-device?
Or is it because -f options have -fno- variants to override them, while --cuda-host/device-only don't, so we can't just mark them as an alias in Options.td?
To think of it, those are just facets of the same issue -- difference in the options syntax.
I do not see semantic differences between the existing options and -foffload-*-only.

E.g. I believe you could've used --cuda-host-only and --cuda-device-only instead of -foffload-*-only without the loss of functionality of your patch.

jhuber6 added inline comments.Apr 26 2022, 2:00 PM
clang/include/clang/Driver/Options.td
2535–2538

I could have used those options, but this applies in general to the new driver, so it wouldn't be intuitive if we used --cuda-device-only when offloading to x86_64 with OpenMP for example. The problem is that the --cuda-device-only is already used in the current offloading driver so if I just make it an alias to this option it won't work. I was hoping to avoid touching the old driver, but I guess I could go back and replace all uses of --cuda-host/device-only with these new options. Also I did forget to add the fno- variants for these and use them correctly. I'm not sure if there's an explicit reason to, but everything else in the OpenMP world uses -f prefixes so I'm mostly just sticking with that.

jhuber6 updated this revision to Diff 425301.Apr 26 2022, 2:13 PM

Adding -fno variants and replacing the uses of --cuda-host-only with and alias and this flag.

tra added inline comments.Apr 26 2022, 2:28 PM
clang/include/clang/Driver/Options.td
2535–2538

it wouldn't be intuitive if we used --cuda-device-only when offloading to x86_64 with OpenMP

On the other hand, adding another set of options replicating the functionality creates more opportunities for conflicting usage. E.g. what should we do if user specifies --cuda-device-only and -foffload-device-only ? We then need to diagnose that in a sensible way, at the very least.

In general, we have been transitioning some options that used to be CUDA-specific into more general variants. Usually --cuda* -> --offload* (--cuda-gpu-arch->--offload-arch) or --gpu*. (-fcuda-rdc -> -fgpu-rdc)
Picking the offloading side should follow the same pattern.

Speaking of names and renaming. Perhaps -f is not the best choice here. -f options are conventionally used to control code generation parameters. -foffload-*-only mostly controls the driver behavior. Perhaps it would make sense to call them --offload-host-only, --offload-device-only and --offload-host-device and alias them to the matching --cuda* counterparts (or vice-versa + search/replace their use in the code).

jhuber6 updated this revision to Diff 425336.Apr 26 2022, 4:28 PM

Updating to not use -f/-fno versions.

tra added inline comments.Apr 26 2022, 4:46 PM
clang/include/clang/Driver/Options.td
2535

We should be using "--" for the new options.

clang/lib/Driver/Driver.cpp
4205

Comment may need updating.

4215

This will not always be correct. E.g. --offload-host-only --offload-host-device would be true here, but we would still want to compile for both and device.

Is there a reason we can no longer rely on HostAction?

4223

This could use a comment. I don't quite understand what we're doing here and why.
Only doing host-side preprocessing if -E is passed?

4278

Same problem as with the host-checking above.

jhuber6 added inline comments.Apr 26 2022, 4:49 PM
clang/include/clang/Driver/Options.td
2535

What's the reason for preferring -- over -? Is it just because -o can bind to output?

clang/lib/Driver/Driver.cpp
4215

Guess I should just use the last argument. Host action is used below, can probably merge these if statements.

4223

OpenMP requires host IR for the device compile, but -E doesn't generate one, but we don't use it for preprocessing.

jhuber6 updated this revision to Diff 425344.Apr 26 2022, 5:04 PM

Making suggested changes.

tra added inline comments.Apr 27 2022, 12:16 PM
clang/include/clang/Driver/Options.td
2535

Convention, I guess. Legacy (e.g. -nostdlib ) and single-letter options (e.g. -o -m, -f ) use single dash. Long options typically use double-dash.

clang/lib/Driver/Driver.cpp
4215

I guess the general idea is to avoid the ambiguity about what controls the behavior of the function. Is that the Args, or the HostAction?
Ideally I'd prefer to parse command line options once, save results somewhere we could use them and then use those flargs to control the behavior, regardless of which options were used to specify it. E.g. CUDA/HIPActionBuilder classes have these member fields:

  bool CompileHostOnly = false;
  bool CompileDeviceOnly = false;

...
      Arg *PartialCompilationArg = Args.getLastArg(
          options::OPT_cuda_host_only, options::OPT_cuda_device_only,
          options::OPT_cuda_compile_host_device);
      CompileHostOnly = PartialCompilationArg &&
                        PartialCompilationArg->getOption().matches(
                            options::OPT_cuda_host_only);
      CompileDeviceOnly = PartialCompilationArg &&
                          PartialCompilationArg->getOption().matches(
                              options::OPT_cuda_device_only);
...
jhuber6 added inline comments.Apr 27 2022, 12:37 PM
clang/lib/Driver/Driver.cpp
4215

I more or less copied that below, but I suppose we are recalculating them each phase. We don't have a monolithic class with my new implementation, but I suppose I could add these to the Compilation or something if we don't want to recalculate it. Beside that, does anything else seem amiss?

tra accepted this revision.Apr 27 2022, 2:40 PM

LGTM with a minor nit.

clang/lib/Driver/Driver.cpp
4225

Nit: (!isa<CompileJobAction>(HostAction) && PL.back() != phases::Preprocess) -> !(isa<CompileJobAction>(HostAction) || PL.back() == phases::Preprocess)

It's a bit easier to understand that way, IMO. We could also return early if HostOnly is set and make this condition simpler.

This revision is now accepted and ready to land.Apr 27 2022, 2:40 PM
jhuber6 updated this revision to Diff 426049.Apr 29 2022, 7:15 AM

Rebase after landing the cuda support for the new driver.