This is an archive of the discontinued LLVM Phabricator instance.

[Clang] Make offloading flags accept '-' and '--'
ClosedPublic

Authored by jhuber6 on Oct 3 2022, 7:05 AM.

Details

Summary

Currently all of the flags beginning with --offload such as
--offload-arch or --offload-device-only require the double-dash
form. However, if a user uses a single dash it will instead name a file
'ffload-arch' for example. This has been the cause of a lot of user
confusion. This patch changes these options to also accept a single
dash. This is similar to many other driver arguments beginning with
-o.

Diff Detail

Event Timeline

jhuber6 created this revision.Oct 3 2022, 7:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 3 2022, 7:05 AM
Herald added a subscriber: StephenFan. · View Herald Transcript
jhuber6 requested review of this revision.Oct 3 2022, 7:05 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 3 2022, 7:05 AM
Herald added a subscriber: cfe-commits. · View Herald Transcript
JonChesterfield accepted this revision.Oct 3 2022, 7:08 AM

Like that a lot, good quality of life improvement.

This revision is now accepted and ready to land.Oct 3 2022, 7:08 AM
This revision was automatically updated to reflect the committed changes.
MaskRay added a comment.EditedOct 3 2022, 9:00 AM

Single-dash long options starting with -o conflict with the short option -o so I am unsure I like this direction.

However, if a user uses a single dash it will instead name a file 'ffload-arch' for example.

That's intended. A user should be able to specify an output name ffload-arch with -offload-arch.

This is similar to many other driver arguments beginning with -o.

What options?

I think this should be reverted. It's moving in the backward direction.

Single-dash long options starting with -o conflict with the short option -o so I am unsure I like this direction.

However, if a user uses a single dash it will instead name a file 'ffload-arch' for example.

That's intended. A user should be able to specify an output name ffload-arch with -offload-arch.

I made this patch primarily because I've gotten tired of getting messages from users wondering why their code wasn't offloading to the correct architecture only when they used -offload-arch=. This is something that's not clearly obvious to users and I think it's highly unlikely that a user would want a file named -ffload-arch.

This is similar to many other driver arguments beginning with -o.

What options?

There were just a few I saw after running clang --help 2>&1 | grep -e '^\ *-o. It was just a handful but I figured having anything more than -o meant this would be uncontroversial.

I think this should be reverted. It's moving in the backward direction.

That's fine if you think that this is unacceptable with the existing semantics. Maybe an alternative route would be to re-parse -o options and give a warning if it matches a recognized option? I feel like using -o without a space is uncommon enough to warrant it.

Single-dash long options starting with -o conflict with the short option -o so I am unsure I like this direction.

However, if a user uses a single dash it will instead name a file 'ffload-arch' for example.

That's intended. A user should be able to specify an output name ffload-arch with -offload-arch.

This is similar to many other driver arguments beginning with -o.

What options?

./bin/clang --help-hidden | grep "  \-o"
  -objcmt-allowlist-dir-path=<value>
  -objcmt-atomic-property Make migration to 'atomic' properties
  -objcmt-migrate-all     Enable migration to modern ObjC
  -objcmt-migrate-annotation
  -objcmt-migrate-designated-init
  -objcmt-migrate-instancetype
  -objcmt-migrate-literals
  -objcmt-migrate-ns-macros
  -objcmt-migrate-property-dot-syntax
  -objcmt-migrate-property
  -objcmt-migrate-protocol-conformance
  -objcmt-migrate-readonly-property
  -objcmt-migrate-readwrite-property
  -objcmt-migrate-subscripting
  -objcmt-ns-nonatomic-iosonly
  -objcmt-returns-innerpointer-property
  -objcmt-whitelist-dir-path=<value>
  -object-file-name=<file>
  -o <file>               Write output to <file>

The alternative is to at least warn if users write -offload... rather than --offlaod...

There are traditionally some single-dash long options (perhaps classical Mac OS style) which conflict with short options. I think nowadays we try to avoid such single-dash short options.
(For example, GNU ld now requires all new long options to use two-dashes, after knowing the problem with -omagic.)

I can make a patch to make cc1 -o Separate instead of JoinedOrSeparate, which will hopefully solve your pain of specifying a misspelled -offloat* cc1 option without good diagnostics.

There are traditionally some single-dash long options (perhaps classical Mac OS style) which conflict with short options. I think nowadays we try to avoid such single-dash short options.
(For example, GNU ld now requires all new long options to use two-dashes, after knowing the problem with -omagic.)

I can make a patch to make cc1 -o Separate instead of JoinedOrSeparate, which will hopefully solve your pain of specifying a misspelled -offloat* cc1 option without good diagnostics.

If that's possible, then that would be a good solution. I figured that allowing -ofoo was set in stone and this was the less controversial option. Just for reference, the problem I commonly encounter with users is the following situation

clang foo.cu -offload-arch=sm_80 -o foo // No offload arch, the  first -o isn't observed as its replaced by the second one.

There are traditionally some single-dash long options (perhaps classical Mac OS style) which conflict with short options. I think nowadays we try to avoid such single-dash short options.
(For example, GNU ld now requires all new long options to use two-dashes, after knowing the problem with -omagic.)

I can make a patch to make cc1 -o Separate instead of JoinedOrSeparate, which will hopefully solve your pain of specifying a misspelled -offloat* cc1 option without good diagnostics.

If that's possible, then that would be a good solution. I figured that allowing -ofoo was set in stone and this was the less controversial option. Just for reference, the problem I commonly encounter with users is the following situation

A cc1 specific Separate -o only helps misspelled -offload-*...

clang foo.cu -offload-arch=sm_80 -o foo // No offload arch, the  first -o isn't observed as its replaced by the second one.

I think these users have to accept that -offload-arch=sm_80 means -o ffload-arch=sm_80.
There are a number of unfortunate Separate short options. -o is among them the most used. Long options have to avoid conflicts with -o by not having the single-dash form.

MaskRay added a comment.EditedOct 3 2022, 11:03 AM

We really want these --offload-* users to stick with one canonical form, not -offload-* in some places while --offload-* in other places.

Another angle is that people find -offload-* working with a new clang may try -offload-* on an old clang and get -o ffload-*.

And -offload-* doesn't help misspelled options.

We really want these --offload-* users to stick with one canonical form, not -offload-* in some places while --offload-* in other places.

Another angle is that people find -offload-* working with a new clang may try -offload-* on an old clang and get -o ffload-*.

And -offload-* doesn't help misspelled options.

This is fine, as long as users get more distinct feedback that -offload-arch isn't doing what they think it does. Normally we'd get some error and a helpful suggestion if the option is misspelled, but with -o options we don't get anything. My only qualm with the current state is that it's not obvious that -offload-arch is actually -o ffload-arch for most cases.

MaskRay added a comment.EditedOct 3 2022, 11:23 AM

We really want these --offload-* users to stick with one canonical form, not -offload-* in some places while --offload-* in other places.

Another angle is that people find -offload-* working with a new clang may try -offload-* on an old clang and get -o ffload-*.

And -offload-* doesn't help misspelled options.

This is fine, as long as users get more distinct feedback that -offload-arch isn't doing what they think it does. Normally we'd get some error and a helpful suggestion if the option is misspelled, but with -o options we don't get anything.
My only qualm with the current state is that it's not obvious that -offload-arch is actually -o ffload-arch for most cases.

You can make -oxxx an error if offloading is used (-o xxx is still allowed). Then no --offload-* needs a -offload-* form. They are new users and they can likely live with stricter rules.
It's perhaps too destructive to remove -oxxx for regular targets.

This patch probably provides some convenience but regresses other aspects, and I think it should be reverted.

We really want these --offload-* users to stick with one canonical form, not -offload-* in some places while --offload-* in other places.

Another angle is that people find -offload-* working with a new clang may try -offload-* on an old clang and get -o ffload-*.

And -offload-* doesn't help misspelled options.

This is fine, as long as users get more distinct feedback that -offload-arch isn't doing what they think it does. Normally we'd get some error and a helpful suggestion if the option is misspelled, but with -o options we don't get anything.
My only qualm with the current state is that it's not obvious that -offload-arch is actually -o ffload-arch for most cases.

You can make -oxxx an error if offloading is used (-o xxx is still allowed). Then no --offload-* needs a -offload-* form.

The problem here is that the --offload-* options are used to enable the offloading toolchain for some targets (e.g. OpenMP). Would a check on -o' that emits a warning if it matches closely a known option work? E.g. -offload-arch` would warn that the user may mean --offload-arch.

This patch probably provides some convenience but regresses other aspects, and I think it should be reverted.

If you think this should be reverted then I'm fine with it. I would like to see some kind of solution to this problem however, as I have dealt with this problem many times when working with users.

We really want these --offload-* users to stick with one canonical form, not -offload-* in some places while --offload-* in other places.

Another angle is that people find -offload-* working with a new clang may try -offload-* on an old clang and get -o ffload-*.

And -offload-* doesn't help misspelled options.

This is fine, as long as users get more distinct feedback that -offload-arch isn't doing what they think it does. Normally we'd get some error and a helpful suggestion if the option is misspelled, but with -o options we don't get anything.
My only qualm with the current state is that it's not obvious that -offload-arch is actually -o ffload-arch for most cases.

You can make -oxxx an error if offloading is used (-o xxx is still allowed). Then no --offload-* needs a -offload-* form.

The problem here is that the --offload-* options are used to enable the offloading toolchain for some targets (e.g. OpenMP). Would a check on -o' that emits a warning if it matches closely a known option work? E.g. -offload-arch` would warn that the user may mean --offload-arch.

If we don't have Joined -o, we won't have the option collision problem. But we have Joined -o, and we should not introduce new aliases to collide with -o.
I am fine if Joined -o can go away but there are too many uses so such a change would be destructive.

My idea is to just disallow Joined -o when targeting a specific environment (e.g. when offloading toolchain is used).

This patch probably provides some convenience but regresses other aspects, and I think it should be reverted.

If you think this should be reverted then I'm fine with it. I would like to see some kind of solution to this problem however, as I have dealt with this problem many times when working with users.

As mentioned, making users stick with more forms instead of all using the canonical form will do harm in other aspects.
I have explained these aspects in my previous comment.

My idea is to just disallow Joined -o when targeting a specific environment (e.g. when offloading toolchain is used).

This seems difficult as we only know which offloading toolchains are active after we fully construct the Compilation in the driver. We could potentially re-parse some options afterwards to do some static checks like this.

My idea is to just disallow Joined -o when targeting a specific environment (e.g. when offloading toolchain is used).

This seems difficult as we only know which offloading toolchains are active after we fully construct the Compilation in the driver. We could potentially re-parse some options afterwards to do some static checks like this.

I think I'm fine if you want to warning for every Joined -o use of ffloading-*. Created D135093 to revert this patch.

Another idea is to reject multiple -o if some are used as the Joined form. Note: multiple Separate -o should be allowed.
It will not catch -offload-* when -o output is not specified, but is probably useful enough.

Sorry, but I'll revert this patch shortly.

tra added a comment.Oct 4 2022, 10:57 AM

Another idea is to reject multiple -o if some are used as the Joined form. Note: multiple Separate -o should be allowed.
It will not catch -offload-* when -o output is not specified, but is probably useful enough.

Is there a reason we should not always reject -o<existing double-dash option>? I doubt there are many real life uses of such output names and if/when someone needs them, it's easy enough to disambiguate by using a separate -o argument or -o./output.
This would address the issue for all --o* options.