This is an archive of the discontinued LLVM Phabricator instance.

[CommandLine] Don't print empty sentinel values from EnumValN lists in help text
ClosedPublic

Authored by jhenderson on Jan 21 2019, 10:08 AM.

Details

Summary

In order to make an option value truly optional, both the ValueOptional attribute and an empty-named value are required. Prior to this change, this empty-named value appears in
the command-line help text:

-some-option - some help text
  =v1        - description 1
  =v2        - description 2
  =          -

This change improves the help text for these sort of options in a number of ways:

  1. ValueOptional options with an empty-named value now print their help text twice: both without and then with '=<value>' after the name. The latter version then lists the allowed values after it.
  2. Empty-named values with no help text in ValueOptional options are not listed in the permitted values.
-some-option         - some help text
-some-option=<value> - some help text
  =v1                - description 1
  =v2                - description 2
  1. Otherwise empty-named options are printed as =<empty> rather than simply '='.
  2. Option values without help text do not have the '-' separator printed.
-some-option=<value> - some help text
  =v1                - description 1
  =v2
  =<empty>           - description

It also tweaks the llvm-symbolizer -functions help text to not print a trailing ':' as that looks bad combined with 1) above.

Diff Detail

Repository
rL LLVM

Event Timeline

jhenderson created this revision.Jan 21 2019, 10:08 AM

Do such sentinel trigger if not passing the equal sign (ie. when using only "-mhvx" in the example you quote) or only with equal sign without a value (one has to use "-mhvx=" to select the option)?

If the former I think this change is fine. I would also suggest adding an early continue when only the description is empty on line 1683 to avoid the empty '-'.

But if the latter (one has to do "-mhvx=" to select the option) I'm not sure hiding the option is right since it is a value (albeit empty) people can use.

lib/Support/CommandLine.cpp
1683 โ†—(On Diff #182812)

Add early continue here as well when only description is empty?

jhenderson marked an inline comment as done.Jan 22 2019, 1:50 AM

Do such sentinel trigger if not passing the equal sign (ie. when using only "-mhvx" in the example you quote) or only with equal sign without a value (one has to use "-mhvx=" to select the option)?

If the former I think this change is fine. I would also suggest adding an early continue when only the description is empty on line 1683 to avoid the empty '-'.

But if the latter (one has to do "-mhvx=" to select the option) I'm not sure hiding the option is right since it is a value (albeit empty) people can use.

Assuming ValueOptional is specified, then yes, you don't need the equal sign. I've written up a table below of different input options versus their behaviour to clarify things:

Code--option--option=--option foo--option=foo
default, no sentinelnot allowednot allowedallowedallowed
default, sentinelnot allowedallowedallowedallowed
ValueOptional, no sentinelnot allowednot allowednot allowedallowed
ValueOptional, sentinelallowedallowednot allowedallowed
lib/Support/CommandLine.cpp
1683 โ†—(On Diff #182812)

Seems reasonable to me, although the '\n' is required still. To avoid repeating it, I'll write it as:

if (!getDescription(i).empty())
  outs().indent(NumSpaces) << " -   " << getDescription(i);
outs() << '\n';

Don't print empty description for option value with no description.

jhenderson marked an inline comment as done.Jan 22 2019, 3:12 AM

Do such sentinel trigger if not passing the equal sign (ie. when using only "-mhvx" in the example you quote) or only with equal sign without a value (one has to use "-mhvx=" to select the option)?

If the former I think this change is fine. I would also suggest adding an early continue when only the description is empty on line 1683 to avoid the empty '-'.

But if the latter (one has to do "-mhvx=" to select the option) I'm not sure hiding the option is right since it is a value (albeit empty) people can use.

Assuming ValueOptional is specified, then yes, you don't need the equal sign. I've written up a table below of different input options versus their behaviour to clarify things:

Code--option--option=--option foo--option=foo
default, no sentinelnot allowednot allowedallowedallowed
default, sentinelnot allowedallowedallowedallowed
ValueOptional, no sentinelnot allowednot allowednot allowedallowed
ValueOptional, sentinelallowedallowednot allowedallowed

Thanks for the explanation. In that case I'd like to differentiate between the optional and non optional case.

  1. For optional omitting the line is probably fine, or perhaps put a [default] instead of empty equal sign.
  2. To handle the non optional case I think we should introduce single quotes after the equal sign to indicate that this is not an error but an empty value, ie. =''. Perhaps also put quote around other values for consistency.

Does that make sense?

Thanks for the explanation. In that case I'd like to differentiate between the optional and non optional case.

  1. For optional omitting the line is probably fine, or perhaps put a [default] instead of empty equal sign.
  2. To handle the non optional case I think we should introduce single quotes after the equal sign to indicate that this is not an error but an empty value, ie. =''. Perhaps also put quote around other values for consistency.

Does that make sense?

I'm afraid I'm not quite sure what you mean by the "optional and non optional case". Do you mean when ValueOptional is and is not used?

Thanks for the explanation. In that case I'd like to differentiate between the optional and non optional case.

  1. For optional omitting the line is probably fine, or perhaps put a [default] instead of empty equal sign.
  2. To handle the non optional case I think we should introduce single quotes after the equal sign to indicate that this is not an error but an empty value, ie. =''. Perhaps also put quote around other values for consistency.

Does that make sense?

I'm afraid I'm not quite sure what you mean by the "optional and non optional case". Do you mean when ValueOptional is and is not used?

Yes my bad, that is what I meant: 1) is when ValueOptional is used, 2) is when it isn't.

Best regards,

Thomas

ruiu added a comment.Jan 22 2019, 4:28 PM

Ideally I believe we should print out something like this for an option that takes an optional argument:

-mhvx                                            - Enable Hexagon Vector eXtensions
-mhvx=<value>                                    - Enable Hexagon Vector eXtensions
   =v60                                           -   Build for HVX v60
   =v62                                           -   Build for HVX v62
   =v65                                           -   Build for HVX v65
   =v66                                           -   Build for HVX v66

Do you think that's doable?

Ideally I believe we should print out something like this for an option that takes an optional argument:

-mhvx                                            - Enable Hexagon Vector eXtensions
-mhvx=<value>                                    - Enable Hexagon Vector eXtensions
   =v60                                           -   Build for HVX v60
   =v62                                           -   Build for HVX v62
   =v65                                           -   Build for HVX v65
   =v66                                           -   Build for HVX v66

Do you think that's doable?

+1, that is much clearer. Perhaps even:

-mhvx                                            - Enable Hexagon Vector eXtensions
-mhvx='<value>'                            - Enable Hexagon Vector eXtensions
   ='v60'                                           -   Build for HVX v60
   ='v62'                                           -   Build for HVX v62
   ='v65'                                           -   Build for HVX v65
   ='v66'                                           -   Build for HVX v66

To be able to list ='' for sentinel value which are not ValueOptional.

Sounds reasonable to try it at least. I'll take a look.

To be able to list ='' for sentinel value which are not ValueOptional.

Thinking about it, I have a slight concern that adding the quotes could cause confusion, because theoretically a valid value could iteslf have quotes in, so this makes the help text ambiguous: does the value include the quotes either end or not? How about =<empty> for the "sentinel" value?

jhenderson edited the summary of this revision. (Show Details)
jhenderson added a reviewer: chandlerc.

I've tweaked the output as discussed. For a ValueOptional with sentinel value, it prints something like the following:

-mhvx                                            - Enable Hexagon Vector eXtensions
-mhvx=<value>                                    - Enable Hexagon Vector eXtensions
  =v60                                           -   Build for HVX v60
  =v62                                           -   Build for HVX v62
  =v65                                           -   Build for HVX v65
  =v66                                           -   Build for HVX v66

Without a sentinel value it prints:

-mhvx=<value>                                    - Enable Hexagon Vector eXtensions
  =v60                                           -   Build for HVX v60
  =v62                                           -   Build for HVX v62
  =v65                                           -   Build for HVX v65
  =v66                                           -   Build for HVX v66

For ValueRequired with an empty string value it prints:

-mhvx=<value>                                    - Enable Hexagon Vector eXtensions
  =v60                                           -   Build for HVX v60
  =v62                                           -   Build for HVX v62
  =v65                                           -   Build for HVX v65
  =v66                                           -   Build for HVX v66
  =<empty>

The latter also demonstrates what happens if the description is empty.

I like this new version very much. I agree, <empty> is less likely to be confusing. One last nit: why do you check for the description being empty when deciding to skip an empty value for a ValueOptional option in the second block but not in the first block? Any reason for the inconsistency?

I like this new version very much. I agree, <empty> is less likely to be confusing. One last nit: why do you check for the description being empty when deciding to skip an empty value for a ValueOptional option in the second block but not in the first block? Any reason for the inconsistency?

It wasn't intentional, but I think it is okay (although I could be persuaded to do something different). As things stand, an empty string with help text would produce something like this for a ValueOptional option:

-mhvx                                            - Enable Hexagon Vector eXtensions
-mhvx=<value>                                    - Enable Hexagon Vector eXtensions
  =v60                                           -   Build for HVX v60
  =v62                                           -   Build for HVX v62
  =v65                                           -   Build for HVX v65
  =v66                                           -   Build for HVX v66
  =<empty>                                       -   Same as v66

We could instead suppress the =<empty> line always (even if it has a description), or we could always print the =<empty> line regardless of whether it has a description or not, I don't mind.

I like this new version very much. I agree, <empty> is less likely to be confusing. One last nit: why do you check for the description being empty when deciding to skip an empty value for a ValueOptional option in the second block but not in the first block? Any reason for the inconsistency?

It wasn't intentional, but I think it is okay (although I could be persuaded to do something different). As things stand, an empty string with help text would produce something like this for a ValueOptional option:

-mhvx                                            - Enable Hexagon Vector eXtensions
-mhvx=<value>                                    - Enable Hexagon Vector eXtensions
  =v60                                           -   Build for HVX v60
  =v62                                           -   Build for HVX v62
  =v65                                           -   Build for HVX v65
  =v66                                           -   Build for HVX v66
  =<empty>                                       -   Same as v66

We could instead suppress the =<empty> line always (even if it has a description), or we could always print the =<empty> line regardless of whether it has a description or not, I don't mind.

Fair enough, that's quite a nice output. LGTM, thanks! Will approve in one day if nobody objects.

ruiu added a comment.Jan 23 2019, 9:16 AM

I found this file: llvm/unittests/Support/CommandLineTest.cpp

Maybe you can add a test for the new functionality to that file.

grimar added a subscriber: grimar.Jan 23 2019, 11:35 PM

I found this file: llvm/unittests/Support/CommandLineTest.cpp

Maybe you can add a test for the new functionality to that file.

That's where I would test this, yes, but the function is all about printing to outs(), and I don't know of a way using LLVM's tools to test that. Can you (or anybody else) suggest anything?

I guess I could just put a test in "test/Other" which runs a tools with --help that prints the various forms available (although I don't think the "=<empty>" path is used anywhere currently).

Remove colon from help text, which now won't have anything listed after it. Without this change, the following would have been the output of llvm-symbolizer --help with this patch:

-functions                        - Print function name for a given address:
-functions=<value>                - Print function name for a given address:
  =none                           -   omit function name
  =short                          -   print short function name
  =linkage                        -   print function linkage name (default)

I checked and couldn't find anywhere else that would be impacted in this way.

I found this file: llvm/unittests/Support/CommandLineTest.cpp

Maybe you can add a test for the new functionality to that file.

That's where I would test this, yes, but the function is all about printing to outs(), and I don't know of a way using LLVM's tools to test that. Can you (or anybody else) suggest anything?

I guess I could just put a test in "test/Other" which runs a tools with --help that prints the various forms available (although I don't think the "=<empty>" path is used anywhere currently).

Can't you use something like dup2 (but more portable across platforms) to redirect the output to a file, read that file and check its content?

Add unit tests.

Windows appears to accept dup et al, so I'm using that for now.

thopre added inline comments.Jan 29 2019, 6:29 AM
unittests/Support/CommandLineTest.cpp
860โ€“861 โ†—(On Diff #184059)

Why not make them private with getters? Did you deem it over-engineering?

917โ€“918 โ†—(On Diff #184059)

I understand the use of variable to avoid duplication but it makes it harder to see that we are testing for correct alignment. Could you choose variables with the right length to have the same alignment as the output text such that we can see the alignment by just looking at the text?

929โ€“930 โ†—(On Diff #184059)

I think it's better to go to next line for every \n, I'd suggest also keeping the \n at end of line to make the reference string as close as the output. Likewise for other tests below and above.

jhenderson marked 3 inline comments as done.Jan 29 2019, 6:36 AM
jhenderson added inline comments.
unittests/Support/CommandLineTest.cpp
860โ€“861 โ†—(On Diff #184059)

Yes, essentially. If you prefer me to add getters, I don't mind.

917โ€“918 โ†—(On Diff #184059)

Sounds good to me. I'll do that.

929โ€“930 โ†—(On Diff #184059)

That's what I wanted to do, but this is how clang-format lines it up. I'm happy to not run clang-format on this section, if you are.

thopre added inline comments.Jan 29 2019, 6:48 AM
unittests/Support/CommandLineTest.cpp
860โ€“861 โ†—(On Diff #184059)

I'm thinking more in terms of readability as someone reading understands from getter without setter that it's a constant value. I would have prefer a simple const though, and the testcase is small enough and obvious enough that it's not much of a problem. Maybe you made the good call, let's leave it as is.

929โ€“930 โ†—(On Diff #184059)

I'm quite new to the community so not sure how is it viewed to ignore clang-format. In my opinion formatting is to improve readability by consistency, in this case having things as close as possible to the output is more readable so I'd be in favour of ignoring clang-format just for that bit yes.

jhenderson marked 4 inline comments as done.

Improve test formatting, by overriding clang-format.

unittests/Support/CommandLineTest.cpp
929โ€“930 โ†—(On Diff #184059)

I'm okay with not using clang-format here, personally. I agree with your assessment that formatting the code via clang-format makes it worse.

thopre accepted this revision.Jan 30 2019, 1:32 PM

Splendid, thanks for addressing all my remarks and thanks ruiu for the excellent output formatting suggestion. LGTM.

This revision is now accepted and ready to land.Jan 30 2019, 1:32 PM
This revision was automatically updated to reflect the committed changes.

This broke the output of llvm-nm --help, which now loops (near) infinitely for me (both built with GCC on Ubuntu 16.04 and with Apple Clang on macOS). It's surprising that this doesn't seem to show up for others, as this is actually tested by test/tools/llvm-nm/libtool-response-file.test.

Herald added a project: Restricted Project. ยท View Herald TranscriptJan 31 2019, 11:13 PM

This broke the output of llvm-nm --help, which now loops (near) infinitely for me (both built with GCC on Ubuntu 16.04 and with Apple Clang on macOS). It's surprising that this doesn't seem to show up for others, as this is actually tested by test/tools/llvm-nm/libtool-response-file.test.

This only seems to be happening if the AArch64 target is enabled. In llvm-nm --help, the output is broken after -aarch64-neon-syntax=<value>, after which it just feeds an infinite stream of blank spaces.

This broke the output of llvm-nm --help, which now loops (near) infinitely for me (both built with GCC on Ubuntu 16.04 and with Apple Clang on macOS). It's surprising that this doesn't seem to show up for others, as this is actually tested by test/tools/llvm-nm/libtool-response-file.test.

This only seems to be happening if the AArch64 target is enabled. In llvm-nm --help, the output is broken after -aarch64-neon-syntax=<value>, after which it just feeds an infinite stream of blank spaces.

How strange. I'm pretty sure I have that target enabled locally. I'll try digging into it, and revert if it's not immediately obvious to allow for more investigation. For the record, I built and ran on both GCC 4.8.2 on Ubuntu 14.04 and Visual Studio 2015 on Windows.

Looks like a build bot picked up the same failure: http://green.lab.llvm.org/green/job/clang-stage2-Rthinlto/15346/

I'm staring at the code and cannot for the life of me see how the change could have caused this. Unfortunately, the test passes for me with current trunk (I just double-checked). @mstorsjo, since you are seeing it, would you mind debugging to identify where this goes wrong?

I've been able to reproduce the failure by extending the length of the switch by a large amount. I think, though not yet confirmed, that the problem is that this patch didn't adjust the apparent option width due to adding "=<value>". You may be seeing it because you are building without the Hexagon target? That causes the -aarch64-neon-syntax=<value> line to be the longest arg string, and greater than the GlobalWidth, I believe, causing an underflow somewhere in the adjusted code.

I'll look at fixing that.

I've been able to reproduce the failure by extending the length of the switch by a large amount. I think, though not yet confirmed, that the problem is that this patch didn't adjust the apparent option width due to adding "=<value>". You may be seeing it because you are building without the Hexagon target? That causes the -aarch64-neon-syntax=<value> line to be the longest arg string, and greater than the GlobalWidth, I believe, causing an underflow somewhere in the adjusted code.

Yes, that's correct, I'm building with ARM+AArch64+X86 only (and could also reproduce it in a build with AArch64 only).

I'll look at fixing that.

Great, thanks!

jhenderson updated this revision to Diff 184741.Feb 1 2019, 6:57 AM

Fix getOptionWidth function for new behaviour, add some asserts for protection, add new unit tests for getOptionWidth, and pull out some of the brittle constants and strings into a single set of static variables.

jhenderson reopened this revision.Feb 1 2019, 6:58 AM

Hi, could somebody take a look at the latest update? I've fixed the problem exposed, and tried to make the existing code a little less brittle and therefore less likely to run into the same issue in the future.

This revision is now accepted and ready to land.Feb 1 2019, 6:58 AM
jhenderson requested review of this revision.Feb 1 2019, 6:58 AM
thopre added inline comments.Feb 4 2019, 1:44 AM
lib/Support/CommandLine.cpp
1662โ€“1663 โ†—(On Diff #184741)

I wonder if a small inline function for this would make sense. BTW, do you know why the width is not computed from the return value of getOptionInfo?

unittests/Support/CommandLineTest.cpp
1023 โ†—(On Diff #184741)

Remove the first occurrence of "via". Any reason why the text here is different from the other two below?

jhenderson marked 4 inline comments as done.Feb 4 2019, 2:50 AM
jhenderson added inline comments.
lib/Support/CommandLine.cpp
1662โ€“1663 โ†—(On Diff #184741)

I'm not sure I see a getOptionInfo function? There is printOptionInfo but that does the printing, and needs to know about spacing. I guess you could pull the common logic out into a function, but that would result in it being computed twice for every piece of help text, and would be a different design to the current design.

unittests/Support/CommandLineTest.cpp
1023 โ†—(On Diff #184741)

I can't remember there being a good reason, but I've tweaked all the comments anyway to clarify which case it is that we actually want to test and what we can't distinguish it from.

jhenderson updated this revision to Diff 185016.Feb 4 2019, 2:59 AM
jhenderson marked 2 inline comments as done.
  • Reword some unit-test comments.
  • Pulled out some common logic into a small function.
  • Renamed some variables to be more consistent with existing naming structure (this kind of switch is named in the following style: --<arg>=<option>).
thopre added inline comments.Feb 4 2019, 3:06 AM
lib/Support/CommandLine.cpp
1662โ€“1663 โ†—(On Diff #184741)

Sorry yes I meant printOptionInfo. My thought was that printing would be done in a string and then getOptionWidth could be replaced by a strlen of sort. Yes it's a big redesign, I was just musing about it, sorry.

My comment was really about adding a skipOptionValue function that would return true when it's a ValueOptional option with empty value and value description. Trying to factor the rest of the code sounds more difficult but this is easily done.

thopre added a comment.Feb 4 2019, 3:11 AM

Forgot some nits about the description (please ignore if it doesn't end up in the commit message):

  • remove the last sentence about testing since you've added unit testing
  • instead of describing a change I find that a small example like you did in your second quote block would be clearer
  • you mention --option=var but your example is of the form -option=var which is confusing
jhenderson edited the summary of this revision. (Show Details)Feb 4 2019, 3:24 AM

Forgot some nits about the description (please ignore if it doesn't end up in the commit message):

  • remove the last sentence about testing since you've added unit testing
  • instead of describing a change I find that a small example like you did in your second quote block would be clearer
  • you mention --option=var but your example is of the form -option=var which is confusing

Yeah, I actually changed it quite a bit in the original commit, but I since realised that there were other improvements I could make, so I've updated the description to match my planned submission. Please let me know if you are happy with the latest round of updates (both code and description).

thopre accepted this revision.Feb 4 2019, 3:29 AM

LGTM. Thanks

This revision is now accepted and ready to land.Feb 4 2019, 3:29 AM
This revision was automatically updated to reflect the committed changes.