This is an archive of the discontinued LLVM Phabricator instance.

[OptTable] Support grouped short options
ClosedPublic

Authored by MaskRay on Jul 12 2020, 12:21 AM.

Details

Summary

POSIX.1-2017 12.2 Utility Syntax Guidelines, Guideline 5 says:

One or more options without option-arguments, followed by at most one option that takes an option-argument, should be accepted when grouped behind one '-' delimiter.

i.e. -abc represents -a -b -c. The grouped short options are very common. Many
utilities extend the syntax by allowing (an option with an argument) following a
sequence of short options.

This patch adds the support to OptTable, similar to cl::Group for CommandLine
(D58711). llvm-symbolizer will use the feature (D83530). CommandLine is exotic
in some aspects. OptTable is preferred if the user wants to get rid of the
behaviors.

  • cl::opt<bool> i(...) can be disabled via -i=false or -i=0, which is different from conventional --no-i.
  • Handling --foo & --no-foo requires a comparison of argument positions, which is a bit clumsy in user code.

OptTable::parseOneArg (non-const reference InputArgList) is added along with
ParseOneArg (const ArgList &). The duplicate does not look great at first
glance. However, The implementation can be simpler if ArgList is mutable.
(ParseOneArg is used by clang-cl (FlagsToInclude/FlagsToExclude) and lld COFF
(case-insensitive). Adding grouped short options can make the function even more
complex.)

The implementation allows a long option following a group of short options. We
probably should refine the code to disallow this in the future. Allowing this
seems benign for now.

Diff Detail

Event Timeline

MaskRay created this revision.Jul 12 2020, 12:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2020, 12:21 AM
jhenderson added inline comments.Jul 13 2020, 1:14 AM
llvm/include/llvm/Option/OptTable.h
200–201

"return" -> "\return"
"Index" -> "\p Index"
"Args" -> "\p Args".

(i.e. use the doxygen style used elsewhere in this file)

llvm/include/llvm/Option/Option.h
208–216

This comment probably wants updating given the API change.

llvm/lib/Option/OptTable.cpp
333

I've not looked too much in depth yet at this, but this function seems very similar to the one immediately below? Can the two somehow be combined?

350–351

I feel like this would be more clearly written as:

unsigned ArgSize = matchOption(Start, Str, IgnoreCase);
if (!ArgSize)
llvm/unittests/Option/OptionParsingTest.cpp
359–363

If we probably should disallow this, why don't we?

grimar added inline comments.Jul 13 2020, 3:00 AM
llvm/lib/Option/OptTable.cpp
472–476

What happens when GroupedShortOptions is true and FlagsToInclude/FlagsToExclude are present?

474

This looks strange. We should not have 2 versions of parseOneArg/ParseOneArg that have different naming styles (lower-case vs upper-case) of the same name.

Also, their signatures (given that InputArgList inherits from ArgList) are very close:

Arg *ParseOneArg(const ArgList &Args, unsigned &Index,
                 unsigned FlagsToInclude = 0,
                 unsigned FlagsToExclude = 0) const;

Arg *parseOneArg(InputArgList &Args, unsigned &Index) const;

And it is probably too easy to call one instead of another.
So it looks like either they should be combined, or, if it is not possible, probably the new parseOneArg should be renamed to something else?

MaskRay updated this revision to Diff 277459.Jul 13 2020, 9:18 AM
MaskRay marked 9 inline comments as done.

Address comments

ikudrin added inline comments.Jul 13 2020, 9:54 AM
llvm/include/llvm/Option/ArgList.h
414

Add an empty line before the method.

llvm/include/llvm/Option/OptTable.h
62

Wouldn't it be better if the grouping is a property of options themselves rather then a flag in the parser? That would allow better tuning of options.

llvm/lib/Option/OptTable.cpp
358

The magic value 2 needs an explanatory comment.

358

The code can't get here if GroupedShortOptions is false, no? Though, I'd prefer the two parsing methods to be merged.

MaskRay added inline comments.Jul 13 2020, 10:18 AM
llvm/lib/Option/OptTable.cpp
333

Answered below.

472–476

There is no such use case.

FlagsToInclude/FlagsToExclude is only used by clang-cl. clang does not use grouped short options.

474

I have mentioned in the description that having two methods is not great but isn't a bad choice.

... The implementation can be simpler if ArgList is mutable.

(ParseOneArg is used by clang-cl (FlagsToInclude/FlagsToExclude) and lld COFF
(case-insensitive). Adding grouped short options can make the function even more
complex.)

ParseOneArg requires const ArgList &Args in some clang call sites. The interface has been made complex by clang-cl and lld COFF (case-insensitive parsing). Having a duplicate parseOneArg actually isolates the complexity without adding too many lines.

In the future, we may consider dropping -long-option support from this overload. This may allow us to optimize/simplify the function. Combining it with ParseShortOptions may add a large switch inside a for loop, which may not be desired.

llvm/unittests/Option/OptionParsingTest.cpp
359–363

This syntax only make sense when there is one dash. If we drop -long-option support, this will be naturally disallowed.

MaskRay updated this revision to Diff 277486.Jul 13 2020, 10:34 AM
MaskRay marked 6 inline comments as done.

Address comments. Improve tests

llvm/include/llvm/Option/OptTable.h
62

Grouped short options is all or nothing. I think it is hardly the case that a utility supports some short options being grouped. This is a compromised design in CommandLine AFAICT.

llvm/lib/Option/OptTable.cpp
358

Removed GroupedShortOptions. I have explained that some call sites require refactoring (due to const reference) if we want to merge parseOneOption & ParseOneOption.

MaskRay added a comment.EditedJul 14 2020, 2:12 PM

I have mentioned why I chose for a new parseOneArg rather than extending the existing ParseOneArg. I'll make more comments here:

On the interface side:

  • ParseOneArg needs to work with a const reference and a DerivedArgList (only used in clang driver). clang driver has unique requirement that some features require to parse on argument on the fly.
  • parseOneArg needs to work with a mutable ArgList. Since we don't need complex stuff in clang, I simply make the parameter InputArgList &. The function is only called/driven by OptTable::ParseArgs. The instance is mutable - so there is no point making InputArgList & const. It would made several called functions complex.

On the feature side:

  • (a) ParseOneArg needs to deal with Windows specific stuff (clang-cl (FlagsToInclude/FlagsToExclude), lld COFF case-insensitive options).
  • (b) parseOneArg deals with GNU getopt_long style command line.

The two use cases are different and no utility wants to have both features (no overlap). (a) added sufficient complexity to ParseOneArg. I don't want to make it more complex. The duplicated stuff is small - just 50 lines of code.

grimar added inline comments.Jul 15 2020, 2:03 AM
llvm/include/llvm/Option/OptTable.h
211

How about renaming this to parseOneArgGrouped and making it private?

MaskRay updated this revision to Diff 278196.Jul 15 2020, 8:13 AM
MaskRay marked an inline comment as done.

parseOneArg (public) -> parseOneArgGrouped (private)

jhenderson added inline comments.Jul 16 2020, 1:19 AM
llvm/include/llvm/Option/Option.h
224–226

It might be nice to keep the Index parameter in the same place as it was before (i.e. move CurArg after it). I don't feel strongly about this though if you prefer your current approach.

llvm/lib/Option/OptTable.cpp
348

"Search for an option which matches Str."

This looks fine to me probably. 3 last nits.

llvm/include/llvm/Option/OptTable.h
94

Doesn't seem you need doxygen comments for a private method?

llvm/lib/Option/OptTable.cpp
380

++Index

llvm/lib/Option/Option.cpp
111

why unsigned and not size_t?

MaskRay updated this revision to Diff 278481.Jul 16 2020, 8:00 AM
MaskRay marked 7 inline comments as done.

Address comments

llvm/include/llvm/Option/Option.h
224–226

I want to keep out parameters in the end.

llvm/lib/Option/OptTable.cpp
380

++Index changes semantics.

grimar added inline comments.Jul 16 2020, 8:08 AM
llvm/include/llvm/Option/OptTable.h
94

Nor addressed?

jhenderson accepted this revision.Jul 17 2020, 12:58 AM

Hesitant LGTM. I don't really follow much of the command line behaviour, so best wait for others to approve too.

This revision is now accepted and ready to land.Jul 17 2020, 12:58 AM

Hesitant LGTM. I don't really follow much of the command line behaviour, so best wait for others to approve too.

I do not feel myself confident in accepting changes for the lib/Option code too. So +1 hesistant LGTM with the unaddressed nit resolved somehow.
(I've never saw "///" comments used for protected/private methods I think, I am not sure it worth documenting them so hard,
since they are not a part of a public API. So I'd expect the comment for the parseOneArgGrouped to be rewritten/reduced probably).

grimar accepted this revision.Jul 17 2020, 2:01 AM
MaskRay updated this revision to Diff 278800.Jul 17 2020, 9:18 AM

Drop parseOneArgGrouped's comment in OptTable.h
(Moved to OptTable.cpp)

This revision was automatically updated to reflect the committed changes.