This is an archive of the discontinued LLVM Phabricator instance.

Replace getopt with llvm::cl in lldb driver
AbandonedPublic

Authored by labath on Feb 29 2016, 9:26 AM.

Details

Reviewers
zturner
clayborg
Summary

This replaces the hand-rolled getopt option parser in lldb driver with the one in llvm. This
results in a lot less code, as the llvm's parser does much of the work (e.g., formatting of
--help output) for us, and side-steps the problem of using the internal getopt implementation in
liblldb on platforms which don't have a system getopt (windows, netbsd).

This change has tiny behaviour changes, which I try to enumerate here:

  • boolean arguments can no longer be specified twice (e.g., --no-lldbinit --no-lldbinit will result in an error). I needed to tweak TestFormats to account for this. Can be changed, but I'm not sure if there is any need for it.
  • I have removed the --script-language option, as it seemed to be ignored anyway.
  • --help output has changed a bit and now reads (note one has to pass --help-hidden to see short option letters. again can be changed, but i think it makes the initial help more readable):

USAGE: lldb [options] <program-and-arguments>

OPTIONS:

General options:

-arch=<string>                 - Tells the debugger to use the specified architecture when starting and running the program.  <string> must be one of the architectures for which the program was compiled.
-attach-name=<string>          - Tells the debugger to attach to a process with the given name.
-attach-pid=<uint>             - Tells the debugger to attach to a process with the given pid
-batch                         - Tells the debugger to running the commands from -s, -S, -o & -O, and then quit.  However if any run command stopped due to a signal or crash, the debugger will return to the interactive prompt at the place of the crash.
-core=<string>                 - Tells the debugger to use the file <string> as the core file.
-debug                         - Tells the debugger to print out extra information for debugging itself.
-editor                        - Tells the debugger to open source files using the host's "external editor" mechanism.
-file=<string>                 - Tells the debugger to use the file <string> as the program to be debugged.
-no-lldbinit                   - Do not automatically parse any '.lldbinit' files.
-no-use-colors                 - Do not use colors.
-one-line=<string>             - Tells the debugger to execute this one-line lldb command after any file provided on the command line has been loaded.
-one-line-before-file=<string> - Tells the debugger to execute this one-line lldb command before any file provided on the command line has been loaded.
-one-line-on-crash=<string>    - When in batch mode, tells the debugger to execute this one-line lldb command if the target crashes.
-repl=<string>                 - Runs lldb in REPL mode with a stub process.
-repl-langauge=<string>        - Chooses the language for the REPL.
-source=<string>               - Tells the debugger to read in and execute the lldb commands in the given file, after any file provided on the command line has been loaded.
-source-before-file=<string>   - Tells the debugger to read in and execute the lldb commands in the given file, before any file provided on the command line has been loaded.
-source-on-crash=<string>      - When in batch mode, tells the debugger to source this file of lldb commands if the target crashes.
-source-quietly                - Don't echo the commands when executing them.
-wait-for                      - Tells the debugger to wait for a process with the given pid or name to launch before attaching.

Generic Options:

-help                          - Display available options (-help-hidden for more)
-help-list                     - Display list of available options (-help-list-hidden for more)
-version                       - Display the version of this program

If you don't provide -file then the first argument will be the file to be
debugged which means that 'lldb -- <filename> [<ARG1> [<ARG2>]]' also
works. But remember to end the options with "--" if any of your
arguments begin with "-".

Multiple "-source" and "-one-line" options can be provided. They will be
processed from left to right in order, with the source files and commands
interleaved.

Diff Detail

Event Timeline

labath updated this revision to Diff 49386.Feb 29 2016, 9:26 AM
labath retitled this revision from to Replace getopt with llvm::cl in lldb driver.
labath updated this object.
labath added reviewers: clayborg, zturner.
labath added a subscriber: lldb-commits.
zturner edited edge metadata.Feb 29 2016, 9:29 AM
zturner added a subscriber: zturner.

Love this change in principle. I will look over it on the windows side to
make sure nothing breaks.

One question: we are using getopt still for parsing interactive command
lines right?

Yes, we are, and unfortunately we can't get rid of it that easily there, as llvm:cl does not really have interactive uses in mind (global variables everywhere, --help kills your program, etc.). It probably wouldn't be too hard to add a version which can do that, as all the blocks are in place, but that's way out of my scope now. ;)

clayborg edited edge metadata.Feb 29 2016, 10:46 AM

As long as both long and short options are still supported? Can you still type any of:

  • --arch=x86_64
  • --arch x86_64
  • -arch=x86_64
  • -arch x86_64
  • -ax86_64
  • -a x86_64

Are we able to track which options can be used with other options with the llvm solution? I didn't look too close. There were bits in the old option definitions which defined with options could be specified with which other options. The other thing that might throw people for a loop is if llvm doesn't support this style:

% lldb /bin/ls --arch=x86_64 -- -lAF

getopt_long allowed you to have arguments mixed in between the options so the arguments would have bee "/bin/ls" and "-lAF". Options would have been "--arch=x86_64".

long and short options are supported, but the one I'm not sure about is the
case where you use a short option with no space. Your example "-ax86_64"
might not work. It might, just that it should be tested. I'm 99%
confident the rest of them all work.

Also not sure about this example: "% lldb /bin/ls --arch=x86_64 -- -lAF"

If I understand correctly, this runs LLDB with the --arch=x86_64 argument,
and specifies the program to debug as "/bin/ls -IAF"?

Seems like a confusing syntax to me, is there any other way to specify this
currently? like:

% lldb "/bin/ls -IAF" --arch=x86_64
% lldb /bin/ls --arch-x86_64 --args=-IAF

Or something along those lines?

long and short options are supported, but the one I'm not sure about is the
case where you use a short option with no space. Your example "-ax86_64"
might not work. It might, just that it should be tested. I'm 99%
confident the rest of them all work.

Also not sure about this example: "% lldb /bin/ls --arch=x86_64 -- -lAF"

If I understand correctly, this runs LLDB with the --arch=x86_64 argument,
and specifies the program to debug as "/bin/ls -IAF"?

Yes.

Seems like a confusing syntax to me, is there any other way to specify this
currently? like:

% lldb "/bin/ls -IAF" --arch=x86_64
% lldb /bin/ls --arch-x86_64 --args=-IAF

Or something along those lines?

Not sure, I am just pointing out what is going to change for people by making this switch due to the way getopt_long works.

For LLVM you might need to do this:

% lldb --arch=x86_64 -- /bin/ls -lAF

All args might need to be past all of the options and after the "--". I just don't know and was pointing out the differences people are likely to be confused by.

In general I prefer to outsource these things to underlaying OSes with their own implementations, not enforcing the "right one" to everybody.

I'm used to NetBSD syntax, that differs with GNU and even if I'm using their software, I automatically pickup the more portable syntax for my command-line.

Other than my preferences, I have nothing against this code to happen.

Just to make it clear -- NetBSD has system getopt(3), we just deliberately rejected -long -only syntax.

labath added a comment.Mar 1 2016, 4:22 AM

As long as both long and short options are still supported? Can you still type any of:

  • --arch=x86_64
  • --arch x86_64
  • -arch=x86_64
  • -arch x86_64
  • -ax86_64
  • -a x86_64

All the variants work (including -a=x86_64) except the "-ax86_64" version. Not sure if it's that important to maintain this one.

Are we able to track which options can be used with other options with the llvm solution? I didn't look too close. There were bits in the old option definitions which defined with options could be specified with which other options.

These bits were there, but they weren't used for any verification purposes, only for formatting the --help output, which I did not find altogether useful anyway, as it has too many options and variants (some of which are wrong).

lldb -h
lldb -v [[--] <PROGRAM-ARG-1> [<PROGRAM_ARG-2> ...]]
lldb -a <arch> -f <filename> [-c <filename>] [-s <filename>] [-o <none>] [-S <filename>] [-O <none>] [-k <none>] [-K <filename>] [-Q] [-b] [-e] [-x] [-X] [-l <script-language>] [-d] [[--] <PROGRAM-ARG-1> [<PROGRAM_ARG-2> ...]]
lldb -n <process-name> -w [-s <filename>] [-o <none>] [-S <filename>] [-O <none>] [-k <none>] [-K <filename>] [-Q] [-b] [-e] [-x] [-X] [-l <script-language>] [-d]
lldb -p <pid> [-s <filename>] [-o <none>] [-S <filename>] [-O <none>] [-k <none>] [-K <filename>] [-Q] [-b] [-e] [-x] [-X] [-l <script-language>] [-d]
lldb -P
lldb -r [<none>] -R <none>

We can't get llvm to produce this kind of output for us, but we can use "option categories". By default, we have "general" and "generic" categories ( :P ), but I was thinking about adding a "repl" category (for --repl, --repl-language) and a "scripting category" (all --source and --one-line options, --batch, etc.). IMHO, this would help the output readability more than the previous "option sets" variant.

This would not help the enforcement of option combinations in a declarative manner, but as we have introspection, it wouldn't be too hard to express "no option from the 'repl' category can be used with other options" with a bit of code.

The other thing that might throw people for a loop is if llvm doesn't support this style:

% lldb /bin/ls --arch=x86_64 -- -lAF

This works as expected.

So, yes, there are tiny regressions, but these are not important IMO, and I think this makes a nice code simplification. However, if I'm not interested in shoving this in if people don't approve, so... let me know what you think

I think there's a lot of value in having lldb's help output and cl syntax
match all of the other llvm tools that people are already familiar with.
Simplicity through consistency. So lgtm

labath added a subscriber: jingham.Mar 2 2016, 1:42 AM

Greg, Jim,

any objections?

IIUC, we can't get out of the business of supporting the HostGetOpt on freebsd & windows since we can't use the llvm version for the command line. So I don't see that it much eases the support burden. The differences in the help output don't seem significant, which doesn't argue strongly one way or the other.

So while I'm not sure I see the point of doing this, I don't see it doing any harm either.

clayborg requested changes to this revision.Mar 2 2016, 10:18 AM
clayborg edited edge metadata.

So this is nice that you cleaned up this one place where we can use LLVM's option parsing stuff, but it now makes the driver inconsistent with the rest of LLDB. It also makes it so you have to link some LLVM .a files into the lldb command line driver which we didn't have before. If we are going to make a change away from getopt_long(), I would rather change everything over. Otherwise we have our command line that behaves one way, and all of our command interpreter commands behaving another. And if we use option parsing from the driver, I would rather have the one consistent way that we handle options be exported through the lldb::SB API with something like:

lldb::SBOptionParser
lldb::SBOption

So while it seems nice to update just the driver, I would really rather us have one consistent way that we parse all command line options. For this to happen we would need to:

  • switch over to using LLVM's option parser everywhere
  • export the option stuff through the lldb::SB API so the driver can use it without having to directly link against llvm
  • make the LLVM option parser self contained so it doesn't use global variables where each option parser has its own data (thread safe and safe to use concurrently)

In general we have been trying to do this:

  • lldb.so or LLDB.framework can use llvm and clang APIs directly, nothing outside should
  • if LLDB needs to export something that is fragile (like the LLVM option parser) it needs to wrap it up in a public API that will remain stable
This revision now requires changes to proceed.Mar 2 2016, 10:18 AM

Our current consistent way of parsing options is to use getopt_long() everywhere.

Our current consistent way of parsing options is to use getopt_long() everywhere.

Actually with usage of getopt_long_only() which is unportable.

labath abandoned this revision.Mar 3 2016, 1:03 AM

Ok, fair enough. I'll see if I can do something like that in the future, but I'm going to abandon this for the moment.

Feel free to check this in a branch if you want to keep it around for when you later work on this so you don't lose the changes?