This is an archive of the discontinued LLVM Phabricator instance.

[lldb] Add -l/--language option to script command
ClosedPublic

Authored by JDevlieghere on Sep 1 2020, 6:43 PM.

Details

Summary

Make it possible to run the script command with a different language than currently selected.

$ ./bin/lldb -l python
(lldb) script -l lua
>>> io.stdout:write("Hello, Wolrd!\n")
Hello, Wolrd!

Because of the ability to pass anything after the script command, it is implemented as a CommandObjectRaw. In a first attempt I tried to change this to a CommandObjectParsed in order to have the existing infrastructure parse the language argument and have the script contents as the first argument. That didn't work out, first because it unconditionally removed quotes from within the script expression and second because it attempts to expand script commands between backticks. Given how easy it is to parse the language argument, I think it's warranted to keep the CommandObjectRaw and just do the parsing manually.

Diff Detail

Event Timeline

JDevlieghere requested review of this revision.Sep 1 2020, 6:43 PM
JDevlieghere created this revision.
kastiglione added inline comments.
lldb/source/Commands/CommandObjectScript.cpp
122

this is missing error handling for unsupported languages, such as script -l swift

lldb/test/Shell/ScriptInterpreter/Lua/lua-python.test
8

is a.out not a universal default?

lldb/test/Shell/ScriptInterpreter/Python/python.test
9

So this SyntaxError is from python because it's attempting to evaluate --language invalid print(...)? I think it'd be good to handle unknown languages, but maybe you have a reason for this?

kastiglione added inline comments.Sep 1 2020, 8:31 PM
lldb/source/Commands/CommandObjectScript.cpp
107–108

generally, lldb supports more than space delimited flags. For example I often type expr -lobjc -- ... with no space. A quick check shows --language=objc is also supported. Should these cases be handled too?

JDevlieghere planned changes to this revision.Sep 1 2020, 10:51 PM
JDevlieghere marked 3 inline comments as done.
JDevlieghere added inline comments.
lldb/source/Commands/CommandObjectScript.cpp
107–108

Yes, if that works for other command it should work for this command too. I was already on the fence about this, but supporting that would definitely tip the scales beyond the complexity I'm willing to accept to work around the limitations of CommandObjectParsed. I'll need to come up with a different solution.

122

I actually did that on purpose, I don't think either Python or Lua support statements that begin with -l or --language but I didn't want to intercept this unless it meant something to LLDB. But I agree that's not great UX.

lldb/test/Shell/ScriptInterpreter/Lua/lua-python.test
8

Not sure about Windows, but regardless I prefer to be explicit.

JDevlieghere edited reviewers, added: kastiglione; removed: l.frisken.Sep 1 2020, 10:52 PM
teemperor requested changes to this revision.Sep 2 2020, 1:31 AM
teemperor added a subscriber: teemperor.

So CommandObjectRaw does support arguments, but they way it works is that you need to have a delimiter for the 'raw' part which is --. If you have that, then you can just use invoke the normal option parsing like CommandObjectExpression does. The handcrafted implementation here adds a completely new lldb command syntax which has a 'raw' part without the -- which seems inconsistent (not saying that the syntax here is worse or better than script --language foo --, but it's just inconsistent with the way the rest of LLDB works). It's obviously also missing the other quirky things that are part of the command syntax (e.g., quoting arguments isn't supported and so on).

JDevlieghere marked 2 inline comments as done.
  • Remove custom parsing and use the command options insofar possible.
  • Require -- as a delimiter when language and code are specified together.

So CommandObjectRaw does support arguments, but they way it works is that you need to have a delimiter for the 'raw' part which is --. If you have that, then you can just use invoke the normal option parsing like CommandObjectExpression does. The handcrafted implementation here adds a completely new lldb command syntax which has a 'raw' part without the -- which seems inconsistent (not saying that the syntax here is worse or better than script --language foo --, but it's just inconsistent with the way the rest of LLDB works). It's obviously also missing the other quirky things that are part of the command syntax (e.g., quoting arguments isn't supported and so on).

Yes, I really wanted to avoid that, and after spending (way too much time) trying to hack that up I concluded that with the lossy conversion to Args that really wasn't an option (pun intended?). So using the delimiter is indeed what I ended up settling for. I did add a special case for the situation where you pass a language but no code, because I thought it would be silly to require you to type script --language lua -- to launch the interactive Lua interpreter.

labath added a comment.Sep 2 2020, 8:15 AM

because I thought it would be silly to require you to type script --language lua -- to launch the interactive Lua interpreter.

For better or worse, this is how expr works:

(lldb) expr --show-types
error: <user expression 0>:1:3: use of undeclared identifier 'show'
--show-types
  ^
error: <user expression 0>:1:8: use of undeclared identifier 'types'
--show-types
       ^
(lldb) expr --show-types --
Enter expressions, then terminate with an empty line to evaluate:
  1: ^D

I agree that silly, but maybe the same fix should then be applied to the expr command (and any other command with similar behavior).

I agree that silly, but maybe the same fix should then be applied to the expr command (and any other command with similar behavior).

Sure, we can use the same trick in other places that use OptionsWithRaw. I'll do that in a separate patch.

This revision is now accepted and ready to land.Sep 2 2020, 9:28 AM

I agree that silly, but maybe the same fix should then be applied to the expr command (and any other command with similar behavior).

Sure, we can use the same trick in other places that use OptionsWithRaw. I'll do that in a separate patch.

Actually, after discussion this with @teemperor offline, for the expo command we can't use that trick because a valid option might also be a valid expression. expr --flag would parse correctly if flag was an option, but it might also be an expression that decrements --flag.

kastiglione added inline comments.Sep 2 2020, 11:36 AM
lldb/source/Commands/CommandObjectScript.cpp
37

is the default's name available in a define? If maybe it'd be good to include it here? I don't actually know when/where this string gets printed.

JDevlieghere marked an inline comment as done.Sep 2 2020, 11:58 AM
JDevlieghere added inline comments.
lldb/source/Commands/CommandObjectScript.cpp
37

It's set dynamically as a debugger property, while the options here are static.

labath requested changes to this revision.Sep 3 2020, 12:21 AM

I agree that silly, but maybe the same fix should then be applied to the expr command (and any other command with similar behavior).

Sure, we can use the same trick in other places that use OptionsWithRaw. I'll do that in a separate patch.

Actually, after discussion this with @teemperor offline, for the expo command we can't use that trick because a valid option might also be a valid expression. expr --flag would parse correctly if flag was an option, but it might also be an expression that decrements --flag.

That's actually a very good point. But... that also applies to the script command. With python script --flag returns the value of the flag variable (negated twice). With lua, it executes the command "--flag", which is a comment. Neither of these are as useful as the c++ --flag, but they still create ambiguities. And I think these commands should disambiguate in the same way. No matter what we choose as the primary interpretation, the "other" meaning can always be obtained by adding a -- to the appropriate place, so it's only a matter of choosing the best default. Given the expr status quo, I'd stick with that.

This revision now requires changes to proceed.Sep 3 2020, 12:21 AM

Given the expr status quo, I'd stick with that.

I want to give a +1 to not copying expr. I think the tradeoff is not not worth it. My user perspective is that lldb could be improved by making more decisions with the user in mind, not lldb's engineers (to give some context). At this time, there's only -l/--language and I don't think the cases of ambiguity have enough likelihood to force the use of --.

Taking --language as an example, it's only valid python if language is a defined variable, and once you consider that it takes an argument, it becomes invalid in python (--language python or --language=python). For Lua, I can't think of practical cases for evaluating a comment, unless Lua uses comments for semantics/metaprogramming?. The common case of -l <language> <code> is not valid for either python or Lua.

I think it's a practical tradeoff to not require -- at this point. Will there be more flags added, I don't know, but if there are I also think it's not a major problem to require -- at a later time if the balances shift.

I have to agree with @kastiglione that this isn't very intuitive. I doubt that more than a handful of users are aware that "raw commands" are a thing, so I think this inconsistency is for once in favor of the user. LLDB commands are anyway not meant to be stable, so we can just fix this if this ever turns out to be a real issue.

Having said that, I think this shouldn't be done here but in its own patch with a less ad hoc implementation and some dedicated parsing tests. Good thing the command parsing implementation is so great, so that will be lots of fun for @JDevlieghere

lldb/source/Commands/CommandObjectScript.cpp
99

We -> we.

labath added a comment.Sep 4 2020, 5:45 AM

I'm not married to the current way we process commands, but I do value their consistency (both between different commands, and between a command and their documentation). This would make script behave differently than expr even though they have identical modes of operation (raw input, special handling of empty argument, etc.), and differently from its documentation. The help script command contains this:

Invoke the script interpreter with provided code and display any results.  Start the interactive
interpreter if no code is supplied.  Expects 'raw' input (see 'help raw-input'.)

I'm not sure how this is patch coded, but probably after the argument are added, this additional snippet will appear:

Important Note: Because this command takes 'raw' input, if you use any command options you must use
 ' -- ' between the end of the command options and the beginning of the raw input.

help raw-input says this:

<raw-input> -- Free-form text passed to a command without prior interpretation, allowing spaces
               without requiring quotes.  To pass arguments and free form text put two dashes ' -- '
               between the last argument and any raw input.

If we do want to change that, we should at least make sure all of these things reflect that. And yeah, that should be a separate patch (maybe a small RFC even).

FWIW, if we do want to change the handling of raw-commands, I don't think changing that for _all_ raw commands (including expr) would be such a bad thing. Users mostly interact with the expression command via the p alias, which already includes the -- thingy in its expansion. And for the advanced uses of these commands, we could still tell a consistent story (e.g. option processing stops at the first token which does not look like an option, and everything after that is considered to be the "raw" part).

JDevlieghere marked an inline comment as done.Sep 8 2020, 11:20 AM

Sounds reasonable to me. I'll split it up in two patches. I'll update this one to use the double dashes so we can land the functionality and keep things consistent with the rest of LLDB. I'll create another patch to do the handling of raw commands to use a heuristic.

labath accepted this revision.Sep 15 2020, 1:53 AM

This looks fine (sorry about the delay, I've been OOO).

lldb/test/Shell/ScriptInterpreter/Lua/lua.test
2

Did you add the -- here just for consistency? It believe it should not be necessary to use the double dash if the script command has no arguments (similar to how its possible to write expr 1+2 without any dashes (but expr -A -- 1+2 does require it)

This revision is now accepted and ready to land.Sep 15 2020, 1:53 AM

This looks fine (sorry about the delay, I've been OOO).

I hope you had a good time. Glad to have you back! :-)

lldb/test/Shell/ScriptInterpreter/Lua/lua.test
2

I think I meant to have both but updated the line without copying it first. I'll fix that before landing.

This revision was automatically updated to reflect the committed changes.
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptSep 15 2020, 9:40 AM
MaskRay added inline comments.
llvm/lib/Support/MemoryBuffer.cpp
460 ↗(On Diff #291960)

I guess this was unintended. Fixed in 03f1516d6075f42dce95bcf9fde3f6fde97abd35