This is an archive of the discontinued LLVM Phabricator instance.

clean up the implementation of PythonCallable::GetNumArguments
ClosedPublic

Authored by lawrence_danna on Oct 15 2019, 8:53 AM.

Details

Summary

The current implementation of PythonCallable::GetNumArguments
is not exception safe, has weird semantics, and is just plain
incorrect for some kinds of functions.

Python 3.3 introduces inspect.signature, which lets us easily
query for function signatures in a sane and documented way.

This patch leaves the old implementation in place for < 3.3,
but uses inspect.signature for modern pythons. It also leaves
the old weird semantics in place, but with FIXMEs grousing about
it. We should update the callers and fix the semantics in a
subsequent patch. It also adds some tests.

Event Timeline

lawrence_danna created this revision.Oct 15 2019, 8:53 AM
Herald added a project: Restricted Project. · View Herald TranscriptOct 15 2019, 8:53 AM

put in missing consumeError()

Harbormaster completed remote builds in B39592: Diff 225083.

Thanks for jumping onto this. Apart from the inline comments, I have one high level question: What are the cases that the old method is wrong for? Was it just builtin functions, or are there other cases too? Is it possible to fix it to work for builtings too? (okay, those were three questions)

What I am wondering is how important it is to maintain two methods for fetching the argument information. Builtin functions are not likely to be used as lldb callbacks, so if it's just those, it may be sufficient to just leave a TODO to use the new method once we are in a position to require python>=3.3.

lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
59–62

Btw, what are the reasons that this can fail? The string object was created two lines above, so we definitely know it's a string. If the call cannot fail in this particular circumstance, then a one can express that (and make the code simpler) with return cantFail(str.AsUTF8()).

828

Make this a raw string? Among other benefits, it means clang-format will not reflow it weirdly...

865–892

Likewise, should most of these takeErrors be cantFail too ?

lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.h
352

Maybe it's time for a utility function like py2_const_cast(format) ?

lldb/unittests/ScriptInterpreter/Python/PythonDataObjectsTests.cpp
643–647

If you implemented operator== and operator<<(raw_ostream&) for the ArgInfo class, you could write this as EXPECT_THAT_EXPECTED(lambda.GetArgInfo(), llvm::HasValue(ArgInfo(1, false, false))

lawrence_danna marked 8 inline comments as done.

review fixes

lawrence_danna added inline comments.Oct 16 2019, 1:52 PM
lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
59–62

It is possible!

chr(0xDFFF).encode('utf8')
865–892

Yea, i guess given that we control the script there's really no reasonable way these can fail if the first check doesn't.

lldb/unittests/ScriptInterpreter/Python/PythonDataObjectsTests.cpp
643–647

eh, do i have to? I actually like having them each on their own line. It makes the patches nicer when members are added or removed, and i'd like to remove all the members except max_positional_args.

Thanks for jumping onto this. Apart from the inline comments, I have one high level question: What are the cases that the old method is wrong for? Was it just builtin functions, or are there other cases too? Is it possible to fix it to work for builtings too? (okay, those were three questions)

What I am wondering is how important it is to maintain two methods for fetching the argument information. Builtin functions are not likely to be used as lldb callbacks, so if it's just those, it may be sufficient to just leave a TODO to use the new method once we are in a position to require python>=3.3.

Looks like the old implementation also doesn't work for class or static methods, or for objects with __call__ method. I added a bunch of tests for script commands with various callable types in the followon patch.

fixup: VAR_KEYWORD

Thanks for jumping onto this. Apart from the inline comments, I have one high level question: What are the cases that the old method is wrong for? Was it just builtin functions, or are there other cases too? Is it possible to fix it to work for builtings too? (okay, those were three questions)

What I am wondering is how important it is to maintain two methods for fetching the argument information. Builtin functions are not likely to be used as lldb callbacks, so if it's just those, it may be sufficient to just leave a TODO to use the new method once we are in a position to require python>=3.3.

Looks like the old implementation also doesn't work for class or static methods, or for objects with __call__ method. I added a bunch of tests for script commands with various callable types in the followon patch.

Ok, I see... Would it be hard/impossible to fix that using the previous method, so that we have a unified implementation? Or would that result in a bunch of ifdefs too ? I'm sorry if that's obvious, but I don't know much about the C python api..

lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
59–62

Right. Thanks for explaining.

865–892

Yes, that's what I was thinking.

lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.h
317

Could you also add a comment that this is because py2 apis accept strings as "char *", and that one should be careful to not use it if the function can actually modify the string.

lldb/unittests/ScriptInterpreter/Python/PythonDataObjectsTests.cpp
643–647

No, you don't. These are mostly fine. The main thing I don't like is when people then try to "reduce duplication" and create a function like compare_arginfo, with the EXPECTs inside. Then, when the checks fail, the error message points you to that function and you're left guessing which of the compare invocations did this fail for. The only thing that can be said here is that if the ASSERT_THAT_EXPECTED macro fails it will terminate the execution of all the subsequent checks too, whereas if this was just one EXPECT call, then you'd be able to see results of all other tests too (and maybe notice a pattern). But that's something I can live with...

695

I don't see this defined anywhere. I guess it leaked from the other patch?

700

Raw string too?

lawrence_danna marked 3 inline comments as done.Oct 17 2019, 1:07 AM

Thanks for jumping onto this. Apart from the inline comments, I have one high level question: What are the cases that the old method is wrong for? Was it just builtin functions, or are there other cases too? Is it possible to fix it to work for builtings too? (okay, those were three questions)

What I am wondering is how important it is to maintain two methods for fetching the argument information. Builtin functions are not likely to be used as lldb callbacks, so if it's just those, it may be sufficient to just leave a TODO to use the new method once we are in a position to require python>=3.3.

Looks like the old implementation also doesn't work for class or static methods, or for objects with __call__ method. I added a bunch of tests for script commands with various callable types in the followon patch.

Ok, I see... Would it be hard/impossible to fix that using the previous method, so that we have a unified implementation? Or would that result in a bunch of ifdefs too ? I'm sorry if that's obvious, but I don't know much about the C python api..

I did fix the previous method for python2, at least for the callable types I could think of to test.

There must be some way to get something resembling the old method to work for python3, because inspect does it. But it would be harder to get right, and a lot less likely to keep being correct in future versions of python. And I don't think it's possible to have a single implementation like that that works for both 2 and 3 without ifdefs.

I was thinking that since python2 is EOL in like two months anyway, the best thing to do is to just keep the old implementation around until we drop python 2 support, and then delete it.

lawrence_danna marked 3 inline comments as done.

fixed

labath accepted this revision.Oct 17 2019, 1:50 AM

Looks like the old implementation also doesn't work for class or static methods, or for objects with __call__ method. I added a bunch of tests for script commands with various callable types in the followon patch.

Ok, I see... Would it be hard/impossible to fix that using the previous method, so that we have a unified implementation? Or would that result in a bunch of ifdefs too ? I'm sorry if that's obvious, but I don't know much about the C python api..

I did fix the previous method for python2, at least for the callable types I could think of to test.

There must be some way to get something resembling the old method to work for python3, because inspect does it. But it would be harder to get right, and a lot less likely to keep being correct in future versions of python. And I don't think it's possible to have a single implementation like that that works for both 2 and 3 without ifdefs.

Right, that's what I was thinking/fearing. I think I am satisfied with that explanation. Thanks for explaining.

I was thinking that since python2 is EOL in like two months anyway, the best thing to do is to just keep the old implementation around until we drop python 2 support, and then delete it.

Yeah, though I don't know if that means that lldb will drop support for it straight away. I haven't heard of any plans like that, though the next weeks dev meeting will probably be a good place to talk about it. Will you be there by any chance?

lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
822

static const char get_arg_info_script[] (one pointer more efficient)

This revision is now accepted and ready to land.Oct 17 2019, 1:50 AM
lawrence_danna marked an inline comment as done.

char * -> char[]

This revision was automatically updated to reflect the committed changes.