Page MenuHomePhabricator

allow arbitrary python streams to be converted to SBFile
ClosedPublic

Authored by lawrence_danna on Sep 29 2019, 12:23 AM.

Details

Summary

This patch adds SWIG typemaps that can convert arbitrary python
file objects into lldb_private::File.

A SBFile may be initialized from a python file using the
constructor. There are also alternate, tagged constructors
that allow python files to be borrowed, and for the caller
to control whether or not the python I/O methods will be
called even when a file descriptor is available.I

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
lawrence_danna marked an inline comment as done.Oct 3 2019, 12:17 PM
labath added inline comments.Oct 3 2019, 12:23 PM
lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
1299–1306

Sorry, I meant &&: IsPythonObjectValid() && Base::IsValid(). Basically, I'm trying to see if there's a reasonable way to reduce the number of classes floating around, and this PresumptivelyValidFile seems like it could be avoided...

1312

There's no need to put a semicolon after the body of the constructor.

1339

Are you sure you're looking at the right place (the comments tend to move around as you reorganize code)? I was asking why is PythonFile a friend of the BinaryPythonFile class. That doesn't seem relevant to the PyErr_Occured checks...

lawrence_danna marked 2 inline comments as done.Oct 3 2019, 12:23 PM
lawrence_danna added inline comments.
lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
1316–1325

they actually have different behavior. The OwnedPythonFile version chains into the base class, but the PythonIOFile version also flushes the stream on the python side if the file is borrowed. The idea is that closing any file should flush the buffer you've been writing too even if the file is borrowed. That way you don't get any weird surprises where you "close" a file and nothing gets written.

lawrence_danna marked 5 inline comments as done.

remove unnecessary friends

lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
1299–1306

If i did it that way, with && and no PresumptivelyValidFile then the python IO files would chain into the base File class and say they're invalid.

That's how I coded it up at first, I didn't notice I needed PresumptivelyValidFile until I saw the python IO files coming up as invalid.

1339

oh, yea it got moved around. fixed.

lawrence_danna marked an inline comment as done.

semicolons

labath added inline comments.Oct 3 2019, 1:07 PM
lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
1299–1306

Yeah, but that's where the second part of this idea comes in. The python IO files can re-override IsValid() so that it does *not* chain into the base class, and just calls IsPythonObjectValid() from OwnedPythonFile. That arrangement seems to make sense to me, though I am willing to be convinced otherwise.

rm class PresumptivelyValidFile

lawrence_danna marked 2 inline comments as done.Oct 3 2019, 1:14 PM
lawrence_danna added inline comments.
lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
1299–1306

oh, yea that works.

lawrence_danna marked an inline comment as done.Oct 3 2019, 1:16 PM

Thanks. I was just about to hit approve, but then I noticed one other thing... :/ It seems that somebody (I guess it was @zturner) spent a lot of time in creating the whole PythonObject hierarchy, and it's (worthwhile) goal seems to be to enable the rest of the to code to avoid dealing with the python APIs directly. However, here you're ignoring all of that, and jumping to the C python interface directly.

I'd like to know what's your take on that? Have you considered basing these file classes on the PythonDataObject stuff?

The main reason I mention that is your comment about PyErr_Occurred, and the resulting boiler plate it produced. This seems like something that would be nicely solved by some c++-like wrapper, which is exactly what the PythonObject stuff is AFAICT.

Most of your interactions seem to be about calling methods. Would it be possible to add a PythonDataObject wrapper for this (and any other frequently used python API)? I'm hoping that we could have something like:

if (Expected<PythonInteger> i = m_py_obj.CallMethod<PythonInteger>(pybuffer))
  l_bytes_written = i->GetInteger();
else
  return Status(i.takeError());

instead of stuff like

auto bytes_written = Take<PythonObject>(
    PyObject_CallMethod(m_py_obj, "write", "(O)", pybuffer.get()));
if (PyErr_Occurred())
  return Status(llvm::make_error<PythonException>("Write"));
long l_bytes_written = PyLong_AsLong(bytes_written.get());
if (PyErr_Occurred())
  return Status(llvm::make_error<PythonException>("Write"));

It doesn't look like adding this should be too hard (including the template-based computation of the python signature), but it could go a long way towards improving the readability, and it might actually fix other issues too -- it seems that the PythonDataObjects currently completely ignore the PyErr_Occurred stuff, which seems to be wrong, and it could actually be the reason why @amccarth was unable to run the test suite with a debug python on windows. Improving that, even if it's just for the stuff that's needed for your work would be really helpful, as it would provide a pattern on how to fix the remaining issues.

I'm sorry that I'm bringing this up so late in the review, but I am also learning this stuff as we go along -- unfortunately, I don't think we have anyone really knowledgeable about the python stuff still active in the lldb community..

lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
1223

Maybe this GIL-talking can be done inside IsPythonSideValid (instead of two IsValid methods)?

1255

extra semicolon here too

1427

This is an example where _I think_ you may be using PyErr_Occurred incorrectly. If python really asserts that PyErr_Occurred is always called, then this is not right, as you will not call it if the result is null. There are a couple of other examples like that in the code too.

Most of your interactions seem to be about calling methods. Would it be possible to add a PythonDataObject wrapper for this (and any other frequently used python API)? I'm hoping that we could have something like...

I think this is a good idea, but there's a lot of places in PythonDataObjects that need to be fixed to check for exceptions and return Expected<>. I really think it ought to be done in a separate patch.

lawrence_danna marked 4 inline comments as done.

minor fixes

lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
1427

I think it is correct.

Python doesn't assert that PyErr_Occured is called, it asserts that the error is cleared before you call back in to another thing that can cause an error. PythonException will clear the error if there is one, or say "unknown error" if there isn't one (which should never happen if utf8==NULL, but i'm checking out of abundance of caution)

I agree about the separate patch stuff, but it seems to be that this should be done before this one. After all, all (most?) of the existing code has already been DataObject-ized and this patch is the thing that's deviating from that practice. I don't think you should rewrite all of the existing code -- hopefully we can find a way to fit this in there. For instance, the CallMethod thingy would be a new api, so we can have that return expected independently of the other APIs. I guess the main part you will need is (and that already existing in the non-Expected form) is the type conversions, but maybe there we can introduce a new kind of conversion, which is more strict in checking for errors. Or possibly have the current form return Expected<T>, but then have a helper function to suppress those errors and return a default-constructed T() for legacy uses (so legacy code would be sth like suppressError(object.AsType<PythonInteger>()))

lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
1427

Ok, thanks for the explanation.

lawrence_danna marked an inline comment as done.Oct 4 2019, 3:09 PM

better integration with PythonObject

I agree about the separate patch stuff, but it seems to be that this should be done before this one. After all, all (most?) of the existing code has already been DataObject-ized and this patch is the thing that's deviating from that practice. I don't think you should rewrite all of the existing code -- hopefully we can find a way to fit this in there.

Yea... I tried updating all the PythonDataObjects code to use Errors and Expected<> and it wound up being just way too big of a patch.

I've updated this patch to add a giant warning at the top of PythonDataObjects.h about the dangers, as well adding just enough Expected<> support to do what I need for this patch.

How's it look now?

I changed my mind about splitting this into another patch. When I first said that I had in mind a much more extensive rewrite of the PythonDataObjects. I think it's fine now as one patch -- though we should still go back later add Expected<> everywhere in PythonDataObjects.

python2 const fix

fix typemaps to deal with None correctly

add safe PythonModule functions and use them

split off generic exception handling stuff into a separate patch

@labath anything else for this one?

labath added a comment.Oct 9 2019, 6:52 AM

I think this is looking pretty good. I like how the read/write methods look now. Just one more question about the PythonBuffer class.

lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
1258–1265

Storing the Error as a member is quite unusual. It would be better to have a factory function which returns Expected<PythonBuffer>.

lawrence_danna marked an inline comment as done.Oct 9 2019, 9:33 AM
lawrence_danna added inline comments.
lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
1258–1265

Oh yea I’ll fix that

lawrence_danna marked an inline comment as done.

fixed pybuffer error handling weirdness

labath accepted this revision.Oct 9 2019, 11:41 AM
labath added inline comments.
lldb/source/Plugins/ScriptInterpreter/Python/PythonDataObjects.cpp
1284

I would slightly prefer if the PyObject_GetBuffer happened in the Create function (and the constructor took an already-constructed Py_buffer). That way, the call is next to the exception check and you avoid creating an "invalid" PythonBuffer object, even as an implementation detail.

This revision is now accepted and ready to land.Oct 9 2019, 11:41 AM

Pavel would slightly prefer if the PyObject_GetBuffer happened in the Create function

lawrence_danna marked an inline comment as done.Oct 9 2019, 12:14 PM
This revision was automatically updated to reflect the committed changes.
jankratochvil added a subscriber: jankratochvil.EditedOct 10 2019, 5:06 AM

Fixed now by rL374322, there was a regression - Linux Fedora 30 x86_64:

======================================================================
ERROR: test_exceptions (TestFileHandle.FileHandleTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/jkratoch/redhat/llvm-monorepo2/lldb/packages/Python/lldbsuite/test/python_api/file_handle/TestFileHandle.py", line 483, in test_exceptions
    self.assertRaises(TypeError, lldb.SBFile, None)
  File "/home/jkratoch/redhat/llvm-monorepo2/lldb/third_party/Python/module/unittest2/unittest2/case.py", line 534, in assertRaises
    callableObj(*args, **kwargs)
  File "/home/jkratoch/redhat/llvm-monorepo2-clangassert/lib/python2.7/site-packages/lldb/__init__.py", line 5334, in __init__
    this = _lldb.new_SBFile(*args)
NotImplementedError: Wrong number or type of arguments for overloaded function 'new_SBFile'.
  Possible C/C++ prototypes are:
    lldb::SBFile::SBFile()
    lldb::SBFile::SBFile(int,char const *,bool)
    lldb::SBFile::SBFile(lldb::FileSP)

Yeah, as Jan mentions, I've had to make some tweaks (r374331, r374322, r374299) in order to get this passing on all python and swig versions. Let me know if you have questions/issues/etc about any of those.