This is an archive of the discontinued LLVM Phabricator instance.

Fix FILE * leak in Python API
AbandonedPublic

Authored by fjricci on Mar 24 2016, 1:27 PM.

Details

Summary

This reverts some of the changes made by:
r257644
r250530
r250525

The changes made in these patches result in leaking the FILE* passed
to SetImmediateOutputFile. GetStream() will dup() the fd held by the
python caller and create a new FILE*. It will then pass this FILE*
to SetImmediateOutputFile, which always uses the flag
transfer_ownership=false when it creates a File from the FILE*.

Since transfer_ownership is false, the lldb File destructor will not
close the underlying FILE*. Because this FILE* came from a dup-ed fd,
it will also not be closed when the python caller closes its file.

Leaking the FILE* causes issues if the same file is used multiple times
by different python callers during the same lldb run, even if these
callers open and close the python file properly, as you can end up
with issues due to multiple buffered writes to the same file.

Diff Detail

Event Timeline

fjricci updated this revision to Diff 51593.Mar 24 2016, 1:27 PM
fjricci retitled this revision from to Fix FILE * leak in Python API.
fjricci updated this object.
fjricci added subscribers: sas, lldb-commits.
zturner requested changes to this revision.Mar 24 2016, 1:31 PM
zturner edited edge metadata.

This patch won't work. PyFile_AsFile doesn't exist on Python 3. Anything that you need done to make this work has to be wrapped up in PythonFile class as it was before, because that is the place where you can add Python version specific code to do different things with Py2 and Py3.

This revision now requires changes to proceed.Mar 24 2016, 1:31 PM

Because python 3 doesn't use FILE* as the underlying implementation anymore, I think the only way to make this work is to continue to use dup() to make the FILE*, but then to actually keep track of the ownership of the FILE*. This can't be done inside PythonFile, because the PythonFile will go out of scope before SetImmediateOutputFile is called. This leaves me thinking that best possibility would be to expose the transfer_ownership flag to the python API and then change the call to set this flag. However, this flag isn't currently exposed above the implementation level, and I'm not sure that it makes sense for it to be from an API point of view.

As an alternative fix, I can also change PythonFile to use PyFile_AsFile() on Python < 3, and the currently existing implementation on Python >= 3. However, this won't actually fix the FILE* leak on python 3.

zturner added a comment.EditedMar 24 2016, 3:16 PM

I'm confused about something. This is the implementation of File::GetStream.

if (!StreamIsValid())
{
    if (DescriptorIsValid())
    {
        const char *mode = GetStreamOpenModeFromOptions (m_options);
        if (mode)
        {
            if (!m_should_close_fd)
            {
                // We must duplicate the file descriptor if we don't own it because
                // when you call fdopen, the stream will own the fd
#ifdef _WIN32
                m_descriptor = ::_dup(GetDescriptor());
#else
                m_descriptor = dup(GetDescriptor());
#endif
                m_should_close_fd = true;
            }

            do
            {
                m_stream = ::fdopen (m_descriptor, mode);
            } while (m_stream == NULL && errno == EINTR);

            // If we got a stream, then we own the stream and should no
            // longer own the descriptor because fclose() will close it for us

            if (m_stream)
            {
                m_own_stream = true;
                m_should_close_fd = false;
            }
        }
    }
}
return m_stream;

If this is successful, the final if (m_stream) check passes, and it sets m_own_stream to true. This means that in the File destructor, it will call fclose. It doesn't matter that nobody calls close, because after you've fdopen'ed something, you're not supposed to call close, only fclose.

Am I missing something?

fjricci added inline comments.Mar 24 2016, 3:20 PM
scripts/Python/python-typemaps.swig
532 ↗(On Diff #51593)

The problem is that here, we save the FILE* ($1) and let the File (file) go out of scope. So the File gets destructed (but it's after calling file.Clear(), so the close doesn't happen). But we still hold onto the FILE*, and we pass it, as $1, to the API call (not seen here, but it comes right after this block in the generated swig cpp code).

When we pass the FILE* to SetImmediateOutputFile, it then generates a new File from this FILE*, but doesn't take ownership, because the transfer_ownership flag is always false. And the File destructor only closes files if it has ownership of the FILE*.

zturner added inline comments.Mar 24 2016, 3:23 PM
scripts/Python/python-typemaps.swig
532 ↗(On Diff #51593)

I think SWIG explicitly has a mechanism to handle this. Let me find it.

zturner added inline comments.Mar 24 2016, 3:25 PM
scripts/Python/python-typemaps.swig
532 ↗(On Diff #51593)

It's the freearg typemap.

Basically you can put a rule that just calls fclose() on the argument, and it will generate this code after it's called the API.

Would this work?

fjricci added inline comments.Mar 24 2016, 3:56 PM
scripts/Python/python-typemaps.swig
532 ↗(On Diff #51593)

So the issue is that we don't want to fclose() after SetImmediateOutputFile(), we want it to fclose() when we're done writing to the FILE*, which happens when the CommandReturnObject is destructed.

Could there be an overload of SBCommandReturnObject constructor that takes
an fd instead of a FILE*, and pass through to the private method which does
its own duping? Then this typemap could be changed to convert to int
instead of to FILE*

Also I'm OOO until tomorrow now, so I can't look at the code again until
then

Would we want this overloaded method taking an int (fd) instead of a FILE* to be present in the API? Or hidden somehow? I'm new to swig, but it looks like python-extensions.swig includes some private extensions to the CommandReturnObject class already.

Overloading with an fd argument seems quite a bit more complex than doing what we do now but setting transfer_ownership in the underlying CommandReturnObject::SetImmediateOutputFile() call to true (which could probably be done either with a new swig extension or by changing the api). Do you think it's worth the added complexity?

As long as it only changes the private API and not the public it's probably
fine, if you think that will work feel free to give it a try and upload a
new patch

It appears that %extend directives only have access to public class members. So my idea won't work. I'll try to overload with a fd call, presumably that call can be exposed in the API? Or is it better if it isn't?

The difficulty with overloading is that, for overloaded functions in swig, it looks like pointer arguments take precedence over integer arguments. But there must be a way to specify.

clayborg requested changes to this revision.Mar 25 2016, 9:37 AM
clayborg edited edge metadata.

Seems like we need to just add the following to the API:

void
SetImmediateOutputFile (FILE *fh, bool transfer_ownership);

void
SetImmediateErrorFile (FILE *fh, bool transfer_ownership);

This should solve our problems right? Note you must add these APIs, you can't remove or change existing ones since there are programs out there that are linking against the old API.

I think that would be sufficient to solve the problem.

I could use a %rename to rename the SetImmediateOutputFile (FILE *fh) call to a private function in %extend, which will then call SetImmediateOutputFile (FILE *fh, transfer_ownership=true) instead.

Does that seem reasonable? I don't see any %rename calls in the existing lldb source, which is my primary hesitation.

Actually, I think this may open a new can of worms. Adding that flag to the API doesn't necessarily achieve what we want. The current bug is that we take a Python file (owned by the Python caller), and we know that the Python caller does not want to transfer ownership of the file. So inside of the swig logic, we make a copy of the file, which is owned by lldb (although really owned by nobody, since lldb doesn't take ownership of it).

This means that the problem is that we take an argument and change its owner, which then causes us to want different behavior from the API function we're calling. This is why the original swig worked, it didn't do anything to change the owner of the file, so it was consistent on both sides of the swig interface.

The only caller that would actually want to tell lldb to take ownership is the swig wrapper, not the Python itself. (What would it mean if the python caller decided to set that flag to true? Presumably we'd then have to take ownership of the python file object as well, which I don't think is something we want, and I don't think it would even be possible, given python garbage collection.)

We don't necessarily need to expose this new API to swig. We get to pick and choose what we expose to Python in the SBCommandReturnObject.i file. So leave the .i file as is and use %rename to specify "transfer_ownership = true". Will that work?

fjricci updated this revision to Diff 51678.Mar 25 2016, 1:12 PM
fjricci edited edge metadata.

Extend SBCommandReturnObject API to allow for taking file ownership

fjricci updated this revision to Diff 51680.Mar 25 2016, 1:13 PM
fjricci edited edge metadata.

Correct whitespace change

clayborg requested changes to this revision.Mar 25 2016, 2:55 PM
clayborg edited edge metadata.

I would prefer to do this with overloading if possible. See inlined comments.

include/lldb/API/SBCommandReturnObject.h
89–93

If we end up using overloading, see below, then these should be marked as deprecated with a comment. We should also add some header doc to specify that the above two functions do not take ownership of the file handle.

95–102

It might be better to just overload:

void
SetImmediateOutputFile(FILE *fh, bool transfer_ownership);

void
SetImmediateErrorFile (FILE *fh, bool transfer_ownership);
scripts/interface/SBCommandReturnObject.i
87–94

Not sure if we can do this if we overload.

This revision now requires changes to proceed.Mar 25 2016, 2:55 PM
fjricci updated this revision to Diff 51698.Mar 25 2016, 4:31 PM
fjricci edited edge metadata.

Use function overloading

fjricci updated this revision to Diff 51699.Mar 25 2016, 4:32 PM
fjricci edited edge metadata.

Fix whitespace

clayborg accepted this revision.Mar 25 2016, 4:32 PM
clayborg edited edge metadata.

Looks good. Thanks for taking the time to get this working right.

sas added a comment.Mar 25 2016, 4:46 PM

Committed as r264476.

fjricci abandoned this revision.Mar 25 2016, 4:46 PM

Committed.