This is an archive of the discontinued LLVM Phabricator instance.

Support for Python 3 in libclang python bindings
ClosedPublic

Authored by jbcoe on Oct 28 2016, 8:42 AM.

Details

Summary

Python bindings tests now pass in Python 3.

map in Python 3 is lazily evaluated so the method by which functions are registered needed updating.

Strings are unicode in Python 3 not UTF-8, I've tried to create an new c_types-like class (c_string_p) to automate the conversion.

String conversions made explicit where required.

It would be useful to add Python 3 tests to continuous integration.

Diff Detail

Repository
rL LLVM

Event Timeline

jbcoe updated this revision to Diff 76197.Oct 28 2016, 8:42 AM
jbcoe retitled this revision from to Incomplete support for Python 3 in libclang python bindings.
jbcoe updated this object.
jbcoe set the repository for this revision to rL LLVM.
jbcoe added a subscriber: cfe-commits.
jbcoe updated this revision to Diff 76338.Oct 30 2016, 11:21 AM
jbcoe retitled this revision from Incomplete support for Python 3 in libclang python bindings to Support for Python 3 in libclang python bindings.
jbcoe updated this object.

Python bindings tests now pass in Python 3.

jbcoe updated this revision to Diff 76344.Oct 30 2016, 3:21 PM
jbcoe removed rL LLVM as the repository for this revision.

Remove mistakenly committed debugging output.

compnerd accepted this revision.Nov 2 2016, 10:06 AM
compnerd edited edge metadata.

It would be nice if there is a simple way to handle the possible performance impact for python 2. Worst case, we can deal with it when it becomes an issue.

bindings/python/clang/cindex.py
518 ↗(On Diff #76344)

IIRC, range and xrange did have some performance difference. This would slow down the bindings on python 2. The difference is obviously not immediately visible unless count is very high. I wonder if we can do anything here to detect the python version and dispatch to xrange in python 2.

This revision is now accepted and ready to land.Nov 2 2016, 10:06 AM
mgorny added a subscriber: mgorny.Nov 2 2016, 10:29 AM

A few comments/questions. However, please note that those are generic Python comments and I haven't used or tested the clang Python API yet.

bindings/python/clang/cindex.py
77 ↗(On Diff #76344)

What is the Python3 version you're aiming to support? If I recall correctly, u"" is allowed (again) since 3.4. And even then, the condition looks weird and makes me think a while before I figure out what it's supposed to mean.

518 ↗(On Diff #76344)

The difference is that range() used to construct a complete list in Python 2. In Python 3, xrange() (that uses iterator) got renamed to range().

If you want to avoid performance impact (not sure if it's really measurable here), you can do something alike C for loop:

i = 0
while i < count:
    #...
    i += 1

It's not really idiomatic Python though. OTOH, it won't take more lines than the conditional.

623 ↗(On Diff #76344)

Why are you changing this? The old version seems to be correct for Python3.

2371 ↗(On Diff #76344)

Seems to be mistakenly added.

2573 ↗(On Diff #76344)

I may be wrong but I think you could use a list comprehension here.

args_array = (c_string_p * len(args))([c_string_p(x) for x in args])

You can also try without [] to avoid the overhead of constructing list, if the function can take an iterator.

bindings/python/tests/cindex/test_translation_unit.py
62 ↗(On Diff #76344)

Could you try inverting this? Python 2.7 already has io.StringIO, so that branch is much more likely to work.

jbcoe updated this revision to Diff 76774.Nov 2 2016, 1:28 PM
jbcoe edited edge metadata.
jbcoe marked an inline comment as done.

Respond to review comments.

bindings/python/clang/cindex.py
77 ↗(On Diff #76344)

Replaced with a simple version check

518 ↗(On Diff #76344)

I've defined xrange to be range for python3. A bit confusing but then we can use xrange uniformly.

623 ↗(On Diff #76344)

filter has changed in Python 3 and the replaced code does not behave the same as this simple list comprehension. I've not delved deeper into why but see test failures without this change.

2573 ↗(On Diff #76344)

I can't get this to work with a comprension or generator, I agree either would be an improvement.

bindings/python/tests/cindex/test_translation_unit.py
62 ↗(On Diff #76344)

Python 2.7's io.StringIO needs a unicode string.

This revision was automatically updated to reflect the committed changes.

I get tons of errors when running these tests with python 3 (version 3.4.3) with a fresh build of clang and llvm. The number of failures vs errors vary wildly between each invocation, the number of total failures + errors stays the same, which led me to think there was some sort of memory corruption.

Running the tests under valgrind make them pass, but also shows errors which I assume are ctypes-related.

This is valgrind's output for  LD_LIBRARY_PATH=/home/meh/devel/sandbox/llvm/build/lib valgrind python3 -m nose tests.cindex.test_cursor:test_get_template_argument_value as an example

jbcoe added a comment.Nov 27 2016, 2:33 AM

Thanks Mathieu.

Tests ran without issues when I submitted my patch but I tested on macOS.

I'll have a look into this and see if I can reproduce the problem.

Hey, for what it's worth when skipping the free at this line: https://github.com/llvm-mirror/clang/blob/master/bindings/python/clang/cindex.py#L181 the issues are gone, I haven't investigated the reason why.

jbcoe added a comment.EditedDec 13 2016, 1:40 PM

Hi Mathieu,
What platform did you run the tests on? How did you run them? I don't get any failures when running tests on macOS 10.12.1.

command: env PYTHONPATH=$(pwd) LD_LIBRARY_PATH=/Users/jon/DEV/LLVM/build-ninja/lib nosetests-3.4 .

python: Python 3.5.2 (default, Oct 11 2016, 04:59:56)
[GCC 4.2.1 Compatible Apple LLVM 8.0.0 (clang-800.0.38)] on darwin

Python installed using homebrew.

I'm keen to get to the bottom of this failure. I've been distracted by other things recently.

Hey, I'm on Linux, Fedora 23 to be specific.

MathieuDuponchelle added a comment.EditedJan 11 2017, 1:03 AM

Hi Mathieu,
What platform did you run the tests on? How did you run them? I don't get any failures when running tests on macOS 10.12.1.

Hey @jbcoe, do you think you can have a look at this issue? Please see my earlier comment for a way to reproduce this (LD_LIBRARY_PATH=/home/meh/devel/sandbox/llvm/build/lib valgrind python3 -m nose tests.cindex.test_cursor:test_get_template_argument_value is a valid way to reproduce this issue)

Please note that I'm thankful for you doing the porting work, much appreciated, with this issue fixed I'll be perfectly happy :)

jbcoe added a comment.Jan 11 2017, 2:38 AM

@MathieuDuponchelle I'm still keen to get to the bottom of this failure. I've not had time to reproduce it yet.

I'm thankful for you taking a look at this and reporting the issue. We need to get this working well for everyone.

@MathieuDuponchelle I've set up a docker image with Fedora 23 and reproduced the errors as you reported.

I've no immediate idea what the problem might be. Might be worth looking with address-sanitizer.

@MathieuDuponchelle I've set up a docker image with Fedora 23 and reproduced the errors as you reported.

I've no immediate idea what the problem might be. Might be worth looking with address-sanitizer.

Hey, good to know!

Two questions:

Sadly I can't run valgrind on macOS Sierra.

I'll try and look at the effects of bindings/python/clang/cindex.py L181 in more detail today.

Just out of curiosity, will you be using the Python 3 bindings for anything? If so what?

jbcoe added a comment.Jan 13 2017, 8:50 AM

I've reverted this change and submitted https://reviews.llvm.org/D28682 to explicitly check the Python version.

I'll pick this up again, I do want Python 3 support.

I've reverted this change and submitted https://reviews.llvm.org/D28682 to explicitly check the Python version.

I'll pick this up again, I do want Python 3 support.

I'm not entirely sure why you reverted the patch altogether, apart from the problem I raised it worked flawlessly, and in any case did not create any problems when the bindings were used from python 2.