This is an archive of the discontinued LLVM Phabricator instance.

Fix formatting of wchar, char16, and char32
Needs ReviewPublic

Authored by zturner on Nov 1 2018, 11:50 AM.

Details

Reviewers
jingham
Summary

char16, char32, and wchar_t were previously broken. If you had a simple variable like wchar_t x = L'1' and wrote p x LLDB would output (wchar_t) x = 1\0. This is because it was using eFormatChar with a size of 2. What we needed was to introduce a special format specifically for wchar_t. The only valid sizes of wchar_t on all existing compilers are 2 and 4, so there's no real point trying to handle sizes of 1 and 8.

Along the way, I accidentally stumbled across the reason that references that char16 and char32 types were also formatting incorrectly, so I fixed that as well.

Diff Detail

Event Timeline

zturner created this revision.Nov 1 2018, 11:50 AM

There were a bunch of other tests testing wchar_t handling, all in:

lang/cpp/wchar_t

as well as some tests in:

functionalities/data-formatter/data-formatter-stl/libcxx/string/TestDataFormatterLibcxxString.py

Those tests weren't failed (except for the latter test, and that only for android/gmodules. Do you know why those tests were passing when the functionality was broken? Maybe they need fixing to be more accurate?

No, but thanks for the pointer. Interestingly, it worked for me on my home machine but not on my work machine, and I'm not sure what the difference between the two is.

Clearly the test in lang/cpp/wchar_t is making it into lldb_private::formatters::WCharSummaryProvider whereas locally I am not.

Update: It's because There was a problem with my PYTHONPATH and python was getting disabled at CMake configure time. So I was effectively running with LLDB_DISABLE_PYTHON. So basically it's only broke on the LLDB_DISABLE_PYTHON codepath.

So the deal is that we were relying on a summary formatter to print wchar_t before, and now you have a format option that handles them as well? Do we need both? Maybe the summary also handles wchar_t * strings?

As an aside, for reasons that are not entirely clear to me most of the data formatter code is #ifndef LLDB_DISABLE_PYTHON. That shouldn't be true, C++ based formatters (which all the built-in formatters are) should be able to work in the absence of Python. Figuring out why this is true is on my list of things to investigate some spare afternoon...

Side question, should we just kill the LLDB_DISABLE_PYTHON=0 codepath? It can be a headache to get LLDB linking with Python (particularly on Windows), but it would be nice to be able to delete all the preprocessor stuff.

It doesn't seem unreasonable to want to build lldb for smaller systems that don't have Python available, and in fact we do that internally at Apple. Actually, it DOES seem a little unreasonable to me because after all you can just run the debugserver/lldb-server and connect remotely. But I have not to date been able to convince the folks who have to work on said systems that they don't really want an on device lldb. Until I can win that argument - which I project happening only just shy of never - I'd rather not lose the ability to build lldb without Python.

Note also, the vast majority of the uses of LLDB_DISABLE_PYTHON are related to the requirement that we have Python to use any of the data formatter facilities. Those uses shouldn't be necessary. All the scripted interpreters should go through the generic ScriptInterpreter interface. There's a ScriptInterpreterNone that should stand in for the Python interpreter in every use except directly managing the Python interpreter. So there's no structural reason why we should need LLDB_DISABLE_PYTHON for anything but the initializers. We should just be able to not build the *Python.cpp files and use the define only to not initialize the Python script interpreter. Something got balled up in how the data formatters were implemented that this isn't true, IMHO.

If somebody wants to spend time looking at this, figuring how to untangle this would serve your purposes and also get the architecture back right at the same time.

labath added a subscriber: labath.Nov 3 2018, 3:11 AM

lldb-server is used on architectures that don't have python (readily) available, and it uses the same codebase as the rest of lldb. (Granted, it only needs a small subset of that codebase, and this subset doesn't/shouldn't care about python, but we aren't able to split out that part cleanly (yet)). So I don't think requiring python is a good idea.

What we could do is make it a build-time error if we failed to detect&configure python *AND* the user hasn't explicitly disabled it. This would catch this problem early and avoid other surprises later on (e.g. the entire dotest test suite only works with python enabled. without it, you'd likely get some weird error).