This is an archive of the discontinued LLVM Phabricator instance.

RFC: [lldb] Fix editline unicode on Linux
ClosedPublic

Authored by jankratochvil on Jul 11 2021, 6:42 AM.

Details

Summary

[[ https://lists.llvm.org/pipermail/lldb-dev/2021-July/016961.html | [lldb-dev] proposed change to remove conditional WCHAR support in libedit wrapper ]]
There is already setlocale in lldb/source/Core/IOHandlerCursesGUI.cpp but that does not apply for Editline GUI editing.
Unaware how to make automated test for this, it requires pty.

Diff Detail

Event Timeline

jankratochvil created this revision.Jul 11 2021, 6:42 AM
jankratochvil requested review of this revision.Jul 11 2021, 6:42 AM
jankratochvil edited the summary of this revision. (Show Details)
teemperor requested changes to this revision.Jul 12 2021, 3:48 AM

In retrospective that's a rather obvious fix. Thanks!

This LGTM, but I think this deserves a comment about why we are setting this. Usually LLVM avoids the locale-dependent functions, so this code looks as-is a bit out of place in LLVM. A comment pointing out that editline calls locale-dependent functions seems good enough. Also two small nits as we're anyway touching this code.

FWIW, the Python interpreter runs in the same process as LLDB and Python code can change the locale of the current process. So in theory a user could have a script that does locale.setlocale(locale.LC_ALL, "C") and this way break editline again. But that seems such an obscure case that I don't think we should bother changing/restoring the locale when we enter/leave editline.

lldb/tools/driver/Driver.cpp
871 ↗(On Diff #357799)

Can you make this std::setlocale?

872 ↗(On Diff #357799)

I don't think we need this if we set LC_ALL?

This revision now requires changes to proceed.Jul 12 2021, 3:48 AM
jankratochvil marked an inline comment as done.Jul 12 2021, 1:18 PM
jankratochvil added inline comments.
lldb/tools/driver/Driver.cpp
871 ↗(On Diff #357799)

OK, I see <clocale> is for std::setlocale.

872 ↗(On Diff #357799)

Originally I have just copy-pasted it. But when testing it more now I think it makes some sense:

#include <locale.h>
#include <stdio.h>
#include <wctype.h>
int main(void) {
  printf("setlocale(LC_ALL  )=%s\n",setlocale(LC_ALL  ,""));
  printf("iswprint(0x17e)=%d\n",iswprint(0x17e));
  printf("setlocale(LC_CTYPE):%s\n",setlocale(LC_CTYPE,""));
  printf("iswprint(0x17e)=%d\n",iswprint(0x17e));
  return 0;
}
$ LANG=C LC_NAME=foobar LC_CTYPE=en_US.UTF-8 ./setlocale
setlocale(LC_ALL  )=(null)
iswprint(0x17e)=0
setlocale(LC_CTYPE):en_US.UTF-8
iswprint(0x17e)=1

Because: Setting all of the categories of the locale of the process is similar to successively setting each individual category of the locale of the process, except that all error checking is done before any actions are performed.

teemperor accepted this revision.Jul 13 2021, 2:15 AM

LGTM, thanks!

lldb/tools/driver/Driver.cpp
872 ↗(On Diff #357799)

Good point, didn't think of that case.

This revision is now accepted and ready to land.Jul 13 2021, 2:15 AM
This revision was landed with ongoing or failed builds.Jul 13 2021, 4:01 AM
This revision was automatically updated to reflect the committed changes.
jankratochvil marked an inline comment as done.