This is an archive of the discontinued LLVM Phabricator instance.

Do not enforce absolute path argv0 in windows
ClosedPublic

Authored by takuto.ikuta on May 31 2018, 3:21 AM.

Details

Summary

Even if we support no-canonical-prefix on clang-cl(https://reviews.llvm.org/D47480), argv0 becomes absolute path in clang-cl and that embeds absolute path in /showIncludes.

This patch removes such full path normalization from InitLLVM on windows, and that removes absolute path from clang-cl output (obj/stdout/stderr) when debug flag is disabled.

Diff Detail

Event Timeline

takuto.ikuta created this revision.May 31 2018, 3:21 AM
takuto.ikuta edited the summary of this revision. (Show Details)
ruiu added a subscriber: ruiu.May 31 2018, 5:52 AM

Looks like this patch contains two unrelated changes. Please separate your change from the lit config changes.

I was under the impression that some tools rely on the fact that arg[0] is always expanded to an absolute path. Does this work with lldb and its test suite?

takuto.ikuta edited the summary of this revision. (Show Details)

Looks like this patch contains two unrelated changes. Please separate your change from the lit config changes.

First patch made on some wrong assumption, fixed and reverted test config change.

I was under the impression that some tools rely on the fact that arg[0] is always expanded to an absolute path. Does this work with lldb and its test suite?

I tried, but there is no check-lldb target. Can I ask you what is the target name to run lldb test suite?

rnk added a comment.May 31 2018, 10:27 AM

I think this would be easy to unit test in llvm/unittests/Support/CommandLine.cpp. We'd just check that the filename is "SupportTests.exe" on Windows and the path is relative after calling this, I guess. Right? Look at how ProgramTest.cpp does this to get a reasonable argv0:

sys::fs::getMainExecutable(TestMainArgv0, &ProgramTestStringArg1);

I was under the impression that some tools rely on the fact that arg[0] is always expanded to an absolute path. Does this work with lldb and its test suite?

I tried, but there is no check-lldb target. Can I ask you what is the target name to run lldb test suite?

The LLDB test suite isn't in very good shape on Windows. It is complicated to set up and build, I don't want to block this fix on @takuto.ikuta setting up that build environment. This is a Windows-only change, and I believe it makes it more consistent with Linux, so as long as check-llvm, check-clang, and check-lld pass, this should be relatively safe.

In D47578#1117874, @rnk wrote:

The LLDB test suite isn't in very good shape on Windows. It is complicated to set up and build, I don't want to block this fix on @takuto.ikuta setting up that build environment. This is a Windows-only change, and I believe it makes it more consistent with Linux, so as long as check-llvm, check-clang, and check-lld pass, this should be relatively safe.

OK, I'll take rnk's word for that.

ruiu added inline comments.May 31 2018, 11:24 AM
llvm/lib/Support/Windows/Process.inc
236

It looks like this function is a bit too complicated. I'd define a function that expands DOS-style 8.3 path and adds missing extension, and use that function.

Add test and split function

takuto.ikuta marked an inline comment as done.May 31 2018, 9:12 PM
In D47578#1117874, @rnk wrote:

I think this would be easy to unit test in llvm/unittests/Support/CommandLine.cpp. We'd just check that the filename is "SupportTests.exe" on Windows and the path is relative after calling this, I guess. Right? Look at how ProgramTest.cpp does this to get a reasonable argv0:

sys::fs::getMainExecutable(TestMainArgv0, &ProgramTestStringArg1);

Added test.

I was under the impression that some tools rely on the fact that arg[0] is always expanded to an absolute path. Does this work with lldb and its test suite?

I tried, but there is no check-lldb target. Can I ask you what is the target name to run lldb test suite?

The LLDB test suite isn't in very good shape on Windows. It is complicated to set up and build, I don't want to block this fix on @takuto.ikuta setting up that build environment. This is a Windows-only change, and I believe it makes it more consistent with Linux, so as long as check-llvm, check-clang, and check-lld pass, this should be relatively safe.

I confirmed that this patch passed check-llvm, check-clang and check-lld.

I tried to test lldb, but failed to build. lldb looks cannot be built with MSVC 2017 15.7.2

FAILED: tools/lldb/tools/lldb-mi/CMakeFiles/lldb-mi.dir/MIDriverMain.cpp.obj
c:\src\chromium\src\third_party\llvm-build\Release+Asserts\bin\clang-cl.exe  /nologo -DGTEST_HAS_RTTI=0 -DIMPORT_LIBLLDB -DLLDB_CONFIGURATION_RELEASE -DLLDB_DISABLE_CURSES -DLLDB_DISABLE_LIBEDIT -DLLDB_DISABLE_PYTHON -DLLDB_PYTHON_HOME=\"\" -DLLDB_USE_BUILTIN_DEMANGLER -DUNICODE -D_CRT_NONSTDC_NO_DEPRECATE -D_CRT_NONSTDC_NO_WARNINGS -D_CRT_SECURE_NO_DEPRECATE -D_CRT_SECURE_NO_WARNINGS -D_DEBUG_POINTER_IMPL="" -D_HAS_EXCEPTIONS=0 -D_SCL_SECURE_NO_DEPRECATE -D_SCL_SECURE_NO_WARNINGS -D_UNICODE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -Itools\lldb\tools\lldb-mi -IC:\src\git\llvm-project-20170507\lldb\tools\lldb-mi -Itools\lldb\include -IC:\src\git\llvm-project-20170507\lldb\include -Iinclude -IC:\src\git\llvm-project-20170507\llvm\include -IC:\src\git\llvm-project-20170507\llvm\..\clang\include -Itools\lldb\..\clang\include /DWIN32 /D_WINDOWS   /Zc:inline /Zc:strictStrings /Oi /Zc:rvalueCast /Brepro /W4  -W -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -Wcovered-switch-default -Wdelete-non-virtual-dtor -Wstring-conversion -Wno-deprecated-declarations -Wno-unknown-pragmas -Wno-strict-aliasing -Wno-deprecated-register -Wno-vla-extension /MD /O2 /Ob2 /DNDEBUG   -wd4018 -wd4068 -wd4150 -wd4251 -wd4521 -wd4530  /EHs-c- /GR- /showIncludes /Fotools\lldb\tools\lldb-mi\CMakeFiles\lldb-mi.dir\MIDriverMain.cpp.obj /Fdtools\lldb\tools\lldb-mi\CMakeFiles\lldb-mi.dir\ -c C:\src\git\llvm-project-20170507\lldb\tools\lldb-mi\MIDriverMain.cpp
In file included from C:\src\git\llvm-project-20170507\lldb\tools\lldb-mi\MIDriverMain.cpp:37:
C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.14.26428\include\csignal(19,13):  error: no member named 'sig_atomic_t' in the global namespace
using _CSTD sig_atomic_t; using _CSTD raise; using _CSTD signal;
      ~~~~~ ^
C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.14.26428\include\csignal(19,39):  error: no member named 'raise' in the global namespace
using _CSTD sig_atomic_t; using _CSTD raise; using _CSTD signal;
                                ~~~~~ ^
C:\Program Files (x86)\Microsoft Visual Studio\2017\Professional\VC\Tools\MSVC\14.14.26428\include\csignal(19,58):  error: no member named 'signal' in the global namespace
using _CSTD sig_atomic_t; using _CSTD raise; using _CSTD signal;
                                                   ~~~~~ ^
C:\src\git\llvm-project-20170507\lldb\tools\lldb-mi\MIDriverMain.cpp(74,10):  error: use of undeclared identifier 'SIGINT'
  signal(SIGINT, sigint_handler);
         ^
C:\src\git\llvm-project-20170507\lldb\tools\lldb-mi\MIDriverMain.cpp(185,10):  error: use of undeclared identifier 'SIGINT'
  signal(SIGINT, sigint_handler);
         ^
5 errors generated.
ruiu added a comment.Jun 1 2018, 10:17 AM

It seems like you are trying a bit too hard to keep the original code which is not always a good idea to keep the clarity of the code.

So, IIUC, you this function:

  • returns a full filename of the current executable. That can be written using GetModuleFileName and GetLongPathName

then you basically do the following to

sys::path::remove_filename(Arg0);
sys::path::append(Arg0, sys::path::filename(ExectuablePath));
llvm/lib/Support/Windows/Process.inc
211–212

This function is used only by your new function, so it is probably better to inline it.

236

You can always ignore Argv0, no? Since GetModuleFileName is always available, you probably should use it unconditionally, ignoring the original argv[0].

261

I believe GetModuleFileName always returns a path with an extension, though it might be 8.3 path.

278–279

Don't convert utf-8 to utf-16 only to convert it back to utf-8 immediately after returning from this function.

Address comments

takuto.ikuta marked 3 inline comments as done.Jun 4 2018, 2:36 AM

Thank you for review

llvm/lib/Support/Windows/Process.inc
261

I meant to say about filename in original argv0 here.

ruiu added a comment.Jun 6 2018, 2:36 PM

It seems you converted the same string back and forth between UTF-8 and UTF-16 to call Windows functions and LLVM functions. That's not only a waste of time (which is not a big problem) but complicates the code.

I'd define std::error GetExecutableName(std::string &Result) which returns a filename (fully expanded but not a fullpath) of the current executable in UTF-8. Then, add something like this at the *end* of GetCommandLineArguments():

SmallVector<char> Arg0(Args[0], strlen(Args[0]);
sys::path::remove_filename(Arg0);
GetExecutableName(Filename);
sys::path::append(Arg0, Filename);
Args[0] = AllocateString(Arg0, Alloc);
ruiu added a comment.Jun 6 2018, 2:46 PM

So the best practice is, when you get a UTF-16 string from an Windows API, you should convert it to UTF-8 as soon as you can so that you can always process strings as UTF-8. Likewise, you should always keep all string in UTF-8 in memory and convert them to UTF-16 just before passing them to an Windows API if necessary. If you find yourself writing UTF-8 ↔ UTF-16 conversion code too frequently, that's like a signal that you are not doing text processing in UTF-8. Basically you need to do coding conversion up to only once for each new string.

I see. Changed not to convert many times.

I confirmed that this passed check-lld, check-llvm and check-clang. If this looks good for you, can I ask you to merge this?

Thanks.

ruiu added inline comments.Jun 7 2018, 12:51 PM
llvm/lib/Support/Windows/Process.inc
240

The intention of the code is to return a success, so it is less confusing if you directly return a success (i.e. std::error_code())

261–262

Looks like defining a variable on at a time is more common than defining two variables in one line.

takuto.ikuta marked 2 inline comments as done.

Can the patch be merged this time?

ruiu added inline comments.Jun 8 2018, 2:35 PM
llvm/lib/Support/Windows/Process.inc
216

Can't this be size_t?

227

and eliminate this static_cast.

234

ec -> EC

251–252

EC

255–256

I

thakis added a comment.Jun 8 2018, 5:30 PM

ruiu: This review has now gone on for a week, with one cycle per day due to timezones. Since the comments are fairly minor nits, do you think you could address them yourself while landing this, in the interest of not spending another week on this?

ruiu accepted this revision.Jun 8 2018, 5:55 PM

Well, I don't think that pointing out that we shouldn't convert strings between UTF8 and UTF16 back and forth is nitpicking, but yeah, this has already been addressed, so feel free to submit. LGTM.

This revision is now accepted and ready to land.Jun 8 2018, 5:55 PM

Follow llvm style

takuto.ikuta marked 5 inline comments as done.Jun 11 2018, 1:45 AM

Rui-san, can I ask you to merge this?

hans added a subscriber: hans.Jun 13 2018, 7:37 AM

Rui-san, can I ask you to merge this?

Committed here: http://llvm.org/viewvc/llvm-project?view=revision&revision=334602

takuto.ikuta closed this revision.Jun 15 2018, 2:26 AM