Shell32.dll depends on gdi32.dll and user32.dll, which are mostly DLLs
for Windows GUI functionality. LLVM's utilities don't typically need GUI
functionality, and loading these DLLs seems to be slowing down startup.
Also, we already have an implementation of Windows command line
tokenization in cl::TokenizeWindowsCommandLine, so we can just use it.
The goal is to get the original argv in UTF-8, so that it can pass
through most LLVM string APIs. A Windows process starts life with a
UTF-16 string for its command line, and it can be retreived with
GetCommandLineW from kernel32.dll.
Previously, we would:
- Get the wide command line
- Call CommandLineToArgvW to handle quoting rules and separate it into arguments.
- For each wide argument, expand wildcards (* and ?) using FindFirstFileW.
- Convert each argument to UTF-8
Now we:
- Get the wide command line, convert the whole thing to UTF-8
- Tokenize the UTF-8 command line with cl::TokenizeWindowsCommandLine
- For each argument, expand wildcards if present
- This requires converting back to UTF-16 to call FindFirstFileW
- Results of FindFirstFileW must be converted back to UTF-8