Windows command line argument processing treats consecutive double quotes
as a single double-quote. This patch implements this functionality.
Fixes PR39506.
Differential D58662
Handle consecutive-double-quotes in Windows argument parsing Sunil_Srivastava on Feb 25 2019, 5:58 PM. Authored by
Details Windows command line argument processing treats consecutive double quotes Fixes PR39506.
Diff Detail
Event TimelineComment Actions In terms of practical effect, this patch allows users to do, on Windows, clang ... -DFOO="""ABC""" on a file having const char *p = FOO; Comment Actions lgtm I experimented with this program to confirm the behavior matches MSVCRT: #include <stdio.h> extern "C" const char *GetCommandLineA(void); int main(int argc, char **argv) { puts(GetCommandLineA()); for (int i = 0; i < argc; ++i) { puts(argv[i]); } } C:\src\llvm-project\build>t.exe -DFOO="""ABC""" t.exe -DFOO="""ABC""" t.exe -DFOO="ABC" C:\src\llvm-project\build>t.exe -DFOO=""" A BC""" t.exe -DFOO=""" A BC""" t.exe -DFOO=" A BC" C:\src\llvm-project\build>t.exe tok1 -DFOO=""" A BC""" tok3 t.exe tok1 -DFOO=""" A BC""" tok3 t.exe tok1 -DFOO=" A BC" tok3 C:\src\llvm-project\build>t.exe tok1 -DFOO="" A BC""" tok3 t.exe tok1 -DFOO="" A BC""" tok3 t.exe tok1 -DFOO= A BC" tok3 Comment Actions Thanks for the lgtm Reid, but unfortunately our internal testing found a case where this path causes trouble. Specifically ( with same string initialization that I listed) clang -c -DFOO=\""\\\\\\""\\"\" p.cpp used to work but it but it fails with this patch. Backslashes need to be recognized specially. So I will have to update this review after more study and experimentation. Comment Actions Hi Reid, Update: We did find two cases which 'work' before my proposed patch and will fail after that, but on further study we think they are defective test cases. Our behavior, with this proposed patch will be similar to the Microsoft compiler on Windows hosts. Still, since these cases will start failing with this patch, I am going to describe them here. Case 1: clang -c -DFOO=\""\\\\\\""\\"\" p.cpp This currently compiles with FOO becoming a string of four backslashes, made of first set of three and the second set of one. With the proposed patch it will fail. -E will show (with the source file listed above): const char *p = "\\\"\"; The Microsoft compiler does exactly the same thing. It currently works because the consecutive-DQs (after six backslashes) terminate the first string and start the next, therefore disappearing themselves. With this patch they will survive on as one DQ. Case 2: clang -c """" p.cpp This currently works because four DQs become two empty strings, further concatenated to become one empty string, which gets filtered out leaving just p.cpp. With the proposed patch four DQs will become a string of one DQ, which is not a valid argument, so we will get the error: clang: error: no such file or directory: '"' In this case, the Microsoft compiler survives with a warning, but only because it allows any unrecognized source file type with just a warning: $ cl /nologo -c """" p.cpp cl : Command line warning D9024 : unrecognized source file type '"', object file assumed cl : Command line warning D9027 : source file '"' ignored $ cl /nologo -c junk p.cpp cl : Command line warning D9024 : unrecognized source file type 'junk', object file assumed cl : Command line warning D9027 : source file 'junk' ignored The clang driver is just more strict about this. So, in summary, I believe this patch is valid, and if you still stand by your LGTM, I will be happy to check it in. |