Response files where not being correctly read on Windows, this change fixes the issue and adds some tests.
Details
Diff Detail
Event Timeline
llvm/test/tools/llvm-ar/response-utf8.test | ||
---|---|---|
8 | The reliance on en_US.UTF-8 is probably ok but not ideal. Do we need XFAIL: system-darwin (http://45.33.8.238/mac/1350/step_10.txt) and what's the resolution to the reverted D68472 (@thopre)? |
llvm/tools/llvm-ar/llvm-ar.cpp | ||
---|---|---|
1095 | If we do this, we almost certainly want to add an --rsp-quoting flag like clang-cl and lld-link have, for cross builds targeting windows on linux. |
llvm/test/tools/llvm-ar/response-utf8.test | ||
---|---|---|
8 | I haven't had time to test manually on Mac OS X yet as I was on holidays. I'll try to do it in the coming days and if all looks good I'll revert the revert and remove the XFAIL for system-darwin. |
llvm/tools/llvm-ar/llvm-ar.cpp | ||
---|---|---|
1095 | I agree that flag would be useful but that it should be in another diff. I think that this change is useful by itself and could be committed first. |
llvm/tools/llvm-ar/llvm-ar.cpp | ||
---|---|---|
259 | MatchFlagWithArg function defined in ar_main does something very similar to this function, so can you factor it out and then use that function instead of writing a new one? That doesn't only reduce the code size but also allows users use --rsp-quoting <style> in addition to --rsp-quoting=<style>. |
llvm/test/tools/llvm-ar/response-utf8.test | ||
---|---|---|
9 | Could you try using the same approach as in https://github.com/llvm/llvm-project/blob/c3eded068c64bfe614d25359927a2917ff8e4a35/llvm/test/tools/llvm-ar/mri-nonascii.test#L19 ? |
- Pulled out the MatchFlagWithArg lambda expression into static function, and done the same with some code that handles hyphen removal.
- Added test cases to response.test.
- Removed the python call in response-utf8.test. I would prefer the file comparison but it should be sufficient to show the archive is functioning properly.
llvm/test/tools/llvm-ar/response-utf8.test | ||
---|---|---|
9 | @thopre Hi, have you tried fixing the macOS (or Windows) problem? | |
llvm/tools/llvm-ar/llvm-ar.cpp | ||
1089 | Prefer using in new code. It can be used in an alias template and thus improve consistency with other constructs of the language. | |
1092 | I feel that we can get rid of some if (Arg.empty()) checks if matchFlagWithArg itself recognizes one or two dash, then removeHyphens may probably be unnecessary. | |
1116 | Argv is passed as a non-const reference but it is not actually mutated in the function as I expected. Should getRspQuoting remove --rsp-quoting so that on L1204 we don't need if (matchFlagWithArg("rsp-quoting", Arg, ArgIt, Argv.end())... ? |
- Use using instead of typedef
- removeHyphens is now called from matchFlagWithArg
- Removed Argv reference argument from getRspQuoting
- Fix response.test that failed on linux
llvm/tools/llvm-ar/llvm-ar.cpp | ||
---|---|---|
1116 | Good catch, it was left over from a previous implementation. I think with the current implementation it may be best not to modify Argv. The call to matchFlagWithArg is cleaner than removing rsp-quoting arguments that can be in one vector element or in two elements. If there is an implementation you had in mind I'd be happy to make changes. |
Context not available.
https://llvm.org/docs/GettingStarted.html#sending-patches
apt install arcanist (clone libphutil + clone arcanist) cd llvm; arc install-certificate arc diff 'HEAD^' # Make sure the commit message includes `Differential Revision: `. Update the revision with a full context
llvm/test/tools/llvm-ar/response.test | ||
---|---|---|
25 | If \ has to be used, it is recommended to indent the continuation lines to make them stand out. | |
llvm/tools/llvm-ar/llvm-ar.cpp | ||
182 | Why KeyLetters? I think the ar terminology uses operations and modifiers. Collectively they can be called options. | |
1123 | Redundant pair of braces | |
1181 | space after , |
llvm/test/tools/llvm-ar/response-utf8.test | ||
---|---|---|
9 | My bad, forgot about this message. I'm not sure what problem you are referring to. The file I linked to in my previous message passes on all three of Linux, Mac OS X and Microsoft Windows. |
llvm/test/tools/llvm-ar/response-utf8.test | ||
---|---|---|
9 | Sorry, I didn't notice that llvm/test/tools/llvm-ar/mri-nonascii.test has been re-committed. |
llvm/tools/llvm-ar/llvm-ar.cpp | ||
---|---|---|
182 | I wanted to make the use of the variable clearer as the term Options differs in meaning:
I hoped that changing the variable name would lessen confusion but can change it back if required. |
llvm/tools/llvm-ar/llvm-ar.cpp | ||
---|---|---|
86 | clang nor lld support --rsp-quoting=default. If you pass --rsp-quoting=default to clang, it looks like it is accepted, but the truth is that clang ignores an --rsp-quoting if it is passed with an unknown argument (you can try --rsp-quoting=foo). So, for consistency, maybe you shouldn't add that one to llvm-ar? | |
392 | Could you revert this naming change? It makes code diff a bit hard to follow. | |
1089 | I don't think this using improves code readability. | |
1127 | It seems that this code can be improved in terms of readability, like this: for (StringRef Arg : Argv) { if (Arg == "-rsp-quoting=posix" || Arg == "--rsp-quoting=posix") Ret = cl::TokenizeGNUCommandLine; else if (Arg == "-rsp-quoting=windows" || Arg == "--rsp-quoting=windows") Ret = cl::TokenizeWindowsCommandLine; } This is I believe basically what we do in the clang driver. |
llvm/tools/llvm-ar/llvm-ar.cpp | ||
---|---|---|
1127 | llvm-ar handles all the other long arguments with MatchFlagWithArg and the majority of this patch is so rsp-quoting can share this behaviour. Would you prefer to use the code above? It would reduce the size of the patch but would mean behaviour inconsistencies in how long arguments are handled. |
UTF-8 tests on Windows used to cause llvm-ar trouble. I don't know enough of Windows (nor do I have a Windows machine...) Hope other people can review and confirm the response-utf8.test test will not cause trouble.
llvm/test/tools/llvm-ar/response.test | ||
---|---|---|
28 | Prefer single quotes. It does less translation. Shell quoting is difficult. With single quotes, a reader can think less what happens here. |
llvm/test/tools/llvm-ar/response.test | ||
---|---|---|
28 | If you do want a form feed, you need echo -e. You are currently using a bash/zsh extension. |
llvm/test/tools/llvm-ar/response.test | ||
---|---|---|
28 | I wasn't looking for a form feed but have used echo -e to clarify what is output |
The use of python in the test response.test was failing on my Windows machine with Python 3. This patch fixes the issue.
At @gbreynoo's request, I ran this test locally, and it passed fine. It's worth noting that I sit opposite him, so it's possible that my Windows machine has some unusual but identical-to-his setup that causes it to work. I also inspected the two response files used by the utf8 test, and they have been encoded in UTF-8 (the bytes the '£' character are 0xc2a3 as expected). Meanwhile, the file "response-utf8.test.tmp-£.a" appears to be named as such when viewed in the UI or via PowerShell's or ls's directory listing. The only issue I found was when lit printed the commands, the output got corrupted slightly by the '£' character, but that's not really an issue with the test, I think. I think the response-utf8.test looks good to me.
I haven't looked at the rest of the code. @MaskRay, if there is anything else you'd like me to take a look at, please let me know.
The tests are good, and I don't have more questions about Windows. But please mark all addresses issues "done". Click the "Done" button before uploading a Differential. When the Differential is uploaded, all marked comments will be marked "Done".
llvm/tools/llvm-ar/llvm-ar.cpp | ||
---|---|---|
1103 | removeHyphens is only used once. Consider inlining it. I feel a bit strange about removeHyphens. Does it accept an option without a dash? |
llvm/tools/llvm-ar/llvm-ar.cpp | ||
---|---|---|
1103 | I have inlined the function.. |
Thanks Maskray for your comment regarding marking comments as "done", I'll look out for that in future.
llvm/tools/llvm-ar/llvm-ar.cpp | ||
---|---|---|
1103 | Regarding the your question, the removeHyphens function would accept options without a dash and return the same value. |
llvm/test/tools/llvm-ar/response.test | ||
---|---|---|
24 | There is no positive --rsp-quoting= test, so it is hard to tell whether the feature works. | |
llvm/tools/llvm-ar/llvm-ar.cpp | ||
1102 | Unused | |
1121 | Three-dash ---rsp-quoting will be accidentally accepted. | |
1137 | SmallVector -> ArrayRef | |
1165 | I think we may need some refactoring first. Let me check if the parsing logic can be simplified. |
- Removed unused function removeHyphens
- Fixed matchFlagWithArg so 3 hyphens are no longer excepted before an argument
- getRspQuoting now uses ArraryRef instead of SmallVector
llvm/tools/llvm-ar/llvm-ar.cpp | ||
---|---|---|
1165 | By moving to iterators my hope was this area would be easier to read, however it is still not ideal. Any suggestions for further refactoring would be appreciated. |
It looks like there is some inconsistency about what default to use.
clang seems to default to GNU style, even when running on windows: https://github.com/llvm/llvm-project/blob/master/clang/tools/driver/driver.cpp#L386
This can be overridden by --driver-mode=cl or --rsp-quoting=windows, but the default with no arguments seems to be GNU.
The wasm version of lld (which I work on) also defaults to GNU style (https://github.com/llvm/llvm-project/blob/9989b859efccafacb0cc1f8d393d8b9fc49f4037/lld/wasm/Driver.cpp#L168) although the other lld flavors have logic that matches that in this change.
Do wasm-ld and clang need updating too?
lld's default choices are different among platforms as we are trying to match the native linkers' behaviors. So is true to clang; if you invoke clang as clang-cl (which is a Microsoft-compatible compiler driver), the value is set to Windows by default. I think we didn't have any other policy than this, as we are creating drop-in replacements.
As to wasm-ld, I don't think you need to care about compatibility, as this is the first (and perhaps the only?) linker for wasm. So I would stick with GNU-style quoting instead of make it vary depending on platform to keep it simple.
Hi sbc100, thanks for highlighting this inconsistency. I think ideally for the sake of consistency, wasm-ld and clang would be updated too. I wonder if it would be worth the effort to have some kind of shared implementation for this?
Regarding Rui's point, I think there is a balance with giving Windows users the command line behaviour they expect. Is it least surprising to them that there is difference in default behaviour between platforms or that the default behaviour for Windows is not Windows style?
This issue we in emscripten is really consistency. We have a compiler driver that invokes clang, llvm-ar, llvm-nm, and wasm-ld. We have logic that uses response files for long command lines. I think it's reasonable to expect that all the tools should have the same default, no? Otherwise each subprocess callsite needs to encode the rsp style .. per tool and per platform.
Yeah, that's another type of consistency, and if you feel that that consistency is more important, feel free to make a change to wasm-ld so that the tool defaults to runtime OS's rsp-quoting style.
The reliance on en_US.UTF-8 is probably ok but not ideal. Do we need XFAIL: system-darwin (http://45.33.8.238/mac/1350/step_10.txt) and what's the resolution to the reverted D68472 (@thopre)?