When writing a dependency (.d) file, if space or # is immediately
preceded by backslashes, escape the backslashes as well as the space
or # character.
This straddles the fence between BSD Make (which does no escaping at
all, and does not support space or # in filespecs) and GNU Make (which
does support escaping, but will fall back to the filespec as-written
if the escaping doesn't match an existing file).
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
lib/Frontend/DependencyFile.cpp | ||
---|---|---|
298 ↗ | (On Diff #24255) | Why # also? We are trying to be gcc-compatible. See https://gcc.gnu.org/git/?p=gcc.git;a=blob;f=libcpp/mkdeps.c;h=78bbf09a0cc8c3ba0fe974b55e1ededf02c66665;hb=HEAD#l47 |
lib/Frontend/DependencyFile.cpp | ||
---|---|---|
298 ↗ | (On Diff #24255) | No, we're trying to produce something that GNU Make can use. What gcc produces is broken by any definition. $ cat tspace.c |
lib/Frontend/DependencyFile.cpp | ||
---|---|---|
298 ↗ | (On Diff #24255) | Please file a bug with gcc. Also, curiously that .d file seems to work for me on windows under cygwin: Sean@Sean-PC ~ $ cat >tspace.c #include "foo\ bar.h" #include "foo\#bar.h" int i; Sean@Sean-PC ~ $ mkdir foo Sean@Sean-PC ~ $ cd foo Sean@Sean-PC ~/foo $ mkdir ' bar.h' Sean@Sean-PC ~/foo $ mkdir '#bar.h' Sean@Sean-PC ~/foo $ cd .. Sean@Sean-PC ~ $ cat >tspace.d tspace.o: tspace.c foo\\\ bar.h foo\\#bar.h Sean@Sean-PC ~ $ make -f tspace.d cc -c -o tspace.o tspace.c make: cc: Command not found <builtin>: recipe for target 'tspace.o' failed make: *** [tspace.o] Error 127 Also, clang's current output seems to work fine on cygwin. In what cases is clang's output currently breaking? Is it just the case of filenames containing backslashes followed by space/hash on unix? The set of strings that can be correctly escaped in make is inherently limited, so we are inherently compromising on what file names can be properly represented. Does it make sense to extend the set of properly representable filenames to files containing backslashes in their name? E.g. include "foo\ bar.h". On unix we would be instructing make to look for a file 'foo\ bar.h' in the current directory but on windows in my testing above we would be instructing make to look for a file ' bar.h' in directory 'foo'. Any project for which that would actually be correct behavior is insane. I think that realistically, backslashes are only going to end up in includes on windows where they will be intended as directory separators, and will result in compilation errors on other platforms. I believe that clang's current output actually works correctly in that case. We might be overthinking this. What case are we fixing? And does it make sense to fix that case? |
lib/Frontend/DependencyFile.cpp | ||
---|---|---|
298 ↗ | (On Diff #24255) | Cygwin is not doing the right thing for me. Using Clang's depfile: probinson@PROBINSON-T3610 ~ probinson@PROBINSON-T3610 ~ probinson@PROBINSON-T3610 ~ With gcc's depfile: probinson@PROBINSON-T3610 ~/foo probinson@PROBINSON-T3610 ~ But if I add the extra backslash manually, it works correctly. probinson@PROBINSON-T3610 ~ probinson@PROBINSON-T3610 ~ So, what am I fixing? If we're going to do anything special with space and hash, then we should do a special thing that causes them to work properly with a tool that knows how to interpret the special thing. The only tool I have handy that is willing to do a special thing with space and hash is GNU Make (on Linux or Windows). The special thing to do is backslash-quote those characters. But that doesn't work in GNU Make unless you also backslash-quote the preceding backslashes; and that characteristic is the same for quoting either space or hash. Therefore, what clang produces won't work at all, and what gcc produces will work only for space and not for hash. My fix makes Clang's output work for both space and hash. Provably. I freely admit--always have, I think--that this is a rather obscure corner case that just isn't going to happen in practice. But it's provably wrong, and my fix provably corrects it. This is a problem? Alternatively, we can just abandon any special processing for space and hash, and say Don't Do That. But leaving this thing in its current provably wrong state just leaves me professionally offended. |
lib/Frontend/DependencyFile.cpp | ||
---|---|---|
298 ↗ | (On Diff #24255) | And why I want to fix this... aside from the professionally-offended part... :-) We are looking at modifying the PS4 toolchain's builder to move away from double-quotes, to doing things the GNU-Make style way. But I don't want to require our builder to correctly interpret broken depfiles; Clang should be producing correctly formed depfiles, and our builder should be able to depend on depfiles being correctly formed. |
Please file a bug with gcc.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65852
I traded comments on this with somebody yesterday; not actually Confirmed, but he also didn't reject it.
Ping. Sean, I filed a gcc bug as noted previously, which two people have commented on and admitted it looks funny.
Please add a lengthy comment describing exactly how to reproduce the case where GCC's output is "broken" and our deviation from its behavior results in things not being broken. Also, in the test file, please isolate all testing of this behavioral deviation and identify it as such. Other than that, this LGTM.
Does gcc intend to fix this soon? Isn't being compatible with gcc important
than the other things?
If gcc emitted an incorrect relocation, would you argue that it's important to be compatible with gcc? Even if you could not point to any linker that handled that buggy relocation in a reasonable way?
Update the test to isolate the not-like-gcc behavior, and incorporate a Lengthy Comment on the function implementing this behavior as requested by reviewer.
If a program (say, ninja) tries to be compatible with gnu make's depfile parsing, it would previously convert "\ " to a space from what I understand. Now it's going to get "\\\ " and think that that's "\ ".
You're mixing things up. #include "\ " would be converted by gcc to "\\ " (because it escapes the space but not the backslash) which would be de-escaped by GNU Make as "\" followed by a space delimiter.
Now Clang will give it "\\\ " which will be handled as "\ " which is correct.
(Remember that the string you hand to #include is NOT a normal C string; it has no escaping in it.)
--paulr
From: cfe-commits-bounces@cs.uiuc.edu [mailto:cfe-commits-bounces@cs.uiuc.edu] On Behalf Of Nico Weber
Sent: Friday, May 08, 2015 4:36 PM
To: reviews+D9208+public+0ac91376ffbd3e34@reviews.llvm.org
Cc: cfe-commits@cs.uiuc.edu
Subject: Re: [PATCH] Fix dependency file escaping
Paul, can you go ahead and commit the non-controversial part of this patch? It sounds like we're getting really held up on the escaping of # when the other part of this patch is pure win.
Looks like we timed-out simultaneously... if people really insist that the # part is a problem, I can undo that as a post-commit fixup.