Most importantly, we should not replace slashes with backslashes
because that would invalidate the path.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp | ||
---|---|---|
117–125 ↗ | (On Diff #141784) | Shouldn't we instead just use something like LLVM_ON_WIN32 etc? |
llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp | ||
---|---|---|
117–125 ↗ | (On Diff #141784) | I don't think so. Imagine that you have something like a ThinLTO distributed build where bitcode files are copied to a remote machine where code generation happens. In that case the code generator won't necessarily be running the same operating system as the bitcode producer was, and the consumer of the object file will be expecting a path style consistent with the bitcode producer. |
Can you elaborate on the use case? What tool do you expect will consume these Unix style paths? What aspect of the path is invalidated by changing the slash style?
I think the right way to address these distributed cross-compilation use cases is probably going to be to have a flag like "-fdebug-compilation-dir" that the build system supplies with a Windows-style drive letter path.
We can use the paths to provide better linker error messages. See D45467.
What aspect of the path is invalidated by changing the slash style?
It means for example that users can't copy the path out of the linker error message and paste it to their editor.
I think the right way to address these distributed cross-compilation use cases is probably going to be to have a flag like "-fdebug-compilation-dir" that the build system supplies with a Windows-style drive letter path.
Yes this is one example of a case where a "foreign" path ends up being embedded in the directory component of a debug info path and the code generator will need to respect that and not the path convention of the host machine. A similar situation is where the frontend is run on the local machine and the backend runs on the remote machine -- this is the distributed thinlto situation. In that case the frontend will "naturally" embed host-style paths in the directory component of debug info paths and the backend will again need to respect them.
For this use case, is it actually important to preserve symlinks? Would it be enough to make the linker canonicalize the slashes to native format when printing the diagnostic?
llvm/lib/CodeGen/AsmPrinter/CodeViewDebug.cpp | ||
---|---|---|
117–125 ↗ | (On Diff #141784) | Yes, this definitely shouldn't be conditional on macros, it's deep in LLVM codegen. llc should do the same thing with IR everywhere. |
Assuming that we don't care about symlinks, that would work I suppose, but it wouldn't necessarily do the right thing in distributed links, and it would mean that any other tools that consume CodeView on non-Windows (e.g. LLDB remote debugging a Windows binary) would need to do the same thing.
OK, here are the two competing use cases that I am imagining:
- (yours) Full cross-compilation from Unix, with linker diagnostics referring to files that exist locally on Unix.
- Distributed cross-compilation on Unix, local linking on Windows with the VC linker.
I think the correct solution to 2 is going to be to use -fdebug-compilation-dir and -fdebug-prefix-map=/unix/path=C:/windows/path so that these DIFile directory components start with a drive letter that makes sense on the system doing the link.
Basically, if we have a DIFile directory that starts with /, let's trust that the producer wants Unix paths in their CodeView.