GCC 8 implements -fmacro-prefix-map. Like -fdebug-prefix-map, it replaces a string prefix for the FILE macro.
-ffile-prefix-map is the union of -fdebug-prefix-map and -fmacro-prefix-map
Details
Diff Detail
- Repository
- rC Clang
- Build Status
Buildable 26908 Build 26907: arc lint + arc unit
Event Timeline
include/clang/Lex/Preprocessor.h | ||
---|---|---|
155 ↗ | (On Diff #156003) | Scrolling back up, put this implementation as close to the debug implementation as you can, they are so ridiculously related that having them far enough apart that future changes could cause them to divert is troublesome to me. |
include/clang/Lex/PreprocessorOptions.h | ||
174 | It seems this can be StringRefs as well. | |
lib/Driver/ToolChains/Clang.cpp | ||
616–617 | filtered can take multiple options, you should be able to not add anything here except adding OPT_ffile_prefix_map_EQ to the filtered line, plus a ternary in the Diag line. | |
628 | See advice above. Additionally/alternatively, I wonder if these two functions could be trivially combined. | |
1279 | Is there a good reason for this to not live with the call to addDebugPrefixMapArg? | |
lib/Frontend/CompilerInvocation.cpp | ||
3094 | Again, this is so much like the debug-prefix otpion, it should probably just be right next to it. Additionally, we don't use curley brackets on single-line bodies. | |
lib/Lex/PPMacroExpansion.cpp | ||
1462 | make MacroPrefixMap const. | |
1533 | This change shouldn't be necessary, SmallString is still likely the right answer here. | |
lib/Lex/Preprocessor.cpp | ||
160 ↗ | (On Diff #156003) | I'm unconvinced that this is necessary. ExpandBuiltinMacro is in Preprocessor, so it has access to PPOpts directly. |
lib/Driver/ToolChains/Clang.cpp | ||
---|---|---|
616 | I find it confusing that -ffile_prefix_map implies -fdebug-prefix-map. I'm not sure that is desirable in every case. It seems better to have a combined option that explicitly does both. | |
619 | I'd prefer the bailout style here, i.e. if (...) { D.diag(...); continue } | |
628 | Or at least the for loop could be refactored into a small helper function that takes the option name, output option and error as argument. | |
lib/Lex/PPMacroExpansion.cpp | ||
1465 | This doesn't handle directory vs string prefix prefix correctly, does it? At the very least, it should have a test case :) |
lib/Driver/ToolChains/Clang.cpp | ||
---|---|---|
616 | -ffile-prefix-map is the combined option. -fmacro-prefix-map is the preprocessor option, and -fdebug-prefix-map is the codegen option. | |
628 | Good ideas. I'll look into them. | |
1279 | Mostly because this is the function that adds preprocessor specific options. There's no other reason why it couldn't be done alongside addDebugPrefixMapArg in this file. | |
lib/Frontend/CompilerInvocation.cpp | ||
3094 | This is here because it's a preprocessor option. This function handles those. The DebugPrefixMap handling is a codegen option. | |
lib/Lex/PPMacroExpansion.cpp | ||
1462 | I shall. | |
1465 | Good catch. I expect not. But on the other hand, it's exactly what debug-prefix-map does :) I'll add test cases in a future review. My first goal was getting something sort-of working. | |
1533 | I tried that long ago. It didn't work, I don't remember exactly why. But I agree that SmallString should be enough. I'll dig more. | |
lib/Lex/Preprocessor.cpp | ||
160 ↗ | (On Diff #156003) | It has access to PPOpts, but the implementation file doesn't have a full definition of PreprocessorOptions. I could add that to this file, then this becomes redundant. |
include/clang/Basic/DiagnosticDriverKinds.td | ||
---|---|---|
118 | Since these are otherwise identical, perhaps a %select{...|...} for the flag name? | |
include/clang/Lex/PreprocessorOptions.h | ||
174 | Did you miss this one? Or is there a good reason these cannot be stringrefs? | |
lib/Driver/ToolChains/Clang.cpp | ||
621 | With the continue above, 'else' is unnecessary/against coding standard. | |
636 | See above. | |
lib/Lex/PPMacroExpansion.cpp | ||
1461 | Did clang-format do this formatting? It looks REALLY weird... | |
1533 | Just noting to handle this before approval. | |
1536 | First, comments end in a period. Second, isn't that what the next line does? |
include/clang/Lex/PreprocessorOptions.h | ||
---|---|---|
174 | I didn't miss it. StringRefs here don't survive. The function that adds them to the map creates temporary strings, that go away once that function ends causing StringRefs to dangle. std::string keeps copies. | |
lib/Driver/ToolChains/Clang.cpp | ||
621 | Next diff will have that. | |
lib/Lex/PPMacroExpansion.cpp | ||
1461 | No, that's my text editor. I'll fix it. | |
1465 | There should be a test, but apparently the debug prefix map code also does this. What do you think the correct behaviour should be? a string prefix, or a directory prefix? | |
1533 | Yup, with some changes to remapMacroPath SmallString works fine. | |
1536 | Yes, old comment is old ;) |
include/clang/Lex/PreprocessorOptions.h | ||
---|---|---|
174 | Oh! I hadn't realized that getAllArgValues gives a vector<string>. That is actually pretty odd for our codebase. Looking into it, there is no reason that function cannot return a vector of StringRef... Alright, at one point someone should likely fix that, but that person should change this type. |
FYI, gcc uses the stupid and bad string prefix matching approach. if clang supports the same option, it should have the same behavior. you could decide that it's a bad idea, but then the option should be called something else, otherwise people will have to go back to testing compiler name and versions again which is :(.
The functionality looks correct to me, but could you include some tests in test/Driver/ and test/Preprocessor/ just to be sure?
test/Driver/debug-prefix-map.c and test/CodeGen/debug-prefix-map.c could serve as inspiration.
The documentation should probable be updated too: docs/ClangCommandLineReference.rst
(It would be nice to have this feature for Reproducible Builds)
lib/Lex/PPMacroExpansion.cpp | ||
---|---|---|
1465 | It should be a string prefix (like GCC) |
lib/Lex/PPMacroExpansion.cpp | ||
---|---|---|
1465 | I disagree. I consider it a bug in GCC that it is a string prefix. It's quite inconsistent as well. |
lib/Lex/PPMacroExpansion.cpp | ||
---|---|---|
1465 | I agree with you, it should have been a directory prefix but GCC implements it as a string prefix although the GCC documents it as: If you decide to fix -fmacro-prefix-map to use a directory prefix match, then the -fdebug-prefix-map should also be fixed for consistency. What about implementing the (buggy) GCC-compatible behavior first and then fixing both cases in a future patch? (I don't mind when the buggy behavior is fixed, I just want to see this functionality moving forward.) Another edge case that I ran into is when using the option to drop directories. When using -ffile-prefix-map=/src=, the command cd /src && cc /src/foo.c would have __FILE__ equal to /foo.c. As a native "fix", one would try -ffile-prefix-map=/src/= which indeed produces __FILE__ equal to foo.c. Matching with a trailing slash however fails to correctly remap some debug information, namely DW_AT_comp_dir. This contains the working directory (/src) which is not matched by /src/. By using a proper directory prefix match, this would be nicely fixed. |
PostgreSQL 11 is now using LLVM to do JITing of SQL expressions. We'd need this feature to strip the build directory off the .bc bitcode files so the .deb packages build reproducibly.
@dankm: Are you still working on this? What can we do to help getting this move forward?
lib/Lex/PPMacroExpansion.cpp | ||
---|---|---|
1465 | Yes, I'm going to submit my code with tests, and hoist the prefix remapping (for debug-prefix-map and macro-prefix-map) into a common location. Most probably part of Path. |
Added unit tests for the prefix remapping.
Switched the sorting on the prefix map, so that <somepath>/sub gets remapped before <somepath> if both are specified.
I intend to do a more invasive change after this review to unify path prefix remapping.
FYI, according to my comment on D49652, assuming I checked it correctly, gcc applies the maps in reverse order of command line specification, not sorted order. It seems unlikely that anyone is actually depending on the order though.
Yeah, I noticed that, but it appears to be undefined by GCC's documentation. I agree with review D49652, but I also want to get this in before 8.0 branches, even if it's not ideal.
Right now we apply them in strict alphabetical order, switching to reverse order lets one map /objdir/<sysroot> to / while remapping /objdir/ to /some/other/dir, which is my use case.
That's the other reason why I find the GCC specification as string prefix confusing. I still say we should just go with mapping of path names and then the order question mostly goes away.
It would be nice to have this for Clang 8.0, the branch date is within 5 days :)
lib/Driver/ToolChains/Clang.cpp | ||
---|---|---|
620 | For clang -ffile-prefix-map=foo, wouldn't this report invalid argument 'foo' to -fdebug-prefix-map? If so, perhaps some method of A or A->getOption() can be used? | |
633 | Same concern here about -ffile-prefix-map=foo showing an error message about -fmacro-prefix-map. | |
test/Preprocessor/file_test.c | ||
6 | Any reason to keep this comment? |
Could you add more tests to check the error message for bad options (missing =):
-fdebug-prefix-map=bad -fmacro-prefix-map=bad -ffile-prefix-map=bad
FWIW, GCC emits two errors for -ffile-prefix-map=bad.
Another edge case is -ffile-prefix-map==foo/, GCC currently uses this to prepend foo/ to every path. Not sure if that is intentional, but that is the current behavior (one which is also replicated by this patch I believe).
Could you also mark review comments that are completed as "done"? It should make the diff easier to read (I hope) :)
include/clang/Basic/DiagnosticDriverKinds.td | ||
---|---|---|
118–119 | Maybe rename _to_prefix_map to _to_option? (And maybe swap the order of parameters so %0 comes before %1?) |
renamed err_drv_invalid_argument_to_prefix_map to err_drv_invalid_argument_to_option
added more frontend tests for macro-prefix-map and file-prefix-map.
Some more got added with the latest diff
FWIW, GCC emits two errors for -ffile-prefix-map=bad.
Yes, this does too. It looked odd to me, but it's not a huge deal.
Another edge case is -ffile-prefix-map==foo/, GCC currently uses this to prepend foo/ to every path. Not sure if that is intentional, but that is the current behavior (one which is also replicated by this patch I believe).
Yes, with this patch it does that for file-prefix-map and macro-prefix-map. It already did that (sort-of) for debug-prefix-map, but seems to add it twice for some debugging information, but I'll fix that later since it's done that since at least version 5.0.
Could you also mark review comments that are completed as "done"? It should make the diff easier to read (I hope) :)
Yes, I tried to do that with this comment. I'm new to phabricator.
Except one thing, it looks reasonable to me. I'll try to run some tests and report back tomorrow.
(Not very familiar with Phabricator either. I still see some comments, hopefully the "Collapse" function does something useful here.)
test/Driver/prefix-map.S | ||
---|---|---|
7 ↗ | (On Diff #181363) | Maybe restore the old file name (debug-prefix-map.S) since this still tests the debug prefix functionality? And otherwise this comment needs to be updated. |
Tests pass here, using it on a large CMake project with a CMAKE_BUILD_TYPE=Debug and c/cxxflags -ffile-prefix-map=$builddir= -ffile-prefix-map=$srcdir/= -fuse-ld=lld successfully strips all traces of $builddir and $srcdir.
If you could take care of the previous comment (undo the rename or rename debug-prefix.map.c), then I've no further comments.
If @joerg or someone else could give the final review/pass, that would be great :)
lib/Driver/ToolChains/Clang.cpp | ||
---|---|---|
619 | Wouldn't using if (...) { D.diag(...); continue; } also skip the A->claim() call? Presumably that could result in spurious errors as well about unused arguments? |
Still fine by me, thanks!
As for the commit message, perhaps reference:
https://bugs.llvm.org/show_bug.cgi?id=38135
As discussed with dankm on IRC, I still would like to see the correct behavior going into 8.0, i.e. not change it later. Since this also matters for potential faster implementations later, it seems like a good idea to do it now. The changes are well-localized.
(1) Do path prefix matching and not string prefix matching. The difference is that the file name must be longer than the prefix and the prefix must be followed by a path separator.
(2) The longest prefix match wins. Substituation is applied only once per file name, independent of the rules. This gives more predictable output and allows switching to a tree-lookup later.
Changes still look reasonable, but the preceding path (https://reviews.llvm.org/D56769) needs some work.
lib/CodeGen/CGDebugInfo.cpp | ||
---|---|---|
607 | Any reason for dropping remapDIPath here? Wouldn't this result in the full path being included even when using: clang -fdebug-prefix-map=/full/path/= /full/path/source.c | |
lib/Lex/PPMacroExpansion.cpp | ||
1466 | Style: space between if and ( |
I'll update the style nit, and spend some non-tired time on the string remapping. Thanks
lib/CodeGen/CGDebugInfo.cpp | ||
---|---|---|
607 | Whoops. That probably shouldn't have been included this round. This is a bugfix. MainFileName is already remapped from earlier in this function, this keeps it from remapping twice if you have an empty old prefix. |
lib/CodeGen/CGDebugInfo.cpp | ||
---|---|---|
607 | The remapping was done here: c std::string MainFileDir; if (const FileEntry *MainFile = SM.getFileEntryForID(SM.getMainFileID())) { MainFileDir = remapDIPath(MainFile->getDir()->getName()); (Observation: the declaration could probably be moved inside the if block since it is not used outside.) What about the second case though? For example, assume /tmp/testdir/mytest.ii: # 1 "/tmp/mytest.c" # 1 "<built-in>" # 1 "<command-line>" # 31 "<command-line>" # 1 "/usr/include/stdc-predef.h" 1 3 4 # 32 "<command-line>" 2 # 1 "/tmp/mytest.c" int main(int argc, const char *argv[]) { return 0; } What happens if you now compile with clang -fdebug-prefix-map=/tmp/=/bla/ /tmp/testdir/mytest.ii from /tmp/testdir? Unless this affects the current patch, consider moving it to a separate change. |
lib/CodeGen/CGDebugInfo.cpp | ||
---|---|---|
476–478 | looking at llvm/lib/Support/Path.cpp replace_path_prefix() returns void but here inside if() it will expect a bool return value |
lib/CodeGen/CGDebugInfo.cpp | ||
---|---|---|
476–478 | nm I guess I needed to look into https://reviews.llvm.org/D56769 as well. |
Hi @dankm, any progress on this feature? The proposed branch off date for Clang 9.0.0 is 18 July 2019: https://lists.llvm.org/pipermail/cfe-dev/2019-June/062628.html
Latest changes. I've been sitting on these for months, so I don't remember all that changed. The path remapping contract changed somewhat, and it's now based on the git monorepo.
Thanks for picking this up again. I've left some nitpicks below in a quick review.
The "strict" parameter is not precisely defined, if that is fixed I think this would be ready for merge.
clang/test/Driver/debug-prefix-map.c | ||
---|---|---|
8 ↗ | (On Diff #212723) | What about combining these two tests? The command is the same, maybe you could have a new -check-prefix to reduce the number of invocations? Likewise for the cases below. |
llvm/include/llvm/Support/Path.h | ||
172 ↗ | (On Diff #212723) | "strict checking" is ambiguous on its own. What about something like: If strict is true, a directory separator following \a OldPrefix will also be stripped. Otherwise, directory separators will only be matched and stripped when present in \a OldPrefix. Or whatever semantics you would like to assign to "strict mode". |
181 ↗ | (On Diff #212723) | Why have a variant with the parameters swapped, is it common in LLVM to have such convenience wrappers? Why not require callers to pass Style::native whenever they want to modify "strict"? |
llvm/lib/Support/Path.cpp | ||
512 ↗ | (On Diff #212723) | this condition is duplicated above |
@dankm do you still plan to work on this? We would really like to see this landed and we could help if needed.
clang/test/Driver/debug-prefix-map.c | ||
---|---|---|
8 ↗ | (On Diff #212723) | The tests will need more thinking, you're probably right, but I don't have much time to work on this at the moment. How do you feel about addressing this in the future? |
At this point sure. Unless it's accepted as-is now, then I don't have a commit bit to finish it off.
The tests need fixing... I can commit it. Now that we've migrated to the llvm monorepo, the git commit message can retain the author info properly...
Add back remapDIPath that was unintentionally deleted by D69213, caught by a test.
Small adjustment of the code
There's still one failing test on Windows after the fix attempt: http://45.33.8.238/win/3052/step_6.txt
Please take a look and revert if it's not an easy fix. (And please watch bots after committing stuff.)
XFAIL'ed clang/test/Preprocessor/file_test.c. I didn't receive an email from http://45.33.8.238/win/3052/step_6.txt not sure if that was because the author email was @dankm's, not mine. If there is a console. please tell me :)
Does this work on Windows?
--- i/clang/test/Preprocessor/file_test.c +++ w/clang/test/Preprocessor/file_test.c @@ -1,8 +1,7 @@ -// XFAIL: system-windows // RUN: %clang -E -ffile-prefix-map=%p=/UNLIKELY_PATH/empty -c -o - %s | FileCheck %s // RUN: %clang -E -fmacro-prefix-map=%p=/UNLIKELY_PATH/empty -c -o - %s | FileCheck %s // RUN: %clang -E -fmacro-prefix-map=%p=/UNLIKELY_PATH=empty -c -o - %s | FileCheck %s -check-prefix CHECK-EVIL -// RUN: %clang -E -fmacro-prefix-map=%p/= -c -o - %s | FileCheck %s --check-prefix CHECK-REMOVE +// RUN: %clang -E -fmacro-prefix-map=%/p/= -c -o - %/s | FileCheck %s --check-prefix CHECK-REMOVE
startswith is not ideal because /t will match /tmp. However, gcc appears to use something similar to startswith, see:
% gcc -c -g -fdebug-prefix-map=/t=x /tmp/c/a.c -o /tmp/c/a.o % llvm-dwarfdump /tmp/c/a.o | grep -m 1 DW_AT_name DW_AT_name ("xmp/c/a.c")
The ugly path separator pattern {{(/|\\\\)}} appears in 60+ tests. Can we teach clang and other tools to
- accept both / and \ input
- but only output /
on Windows? We can probably remove llvm::sys::path::Style::{windows,posix,native} from include/llvm/Support/Path.h and only keep the posix form.
In general they do, AFAIK, although it's not feasible in cases where / is the character that introduces an option, which is common on standard Windows utilities.
- but only output /
on Windows?
This is often actually incorrect for Windows.
We can probably remove llvm::sys::path::Style::{windows,posix,native} from include/llvm/Support/Path.h and only keep the posix form.
It's highly unlikely that will be correct for all cases, and certainly will not match users' expectations. Debug info, for example, will want normal Windows syntax file paths with \.
Since these are otherwise identical, perhaps a %select{...|...} for the flag name?