This is an archive of the discontinued LLVM Phabricator instance.

-fdebug-prefix-map=: make the last win when multiple prefixes match
ClosedPublic

Authored by MaskRay on Apr 21 2023, 5:47 PM.

Details

Summary

For
clang -c -g -fdebug-prefix-map=a/b=y -fdebug-prefix-map=a=x a/b/c.c,
we apply the longest prefix substitution, but
GCC has always been picking the last applicable option (a=x, see
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109591).

I feel that GCC's behavior is reasonable given the convention that the last
value wins for the same option.

Before D49466, Clang appeared to apply the shortest prefix substitution,
which likely made the least sense.

Diff Detail

Event Timeline

MaskRay created this revision.Apr 21 2023, 5:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 21 2023, 5:47 PM
MaskRay requested review of this revision.Apr 21 2023, 5:47 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
joerg added a comment.Apr 21 2023, 5:55 PM

For me long matching prefix makes more sense, but if the same prefix is used multiple times, the last option should win.

scott.linder added inline comments.
clang/include/clang/Basic/CodeGenOptions.h
209

What benefit does forcing allocation have? Why not use the default, or switch to std::vector?

clang/lib/CodeGen/CGDebugInfo.cpp
72–77

Can you use the member-initializer-list here?

MaskRay marked 2 inline comments as done.Apr 24 2023, 3:17 PM
MaskRay added inline comments.
clang/include/clang/Basic/CodeGenOptions.h
209

This doesn't force allocation. I use inline element size 0 to make the member variable smaller than a std::vector.
std::vector likely compiles to larger code.

clang/lib/CodeGen/CGDebugInfo.cpp
72–77

No, this doesn't work. We convert a vector of std::string to a vector of StringRef.

scott.linder added inline comments.Apr 25 2023, 10:58 AM
clang/include/clang/Basic/CodeGenOptions.h
209

Sorry, I just didn't realize this was idiomatic (i.e. I hadn't read https://llvm.org/docs/ProgrammersManual.html#vector), and I didn't see other similar uses in the file.

There are 15 std::vector members of CodeGenOptions, but no SmallVector<_, 0> members; maybe the rest can be converted in an NFC patch, so things are consistent?

clang/lib/CodeGen/CGDebugInfo.cpp
72–77

If all we are doing is creating another vector which shares the underlying strings with the original, why not just save a reference to the original vector? Is the cost of the extra dereference when accessing it greater than the cost of traversing it plus the extra storage for the StringRefs?

It seems like the original utility was just to get the std::map sorting behavior, which we no longer need.

MaskRay updated this revision to Diff 516863.Apr 25 2023, 12:29 PM
MaskRay marked 3 inline comments as done.

remove a DebugPrefixMap variable

MaskRay marked an inline comment as done.Apr 25 2023, 12:32 PM
MaskRay added inline comments.
clang/include/clang/Basic/CodeGenOptions.h
209

It seems that SmallVector is used more frequently but std::vector is used as well. I'd prefer that we don't change the types for existing variables...

% rg SmallVector lib -l | wc -l => 439
% rg std::vector lib -l | wc -l => 255
clang/lib/CodeGen/CGDebugInfo.cpp
72–77

Thanks for the giving this more attention. Actually I found that the variable can be removed. Removed:)

scott.linder accepted this revision.Apr 25 2023, 2:04 PM

LGTM, thank you!

Does this warrant a release note, as it is changing the behavior in a backwards-incompatible manner? I do think changing to match GCC is worthwhile, even if it means a change in behavior for Clang.

I also don't think the "longest first" heuristic is useful enough to outweigh the benefits of behaving like GCC; it seems like the user can always sort their own options to get the same effect.

This revision is now accepted and ready to land.Apr 25 2023, 2:04 PM
MaskRay updated this revision to Diff 516924.Apr 25 2023, 2:36 PM
MaskRay marked an inline comment as done.

Update HelpText

LGTM, thank you!

Does this warrant a release note, as it is changing the behavior in a backwards-incompatible manner? I do think changing to match GCC is worthwhile, even if it means a change in behavior for Clang.

I also don't think the "longest first" heuristic is useful enough to outweigh the benefits of behaving like GCC; it seems like the user can always sort their own options to get the same effect.

Thank you! I feel that specifying multiple -fdebug-prefix-map= is a very uncommon situation, so a release note entry is likely overkill.
I updated the HelpText, though.

For me long matching prefix makes more sense, but if the same prefix is used multiple times, the last option should win.

"if the same prefix is used multiple times, the last option should win." I think this interpretation will add some complexity, e.g. whether we prefer -fdebug-prefix-map=a/=b or -fdebug-prefix-map=a=c. GCC's current rule (simple) seems quite reasonable.

From the discussion on https://gcc.gnu.org/bugzilla/show_bug.cgi?id=109591 , GCC will likely remain the current behavior and will improve the documentation.

This revision was landed with ongoing or failed builds.Apr 25 2023, 3:12 PM
This revision was automatically updated to reflect the committed changes.
dyung added a subscriber: dyung.Apr 25 2023, 6:21 PM

@MaskRay , a test you modified, debug-prefix-map.s is failing on the PS5 Windows bot. Can you take a look?

https://lab.llvm.org/buildbot/#/builders/216/builds/20460

dyung added inline comments.Apr 25 2023, 6:22 PM
llvm/test/MC/ELF/debug-prefix-map.s
54

This might need another regex for the path separator between src_root and bar.
https://lab.llvm.org/buildbot/#/builders/216/builds/20460/steps/7/logs/FAIL__LLVM__debug-prefix-map_s

<stdin>:11:2: note: possible intended match here
 DW_AT_comp_dir [DW_FORM_string] ("/src_root\\bar")
 ^
dankm added a comment.Apr 27 2023, 2:48 PM

Thank you! I feel that specifying multiple -fdebug-prefix-map= is a very uncommon situation, so a release note entry is likely overkill.
I updated the HelpText, though.

The Yocto project and my WIP changes against FreeBSD both specify multiple -fdebug-prefix-map= options, but neither depend on the longest-match first semantics. They did at one point, but GCC behaved differently.

So, I suppose this LGTM.