This is an archive of the discontinued LLVM Phabricator instance.

[lld][COFF] Add support for /map
ClosedPublic

Authored by saudi on Nov 21 2019, 12:14 PM.

Details

Summary

We use an external (commercial) software package to modify the generated executable as a post-process.
That software requires a map file with the same format as generated by link.exe /MAP.

This implementation is based on observation from the output of link.exe's map files.
We didn't need support for the /MAPINFO option.

Enhancements from the lldmap counterpart:

  • Removed duplicates of symbols (that is duplicate Symbol* values)
  • Added output for thunk symbols for dllimport stubs

I've also modified the default file name for /lldmap to use the .lldmap extension (instead of .map). This is to avoid conflict if both /map and /lldmap are used.

Diff Detail

Event Timeline

saudi created this revision.Nov 21 2019, 12:14 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 21 2019, 12:14 PM
ruiu added a comment.Nov 21 2019, 8:30 PM

Thank you for the patch. Overall, I think it is a good thing to add link.exe-compatible /map file output to lld.

lld/COFF/CMakeLists.txt
19–20

Except a few features, everything in lld/COFF is Microsoft-compatible, so adding a MS suffix feels a bit odd to me. You probably should rename existing MapFile.cpp to LLDMapFile.cpp and then place your file as MapFile.cpp.

lld/COFF/Config.h
188

Ditto -- rename this to lldMapFile and name yours mapFile

lld/COFF/Driver.cpp
720

We should avoid changing the suffix because it breaks backward compatibility. Instead, you may want to add a warning if /map and /lldmap has the same filename as an output file.

lld/COFF/MapFileMS.cpp
9 ↗(On Diff #230503)

It is a bit odd that we intend to print out in a format that is not exactly the same as Microsoft's. What is the difference? Can't we output the exact same output?

saudi updated this revision to Diff 231567.Nov 29 2019, 10:49 AM

I changed the code quite a bit.

  • Better match the map file format (added "entry point" line, separate list for static symbols, made header addresses relative to the section instead of absolute)
  • Optimized (parallel sort of symbols etc...), added timers for feedback with /time option
  • Applied fixes previously asked by Rui Ueyama (renaming mapFileMS to mapFile, ...)
saudi marked 5 inline comments as done.Nov 29 2019, 11:10 AM
saudi added inline comments.
lld/COFF/MapFileMS.cpp
9 ↗(On Diff #230503)

I used the same format observed from MS generated map files.
The remaining differences are the "f" and "i" flags shown in some of the symbol lines from link.exe, I couldn't figure out what they mean.
I fixed the comment, as it is not supposed to be just "close to" the link format.

saudi marked an inline comment as done.Dec 12 2019, 9:04 AM

Ping! :)

rnk added a comment.Jan 15 2020, 12:00 PM

A couple of nits, but I think it looks pretty good, thanks!

lld/COFF/MapFile.cpp
62

Is v.erase(end, v.end()) not more idiomatic?

lld/test/COFF/map.test
28

Please add some absolute symbols with "negative" RVAs to the inputs, since those are an interesting edge case in the sorting.

saudi updated this revision to Diff 248294.Mar 4 2020, 1:12 PM

For this update, I followed Reid's suggestions:

  • replaced std::resize() -> std::erase()
  • Modified the test to include symbol of type DefinedAbsolute

I also added a missing null-check

saudi marked 2 inline comments as done.Mar 4 2020, 1:13 PM
ruiu added a comment.Mar 5 2020, 10:27 PM

Overall looking good.

lld/COFF/Driver.cpp
1571–1572

nit: You can omit !config->mapFile.empty() && because if A==B and A=="", B must not be empty.

another nit: we usually do A != "" instead of !A.empty if A is StringRef.

1575

config->lldmapFile == ""

lld/COFF/LLDMapFile.cpp
128

nit: please make sure that a source file ends with \n instead of some other byte.

lld/COFF/MapFile.cpp
68–69

If you are adding config->imageBase to both A and B before comparing A < B, you don't have to add that at all, do you?

239

TDS -> tds, but I guess you can remove this local variable and use config->timestamp directly.

240

FormattedTime -> formattedTime

241

strftime's %c is locale-aware, so the exact output string of the function is not predictable. It may be in your local language (e.g. 2020年3月15日) depending on the language settings, and even for US English, I believe it's different depending on libc. That's not suitable for machine processing.

What is the output of MSVC link in the US English setting? We should always format a timestamp in that format whatever the user's language setting is.

saudi updated this revision to Diff 248777.Mar 6 2020, 10:26 AM
saudi marked an inline comment as done.

Implemented the fixes suggested by Rui Ueyama.

Notes on the timestamp:
I replaced the strftime call with a local implementation that reproduces what link.exe outputs.
Assuming link.exe was built using MSVC, I suspect that its implementation uses strftime("%c"), as it matches the format from Visual C++'s CRT library using the "C" locale.
I couldn't find a way on my machine to have link.exe output differently : my machine is configured as French, but link.exe's /map outputs english message (e.g. "Fri Mar 6 11:50:39 2020").

saudi updated this revision to Diff 248791.Mar 6 2020, 10:33 AM
saudi marked an inline comment as done.

Updated diff for one missing fix

saudi marked 5 inline comments as done.Mar 6 2020, 10:35 AM
saudi added inline comments.
lld/COFF/MapFile.cpp
68–69

This was to support underflowing that happens with DefinedAbsolute symbols (RVA = VA - imageBase).

saudi updated this revision to Diff 248792.Mar 6 2020, 10:41 AM
saudi marked an inline comment as done.Mar 9 2020, 7:12 AM
ruiu accepted this revision.Mar 9 2020, 11:12 PM

LGTM

This revision is now accepted and ready to land.Mar 9 2020, 11:12 PM
amccarth accepted this revision.Mar 10 2020, 8:35 AM

Sorry, somehow I'd missed that I'd been added as a reviewer. This looks good to me. (Though I do have a couple questions in the inline comments.)

lld/COFF/LLDMapFile.cpp
69

Stupid question: Is it possible for two symbols to share an address? If so, should this be a std::stable_sort for deterministic build outputs?

lld/COFF/MapFile.cpp
61

Is the lambda necessary in this sort? Don't std::pairs naturally use lexicographic ordering?

Do all the Defined *s live inside a single object l(ike an array)? If not, then comparing pointers to individual ones is technically undefined behavior though it's very likely to work as expected.

saudi marked 2 inline comments as done.Mar 10 2020, 10:27 AM
saudi added inline comments.
lld/COFF/LLDMapFile.cpp
69

There definitely are symbols that share an address.
Note that LLDMapFile.cpp was previously MapFile.cpp which was renamed and mostly unchanged (just adapted for the rename).
But I don't mind fixing it on top, should I do that in this commit?

lld/COFF/MapFile.cpp
61

Indeed, the Define*s come from various ObjFiles so we can't expect them to come from the same array.

This sort is just to regroup entries with the same pointer value, for std::unique to work.
I just noticed:

  • comparing the entire pair is not necessary here (we just need to regroup same pointers)
  • we just need to compare the value of the pointers, so maybe converting to size_t would be more relevant?

I guess the sort could become :

parallelSort(v, [](const auto &a, const auto &b) {
  return (size_t)a.first < (size_t)b.first;
});

Thanks for answering my questions.

lld/COFF/LLDMapFile.cpp
69

I think switching to stable_sort should probably be a separate commit. That way, if I'm wrong, it can be rolled back without affecting anything else.

lld/COFF/MapFile.cpp
61

Casting to size_t is fine, but I don't think it's necessary given how you answered the first question here.

I always get nervous when I see a custom comparator that ignores some of the fields--I worry that somehow it's not providing a strict weak total ordering. But I think it's fine here.

That said, you could just use std::lexicographical_compare rather than a lambda.

ruiu added inline comments.Mar 13 2020, 12:43 AM
lld/COFF/LLDMapFile.cpp
69

Good point. You can do either in this patch or in a follow-up patch, but this should definitely be stable_sort for build reproducibility. Our promise to users is that as long as you are using the exact same version of lld and feeding the exact same set of files and command line options, lld's outputs will be the same no matter what the host system is. For that reason we generally avoid unstable sort.

This revision was automatically updated to reflect the committed changes.
lld/test/COFF/Inputs/map.yaml