This is an archive of the discontinued LLVM Phabricator instance.

Add -gnu-map option to emit a map file in the GNU-tsyle format.
Needs ReviewPublic

Authored by ruiu on Jun 12 2019, 2:36 AM.

Details

Summary

lld supports -Map option to write the layout information to a specified
file in a human-readable format. GNU ld and gold support that option as
well, but their format is different from others.

If the output were consumed only by humans, that's not a problem. However,
in practice, there are a lot of post-link tools that parses map files, and
they almost always support the GNU-style map file because of a historical
reason. It is not too hard to update a program if a post-link tool is
written by a scripting language, but if it is a proprietary tool, it's not
easy to update. In reality, it is sometimes very hard to update a program
so that the tool can read the lld-style map file.

So, in this patch, I added -gnu-map option to print out a map file in
the GNU-style format. I'm not super happy to do this, as this is basically
a duplicate feature, but given that the amount of code to implement the
feature is very small, I think doing this makes sense. It should help
users who has tools that consume map files.

Event Timeline

ruiu created this revision.Jun 12 2019, 2:36 AM
Herald added a project: Restricted Project. · View Herald Transcript

That being said, -Map output is oftentimes processed by post-link tools

Just curious, do you have some concrete examples? Since ld.bfd and gold have different -map formats. These tools may have to learn the format of gold -Map when they migrate from ld.bfd to gold, no?

MaskRay added inline comments.Jun 12 2019, 2:50 AM
lld/ELF/Options.td
192

Without looking at the documentation, I think people would mostly likely think -gnu-map means a GNU ld compatible format.

ld.bfd is GNU ld :) (I have to say -gold-map sounds like a weird option name)

ruiu added a comment.Jun 12 2019, 2:51 AM

Yes, we have an internal user who has a proprietary tool to process a -Map output. As to the difference between bfd and gold, they don't look too big, so perhaps most tools can consume both. In this output, I modeled gold.

Yes, we have an internal user who has a proprietary tool to process a -Map output. As to the difference between bfd and gold, they don't look too big, so perhaps most tools can consume both. In this output, I modeled gold.

I think it isn't likely that tools will accept both gold and bfd map file formats; from a conversation with an employee of a Linux Distro, gold was used about 3% of the time although I expect its use to be significantly higher in Google. I've seen map file post processing most used in embedded systems as this is where placement of sections and size information is most crucial. I think most of those users are with ld.bfd at the moment though.

At the Euro LLVM binutils (LLD as honorary member) it was suggested that a machine readable format option for outputs such as the map file would be ideal. There was some belief that binutils could be persuaded to adopt a similar format. The TI proprietary linker does this http://downloads.ti.com/docs/esd/SLAU131K/Content/SLAU131K_HTML/xml_link_information_file_description.html#STDZ0820750 although they use XML which is probably overkill. Something like JSON would be quite simple for tools to parse.

Would it be worth asking if the customer would be willing to rewrite their parser to consume something like JSON and go down that route? Having said all that I've no objections to adding support for other linker map file formats, and adding gold doesn't prevent adding others later.

lld/ELF/Options.td
192

I agree with MaskRay here. Perhaps another command line option --map-file-format=<format> where <format> defaults to lld. We could then add gold, bfd etc.

Not had a chance to go through the gold source yet to see if there is anything obviously missing. From a quick experiment using a linker-script derived from ld.bfd it seems like gold's map file remains much the same, whereas ld.bfd will give information about symbol resolution etc. Will take a look at the gold source this afternoon.

lld/test/ELF/gnu-map.s
65

The path names will be different, will need to use some kind of regex here to skip the first part.

I've been through to see if I could spot any significant differences. I've recorded what I found although I'm not sure how significant they are to the person attempting to parse the map file though.

lld/ELF/MapFile.cpp
257

gold right justifies the size, but I doubt that is significant.

259

gold will print out the LMA if it is different at this point. Followed by (before compression) if the output section has been subject to debug compression.

268

Gold has some custom name printing for its linker generated content (do_print_to_mapfile) . For example the .plt comes out as

**PLT

.rela.dyn comes out as

** dynamic relocs
273

gold size is right justified (not significant).

274

Gold does not print the path and object for linker generated content like the .plt, LLD prints out <internal>:(.plt)

277

gold only prints "ordinary symbols", for example it does not print sharedFoo, sharedBar, sharedFunc1, sharedFunc2

MaskRay added inline comments.Jun 13 2019, 11:58 PM
lld/ELF/MapFile.cpp
245

gold reports ** file header and ** segment headers.

257

gold essentially uses 0x%08llx for 32-bit targets.

(the actual code is fprintf(this->map_file_, "0x%0*llx %10s", parameters->target().get_size() / 4, static_cast<unsigned long long>(os->address()), sizebuf);)

259

LMA is printed if either AT(LMA) or AT>LMA_REGION is specified.

This can be tested with if (OSec->LMAExpr || OSec->LMARegion) but we don't seem to store the LMA address as gold does. So I don't know how we can dump this piece of information.

268

** gdb_index
** GOT PLT
** PLT
** .LA25.stubs
** section headers
** .reginfo
** string table
** relocs for static relocation sections.

MaskRay added inline comments.Jun 14 2019, 12:01 AM
lld/test/ELF/gnu-map.s
65

This can be tested with:

FileCheck -DOBJ=%t.o %s

// CHECK-NEXT: .... [[OBJ]]:(.text)

phosek added a subscriber: phosek.Jun 19 2019, 3:10 PM

Yes, we have an internal user who has a proprietary tool to process a -Map output. As to the difference between bfd and gold, they don't look too big, so perhaps most tools can consume both. In this output, I modeled gold.

I think it isn't likely that tools will accept both gold and bfd map file formats; from a conversation with an employee of a Linux Distro, gold was used about 3% of the time although I expect its use to be significantly higher in Google. I've seen map file post processing most used in embedded systems as this is where placement of sections and size information is most crucial. I think most of those users are with ld.bfd at the moment though.

At the Euro LLVM binutils (LLD as honorary member) it was suggested that a machine readable format option for outputs such as the map file would be ideal. There was some belief that binutils could be persuaded to adopt a similar format. The TI proprietary linker does this http://downloads.ti.com/docs/esd/SLAU131K/Content/SLAU131K_HTML/xml_link_information_file_description.html#STDZ0820750 although they use XML which is probably overkill. Something like JSON would be quite simple for tools to parse.

Would it be worth asking if the customer would be willing to rewrite their parser to consume something like JSON and go down that route? Having said all that I've no objections to adding support for other linker map file formats, and adding gold doesn't prevent adding others later.

I just had a discussion with one of our internal teams that's looking at lld and this exact request came out (and it wasn't the first time). There are several examples of people parsing .map (or asking for help with) files:

There are also many internal examples from Google that I cannot show publicly.

Given how often we're being asked for this, especially from teams working in the embedded space that heavily relies on -Map output, JSON output is something I'd like to look into soon.

When that's supported, we could provide a script that would allow converting the JSON output to GNU-tstyle format. That would obviate the need for supporting every possible output format directly in lld.

mierle added a subscriber: mierle.Jul 8 2019, 1:26 PM

I am one of the engineers at Google wanting to have machine readable .map file output from lld and ld. Regarding Peter's suggestion to eventually only offer a .json format and helper scripts, then removing the other .map output formats, I'm less excited. In practice maintaining scripts (and their associated runtimes and dependencies) can be a big burden. The map export code isn't enormous and so I would suggest we just keep it.

lld/ELF/Options.td
192

As someone who wants to get a JSON-based map format added to LLD, I like Peter's suggestion. With JSON added, the flags could be:

  • --map-file-format=lld
  • --map-file-format=gold
  • --map-file-format=json
  • --map-file-format=csv

which feels clean, readable, and extendable.