This is an archive of the discontinued LLVM Phabricator instance.

[COFF] Move section name encoding into BinaryFormat
ClosedPublic

Authored by npmiller on Feb 2 2022, 7:43 AM.

Details

Summary

Large COFF section names are moved into the string table and the section header field is the offset into the string table encoded in ASCII for offset smaller than 7 digits and in base64 for larger offsets.

The operation of taking the string table offsets is done in a few places in the codebase, so it is helpful to move this operation into BinaryFormat so that it can be shared everywhere it's done.

So this patch takes the implementation of this operation from llvm/lib/MC/WinCOFFObjectWriter.cpp and moves it into BinaryFormat.

This is a prerequisite for https://reviews.llvm.org/D118692 that fixes an issue where llvm-objcopy was implementing this encoding operation incorrectly.

Diff Detail

Event Timeline

npmiller created this revision.Feb 2 2022, 7:43 AM
npmiller requested review of this revision.Feb 2 2022, 7:43 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 2 2022, 7:43 AM
mstorsjo added a reviewer: rnk.Feb 2 2022, 7:56 AM
mstorsjo added a subscriber: mstorsjo.
rnk accepted this revision.Feb 2 2022, 8:55 AM

lgtm

This revision is now accepted and ready to land.Feb 2 2022, 8:55 AM
jhenderson accepted this revision.Feb 2 2022, 11:27 PM

Some nits from me. Otherwise, LGTM.

llvm/include/llvm/BinaryFormat/COFF.h
735

NameSize isn't referenced anywhere in this signature - I have to go a long way to find what it refers to. Perhaps replace NameSize with the NameSize constant value or something to that effect?

llvm/lib/BinaryFormat/CMakeLists.txt
15

Nit: filenames look to be in alphabetical order in this list, so move COFF.cpp much earlier.

llvm/lib/BinaryFormat/COFF.cpp
2

Nit: No need for the "*- C++-*" bit of this line - this is only for .h files where the extension doesn't signify the language.

npmiller updated this revision to Diff 405555.Feb 3 2022, 2:35 AM

Fixed file order in CMakeLists.txt, removed unnecessary *- C++-* in .cpp source file.

For the NameSize comment I simply added the namespace, COFF::NameSize, which should make it clear that it's referring to the constant value.

jhenderson accepted this revision.Feb 3 2022, 2:39 AM

Great, looks good.

@npmiller If you don't have commit access, I can push this patch for you. Can you provide your preferred git author line, i.e. "Real Name <email@address>"?

@npmiller If you don't have commit access, I can push this patch for you. Can you provide your preferred git author line, i.e. "Real Name <email@address>"?

@mstorsjo yeah I don't have commit access so that would be great, for the git author line: "Nicolas Miller <nicolas.miller@codeplay.com>", thank you!

@npmiller If you don't have commit access, I can push this patch for you. Can you provide your preferred git author line, i.e. "Real Name <email@address>"?

@mstorsjo yeah I don't have commit access so that would be great, for the git author line: "Nicolas Miller <nicolas.miller@codeplay.com>", thank you!

Awesome, I'll push it in a little while after building and running tests then.

Btw, lld/COFF/Writer.cpp also has another instance of code doing the same (OutputSection::writeHeaderTo), and it'd be nice to refactor that to use the same shared routine now. It's probably less common to have over 10^6 bytes of string table in an executable, but it's clearly possible. It has a slightly different set of bugs (it writes the null byte into the following field if it's up to 10^7 bytes, and doesn't implement the base64 alternative at all), so it'd be good to fix the same there too.

This revision was landed with ongoing or failed builds.Feb 21 2022, 3:51 AM
This revision was automatically updated to reflect the committed changes.