This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Add --build-id-link-dir flag
ClosedPublic

Authored by jakehehrlich on Nov 10 2018, 3:01 PM.

Details

Summary

This flag does not exist in GNU objcopy but has a major use case. Debugging tools support the .build-id directory structure to find debug binaries. There is no easy way to build this structure up however. One way to do it is by using llvm-readelf and some crazy shell magic. This implements the feature directly. It is most often the case that you'll want to strip a file and send the original to the .build-id directory but if you just want to send a file to the .build-id directory you can copy to /dev/null instead.

I happen to have implemented the "find build id" algorithm like 3 or 4 times now. You can find reference implementations that are known to work here and here.

The .build-id use case documentation can be found here: https://sourceware.org/gdb/onlinedocs/gdb/Separate-Debug-Files.html

In Fuchsia we now have a lot of tools that fetch binaries based on their build id but we do it using this awful non-standard thing we call "ids.txt". This will let us replace it.

Diff Detail

Repository
rL LLVM

Event Timeline

jakehehrlich created this revision.Nov 10 2018, 3:01 PM
mcgrathr added inline comments.Nov 10 2018, 3:53 PM
llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
140 ↗(On Diff #173532)

Is there a check (and test coverage) for the error case where the build ID is less than two bytes?

142 ↗(On Diff #173532)

Where's the documentation for the behavior here?
The linked-to gdb documentation describes the .build-id/ directory convention in which it looks for xx/xxx.debug, not xx/xxx.
This is also what lldb implements (it only ever looks for xx/xxx.debug, never xx/xxx).

If the intent here is to publish the full, unstripped input file as the debug file, then it should have the .debug suffix for the gdb convention.
elfutils does look for xx/xxx when looking for an ELF file cold by build ID (which is the use case here, but not something AFAIK that either gdb or lldb supports), and then only looks for xx/xxx.debug if the main file doesn't have debug info.
But that's not the common convention.

When objcopy is being used to strip and produce a separate debug file (with SHT_NOBITS replacements for SHF_ALLOC sections, etc.), then what you want to link as xx/xxx.debug is the separate debug file, not the input file. In that case the best behavior IMHO is to also link xx/xxx to the stripped output file. That's consistent with e.g. Fedora/CentOS/RHEL debuginfo rpms, which include the separate debug file in /usr/lib/debug/.build-id/xx/xxx.debug and a symlink to the installed (stripped) binary as /usr/lib/debug/.build-id/xx/xxx.

143 ↗(On Diff #173532)

Is this stray debug output?

146 ↗(On Diff #173532)

In POSIX terms, this is usually done by checking for errno==EEXIST rather than another filesystem call to see if that was the reason it failed.

llvm/tools/llvm-objcopy/ELF/Note.h
22 ↗(On Diff #173532)

static is always superfluous inside an anonymous namespace.
This should be const char *const or constexpr if LLVM allows newer C++.
Except it's unused, so it shouldn't be here at all.

24 ↗(On Diff #173532)

Actually some systems do use 64-bit fields in ELFCLASS64 SHT_NOTE (or PT_NOTE) data.

91 ↗(On Diff #173532)

So this tool does not handle cross-endian file access?
And what's wrong with ELFFile::notes() anyway?

jakehehrlich added inline comments.Nov 10 2018, 4:25 PM
llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
140 ↗(On Diff #173532)

This ArrayRef will be empty in that case. There is however no check to make sure that the build ID isn't empty so the [0] above could fail in theory.

142 ↗(On Diff #173532)

--only-keep-debug is a nop in llvm-objcopy. There is also no all in one "strip-to-file" option right now. It's certainly doable but it doesn't currently exist. The semantics would be the same weather the input file was copied or we went though the motions.

143 ↗(On Diff #173532)

yes.

146 ↗(On Diff #173532)

create_hard_link doesn't seem to give any better information that could be used to do this across all systems.

llvm/tools/llvm-objcopy/ELF/Note.h
24 ↗(On Diff #173532)

llvm doesn't handle them and they're outliers.

91 ↗(On Diff #173532)

Ah crap...well that *is* an issue that I should fix I suppose.

llvm/tools/llvm-objcopy/ELF/Note.h
18 ↗(On Diff #173532)

i think we should not have anonymous namespaces in headers,
every time this header is included everything will be duplicated.
so i would remove this anonymous namespace, leave only the "public" interface declarations in this header and move the details of implementation into .cpp

22 ↗(On Diff #173532)

if i remember correctly the existing Visual Studio build bots were not happy with this, though things might have changed.

llvm/tools/llvm-objcopy/ELF/Note.h
22 ↗(On Diff #173532)

(i meant "with this usage of constexpr")

I started to rewrite it in terms of ELFFile. The main challenge nis dealing
with the Error object that we don't really care about.

cc'ing blaike as a local debug expert -- I'm not familiar with systems using --build-id

llvm/test/tools/llvm-objcopy/build-id-link-dir.test
3 ↗(On Diff #173532)

Does this also work for in-place invocations?

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
140 ↗(On Diff #173532)

I agree -- this should error (with a test too :) ) if build id is less than 2 bytes

142 ↗(On Diff #173532)

nit: comment the true here, e.g. /*LowerCase*/true

144 ↗(On Diff #173532)

Why create a hard link instead of copying the file?

llvm/tools/llvm-objcopy/ELF/Note.h
1 ↗(On Diff #173532)

I don't know if you need this file at all. NoteIterator/NoteRange looks like it's reimplementing the note iterator in libobject
e.g. https://github.com/llvm-mirror/llvm/blob/master/include/llvm/Object/ELF.h#L228

llvm/tools/llvm-objcopy/ObjcopyOpts.td
167 ↗(On Diff #173532)

Should this be applied to llvm-strip as well?

phosek added inline comments.Nov 12 2018, 11:22 AM
llvm/tools/llvm-objcopy/ObjcopyOpts.td
167 ↗(On Diff #173532)

Yes, I believe we'll need that as well.

@mcgrathr 's comments are probably more relevant than mine - especially those that, by the sounds of it, remove the need for adding a new iterator (but I only found that after I'd already written some of that - and figured the comments might be useful general feedback/'learnings' anyway :) )

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
131 ↗(On Diff #173532)

Prefer = instead of () to make it clear you're not relying on an explicit ctor/conversion?

SmallString<123> Path = Config.BuildIdLinkDir;
llvm/tools/llvm-objcopy/ELF/Note.h
59 ↗(On Diff #173532)

An Explicit default ctor seems a bit novel - is there anything this is particularly protecting against? (seems like it'd make it invalid to use {} to construct a NoteIterator - is that important?)

Also: Maybe use non-static data member initializers?

const uint64_t *Begin = nullptr;
const uint64_t *End = nullptr;
const Note *Cur = nullptr;

(& then this ctor could be "NoteIterator() = default;")

62 ↗(On Diff #173532)

Might as well use the init list for Cur too?

66–70 ↗(On Diff #173532)

Usually folks omit {} on single-line (or single statement) scopes in LLVM code.

(also could use a conditional operator, since each branch is an assignment to the same variable - but that's super subjective, of course :) )

73–74 ↗(On Diff #173532)

Prefer non-member versions of operator overloads so any implicit conversions apply equally to the LHS and RHS? (they can still be friend definitions defined inline in the class definition here)

& maybe define one in terms of the other for consistency?

83–86 ↗(On Diff #173532)

Const qualifying value return types is a bit problematic, so far as I recall - doesn't allow the value to be moved from, etc, etc?

90 ↗(On Diff #173532)

I don't think there's much use of {} ctor invocation in LLVM, so I'd suggest using () for consistency.

90 ↗(On Diff #173532)

Worth using std::find? Probably not, but figured I'd mention/ask in case.

cc'ing blaike as a local debug expert -- I'm not familiar with systems using --build-id

Thanks! Had it on my "to look at" list in any case (& have had some discussions with the Fuschia folks about build-id previously) but I really know very little about build-id, where it came from, who uses it/why, how broad the adoption (& thus how reasonable the request for tooling functionality, etc) is. But certainly here to pay attention/provide what feedback & learn what I can from it all.

So I fixed this issue: https://reviews.llvm.org/D54451

but now I seem to have found a different issue that causes a program failure to happen due to Error being handled improperly. After which we should be using the ELFType iterator implementation.

llvm/test/tools/llvm-objcopy/build-id-link-dir.test
3 ↗(On Diff #173532)

It works it's just odd because you modify the the thing you sent to the build id directory.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
144 ↗(On Diff #173532)

To save space but I guess copying would make the inplace case more sensible. Generally the build system may still need the unstripped file around. For instance, LLD doesn't support linking against binaries stripped with --strip-sections. Because debug binaries total to a rather large amount, not hard-linking would all at once double the size of the build. Currently in Fuchsia, on a lower layer, our build requires ~30gb of space. Of that 10 gb is for unstripped binaries. If we copied we'd suddenly shoot our build size up to 40gb for the smallest part. This isn't a terrible cost mind you but some people have complained about the space requirements. That seems like a high cost to pay to me.

llvm/tools/llvm-objcopy/ELF/Note.h
1 ↗(On Diff #173532)

Yeah, I'm rewriting everything in terms of that right now. This file will go away.

llvm/tools/llvm-objcopy/ObjcopyOpts.td
167 ↗(On Diff #173532)

Sounds good, I'll add it.

Not technically something we can't work around, the following code solves my problem now that I understand it.

Error Err = Error::success();
if (Err) assert(false); // This only serves to ensure that Err is checked.

This is despite that fact that I check each time before the destructor would be called...I also have to give to the library unchecked as well or else if it assigns an error the program will crash. Will upload change soon.

jakehehrlich added inline comments.Nov 12 2018, 6:37 PM
llvm/test/tools/llvm-objcopy/build-id-link-dir.test
3 ↗(On Diff #173532)

Sorry, this is not right as my tests show. I thought this would be the case but I remembered that FileBuffer atomically renames the file into place which has different semantics. I wrote that comment piror to thinking about that and didn't delete this.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
144 ↗(On Diff #173532)

So also follow up on this comment as well, most of this is still correct but this behavior is actually exactly as sensible as copying and also saves significant space. Also almost all tools use the rename at the end thing needed to make this true to gain atomicity and avoid causing problems on failure. So I think hard linking is just strictly better.

rupprecht added inline comments.Nov 13 2018, 12:18 PM
llvm/test/tools/llvm-objcopy/build-id-link-dir.test
5 ↗(On Diff #173804)

Is test -e portable (i.e. on windows)? Just ls might be a better option.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
123 ↗(On Diff #173804)

This should probably error instead of assert, otherwise this will cause weird failures depending on debug vs release builds

I don't really understand why it's needed from this comment either -- can you elaborate the comment?

(Probably the libobject code should just be fixed, but I don't see that as a blocker)

125 ↗(On Diff #173804)

The ELF dumper in llvm-readobj only looks for notes in the program headers if it's a core file:
https://github.com/llvm-mirror/llvm/blob/master/tools/llvm-readobj/ELFDumper.cpp#L3883
Honestly I'm not sure why it does it though. Do you need to do similar logic here?

173 ↗(On Diff #173804)

BuildIdBytes.getValue().slice(1) instead

578 ↗(On Diff #173804)

Is the In here supposed to be Out?

llvm/tools/llvm-objcopy/ObjcopyOpts.td
167 ↗(On Diff #173532)

Maybe you just haven't gotten to it yet, but I still don't see the Stripopts.td change in this patch

169 ↗(On Diff #173804)

This looks long -- you should be able to format this automatically with git-clang-format --extensions td master

dblaikie added inline comments.Nov 13 2018, 12:23 PM
llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
155–156 ↗(On Diff #173804)

Curious API design - perhaps you wouldn't trip over the Error clearing issues if you used

Expected<ArrayRef<uint8_t>> findBuildID(In);

Errors as out parameters are, tricky.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
140 ↗(On Diff #173532)

as David mentioned below - maybe return Expected<...> instead ? imho that would be more readable, no "Out" parameters, etc ?

jakehehrlich marked 16 inline comments as done.
  1. Switched to expected
  2. Used slice
  3. Formatted td file
llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
123 ↗(On Diff #173804)

This is very subtle and this assert should never trigger, hence the assert. I changed it to llvm_unreachable instead. Error has state, checked vs unchecked. libObject assigns to the error rather than returning an error, but this is faulty for this reason. You can only construct an error using Error::success() but by default that is unchecked. If you assign to an unchecked Error, your program will assert fail. To avoid this, I just manually check the successful error. This is just an indication that libObject chose a bad way to propagate errors in this case. I can't really avoid it unfortunately.

125 ↗(On Diff #173804)

That's not really ideal behavior; ELFDumper should check both. For build ids the correct place to look is always via the program headers because only non-ET_REL binaries have build ids and such binaries will always have a PT_NOTE if they have a build id. This means stripped binaries can (even those stripped with --strip-sections) can be matched up with their build id.

155–156 ↗(On Diff #173804)

As I have discovered by trying to deal with the one in libObject you'll see above. I've switched the code to use Expected.

578 ↗(On Diff #173804)

No, the build id is unchanged. Thinking about it, we should probably extract the build id once and then pass the build id...not recompute it twice.

llvm/tools/llvm-objcopy/ObjcopyOpts.td
169 ↗(On Diff #173804)

woah...I didn't know that was a thing!

dblaikie added inline comments.Nov 14 2018, 5:50 PM
llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
137–138 ↗(On Diff #174139)

Maybe sink this into the "unwrapOrError" loop (& sink the Error itself into just before the "notes" loop) - shortest scope as needed helps make the code a bit easier to understand/reason about.

123 ↗(On Diff #173804)

This also probably isn't ideal - ah, here's the tool for this:

http://llvm.org/doxygen/classllvm_1_1ErrorAsOutParameter.html - hopefully that addresses your issues here, specifically "notes" should use an ErrorAsOutParameter to safely clear the error state of the callers new Error object.

adding @lhames just as FYI for some stumbling over the Error-as-out-parameter usage.

Ah, found the documentation for it (beyond the basic doxygen): http://llvm.org/docs/ProgrammersManual.html#fallible-constructors

I would probably use more detailed help messages for the new options (i.e. what exactly the last two options mean (name's suffix) and how exactly the path is constructed from the build-id),
so that users can understand what's going on without looking at the actual implementation.
other than that - LGTM

This revision is now accepted and ready to land.Nov 14 2018, 8:53 PM
jhenderson requested changes to this revision.Nov 15 2018, 3:07 AM

Sorry for not getting to this earlier. It's been a busy week for me.

llvm/test/tools/llvm-objcopy/build-id-link-dir.test
5 ↗(On Diff #173804)

I can't verify for all Windows versions, but on mine, it is part of GnuWin32 tools, which is a prerequisite, I believe.

11 ↗(On Diff #174139)

--build-id-link-output= ... looks weird and I don't think should be valid syntax. Could it not just be --build-id-link-output?

If specifying an output is valid, there should be a test for that...

Basically, I'm a little confused as to what is going on here and in the next test.

llvm/test/tools/llvm-objcopy/no-build-id2.test
1 ↗(On Diff #174139)

Why is this test a separate test? Perhaps naming it differently would help clarify.

llvm/tools/llvm-objcopy/CopyConfig.cpp
291 ↗(On Diff #174139)

clang-format?

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
125 ↗(On Diff #174139)

Why the constexpr here? Pretty sure it's not going to achieve much. Also, I don't think this is the correct case for constants in LLVM - I think it's supposed to be the same as other local variables.

139 ↗(On Diff #174139)

Use makeStringError, and llvm::errc::invalid_argument, to avoid the std::make_error_code. It might be nice to also mention the object file name in the error message.

162 ↗(On Diff #174139)

It might be nice to add a little bit more to this error message to make it clear what build ID is too small (e.g. "the build ID in file "foo" is smaller than two bytes").

165 ↗(On Diff #174139)

/*LowerCase*/

179 ↗(On Diff #174139)

Comment the true here too.

182 ↗(On Diff #174139)

This should probably mention where it's trying to link to, as well.

llvm/tools/llvm-objcopy/ObjcopyOpts.td
169 ↗(On Diff #174139)

Use -> Used
or maybe better yet:
Set directory for --build-id-link-input and --build-id-link-output to <dir>

Using the <dir> meta variable name for the two options in this help text is a bit confusing, given that <dir> is also the name of the meta variable for this option.

This revision now requires changes to proceed.Nov 15 2018, 3:07 AM
jakehehrlich marked 9 inline comments as done.
jhenderson added inline comments.Nov 27 2018, 2:49 AM
llvm/test/tools/llvm-objcopy/bad-build-id.test
4 ↗(On Diff #175368)

You don't need to capture "T" here, since you don't use it in a later CHECK. You can do {{.*}} instead.

llvm/test/tools/llvm-objcopy/build-id-link-dir.test
11 ↗(On Diff #174139)

The weird syntax still exists: trailing '=' looks like something is missing. I'd prefer a solution to it. Either --build-id-link-output should be a valid switch on its own with no argument, as well as --build-id-link-output=<some value> or some other solution si needed.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
139 ↗(On Diff #174139)

Oops, yes, createStringError. You don't need the std::make_error_code now.

132–135 ↗(On Diff #175368)

clang-format, and unnecessary braces.

145 ↗(On Diff #175368)

I might be missing something, but I think the parentheses here are all unnecessary.

164 ↗(On Diff #175368)

Remove this return and the parentheses, since error() is a no-return function.

P.S. Why do we have separate error and reportError functions?

170–171 ↗(On Diff #175368)

Could these two lines be combined to the following?

if (auto EC = sys::fs::create_hard_link(ToLink, Path)) {
jakehehrlich marked 4 inline comments as done and an inline comment as not done.

Responded to latest comments.

llvm/test/tools/llvm-objcopy/build-id-link-dir.test
5 ↗(On Diff #173804)

I have a windows box with a minimal install for this exact reason. I'll try it later to make sure it works and report back.

11 ↗(On Diff #174139)

Setting an output is valid. It's just that stripped binaries traditionally have no extension. an empty "=" is commonly used to pass an empty string and is valid for all "=" arguments across llvm from the argument parser's perspective. This is equivalent to --build-id-link-output ""

The imagined use case for --build-idlink-output that uses a non-empty string would be if you're not stripping, you're adding or removing something and you need ".debug" afterwards. Or using "--only-keep-debug" despite that not having a proper implementation at this exact moment. The example for this would be llvm-objcopy --build-id-link-output=.debug --only-keep-debug <input> <output> there's also this informally proposed idea that has floated around a few people here of adding a third ".dwp" file to the build id directory but I think people wanted to put that feature directly in llvm-dwp. Also in that case you're more or less directly using llvm-objcopy just to do the copy to the build id directory and no other operation.

llvm/test/tools/llvm-objcopy/no-build-id2.test
1 ↗(On Diff #174139)

The intent was for one to have notes but no build id entry and the other to just not have notes. Those would need two different yaml files. I guess I forgot to add that. They've been added now.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
125 ↗(On Diff #174139)

In other code bases I've been working in constexpr has been prefered. As for the case correctness. This is a defined constant in system's <elf.h> but not in llvm's BinaryFormat header for elf. I'm adding it here as it would normally appear in the header. I'll remove the constexpr but I'd like to keep the name.

179 ↗(On Diff #174139)

I realized it wasn't needed and dropped in instead. I commented the other one as requested however.

164 ↗(On Diff #175368)
  1. Because I copied the error handeling functions from other places in the very first commit and we've just kept it so far
  2. They all do slightly different things. Though I think I'm not consistent about when I use the Error form of reportError and when I manually use error() with an Error
170–171 ↗(On Diff #175368)

yep. Thanks!

LGTM, with a couple of minor comments.

llvm/test/tools/llvm-objcopy/build-id-link-dir.test
11 ↗(On Diff #174139)

Ah, okay that makes more sense now. Thanks. "" is a bit clearer to me, I guess.

llvm/test/tools/llvm-objcopy/no-build-id2.test
1 ↗(On Diff #174139)

Okay. Maybe rename "no-build-id2.test" to "no-build-id-no-notes.test"? Not too bothered though. Alternatively, a comment in the two files explaining the difference.

llvm/tools/llvm-objcopy/ELF/ELFObjcopy.cpp
125 ↗(On Diff #174139)

Could we not just add it to the header?

jhenderson accepted this revision.Dec 3 2018, 3:53 AM
This revision is now accepted and ready to land.Dec 3 2018, 3:53 AM
jakehehrlich marked an inline comment as done.

Aight, uploading the final patch so people can see exactly what I'm going to push.

This revision was automatically updated to reflect the committed changes.