This is an archive of the discontinued LLVM Phabricator instance.

[lld][ELF] Add support for -b binary
ClosedPublic

Authored by Bigcheese on Aug 30 2016, 3:44 PM.

Details

Summary

Implemented by building an ELF file in memory.

The -b formats 'elf', 'default', and 'binary' match gold behavior.

This is needed for a bunch of kernel drivers in FreeBSD.

Diff Detail

Event Timeline

Bigcheese updated this revision to Diff 69771.Aug 30 2016, 3:44 PM
Bigcheese retitled this revision from to [lld][ELF] Add support for -b binary.
Bigcheese updated this object.
Bigcheese added a reviewer: ruiu.
Bigcheese added a project: lld.
Bigcheese added a subscriber: llvm-commits.
ruiu added inline comments.Aug 30 2016, 4:52 PM
ELF/Driver.cpp
500

I'd directly call readLinkerScript if OPT_script. It should allows you to remove the second parameter from addFile.

506

We use S as a local variable name in other places for a return value of getValue.

515

I think getSpelling always return -b, so I'd just print out "unknown -b format: ".

ELF/InputFiles.cpp
714

Can you move all these details into SimpleELFWriter so that we can just do something like

std::vector<uint8_t> Buf = createELFFromBinary(MB);
return createELFFile<ObjectFile>(MemoryBufferRef(Buf, MB.getBufferIdentifier()));
ELF/SimpleELFWriter.h
1

Writer is I think a confusing name because it sounds like an alternative Writer. It is actually rather a part of readers. I'd probably name this BinaryFile.h.

Bigcheese added inline comments.Aug 30 2016, 5:40 PM
ELF/Driver.cpp
500

I was looking at that, but then I'd have to copy:

if (Config->Verbose)
  outs() << Path << "\n";

Optional<MemoryBufferRef> Buffer = readFile(Path);
if (!Buffer.hasValue())
  return;
MemoryBufferRef MBRef = *Buffer;

Which seems like more complexity than adding a param.

506

Will do.

515

It can also return -format=.

Also I believe that would be bad practice in general. It makes it really easy to get into the situation where we print the "wrong" thing.

ELF/InputFiles.cpp
714

If the concern is just having this much code in InputFile.cpp I'm fine with moving it somewhere else and calling it from here.

As for moving the code into SimpleELFWriter, I would rather keep SimpleELFWriter simple and not mix responsibilities. I plan to extend/refactor SimpleELFWriter to LLVMObject at some point and use it for final ELF emission for MC, split dwarf, objcopy (maybe), etc.

This was extracted (and reduced) from our internal version of the linker which has need of it elsewhere. So I'm already at 3 active uses of this code including this patch.

ruiu added inline comments.Aug 30 2016, 6:15 PM
ELF/Driver.cpp
500

Ah right. I'm fine with your change.

ELF/InputFiles.cpp
714

My concern was mostly about the amount of code you added to this file. What we want to do here is to wrap a binary blob with an ELF header, and we are not interested in the details, so I wanted to move all these details to other file.

I see a room to simplify the writer thing, but as long as it is encapsulated into one file, we can improve the implementation independently from other file.

ELF/SimpleELFWriter.h
31

Please create a .cpp file and move the definitions to that file.

rafael added inline comments.
ELF/InputFiles.cpp
724

These names look wrong. With gold and bfd I get

_binaryhome_espindola_llvm_build_tools_lld_test_ELF_Output_format_binary_test_tmp_binary_start
_binary
home_espindola_llvm_build_tools_lld_test_ELF_Output_format_binary_test_tmp_binary_size
_binary__home_espindola_llvm_build_tools_lld_test_ELF_Output_format_binary_test_tmp_binary_end

emaste added inline comments.Sep 7 2016, 11:16 AM
ELF/InputFiles.cpp
724

I believe it should be _binary_ + alnum_pathname + {_start, _end, _size}, where alnum_pathname is the full pathname passed to the linker with non-alnum characters replaced by _.

I still thing that writing an ELF to read it back is a bad design, but the refactoring needed to avoid it is non-trivial. We are almost there for symbols, but sections require more work.

So please upload a last version with the linux build fixed, the symbol names changes and the nit in this comment. With that, it probably LGTM and will try to refactor it afterwards.

ELF/Driver.cpp
507

You can write this as

case OPT_format: {
  StringRef Val = Arg->getValue();
  if (Val == "elf" || Val == "default")
    Config->Binary = false;
  else if (Val == "binary")
    Config->Binary = true;
  else
    error("unknown " + Arg->getSpelling() + " format: " + Arg->getValue() +
          " (supported formats: elf, default, binary)");
  break;
}
Bigcheese updated this revision to Diff 70644.Sep 7 2016, 8:37 PM
  • Split SimpleELFWriter into .cpp
  • Fix nits
  • Fix linux build
  • Fix symbol names
rafael accepted this revision.Sep 8 2016, 4:35 AM
rafael added a reviewer: rafael.

LGTM

test/elf/format-binary.test
2

The test directory name is wrong. It should be ELF, not elf.

This revision is now accepted and ready to land.Sep 8 2016, 4:35 AM
emaste added a comment.Sep 8 2016, 8:54 AM

One issue remains in using this for the FreeBSD kernel module build -- we don't specify the output format / emulation right now, because we build GNU ld with the output target compiled-in. This results in an error (see D24348 for my suggested improvement to the error message).

I think it's much better if we don't have to build the linker multiple times though (once for each supported target). The right thing for us to do is make the build pass in -m.

ruiu added inline comments.Sep 8 2016, 9:19 AM
ELF/CMakeLists.txt
20 ↗(On Diff #70644)

I still don't like this file name. SimpleELFWriter sounds like it is for -o binary instead of -b. Can you rename BinaryFile.cpp?

ELF/InputFiles.cpp
736

Please move these details to SimpleELFWriter.cpp.

emaste added a comment.Sep 8 2016, 1:26 PM

The right thing for us to do is make the build pass in -m.

That change is now in review for FreeBSD, https://reviews.freebsd.org/D7837

Bigcheese updated this revision to Diff 70753.Sep 8 2016, 2:30 PM
Bigcheese edited edge metadata.
  • Change name to ELFCreator
ruiu added inline comments.Sep 8 2016, 2:32 PM
ELF/InputFiles.cpp
736

How about this?

Bigcheese added inline comments.Sep 8 2016, 2:35 PM
ELF/InputFiles.cpp
736

Well, the code doesn't belong in ELFCreator.cpp. I can make a BinaryFile.cpp for it, but it's only 40 lines of code.

ruiu added inline comments.Sep 8 2016, 2:39 PM
ELF/InputFiles.cpp
736

That distinction doesn't make much sense to me because this is part of the linker. If you have a concrete plan to move ELFCreator to some library, it may make sense, but it doesn't seem to happen soon.

silvas added a subscriber: silvas.Sep 8 2016, 3:14 PM
silvas added inline comments.
ELF/InputFiles.cpp
736

Like Michael already said, ELFCreator is already used in at least 2 other places for PS4. It doesn't make sense to move this BinaryFile-specific code into ELFCreator.

ruiu added inline comments.Sep 8 2016, 5:17 PM
ELF/InputFiles.h
310

Is this equivalent to std::unique_ptr<char *>? If so, please write that way.

Can you use uint8_t instead of char?

Bigcheese added inline comments.Sep 8 2016, 5:51 PM
ELF/InputFiles.h
310

It needs to be [] so that delete[] is called instead of delete. But I'm actually going to change it to std::vector.

I can change it to uint8_t.

ruiu accepted this revision.Sep 9 2016, 2:57 PM
ruiu edited edge metadata.

LGTM. Let's land this. Thank you for doing this!

This revision was automatically updated to reflect the committed changes.