This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objdump] Print file format in lowercase to match GNU output.
ClosedPublic

Authored by rupprecht on Feb 11 2020, 12:00 PM.

Details

Summary

GNU objdump prints the file format in lowercase, e.g. elf64-x86-64. llvm-objdump prints ELF64-x86-64 right now, even though piping that into llvm-objcopy refuses that as a valid arch to use.

As an example of a problem this causes, see: https://github.com/ClangBuiltLinux/linux/issues/779

Diff Detail

Event Timeline

rupprecht created this revision.Feb 11 2020, 12:00 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
MaskRay accepted this revision.Feb 11 2020, 1:23 PM

Verified that coff, elf, and mach-o are printed as lower cases. I believe bpf is similar.

llvm/tools/llvm-objdump/llvm-objdump.cpp
2176

outs() << ":\tfile format " << O->getFileFormatName().lower() << "\n\n";

This revision is now accepted and ready to land.Feb 11 2020, 1:23 PM
MaskRay added a subscriber: grimar.EditedFeb 11 2020, 1:26 PM

Wait. I wonder whether we can change llvm-readobj to use lower case names as well. The following should be updated:

StringRef ELFObjectFile<ELFT>::getFileFormatName() const {
  bool IsLittleEndian = ELFT::TargetEndianness == support::little;
  switch (EF.getHeader()->e_ident[ELF::EI_CLASS]) {
  case ELF::ELFCLASS32:
    switch (EF.getHeader()->e_machine) {
    case ELF::EM_386:
      return "ELF32-i386";
    case ELF::EM_IAMCU:
      return "ELF32-iamcu";
    case ELF::EM_X86_64:
      return "ELF32-x86-64";
    case ELF::EM_ARM:

These are almost bfdnames, except that they use upper cases. I don't find a compelling argument using upper case ELF, so I vote for changing these ELF to elf. @jhenderson @grimar @sbc100 thoughts?

// lib/Object/WasmObjectFile.cpp
StringRef WasmObjectFile::getFileFormatName() const { return "WASM"; }
MaskRay added a subscriber: sbc100.Feb 11 2020, 1:30 PM
MaskRay added a comment.EditedFeb 11 2020, 1:55 PM

I think the patch is good to go on its own, after the outs() << ":\tfile format " << O->getFileFormatName().lower() << "\n\n"; change.

If the consensus is to also change llvm-readobj output, we can teach getFileFormatName() to use lower case names and remove .lower() from llvm-objdump.cpp . It will involve a bunch of other test case updates. They can be done in a separate change.

rupprecht updated this revision to Diff 243991.Feb 11 2020, 1:57 PM
rupprecht marked an inline comment as done.
  • Use StringRef::lower()

Wait. I wonder whether we can change llvm-readobj to use lower case names as well. The following should be updated:

StringRef ELFObjectFile<ELFT>::getFileFormatName() const {
  bool IsLittleEndian = ELFT::TargetEndianness == support::little;
  switch (EF.getHeader()->e_ident[ELF::EI_CLASS]) {
  case ELF::ELFCLASS32:
    switch (EF.getHeader()->e_machine) {
    case ELF::EM_386:
      return "ELF32-i386";
    case ELF::EM_IAMCU:
      return "ELF32-iamcu";
    case ELF::EM_X86_64:
      return "ELF32-x86-64";
    case ELF::EM_ARM:

These are almost bfdnames, except that they use upper cases. I don't find a compelling argument using upper case ELF, so I vote for changing these ELF to elf. @jhenderson @grimar @sbc100 thoughts?

// lib/Object/WasmObjectFile.cpp
StringRef WasmObjectFile::getFileFormatName() const { return "WASM"; }

Yep... I was originally thinking we could update those to be lowercase, but that's kinda a broader change, and I imagine most llvm developers are used to seeing "ELF64-..." and I'm not sure it's worth changing all the tools just so that objdump can be more GNU compatible.

I personally wouldn't mind it. If others are in favor of it too, I'm all for it and can make the change.

tpimh added a subscriber: tpimh.Feb 12 2020, 1:13 AM

I don't really have a strong opinion either way. By strict style rules, "ELF" is an acronym, so should be all upper-case. In regular text, I'd expect it to be ELF everywhere too. On the other hand, as pointed out, "elf" is the bfd naming style.

For WASM, I recall there being some discussion to standardise on a particular format, and it would make sense if we can be consistent across all file formats. Perhaps @sbc100 could bring his insights from that side over.

Wasm is not an acronym so we don't usually display it in uppercase. So yes, I think we should change WasmObjectFile::getFileFormatName() to return lowercase.

This revision was automatically updated to reflect the committed changes.