This is an archive of the discontinued LLVM Phabricator instance.

[llvm-objcopy] Default --output-target to --input-target when unspecified
ClosedPublic

Authored by MaskRay on Sep 3 2019, 11:34 PM.

Details

Summary

Fixes PR42171.

In GNU objcopy, if -O (--output-target) is not specified, the value is
copied from -I (--input-target).

objcopy -I binary -B i386:x86-64 a.txt b       # b is copied from a.txt
llvm-objcopy -I binary -B i386:x86-64 a.txt b  # b is an x86-64 object file

This patch changes our behavior to match GNU. GNU objcopy has another
strange behavior:

objcopy -I binary -O elf64-x86-64 a.txt b.o                # e_machine is EM_NONE
objcopy -I binary -B i386:x86-64 -O elf64-x86-64 a.txt b.o # e_machine is EM_X86_64

It is documented that -B is not used unless -I binary. Then, it is
unclear why the e_machine field cannot be inferred from -O. We just
fill in the e_machine field regardless of -B.

Diff Detail

Repository
rL LLVM

Event Timeline

MaskRay created this revision.Sep 3 2019, 11:34 PM
MaskRay retitled this revision from [llvm-objcopy] Set --output-target to --input-target is unspecified to [llvm-objcopy] Set --output-target to --input-target if unspecified.Sep 3 2019, 11:35 PM
MaskRay retitled this revision from [llvm-objcopy] Set --output-target to --input-target if unspecified to [llvm-objcopy] Default --output-target to --input-target when unspecified.

Would you mind explaining the purpose of the -B switch now? I'm just struggling to grasp it a little at the moment.

test/tools/llvm-objcopy/ELF/binary-input-and-output.test
11 ↗(On Diff #218596)

Nit: are specified -> are both specified.

15 ↗(On Diff #218596)

Would you also mind adding a second hash to the existing comments in this file for consistency throughout? Happy for that to be a separate NFC commit if you want, or in this one.

If you do that, please add the missing trailing full-stops too.

21 ↗(On Diff #218596)

Nit: are specified -> are both specified.

Perhaps also worth adding "for non-binary output" or similar, to distinguish this case from the above similar one.

test/tools/llvm-objcopy/ELF/binary-input-error.test
3–4 ↗(On Diff #218596)

This is valid behaviour now, right? We need a test case to show this works.

test/tools/llvm-objcopy/ELF/binary-input.test
4 ↗(On Diff #218596)

I think you can remove the -B here, right?

test/tools/llvm-objcopy/ELF/new-symbol-visibility.test
4 ↗(On Diff #218596)

Can you remove the -B here and elsewhere in this test too?

MaskRay updated this revision to Diff 218635.Sep 4 2019, 3:32 AM
MaskRay marked 7 inline comments as done.
MaskRay edited the summary of this revision. (Show Details)

Address review comments

binutils-gdb/binutils/NEWS says

* New command-line switch to objcopy -B (or --binary-architecture) which sets
  the architecture of the output file to the given argument.  This option only
  makes sense, if the input target is binary.  Otherwise it is ignored.
  By Stefan Geuken.

I assume it means -B is only useful when -I binary is used. Now there are 2 cases:

  • -I binary -O binary => it is a copy. -B is ignored.
  • -I binary -O elf64-x86-64 => elf64-x86-64 makes the output little-endian and 64-bit, but e_machine is not set. -B i386:x86-64 is supposed to set the e_machine field. This means if -I binary is used and -O is not binary, -B should always be set. I don't know why it behaves that way. I guess we may be able to ignore -B entirely. I'll do that in the next patch.
% file b.o
b.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
test/tools/llvm-objcopy/ELF/binary-input-error.test
3–4 ↗(On Diff #218596)

Added a test to binary-input-and-output.test

## If -O is not specified, it defaults to -I, i.e. "binary".
# RUN: llvm-objcopy -I binary %t.txt %t.3.txt
# RUN: cmp %t.txt %t.3.txt
jhenderson added inline comments.Sep 4 2019, 9:11 AM
test/tools/llvm-objcopy/ELF/binary-input-and-output.test
22 ↗(On Diff #218635)

%t.3.txt -> %t.5.txt here and the line below.

test/tools/llvm-objcopy/ELF/binary-output-target.test
18–19 ↗(On Diff #218635)

Was this case just missing before? Maybe worth making this change independently.

MaskRay updated this revision to Diff 218729.Sep 4 2019, 9:18 AM
MaskRay marked 2 inline comments as done.

%t.3.txt -> %t.5.txt

Delete a powerpc32 test which did not exist before

test/tools/llvm-objcopy/ELF/binary-output-target.test
18–19 ↗(On Diff #218635)

It was missed before. Will add it separately.

So it seems like we've been preferring the architecture implied by -O instead of the one preferred by -B. My intuition is that it should be the opposite. Did we find a reason to prefer using -O?

If our intent is to prefer using -O over -B when possible (as seems to be the case) then this all looks good to me. I just want an explanation if one exists.

MaskRay added a comment.EditedSep 5 2019, 12:09 AM

If our intent is to prefer using -O over -B when possible (as seems to be the case) then this all looks good to me. I just want an explanation if one exists.

I prefer -O to -B. -O seems more natural to describe the desired output format (O -> output, the opposite of -I -> input).

{llvm-,}objcopy -O binary a.o a.txt can extract the text file. There is no -B alternative, so -O has to be kept around.

After this change, we can essentially ignore -B, like what elftoolchain (used by FreeBSD) does:

case 'B':
  /* ignored */
  break;
  ...
case 'I':
case 's':
  set_input_target(ecp, optarg);
  break;
  ...
case 'O':
  set_output_target(ecp, optarg);
  break;
% elfcopy/elfcopy -I binary -O elf64-x86-64 /tmp/c/a.txt /tmp/c/a.o; file /tmp/c/a.o
/tmp/c/a.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped

# I have tried a very old version of binutils (~2003). The weird behavior has been around for many years...
% /tmp/p/binutils-2.14/Release/binutils/objcopy -I binary -O elf64-x86-64 a.txt a.o; file a.o  
a.o: ELF 64-bit LSB relocatable, no machine, version 1 (SYSV), not stripped
% /tmp/p/binutils-2.14/Release/binutils/objcopy -I binary -B i386:x86-64 -O elf64-x86-64 a.txt a.o; file a.o
a.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped
jakehehrlich accepted this revision.Sep 13 2019, 4:06 PM

I'm happy with this since the votes seem to favor this behavior.

This revision is now accepted and ready to land.Sep 13 2019, 4:06 PM
llvm/trunk/test/tools/llvm-objcopy/ELF/binary-input-and-output.test