This is an archive of the discontinued LLVM Phabricator instance.

Add support for OUTPUT_ARCH linker script command
AbandonedPublic

Authored by void on Dec 5 2018, 12:27 AM.

Details

Summary

The OUTPUT_ARCH linker script command overrides the "-m <arch>"
command line flag.

Event Timeline

void created this revision.Dec 5 2018, 12:27 AM
void added a comment.Dec 5 2018, 12:41 AM

Note: I tried to add as many BFD arch entries as I could muster from binutils. It's obvious not complete, so any further entries you think should be here let me know.

grimar added a subscriber: grimar.Dec 5 2018, 4:38 AM

What is your use case? I wonder what is the reason to use OUTPUT_ARCH to override -m?

We started to support OUTPUT_FORMAT recently, but it does not override the
EKind, EMachine and MipsN32Abi values if they were already found from reading the files.

ELF/ScriptParser.cpp
407

I did not debug it, but I think the following case would fail:

If you take 32bit object test32.o and a script saying the arch is 64,
and invoke ld.lld test32.o -T script

then the script should override the EKind/EMachine and crash the linker somewhere
(I suspect it will try to parse the files using a wrong ELFT then).

ruiu added a comment.Dec 5 2018, 7:36 AM

What is the motivation to do this? We don't try too hard to implement every detail of the linker script because it's just too complicated and there is usually an easy workaround that works without a linker script.

void added a comment.Dec 5 2018, 12:56 PM

What is your use case? I wonder what is the reason to use OUTPUT_ARCH to override -m?

It was pointed out that the "-m" flag has slightly different semantics than what we would expect:

From Ian Taylor:

"The documentation for the option says:

-m emulation
    Emulate the emulation linker. You can list the available emulations with the ‘--verbose’ or ‘-V’ options.
    If the ‘-m’ option is not used, the emulation is taken from the LDEMULATION environment variable, if that is defined.
    Otherwise, the default emulation depends upon how the linker was configured.

"The point is that the emulation does not primarily set the expected object file format. It sets the default linker behavior. With the BFD linker, it primarily sets the linker script to use (the BFD linker always uses a linker script). If you then provide a different linker script using the -T option, that naturally overrides the linker script selected by the -m option."

void added a comment.Dec 5 2018, 12:58 PM

What is the motivation to do this? We don't try too hard to implement every detail of the linker script because it's just too complicated and there is usually an easy workaround that works without a linker script.

The motivation is to support linking Linux, which uses linker scripts heavily. I added a comment by IanT from another thread about why this is a good addition.

void marked an inline comment as done.Dec 5 2018, 2:12 PM
void added inline comments.
ELF/ScriptParser.cpp
407

It doesn't look like that's the case:

[morbo@fawn:llvm] cat script 
OUTPUT_ARCH(i386:x86-64)
[morbo@fawn:llvm] clang -m32 -c z.c
[morbo@fawn:llvm] ./llvm.opt.obj/bin/ld.lld z.o -T script 
ld.lld: warning: cannot find entry symbol _start; defaulting to 0x401000
ruiu added a comment.Dec 5 2018, 2:52 PM

Even with this patch, don't you still have to make a change to lld so that it accepts a set of input files of different targets?

void added a comment.Dec 5 2018, 3:02 PM

Even with this patch, don't you still have to make a change to lld so that it accepts a set of input files of different targets?

I don't understand. They should all be for the same target. This is just telling the linker which target they're actually for.

ruiu added a comment.Dec 5 2018, 3:10 PM

Hmm, you said that you are handling inputs that are mix of i386 and x86-64 object files, so I thought this is for that issue. What am I missing?

ruiu added a comment.Dec 5 2018, 3:17 PM

Looks like what you are trying to do is to generate a x86-64 executable from i386 object files. That's what we do not expect in lld. With this patch, in theory you can create AArch64 exectuables from x86-64 object files, but that doesn't make sense and likely to crash the linker because we assume that all object files are uniform in terms of target types.

I think this use case is too tricky to directly support in the linker. I'd probably use objcopy or something to transplant i386 code to x86-64 object file before passing it to the linker, so that we don't deal with the trickiness in the linker.

grimar added a comment.Dec 5 2018, 3:19 PM

What is the motivation to do this? We don't try too hard to implement every detail of the linker script because it's just too complicated and there is usually an easy workaround that works without a linker script.

The motivation is to support linking Linux, which uses linker scripts heavily. I added a comment by IanT from another thread about why this is a good addition.

Now I remember. We saw this bug earlier and decided to report an issue to linux kernel and not change the linker.
Bug reported here: https://bugzilla.kernel.org/show_bug.cgi?id=194091.

void added a comment.Dec 5 2018, 3:19 PM

Looks like what you are trying to do is to generate a x86-64 executable from i386 object files. That's what we do not expect in lld. With this patch, in theory you can create AArch64 exectuables from x86-64 object files, but that doesn't make sense and likely to crash the linker because we assume that all object files are uniform in terms of target types.

I think this use case is too tricky to directly support in the linker. I'd probably use objcopy or something to transplant i386 code to x86-64 object file before passing it to the linker, so that we don't deal with the trickiness in the linker.

Sorry, I've never said that I want to generate an x86-64 executable from i386 object files. The OUTPUT_ARCH command doesn't do that anyway...

void added a comment.Dec 5 2018, 3:23 PM

What is the motivation to do this? We don't try too hard to implement every detail of the linker script because it's just too complicated and there is usually an easy workaround that works without a linker script.

The motivation is to support linking Linux, which uses linker scripts heavily. I added a comment by IanT from another thread about why this is a good addition.

Now I remember. We saw this bug earlier and decided to report an issue to linux kernel and not change the linker.
Bug reported here: https://bugzilla.kernel.org/show_bug.cgi?id=194091.

That bug report is very old and no action has been taken on it. From what I copied above from Ian Taylor, the bug you reported isn't actually a bug, but a misunderstanding of what the -m flag means.

ruiu added a comment.Dec 5 2018, 3:27 PM

Sorry, I'm really confused. Could you explain again for me why you need this patch to link Linux kernel?

void added a comment.Dec 5 2018, 3:49 PM

Sorry, I'm really confused. Could you explain again for me why you need this patch to link Linux kernel?

The Linux kernel uses linker scripts heavily. They include the OUTPUT_ARCH command. It's a feature that's not quite the same as the -m command line option. It's not cruft and has an actual real use.

void abandoned this revision.Dec 5 2018, 3:57 PM
tpimh added a subscriber: tpimh.Dec 6 2018, 12:58 AM