Page MenuHomePhabricator

[lld] [ELF] Support customizing behavior on target triple
Needs RevisionPublic

Authored by mgorny on Jan 13 2019, 2:00 PM.

Details

Summary

Support customizing LLD behavior on target triple; in particular,
having defaults depend on target system. The triple can be either
passed via --target= option, inferred from filename (${triple}-ld.lld)
or stored at build time as LLVM default.

Diff Detail

Event Timeline

mgorny created this revision.Jan 13 2019, 2:00 PM

I'm happy with this approach since the triple can be set on the commandline.
If this gets merged I will update CHERI lld to use the triple instead of the new emulation that we added.

ELF/Driver.cpp
757

I think failing on an unknown triple would be good since it can avoid user error where the wrong defaults are chosen just because of a small typo.

mgorny updated this revision to Diff 181552.Jan 14 2019, 7:09 AM
mgorny retitled this revision from [lld] [ELF] Support inferring target triple from filename to [lld] [ELF] Support customizing behavior on target triple.
mgorny edited the summary of this revision. (Show Details)
mgorny marked 2 inline comments as done.
mgorny added inline comments.
ELF/Driver.cpp
757

Somewhat done. Note that LLD apparently doesn't fail on errors immediately, so it behaves a bit weird.

mgorny updated this revision to Diff 181554.Jan 14 2019, 7:18 AM
mgorny marked an inline comment as done.

Fixed leaving triple unset on invalid --target.

I think we need to be very careful here. If I've understood this correctly then if this functionality is used for anything critical then a manually supplied target will be needed when doing cross-linking. For example my host LLD is x86_64, is just called ld.lld and will have an inferred x86_64 target triple. If someone customises the behaviour of LLD on the triple in a way that doesn't get caught by the test suite then we could get some strange breakages when doing cross-linking. I personally would prefer to see any option like this not try and auto-infer the target unless it can do it reliably and accurately from the input objects and I don't know if that is possible for all supported targets.

I think this might be better raised on lllvm-dev as I suspect that we need to give this wider visibility. I'm not totally opposed to this as I can see that --target has some advantages over adding extra emulations, or relying on the --target in the compiler driver but I think we need to be careful to define how this will interact with the emulation, and what the bounds of what we can customise with the option are?

ruiu accepted this revision.Jan 14 2019, 10:21 AM

Let me make it clear again that I'm *not* okay with this approach. Please explicitly pass command line arguments from the compiler driver to the linker.

This revision is now accepted and ready to land.Jan 14 2019, 10:21 AM
ruiu requested changes to this revision.Jan 14 2019, 10:22 AM
This revision now requires changes to proceed.Jan 14 2019, 10:22 AM
krytarowski added a comment.EditedJan 30 2019, 7:06 AM

@ruiu we need some resolution here. We have stronger feelings from the community to customize the linker directly based on detected triple.

At least other !GNU platforms will benefit from it too (at least FreeBSD, OpenBSD and other BSD derivations like mentioned CheriBSD).
FreeBSD already adds such target customization based on an emulation name hack, we cannot repeat it on NetBSD in the same way.

In the worst case we can even produce a new flavor of lld target that resembles/duplicates original ELF and targets (Net)BSD.

It does allow us better customization with keeping compat with other toolchain components, especially from the the GNU world.

Ideally we would get something for LLVM 8.0.. as it could be still backportable at least for downstream purposes. RC2 is planned for Feburary 6th and we need unpatched lld functional out of the box.

Under what circumstances and what model will you let to support this approach?

ruiu added a comment.Jan 30 2019, 10:44 AM

Kamil,

I understand how adding --target option to the linker would make target-specific customization easier, but that's as I said not acceptable by design in lld. The code we have for FreeBSD just sets one bit in the output file, and that's fundamentally different from adding a "master switch" to turn on all customization for some specific target. We've been making changes to lld to make it work for many targets, but the features we've added are on by default (such as the code for Android TLS layout or supporting .openbsd.randomdata section for OpenBSD) or enabled by per-feature basis by a command line flag (such as *-freebsd emulation name). There's no flag like "turning lld into FreeBSD mode" or something like that. It seems to be working pretty well, and I'm happy about that design. So, I'm sorry, but I don't think I can support this patch's approach at all.

Honestly what's special about NetBSD from the lld's perspective seems to be the code ownership of the compiler driver. I can help you solve technical issues, but I don't think I should accept a patch that's against one of the lld's fundamental design choices to solve non-technical problems.

I think the main concern is that if we link ld -> ld.lld, gcc will no longer work out of the box. So we would have to repeat all the logic in gcc and in any other compiler that calls ld directly.

ruiu added a comment.Jan 30 2019, 11:09 AM

But that's only adding missing -L and perhaps a few more, no? That doesn't seem too hard to do in gcc. I don't want to repeat what compiler drivers do in the linker. Also, even with this patch, you need to make a change to gcc to pass --target parameter to the linker, right?

But that's only adding missing -L and perhaps a few more, no? That doesn't seem too hard to do in gcc.

Probably not, provided nobody will block it. I'm not the one opposed to it.

I don't want to repeat what compiler drivers do in the linker. Also, even with this patch, you need to make a change to gcc to pass --target parameter to the linker, right?

Well, no, since gcc doesn't really support cross-compiling the way clang does, so it will just be happy with LLVM's default triple. We will probably want to patch GNU ld to accept/ignore --target though.

ruiu added a comment.Jan 30 2019, 11:29 AM

If you still need to patch GNU ld, it doesn't seems like this patch makes things easier for you. (But even if this would make it easier for you, this patch's approach is not okay by design though.)

If we pass flags from clang, we sacrifice:

  • lld usable as a standalone executable
  • gcc capable to use lld as a functional linker
  • clang driver not capable to call unpatched gnu ld/gold as we grow local flags to workaroud the customization (such as -z nognustack)
  • compat with a number of programs programs that call linker directly and we customize the passed flags (gcc itself is one of them)

We just gain clang+lld tandem to work and we must retain GCC/LD interoperability (which as I understand isn't a priority/interesting for other OSes).

In theory we can patch:

  • GCC to pass lld flags
  • GNU LD (/gold) to accept new flags
  • all the programs using linker to stop doing so (like the GCC project)

We need to tune in lld at least for start (out of my head, likely not a finished list):

  • tune the DT_NEEDED behavior,
  • rpath behavior,
  • GNU stack generation
  • specify default target/emulation search paths
  • disable 3 segments (disable ro one)

We can address all the issues either in lld or in all the other projects around which certainly doesn't scale (and I expect in particular problems to push e.g. -z nognustack to GNU ld(1)).

OpenBSD indeed patches downstream their linkers for additional flags and accidentally it works better in the lld design. In NetBSD we struggle to get sane defaults rather than a set of additional flags. GNU stack is unwanted also on at least OpenBSD(/Fuchsia..) and there is no way to disable it in lld as of now (a proposal is to grow a custom flag). All OSes need to pass default paths for standalone linker usage (I've listed a dozen of examples from FreeBSD ports in other review).

The current feeling in the project is that if we cannot set the defaults we should fork the lld driver downstream and patch it there, which is a lose as the upstream version will keep to be unusable.

If you still need to patch GNU ld, it doesn't seems like this patch makes things easier for you. (But even if this would make it easier for you, this patch's approach is not okay by design though.)

If we can get standalone lld functional out of the box and respecting target, we no longer need to patch GCC/LD/distribution/3rd-party-software.

If there is a strong feeling to retain lld ELF target as it is, we can grow a custom target like Darwin/Windows and specify customization there. The mentioned Darwin adds /usr/lib in the linker directly. We need analogous opportunity on our end.

krytarowski added a comment.EditedJan 30 2019, 3:18 PM

chandlerc added a comment.

There was a long discussion between two NetBSD maintainers about this (both already in the reviewers list of this patch). I'm not sure if there is an existing thread that would be better to follow up on as opposed to starting a fresh thread.

The discussion is ongoing since 2017 and we have an impasse.

I think the question was: should the path search logic be handled in the clang driver or in LLD itself. FWIW, I prefer that NetBSD follow the same design as every other platform here with the (somewhat C and C++ specific) search logic provided by the clang driver.

Path logic is unfortunately just the tip of iceberg of customization we need, but it's true that its the first problem to encounter when attempting to use lld/NetBSD. Depending on the approach of solving it we can follow up with other changes.

Also to be clear, the current design of LLD eliminates usa-cases of using ld that we find important to keep supporting (and they are in real-world usage) - upstream doesn't want to handle them for (as far as I understand) the sake of simplicity -- the result is that LLD deliverables are highly biased with a specific OS target, which can (and actually do) produce dysfunctional binaries for others. So in other words this patch is more like adding a missing feature that can be turned into benefit of every ELF target, adding skeleton of standalone support and we can follow up with other OS-specific changes.

For now we need to use GCC with lld with these patches applied, but this is not a desirable solution (I call it a failure).

@ruiu ping?

LLVM 9 is branching soon and we would like to unearth from local patches.

From the internal discussions it was decided that we want to maintain our current behavior that differs to Linux.

MaskRay requested changes to this revision.EditedJul 1 2019, 10:27 PM

krytarowski: The discussion is ongoing since 2017 and we have an impasse.

peter.smith: I think this might be better raised on lllvm-dev as I suspect that we need to give this wider visibility. I'm not totally opposed to this as I can see that --target has some advantages over adding extra emulations, or relying on the --target in the compiler driver but I think we need to be careful to define how this will interact with the emulation, and what the bounds of what we can customise with the option are?

Peter suggested raising your problem on the llvm-dev mailing list. Have you done that?

Personally I am opposed to this change. The compiler driver (gcc,clang) has a set of arch/OS dependent defaults. It seems weird/redundant to add another sets of defaults on the linker side.

A few replies to some bullet points you raised in a previous comment.

GCC to pass lld flags

Compiler drivers have more information (OS version, toolchain path, linker name, etc) to decide if some linker flags are needed. Some -f options can generate linker options as well. Your statistics on the other revision suggested there are very few users of ld. For them, they usually want to accomplish a very specific goal by dancing with a linker script.

Have you thought that the arch/os/environment dependent behaviors will actually make them harder to port?

rpath behavior

From my reading of https://blog.netbsd.org/tnf/entry/the_first_report_on_lld , I don't see why NetBSD must maintain the --disable-new-dtags local patch.

GNU stack generation

-z nognustack makes sense to me. I commented on D56554.

specify default target/emulation search paths

This job should be done in the compiler drivers.

disable 3 segments

Personally I don't think it is really necessary to change the default.. But you can add --no-rosegment if you are really bothered by the layout.

I think your requirement can be simply accomplished with a shell script wrapper called ld:

#!/bin/sh
# customization
exec ld.lld <options>

the result is that LLD deliverables are highly biased with a specific OS target, which can (and actually do) produce dysfunctional binaries for others

I don't agree. You might have this belief because many mainstream OSes are dominated by GNU toolchain for a decade (there is an interesting topic on llvm-dev/2019-June that may catch your eyes).
I'm glad this doesn't happen on lld land.

If you read my recent test changes in various components of llvm and lld, you'll find I avoid -unknown-linux-gnu as long as it makes sense :) I don't want my changes geared toward Linux "quirks" (as you might say).

krytarowski added a comment.EditedJul 2 2019, 8:02 AM

Personally I am opposed to this change. The compiler driver (gcc,clang) has a set of arch/OS dependent defaults. It seems weird/redundant to add another sets of defaults on the linker side.

The NetBSD toolchain relies on internal knowledge in a linker that can be used as a standalone program. This is our design choice and this was decided internally to be kept. You might not like it from some point of view, but it is our real-world use-case, we wrote set of patches for it, we maintain a buildbot building and running lld tests... we try to upstream it.

A similar customization is actually done for (at least) Darwin (as the only OS using MachO) and FreeBSD (by an accident it's possible to differentiate FreeBSD on emul name). Please see that Darwin appends default paths (last time I checked) and for FreeBSD there are some fixups based on detecting FreeBSD. For OpenBSD there is an accident that it works thanks to embedding 'openbsd' in customization parts. For NetBSD we don't have such escape detection viable to include fixups and we shall detect triple target based on lld executable name.

If you don't like to see this code by defaut in lld, we can include it under #ifdef or overload the 'elf' namespace into 'netbsdelf'.

We need to unearth from our local patches and get upstream version of lld functional.