This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Support a few elf32lriscv_* & elf64lriscv_* emulations
AcceptedPublic

Authored by MaskRay on Jan 30 2021, 7:29 PM.

Details

Summary

And add tests that such emulations can have _fbsd suffix to indicate
ELFOSABI_FREEBSD. c.f. https://reviews.llvm.org/D95370#2532572

Diff Detail

Event Timeline

MaskRay created this revision.Jan 30 2021, 7:29 PM
MaskRay requested review of this revision.Jan 30 2021, 7:29 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 30 2021, 7:29 PM
luismarques added inline comments.Jan 31 2021, 2:56 PM
lld/ELF/Driver.cpp
160

Remove this case?

MaskRay updated this revision to Diff 320379.Jan 31 2021, 3:09 PM
MaskRay marked an inline comment as done.

Make default value explicit
Delete stray case
Add a negative test

luismarques accepted this revision.Jan 31 2021, 3:10 PM

Overall seems fine to me.
I don't know if it's worth it more thoroughly testing the various suffix combinations?

lld/test/ELF/emulation-riscv.s
23–25

Unless the -NEXT suffix interacts across prefixes (?!), aren't the -NEXT suffixes here unneeded and misleading?

64–66

Ditto.

This revision is now accepted and ready to land.Jan 31 2021, 3:10 PM
MaskRay marked 2 inline comments as done.EditedJan 31 2021, 6:48 PM
MaskRay added a subscriber: jimw.

Overall seems fine to me.

Thanks!

I don't know if it's worth it more thoroughly testing the various suffix combinations?

That is excessive and does not seem to useful to me. I don't intend to add them in my binutils emulation & target triple patch https://sourceware.org/pipermail/binutils/2021-January/115156.html

Actually, I am not sure adding _ilp32f/_ilp32/_lp64f/_lp64 suffixes is a good idea if the only difference is the different library paths (which can be suppressed by ld -nostdlib). https://sourceware.org/bugzilla/show_bug.cgi?id=22962 @jimw

LLD does not have the concept of default library paths. It works for all targets we support because the compiler drivers pass through the library paths.

lld/test/ELF/emulation-riscv.s
23–25

-NEXT suffix interacts across prefixes.

64–66

-NEXT suffix interacts across prefixes

Overall seems fine to me.

Thanks!

I don't know if it's worth it more thoroughly testing the various suffix combinations?

That is excessive and does not seem to useful to me. I don't intend to add them in my binutils emulation & target triple patch https://sourceware.org/pipermail/binutils/2021-January/115156.html

Actually, I am not sure adding _ilp32f/_ilp32/_lp64f/_lp64 suffixes is a good idea if the only difference is the different library paths (which can be suppressed by ld -nostdlib). https://sourceware.org/bugzilla/show_bug.cgi?id=22962 @jimw

LLD does not have the concept of default library paths. It works for all targets we support because the compiler drivers pass through the library paths.

Yeah I really don't like the ABI emulations. elfXXlriscv is sufficient, the others should go (or at least never see the light of day in LLD if BFD needs to keep them for backwards compatibility).

Overall seems fine to me.

Thanks!

I don't know if it's worth it more thoroughly testing the various suffix combinations?

That is excessive and does not seem to useful to me. I don't intend to add them in my binutils emulation & target triple patch https://sourceware.org/pipermail/binutils/2021-January/115156.html

Actually, I am not sure adding _ilp32f/_ilp32/_lp64f/_lp64 suffixes is a good idea if the only difference is the different library paths (which can be suppressed by ld -nostdlib). https://sourceware.org/bugzilla/show_bug.cgi?id=22962 @jimw

LLD does not have the concept of default library paths. It works for all targets we support because the compiler drivers pass through the library paths.

Yeah I really don't like the ABI emulations. elfXXlriscv is sufficient, the others should go (or at least never see the light of day in LLD if BFD needs to keep them for backwards compatibility).

And really people are supposed to invoke the linker through the GCC or Clang driver, that's the only supported interface. Anything else and they can deal with having to specify the search path manually, but 99% of the time the code is bad and can just use the driver, perhaps with the odd -Wl,-foo. Then you get the right search path by virtue of -mabi, no need for hacky emulations to support dubious glibc RISC-V-specific multiarch silliness.

jimw added a comment.Jan 31 2021, 8:29 PM

The binutils folks used to make a joke "The correct way to spell GNU ld is GCC". Use of ld directly is discouraged. However, we do have default paths in the linker, and those paths must be correct. There are some situations where calling ld directly is not hard. And there are people with 20 year old code that uses ld directly that still needs to work if it isn't wrong. So I believe we need the emulations.

The emulations are only visible between gcc and ld. If you don't plan to support using gcc with lld then you don't need the emulations. Or we could have a configure option or command line option to tell gcc that it will be used with lld instead of ld,, and then change the emulation names depending on which linker is being used. I see that we already have the -fuse-ld= option in gcc that we could use. There are also --enable-ld=X and --enable-gold configure options, but they don't seem to do much, other than setting a configure variable to give the name of the ld to use. It seems to be primarily useful if doing a combined tree build (binutils+gcc) in which case gcc needs to know which of the two linkers (l.d.bfd or gold) to use. It is also used to simplify some feature tests, for features which are known to be supported in gold and/or bfd. Anyways, it doesn't seem to be used to control how gcc emits linker options. But we could extend it for that use.

The binutils folks used to make a joke "The correct way to spell GNU ld is GCC". Use of ld directly is discouraged. However, we do have default paths in the linker, and those paths must be correct. There are some situations where calling ld directly is not hard. And there are people with 20 year old code that uses ld directly that still needs to work if it isn't wrong. So I believe we need the emulations.

That doesn't quite follow. The code needs patching *anyway* to support RISC-V if it needs the emulation to be specified, so *how* you require it to be patched is entirely up to you. You could instead just say "don't do that" and make them write sane code rather than hacky BFD-specific code, whereas instead if you invoke LD directly and need to support multiarch then you need to have some hacky config stuff to detect what ABI is in use (which, incidentally, always ends up being parsing the output of cc -v's --with-abi which is extremely GCC-specific and makes zero sense for Clang due to being much more flexible). Or you could pass -L /foo/lp64d rather than -m elf64lriscv_lp64d. So I disagree with that statement.

The emulations are only visible between gcc and ld. If you don't plan to support using gcc with lld then you don't need the emulations.

We do, but that makes no sense. GCC shouldn't need the emulations because it can just give the right search paths like Clang does. That's far cleaner, and more flexible, than having a whole family of emulations just to be able to configure ld's default search path.

Or we could have a configure option or command line option to tell gcc that it will be used with lld instead of ld,, and then change the emulation names depending on which linker is being used. I see that we already have the -fuse-ld= option in gcc that we could use. There are also --enable-ld=X and --enable-gold configure options, but they don't seem to do much, other than setting a configure variable to give the name of the ld to use. It seems to be primarily useful if doing a combined tree build (binutils+gcc) in which case gcc needs to know which of the two linkers (l.d.bfd or gold) to use. It is also used to simplify some feature tests, for features which are known to be supported in gold and/or bfd. Anyways, it doesn't seem to be used to control how gcc emits linker options. But we could extend it for that use.

jimw added a comment.Jan 31 2021, 9:07 PM

When you configure GNU ld, you specify a default emulation, which in turn specifies the default built-in paths. So if an OS is ilp32f by default, then just calling ld will work. But we still need a way to handle cross compilers to other ABIs, and we do that via emulations.

On an x86 system, we have 3 emulations, because we have 3 ABIs. One for 32-bit code, one for 64-bit code, and one for the x32 ABI. On an MIPS system we have 3 emulations because we have 3 ABIs. One for 32-bit code, one for 64-bit code, and one for N32 ABI. On RISC-V, we have 6 ABIs, so we have 6 emulations.

An OS distro has many thousands of packages, and not all of them use ld the way that you might expect. If you don't want to fix an entire OS distro to use ld correctly, then I would argue that you need the emulations, because this is how binutils has traditionally solved this problem.

MaskRay marked 2 inline comments as done.Feb 1 2021, 11:50 AM

I took a skim in binutils-gdb/ld/emulparams and I am not sure _ilp32/_ilp32f/_lp64/_lp64f are how binutils traditionally solve minor ABI variant problems.
For architectures which provide very different flavors, e.g. x86-64's ILP32 ABI, elf32_x86_64, that emulation is there because ILP32 is entirely different.
If we read the file, we will notice that it changes many defaults, not just default library search paths.

Let's take a look at GCC: gcc/config/arm/linux-elf.h does not change emulation due to floating point ABI differences.
gcc/config/mips/linux.h has 3 variants but they are for fundamentally different ABIs.
RISC-V is currently unique in this matter.

LLD by design does not wants default library search paths. It actually works surprisingly well. We don't need special code for ARM/AArch64/mips/PowerPC/etc.
Different platforms have different customization needs. Some may want /lib, some may want /lib64, some may want in another location.
As jrtc27 said users should just do that via the compiler driver, and if they really need to invoke ld directly (I'd say in 99.9% cases they are inferior choices), they can specify explicit -L instead of passing through a magic -m.

I certainly want LLD to be used by GCC, but I'd also want it to be maintainable and readable. I'd be conservative taking these unnecessary quirks. https://maskray.me/blog/2020-12-19-lld-and-gnu-linker-incompatibilities

"One reason that I am subscribed to the binutils mailing list is I want to participate in its design processes (I am proud to say that I have managed to find some early issues of various new things)."

I'd hope future GCC versions can be changed to not rely on -m emulations. They can pass -L regardless of -fuse-ld={bfd,gold,lld}.

MaskRay added a comment.EditedFeb 1 2021, 11:55 AM

My personal criterion on what should be an emulation: it should have bfd name (OUTPUT_FORMAT) difference on e_machine/EI_CLASS/EI_DATA/EI_OSABI (file format output in objdump -h output and objcopy's -I/-O values)
x86-64 and AArch64's ILP32 ABIs meet this crterion.

elf*riscv_{ilp32,ilp32f,lp64,lp64f} currently fail this condition.

FreeBSD need separate emulations because they have different EI_OSABI values.
Please also see the description of D95749 for my comment on OSABI.

jimw added a comment.Feb 4 2021, 5:46 PM

I added the emulations here.

https://sourceware.org/pipermail/binutils/2018-May/102785.html

the bug that this is to fix is here

https://sourceware.org/bugzilla/show_bug.cgi?id=22962

The bug report discussion concluded that it had to be fixed in the linker, and fixing it in the linker required adding emulations.