This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Make binding (weak or non-weak) logic consistent for Undefined and SharedSymbol
ClosedPublic

Authored by MaskRay on Aug 1 2019, 8:51 AM.

Details

Summary

This is a case missed by D64136. If %t1.o has a weak reference on foo,
and %t2.so has a non-weak reference on foo:

0. ld.lld %t1.o %t2.so          # ok; STB_WEAK; accepted since D64136
1. ld.lld %t2.so %t1.o          # undefined symbol: foo; STB_GLOBAL
2. gold %t1.o %t2.so            # ok; STB_WEAK
3. gold %t2.so %t1.o            # undefined reference to 'foo'; STB_GLOBAL
4. ld.bfd %t1.o %t2.so          # undefined reference to `foo'; STB_WEAK
5. ld.bfd %t2.so %t1.o          # undefined reference to `foo'; STB_WEAK

It can be argued that in both cases, the binding of the undefined foo
should be set to STB_WEAK, because the binding should not be affected by
referenced from shared objects.

--allow-shlib-undefined doesn't suppress errors (3,4,5), but -shared or
--noinhibit-exec allows ld.bfd/gold to produce a binary:

3. gold -shared %t2.so %t1.o    # ok; STB_GLOBAL
4. ld.bfd -shared %t2.so %t1.o  # ok; STB_WEAK
5. ld.bfd -shared %t1.o %t1.o   # ok; STB_WEAK

If %t2.so has DT_NEEDED entries, ld.bfd will load them (lld/gold don't
have the behavior). If one of the DSO defines foo and it is in the
link-time search path (e.g. DT_NEEDED entry is an absolute path, via
-rpath=, via -rpath-link=, etc),
ld.bfd %t1.o %t2.so and ld.bfd %t1.o %t2.so will not error.

In this patch, we make Undefined and SharedSymbol share the same binding
computing logic. Case 1 will be allowed:

0. ld.lld %t1.o %t2.so          # ok; STB_WEAK; accepted since D64136
1. ld.lld %t2.so %t1.o          # ok; STB_WEAK; changed by this patch

In the future, we can explore the option that turns both (0,1) into
errors if --no-allow-shlib-undefined (default when linking an
executable) is in action.

Event Timeline

MaskRay created this revision.Aug 1 2019, 8:51 AM
MaskRay updated this revision to Diff 212829.Aug 1 2019, 8:55 AM
MaskRay edited the summary of this revision. (Show Details)

Mention --allow-shlib-undefined

MaskRay updated this revision to Diff 212831.Aug 1 2019, 8:57 AM
MaskRay retitled this revision from [ELF] resolveUndefined: make STB_WEAK logic consistent for Undefined and SharedFile to [ELF] resolveUndefined: make STB_WEAK logic consistent for Undefined and SharedSymbol.

Retitle

MaskRay edited the summary of this revision. (Show Details)Aug 1 2019, 9:01 AM

IIUC the desired state you are proposing is:

ld.lld %t1.o %t2.so          # ok; STB_WEAK
ld.lld %t2.so %t1.o          # ok; STB_WEAK

I agree that ELF is underspecified here and I agree that the shared library shouldn't change the binding to non-weak in the executable (in the case where we allow undefined symbols in shared libraries). Personally I think ld.bfd's behaviour is preferable given the intention behind giving error messages from undefined symbols; i.e. catching likely errors at the earliest possible stage. If we permit it then the unfortunate user will get an undefined symbol error at run time from the dynamic loader instead of the static linker.

MaskRay updated this revision to Diff 212843.Aug 1 2019, 9:51 AM
MaskRay edited the summary of this revision. (Show Details)

Forget to initialize referenced

MaskRay updated this revision to Diff 212951.EditedAug 1 2019, 7:04 PM
MaskRay edited the summary of this revision. (Show Details)

I agree that ELF is underspecified here and I agree that the shared library shouldn't change the binding to non-weak in the executable (in the case where we allow undefined symbols in shared libraries). Personally I think ld.bfd's behaviour is preferable given the intention behind giving error messages from undefined symbols; i.e. catching likely errors at the earliest possible stage. If we permit it then the unfortunate user will get an undefined symbol error at run time from the dynamic loader instead of the static linker.

I agree erroring would be preferable. However, ld.bfd recursively loads DT_NEEDED entries to resolve undefined referenced. This behavior is not implemented in gold and lld. We may have to be permissive, as what we did in the implementation of --no-allow-shlib-undefined.

Added an example to the description:

If one of the DSO defines foo and it is in the link-time search path (e.g. DT_NEEDED entry is an absolute path, via -rpath=, via -rpath-link=, etc), ld.bfd %t1.o %t2.so and ld.bfd %t1.o %t2.so will not error.

MaskRay updated this revision to Diff 213014.Aug 2 2019, 4:28 AM
MaskRay retitled this revision from [ELF] resolveUndefined: make STB_WEAK logic consistent for Undefined and SharedSymbol to [ELF] Make binding (weak or non-weak) logic consistent for Undefined and SharedSymbol.
MaskRay edited the summary of this revision. (Show Details)

Retitle. Mention that:

In the future, we can explore the option that turns both (0,1) into errors.
--no-allow-shlib-undefined is in action.

ruiu accepted this revision.Aug 6 2019, 5:54 AM

Honestly I don't have a strong preference on the behavior of weak symbols in this case. I have a slight concern that if there is a program that depends on the current behavior, they may break by this change, but this case is too subtle and linker-dependent, so I'm perhaps worried too much.

LGTM

This revision is now accepted and ready to land.Aug 6 2019, 5:54 AM

Honestly I don't have a strong preference on the behavior of weak symbols in this case. I have a slight concern that if there is a program that depends on the current behavior, they may break by this change, but this case is too subtle and linker-dependent, so I'm perhaps worried too much.

LGTM

Don't worry! After the binding fix, the check is actually relaxed (weak undef doesn't error in lld while it may in a few scenarios in ld.bfd).

This revision was automatically updated to reflect the committed changes.