This is an archive of the discontinued LLVM Phabricator instance.

[ELF] Hint -z nostart-stop-gc for __start_ undefined references
ClosedPublic

Authored by MaskRay on Nov 30 2021, 4:42 PM.

Details

Summary

Make users aware what to do with ld.lld 13.0.0 / GNU ld<2015-10 --gc-sections
behavior.

Diff Detail

Event Timeline

MaskRay created this revision.Nov 30 2021, 4:42 PM
MaskRay requested review of this revision.Nov 30 2021, 4:42 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 30 2021, 4:42 PM
MaskRay updated this revision to Diff 390859.Nov 30 2021, 4:46 PM

change path

MaskRay updated this revision to Diff 390866.Nov 30 2021, 5:02 PM

Suggest KEEP

MaskRay updated this revision to Diff 391415.Dec 2 2021, 11:53 AM

start_stop_gc => start-stop-gc

I think we can flesh out the content later.
Having the message in the 13.0.1 code may help.

I am happy to discuss the wording for start-stop-gc.rst. Having the code in 13.0.1 is more important, though, so I am pushing it.

This revision was not accepted when it landed; it landed in state Needs Review.Dec 2 2021, 11:58 AM
This revision was landed with ongoing or failed builds.
This revision was automatically updated to reflect the committed changes.
smeenai added a subscriber: smeenai.Dec 2 2021, 7:59 PM

This won't diagnose the case where the actual contents of the C-named section get dropped, which can only be caught at runtime. For example, our application of encapsulation symbols for library merging was relying on C-named sections being retained; the behavior change in LLD 13 didn't result in any link-time errors, but it caused run-time crashes because all the JNI_OnLoads were dropped. (It was straightforward for us to add __attribute__((retain)) because of how our builds are set up, so it wasn't a problem for us.) This is still a good change; just pointing out that it doesn't capture all possible cases which might change program behavior.

Generally once a patch is sent for review it should remain in review until approved, rather than committed without approval - so this patch should probably be reverted until approved?

Generally once a patch is sent for review it should remain in review until approved, rather than committed without approval - so this patch should probably be reverted until approved?

Hmm, is that standard convention? I've observed a bunch of cases where contributors put up a review as an FYI or to get some additional thoughts, but then commit if there's no feedback or objections, for patches where they would have otherwise been comfortable committing directly for post-commit review.

Generally once a patch is sent for review it should remain in review until approved, rather than committed without approval - so this patch should probably be reverted until approved?

Hmm, is that standard convention?

Generally - not sure if it's well documented, but I can probably find a few instances of this being sited over the years by folks other than myself.

I've observed a bunch of cases where contributors put up a review as an FYI or to get some additional thoughts, but then commit if there's no feedback or objections, for patches where they would have otherwise been comfortable committing directly for post-commit review.

Yeah, there's some wriggle room, especially for code owners (like @MaskRay in this case) who might be seeking feedback but not feel like they're blocked on receiving that feedback. In that case I think the best/necessary thing to do is to document that ahead of time in the initial review - that the review is seeking feedback, but does not require it to proceed. That way it doesn't end up looking like the initial review required feedback, but the patch was then committed without feedback because the author wasn't willing to wait for folks to give that feedback. (that's the main reason for this rule: To ensure reviewers don't feel rushed/authors don't use short timelines as a way to push things through without the necessary review)

I'm happy to approve in post; my apologies for not getting to it in time. Regardless of the opinion on the default value of start-stop-gc this particular change adds a useful warning and documentation.

As an aside Arm's proprietary linker introduced a "weakly live" value for sections referenced by the equivalent of __start and __stop. The linker would follow relocations from a "weakly live" section to see if it could reach an existing "live" section from the same object file, where "live" is the result of classic garbage collection assuming --start-stop-gc . This heuristic kind of worked for what we needed it do, which was to permit all exception tables (reachable via start and stop) to be thrown away if no object called throw. We didn't have any real rigorous theory to back up the heuristic though other than it was conservative enough to keep what we need to keep and gave good enough GC results in most cases. I'd not recommend LLD follow that though as it was a heuristic tuned to a system library.

I'm happy to approve in post; my apologies for not getting to it in time. Regardless of the opinion on the default value of start-stop-gc this particular change adds a useful warning and documentation.

As an aside Arm's proprietary linker introduced a "weakly live" value for sections referenced by the equivalent of __start and __stop. The linker would follow relocations from a "weakly live" section to see if it could reach an existing "live" section from the same object file, where "live" is the result of classic garbage collection assuming --start-stop-gc . This heuristic kind of worked for what we needed it do, which was to permit all exception tables (reachable via start and stop) to be thrown away if no object called throw. We didn't have any real rigorous theory to back up the heuristic though other than it was conservative enough to keep what we need to keep and gave good enough GC results in most cases. I'd not recommend LLD follow that though as it was a heuristic tuned to a system library.

Ah, thanks for weighing in. Sounds good to me!

This won't diagnose the case where the actual contents of the C-named section get dropped, which can only be caught at runtime.
For example, our application of encapsulation symbols for library merging was relying on C-named sections being retained; the behavior change in LLD 13 didn't result in any link-time errors, but it caused run-time crashes because all the JNI_OnLoads were dropped. (It was straightforward for us to add __attribute__((retain)) because of how our builds are set up, so it wasn't a problem for us.) This is still a good change; just pointing out that it doesn't capture all possible cases which might change program behavior.

This won't diagnose the case when the __start_ reference is weak.
Among the few Debian Code Search examples, I cannot find undefined weak examples.
I know that Clang InstrProfiling uses undefined weak. Such examples are just rare.
For such cases, we have to rely on the developer to understand the problem. (The section just doesn't exist, so presumably not too difficult to troubleshoot.)

Generally once a patch is sent for review it should remain in review until approved, rather than committed without approval - so this patch should probably be reverted until approved?

Hmm, is that standard convention?

Generally - not sure if it's well documented, but I can probably find a few instances of this being sited over the years by folks other than myself.

I've observed a bunch of cases where contributors put up a review as an FYI or to get some additional thoughts, but then commit if there's no feedback or objections, for patches where they would have otherwise been comfortable committing directly for post-commit review.

Yeah, there's some wriggle room, especially for code owners (like @MaskRay in this case) who might be seeking feedback but not feel like they're blocked on receiving that feedback. In that case I think the best/necessary thing to do is to document that ahead of time in the initial review - that the review is seeking feedback, but does not require it to proceed. That way it doesn't end up looking like the initial review required feedback, but the patch was then committed without feedback because the author wasn't willing to wait for folks to give that feedback. (that's the main reason for this rule: To ensure reviewers don't feel rushed/authors don't use short timelines as a way to push things through without the necessary review)

Ah, ok. Rushed on this because the combination of (a) the the code change would be useful for 13.0.1 and the code was the final form and the documentation page can evolve separately as I mentioned..
(b) this was (clearly as I think) what the release manager and some folks wanted.
(I was on the plane for many hours today. Pushed this so that there was a commit id with which the release manager knew what to cherry pick)

I'm happy to approve in post; my apologies for not getting to it in time. Regardless of the opinion on the default value of start-stop-gc this particular change adds a useful warning and documentation.

Thanks!