This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - LTO: do not optimize away symbols accessed from linkerscript.
ClosedPublic

Authored by grimar on Aug 23 2017, 4:58 AM.

Details

Summary

Previously when BC file had global variable that was accessed from script,
it was optimized away.
In this patch I add left side of assignment expression to list of symbols referenced
by script. That marks them used from regular objects for LTO and fixes issue observed.

Diff Detail

Event Timeline

grimar created this revision.Aug 23 2017, 4:58 AM
davide added a subscriber: davide.Aug 23 2017, 5:43 AM

Out of curiosity, where did you hit this?
Is there any way to test that the assignment updates the value of foo to be 1 ?

grimar added a comment.EditedAug 23 2017, 6:04 AM

Out of curiosity, where did you hit this?

It is very close to D37009 btw.

Looks phab ate my first reply. Duplicating again for history:

Out of curiosity, where did you hit this?

I did not hit, just basing on my knowledge about how things work, I supposed this case will not work and checked.

Is there any way to test that the assignment updates the value of foo to be 1 ?

I believe yes, I'll update the testcase.

What happens if foo is specified in the linker script but actually doesn't appear in any bitcode file?
IIRC we should error in that case.

This comment was removed by grimar.

What happens if foo is specified in the linker script but actually doesn't appear in any bitcode file?
IIRC we should error in that case.

I think we will create absolute symbol foo for this case.
If it would be foo = bar then we would error out.

I was wrong, if all inputs are BC and we have foo = 1 script then I do not see it in output and link succeeds.
I'll take a look closer on this case.

davide requested changes to this revision.Aug 23 2017, 6:33 AM

Yes, I was afraid this wasn't enough.

This revision now requires changes to proceed.Aug 23 2017, 6:33 AM

Yes, I was afraid this wasn't enough.

As far I see without this patch situation is the same. We do not create absolute symbols if have
foo = 1 script and only bitcode inputs that does not have foo symbol.
I think such case should be fixed in a different patch then.

grimar updated this revision to Diff 112357.Aug 23 2017, 6:50 AM
grimar edited edge metadata.
  • Improved testcase.
grimar updated this revision to Diff 112365.Aug 23 2017, 7:19 AM

My appologies. Everything is fine with emiting absolute symbols.
Looks I looked into temporarily LTO object instead of final binary.

I updated testcase to show they work fine. Sorry for all noise.

grimar planned changes to this revision.Aug 23 2017, 8:14 AM
grimar requested review of this revision.Aug 23 2017, 8:52 AM
ruiu added inline comments.Aug 23 2017, 9:07 AM
ELF/ScriptParser.cpp
632

Why false?

705–711

This function doesn't seem like the best place to do this. Is readAssignment better, isn't it?

grimar added inline comments.Aug 23 2017, 9:40 AM
ELF/ScriptParser.cpp
632

I need to distinguish assignments inside output section declaration and outside (global).
Because for example we have in one of testcases:

ONLY_IF_RO { abc = 1; *(foo) }

And in that case we should not create abc at all.
I decided to handle only global assignments for now, probably it can be enough.
That is why false.

705–711

Unfortunaly not, because PROVIDE/HIDDEN/PROVIDE_HIDDEN are using it too.

We should not provide symbol that was not referenced.
When we add it to Opt.ReferencedSymbols it creates undefined symbol and that works like it was referenced.
(https://github.com/llvm-mirror/lld/blob/master/ELF/LinkerScript.cpp#L148)
What breaks PROVIDE logic.

davide added a subscriber: pcc.EditedAug 25 2017, 10:39 AM

FWIW, this is just an incarnation of a more general problem, i.e. the fact that the interaction between LTO and linker scripts is/wasn't well studied in the past.
I guess we may want to fix this case, but I'm afraid several other problems will arise.
I wonder when/where we should draw the line between supporting arbitrary interactions between the two or just suggesting people to pass -fno-lto to an handful of files in their compile step, as GCC seems to be doing.
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=65252
(also, cc @pcc for thoughts)

pcc added a comment.Aug 25 2017, 11:04 AM

I feel like a reasonable position to start with would be "you can't use a linker script to talk about sections in an LTO input file, only symbols". i.e. a SECTIONS clause would not be able to refer to sections in LTO input files, but you would be able to redefine symbols (that support already exists for --wrap and --defsym of course) and define symbols in terms of other symbols defined in LTO input files.

That basically seems to be the position that GCC has taken, and doesn't seem too onerous to support in LLD.

grimar updated this revision to Diff 112882.Aug 28 2017, 4:30 AM
  • Reimplemented, taking considerations from D37208 into account.
grimar updated this revision to Diff 113070.Aug 29 2017, 5:19 AM
  • Removed excessive variable.
pcc added inline comments.Sep 12 2017, 11:07 AM
ELF/LTO.cpp
170 ↗(On Diff #113070)

I think we should be setting LinkerRedefined to true for these symbols instead of VisibleToRegularObj, otherwise the optimizer may IPO past them.

grimar added inline comments.Sep 13 2017, 6:24 AM
ELF/LTO.cpp
170 ↗(On Diff #113070)

I am not really familar with LTO logic, but if that is true, that sounds like a bug of LTO for me.
May be I am missing something, but if linker says symbols should be visible (and that means used) from regular object,
I would expect LTO should not eliminate them out.

But actually I think it does not do that:
I looked how both VisibleToRegularObj and LinkerRedefined are used. They set symbol resolution to
GlobalResolution::External:
https://github.com/llvm-mirror/llvm/blob/master/lib/LTO/LTO.cpp#L426

And that prevents making them internal here:
https://github.com/llvm-mirror/llvm/blob/master/lib/LTO/LTO.cpp#L812

After all when parsing LTO objects and symbols:
https://github.com/llvm-mirror/lld/blob/master/ELF/SymbolTable.cpp#L125
there is no bar symbol at all (because VisibleToRegularObj was not set for it,
but foo is still there). That probably shows that logic is correct.
What am I missing ?

grimar added inline comments.Sep 13 2017, 6:26 AM
ELF/LTO.cpp
170 ↗(On Diff #113070)

When I mentioned bar and foo, I meant those from my testcase. foo is assigned in script, bar is not.

pcc added inline comments.Sep 13 2017, 11:36 AM
ELF/LTO.cpp
170 ↗(On Diff #113070)

By "IPO past them" I meant that the optimizer may apply inlining or otherwise use the symbol's definition to optimize code that references the symbol, which is what setting LinkerRedefined prevents. That is orthogonal to whether the symbol is dropped.

For example suppose we have these two TUs:

int f() {
  return g();
}
int g() {
   return 1;
}
int h() {
   return 2;
}

and this linker script:

g = h;

If we mark g as VisibleToRegularObj, that would not prevent g from being inlined into f at LTO time, causing f to incorrectly return 1. Setting LinkerRedefined prevents g from being inlined, so f will return 2 as a result of the linker script replacing g with h.

grimar planned changes to this revision.Sep 14 2017, 12:01 AM
grimar added inline comments.
ELF/LTO.cpp
170 ↗(On Diff #113070)

Ah, got it, thanks for explanation ! I'll try to prepare testcase and update the diff with this change.

grimar updated this revision to Diff 115218.Sep 14 2017, 7:33 AM
  • Addressed review comment, added testcase for IPO.
pcc accepted this revision.Sep 14 2017, 9:25 AM

LGTM

ELF/LTO.cpp
128 ↗(On Diff #115218)

I would inline this function into the caller.

ruiu added inline comments.Sep 14 2017, 11:24 AM
ELF/LTO.cpp
168 ↗(On Diff #115218)

"following symbols" doesn't explain what this code is for. It's basically saying that this code does something as is written. Please update the comment.

grimar updated this revision to Diff 115377.Sep 15 2017, 2:09 AM
  • Addressed comments.
ELF/LTO.cpp
128 ↗(On Diff #115218)

Done. Thanks for review !

168 ↗(On Diff #115218)

Done.

ruiu added inline comments.Sep 15 2017, 2:17 PM
ELF/LTO.cpp
134 ↗(On Diff #115377)

Remove llvm::.

168 ↗(On Diff #115218)

This commend needs to explain "why" instead of "what", as what this code does is obvious by, well, code.

grimar updated this revision to Diff 115612.Sep 18 2017, 2:27 AM
  • Addressed review comments.
ruiu added inline comments.Sep 18 2017, 12:27 PM
ELF/LTO.cpp
165 ↗(On Diff #115612)

IPO -> interprocedural optimization

167 ↗(On Diff #115612)

Linkerscript is not a word. It's a linker script.

  1. Symbols redefined in linker scripts.
grimar updated this revision to Diff 115815.Sep 19 2017, 2:59 AM
  • Addressed review comments.
ruiu added inline comments.Sep 19 2017, 1:08 PM
ELF/LTO.cpp
165–170 ↗(On Diff #115815)

"We have to do this because ..." part explains why we can't do IPO for both (1) and (2) symbols. So, writing it as part of (2) is not correct.

grimar updated this revision to Diff 115987.Sep 20 2017, 5:30 AM
  • Updated comment.
ruiu accepted this revision.Sep 24 2017, 2:50 PM

LGTM

ELF/LTO.cpp
166 ↗(On Diff #115987)

can -> would

grimar accepted this revision.Sep 25 2017, 2:35 AM

Committed as r314097

This revision was automatically updated to reflect the committed changes.