This is an archive of the discontinued LLVM Phabricator instance.

[ELF] - Dedupliсate FDEs correctly when two sections are ICFed
AbandonedPublic

Authored by grimar on Sep 14 2017, 5:09 AM.

Details

Reviewers
ruiu
rafael
Summary

This is PR34518,

Previously when LLD ICFed 2 sections it leaved 2 duplicate entries in .eh_frame
pointing to the same address. After that it "deduplicated" them from .eh_frame_header,
adjusting the amount of valid FDEs in its header.
As a result excessive entries in .eh_frame and excessive dummy data in .eh_frame_header
was emited to output. Patch fixes that.

Diff Detail

Event Timeline

grimar created this revision.Sep 14 2017, 5:09 AM
grimar retitled this revision from [ELF] - Deduplivate FDEs correctly when two sections are ICFed to [ELF] - Dedupliсate FDEs correctly when two sections are ICFed.Sep 14 2017, 5:13 AM
grimar edited the summary of this revision. (Show Details)
ruiu added inline comments.Sep 15 2017, 2:53 PM
ELF/SyntheticSections.cpp
459–460

You are making this attribute-accessing function non-idempotent, which is generally bad. Again, instead of "fixing" a problem, you want to think harder to find out what this function should do. Please implement in a different way.

grimar updated this revision to Diff 115651.Sep 18 2017, 7:52 AM
  • Reimplemented according to review comments.
ruiu edited edge metadata.Sep 18 2017, 12:11 PM

I don't think this is in the right direction. This seems too slow and too complicated. Why don't you just skip FDEs whose section IS does not satisfy IS == IS->Repl? Usually IS == IS->Repl, but if IS has been merged by ICF, IS != IS->Repl, so you can easily identify if a given section has been merged by ICF.

That said, first of all, do you think you are doing is right? It seems that the current ICF code merges two different functions that are otherwise identical except FDEs. We should take FDEs into account when comparing functions in ICF, shouldn't we? (I'm not saying that you should update this patch, but I want you to think.)

ELF/SyntheticSections.cpp
436

This temporary variable is redundant.

grimar added a comment.EditedSep 19 2017, 4:46 AM
In D37848#874114, @ruiu wrote:

I don't think this is in the right direction. This seems too slow and too complicated. Why don't you just skip FDEs whose section IS does not satisfy IS == IS->Repl? Usually IS == IS->Repl, but if IS has been merged by ICF, IS != IS->Repl, so you can easily identify if a given section has been merged by ICF.

Honestly that was the first I thought of when started to work on the fix. I faced with breakage of eh-frame-hdr-icf.s that time.
Given its code:

.globl _start, f1, f2
_start:
  ret

.section .text.f1, "ax"
f1:
  .cfi_startproc
  ret
  .cfi_endproc

.section .text.f2, "ax"
f2:
  .cfi_startproc
  ret
  .cfi_endproc

ICF merged both .text.f1 and .text.f2 to .text. That way simple checking of IS == IS->Repl would just omit both two FDEs we have here.
I supposed that it is not correct and that we can leave single FDE and use its data then. Since FDE describes
"how to unwind to the caller if the current instruction pointer is in the range covered by the FDE." I supposed it is fine to use it.
I did not think that it can be ICF issue here. ICF stands for "identical code folding", I did not think about we should take FDE data into account
when doing ICF.
Just like we do not support --icf=safe currently (and so far ICF is not safe anyways), I supposed we can assume that FDE data for identical code can be reused.

FWIW I just checked gold's behavior and it seems it is to use FDE of section that is alive after ICF. For case above it just drops both FDEs, but if we have 2
sections with different FDEs it uses the first FDE in order and ICF eliminates the second section. That also makes me think FDEs can be reused.

That said, first of all, do you think you are doing is right? It seems that the current ICF code merges two different functions that are otherwise identical except FDEs.
We should take FDEs into account when comparing functions in ICF, shouldn't we? (I'm not saying that you should update this patch, but I want you to think.)

Given above I am not sure that we should do that. Taking FDEs into account is probably safest possible way to handle things, but I am not sure it is worth that as it may
slow down ICF and make its results to be worse.
If that path be chosen that seems we also need to check that their CIEs are the same as it should be possible to have different personality functions for example.

ruiu added a comment.Sep 19 2017, 1:05 PM

You are saying that it is not correct to merge a .text section with other .text section if one has an FDE and the other doesn't. But you are also saying that we should merge two .text sections if both have FDEs even if their FDE contents are different. These two statements does not sound consistent to me. I don't think I'm convinced.

If you are saying that checking existence of FDEs is enough, can you argue why that is enough? If not, you could write code to demonstrate a problem (I guess you could write code that uses exceptions and behaves differently when two sections are merged by ICF) so that we can fix it. I don't think this patch fixes a real problem.

In D37848#875500, @ruiu wrote:

You are saying that it is not correct to merge a .text section with other .text section if one has an FDE and the other doesn't.

No I did not :) What I was trying to say is that it is seems to me strange that we can merge such sections and lose valid FDE in result.
And so I *supposed* we should merge them but keep and use valid FDE. That is actually what code currently do. If we have 3 identical code sections,
and one of them does not have FDE, then all 3 be merged, and both 2 FDEs be placed to .eh_frame and one of FDEs will be filtered out
from .eh_frame_hdr finally.

But you are also saying that we should merge two .text sections if both have FDEs even if their FDE contents are different. These two statements does not sound consistent to me. I don't think I'm convinced.

That is what we already do and it works. I can make synthetic testcase which will break it (can corrupt one of FDEs for example),
but synthetic case is probably not the good reason to change current behavior. Especially when we know that gold also just uses one of FDEs,
so behavior is probably acceptable.

If you are saying that checking existence of FDEs is enough, can you argue why that is enough? If not, you could write code to demonstrate a problem (I guess you could write code that uses exceptions and behaves differently when two sections are merged by ICF) so that we can fix it. I don't think this patch fixes a real problem.

Patch fixes the real problem which is about redundant FDE entries LLD currently emits. We emit redundant duplicate entries in .eh_frame and have to adjust .eh_frame_hdr which also contains garbage data at its end.
So we have output larger it can be currently.

As mentioned earlier in previous comments, simpler approach can be used: just skip FDEs whose section IS does not satisfy IS == IS->Repl,
but I can prepare the sample to demonstrate the possible problem that can happen then.
It is also unfortunately a bit synthetic.
I tried to use -fno-asynchronous-unwind-tables flag to omit .eh_frame from output (https://gcc.gnu.org/bugzilla/show_bug.cgi?id=43232),
but it does not omit it for me. So what can be used to demonstrate the difference is next code:

int test() {
  try {
    throw 1;
  }catch(...){}
 return 0;
}

If I compile 2 files with the same test() and test1() to asm and drop .eh_frame from one of them manually. Then ICF will combine test() and test1().
Then depending on order of objects in inputs there will be either zero FDEs or single one in ouput I believe.

So my question is then - since testcase is synthetic, should we care about it ?
If no then I think I agree we can just check IS == IS->Repl for now.

ruiu added a comment.Sep 21 2017, 2:15 PM

That is what we already do and it works. I can make synthetic testcase which will break it (can corrupt one of FDEs for example),
but synthetic case is probably not the good reason to change current behavior. Especially when we know that gold also just uses one of FDEs,
so behavior is probably acceptable.

It may not work, and that's what we are talking about. What gold or other linkers do don't really matter when we are talking about what is the correct behavior.

First of all, no optimization should change any program's defined behavior. Non-safe ICF does not really satisfy as it could break pointer equality, but it is described. It's akin to -ffast-math flag that allows compilers to do unsafe optimizations. But still, it shouldn't break any assumption that is not described. Naturally, it shouldn't change exception handler's behavior, for example.

I wonder if you really understood my point. What I was saying that, if two functions have different FDEs, we probably shouldn't merge these functions because merging them could change their behavior when exception is thrown. We are not talking about the current lld's or gold's behavior but what it should be. Are we on the same page?

In D37848#878146, @ruiu wrote:

I wonder if you really understood my point. What I was saying that, if two functions have different FDEs, we probably shouldn't merge these functions because merging them could change their behavior when exception is thrown. We are not talking about the current lld's or gold's behavior but what it should be. Are we on the same page?

Now - yes.
I'll be happy to try to reimplement this patch to change how ICF handles sections with/without FDEs to most correct way then.

grimar abandoned this revision.Sep 26 2017, 6:53 AM

Abandoning. Will try to reimplement in according to discussed way.