This is an archive of the discontinued LLVM Phabricator instance.

[lld-link] Add safe icf mode to lld-link, which does safe icf for all sections.
ClosedPublic

Authored by zequanwu on Feb 24 2021, 8:15 PM.

Details

Summary

lld-link's /opt:safeicf will do safe icf for all sections.

Diff Detail

Event Timeline

zequanwu requested review of this revision.Feb 24 2021, 8:15 PM
zequanwu created this revision.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 24 2021, 8:15 PM

Code looks good. I don't have suggestion whether a new mode should be added.

rnk added a comment.Feb 25 2021, 1:47 PM

We had a user request to bring the "safe" and "all" ICF modes into alignment between the ports: https://crbug.com/1087923

I think LLD's /OPT:ICF behavior should match MSVC, which doesn't give you safe ICF, it should give you real ICF. So, I guess the thing to do is add a safeicf mode (/OPT:SAFEICF?), and then update these tests to use it.

LLD's current behavior is a bit emergent: we do safe ICF for non-executable data. That's pretty useful. Very few programs rely on function pointer address significance, but many programs do things like taking the address of a global and stuffing it in a hash table. The question is, should we have a mode that preserves this behavior? Fewer modes is better, but if we don't have a behavior preserving mode, Chrome either takes a binary size hit (safe ICF) or is potentially broken by global data ICF.

zequanwu updated this revision to Diff 326844.Feb 26 2021, 4:43 PM
  • Add /OPT:safeicf, which will mark all sections as unique if .addrsig doesn't exist. By default, /OPT:icf will do aggressive icf, matching the behavior of MSVC linker's /OPT:icf.
  • Update test cases.
pcc added a subscriber: pcc.Feb 26 2021, 5:05 PM

As far as I know the current behavior is that /opt:icf gives you full ICF for executable sections and safe ICF for non-executable sections. This is more than what MSVC does, which is full ICF for executable sections and no ICF for non-executable sections. I think it makes sense to add a mode that gives you full ICF for non-executable sections, but the behavior of /opt:icf should be preserved.

In D97436#2591761, @pcc wrote:

As far as I know the current behavior is that /opt:icf gives you full ICF for executable sections and safe ICF for non-executable sections. This is more than what MSVC does, which is full ICF for executable sections and no ICF for non-executable sections. I think it makes sense to add a mode that gives you full ICF for non-executable sections, but the behavior of /opt:icf should be preserved.

Initially, we found that the current behavior of /opt:icf merges executable sections even if they are in address-significant table, which obviously is a bug. This patch is trying to fix it and adding a /opt:safeicf to do safe icf for both executable and non-executable sections. The behavior of /opt:icf is preserved.

pcc added a comment.Feb 26 2021, 9:12 PM
In D97436#2591761, @pcc wrote:

As far as I know the current behavior is that /opt:icf gives you full ICF for executable sections and safe ICF for non-executable sections. This is more than what MSVC does, which is full ICF for executable sections and no ICF for non-executable sections. I think it makes sense to add a mode that gives you full ICF for non-executable sections, but the behavior of /opt:icf should be preserved.

Initially, we found that the current behavior of /opt:icf merges executable sections even if they are in address-significant table, which obviously is a bug.

No it isn't. This matches the behavior of MSVC, which doesn't have a concept of an address significance table.

This patch is trying to fix it and adding a /opt:safeicf to do safe icf for both executable and non-executable sections. The behavior of /opt:icf is preserved.

If you think that /opt:icf should do safe ICF for executable sections then I don't get what you think the difference between /opt:icf and /opt:safeicf should be.

zequanwu updated this revision to Diff 327228.Mar 1 2021, 11:56 AM

Preserve the bahavior of /opt:icf. Using /opt:safeicf to do safe icf for all sections.

zequanwu updated this revision to Diff 327232.Mar 1 2021, 12:03 PM

Update test case.

rnk added a comment.Mar 1 2021, 4:07 PM

Some nits.

I'm a bit worried that our end state for Chrome is going to be slightly confusing. In order to allow users to suppress ICF with attributes, this is what we will have to do:

  • Add a new mode to clang -faddrsig that stops putting address taken functions in .llvm_addrsig (-faddrsig-data(-only)? -fno-addrsig-code?)
  • Add a new attribute to clang that puts a function in .llvm_adddrsig (not sure if this should be called literally __attribute__((addrsig)) or something general like noicf)
  • Switch from /OPT:IFC to /OPT:SAFEICF in Chrome

Someone reading the build files might think that we are using safe ICF, but the compiler flag (-fno-addrsig-code or whatever) will mean we are effectively using regular, aggressive ICF. I guess we'll just have to live with that. It's pretty normal for build files to be confusing. =/

lld/COFF/Config.h
86

This isn't precisely MSVC's behavior. LLD will do ICF for code like MSVC, but it will do safe ICF for data using .llvm_addrsig, which MSVC doesn't look at. I'd make the comment more explicit, like: "Aggressive ICF for code, but safe ICF for data, similar to MSVC's behavior"

It seems to me that MSVC will not merge any readonly data sections:

extern "C" int printf(const char *, ...);
extern const int g1;
extern const int g2;
const int g1 = 42;
const int g2 = 42;
int main() {
  printf("g1: %d, g2: %d\n", g1, g2);
  printf("&g1: %p, &g2: %p\n", &g1, &g2);
}

g1 and g2 have distinct addresses:

$ cl -Z7 -Gy -Gw -O1 -c t.cpp -Facl.asm && link t.obj  -OPT:REF,ICF -debug && ./t.exe
Microsoft (R) C/C++ Optimizing Compiler Version 19.28.29336 for x64
Copyright (C) Microsoft Corporation.  All rights reserved.

t.cpp
Microsoft (R) Incremental Linker Version 14.28.29336.0
Copyright (C) Microsoft Corporation.  All rights reserved.

g1: 42, g2: 42
&g1: 00007FF77EDD22E0, &g2: 00007FF77EDD22E4
lld/COFF/ICF.cpp
86–89

nit: I feel like this would read better as:

// Under regular (not safe) ICF, all code sections are eligible.
if (icfLevel == ICFLevel::All && ...EXECUTE)
  return true;

I think that's functionally equivalent: under safe ICF we'd fall through to the main return.

zequanwu updated this revision to Diff 327336.Mar 1 2021, 6:17 PM
zequanwu marked 2 inline comments as done.

address comment.

rnk added a comment.Mar 2 2021, 1:39 PM

Thanks, I have some more driver code simplification suggestions.

lld/COFF/Driver.cpp
1554–1557

I see, I had forgotten about this limited ICF behavior.

1606–1607

I think this FIXME is wrong, LLD will merge readonly data in both safe and regular ICF modes. In both modes, it will refuse to merge readonly data unless the object file has .llvm_addrsig and the data is not address taken.

Let's make the following changes:

  • get rid of the "limited" ICF level 1, it is equivalent to ICFLevel::All
  • Use Optional<ICFLevel> as the type for icfLevel
  • if (!icfLevel && config->doGC) icfLevel = ICFLevel::All; -- that should make /opt:ref enable ICF
  • Delete this FIXME comment and the discussion of limited ICF.
zequanwu added inline comments.Mar 3 2021, 12:20 PM
lld/COFF/Driver.cpp
1606–1607

This will change the default behavior. Do you meant that? Or I misunderstand.
In current behavior, if doGC is false and /opt:icf is not given (icfLevel ==1), it will set icfLevel to 0, which disable icf:

if (icfLevel == 1 && !doGC)
  icfLevel = 0;

By your changes, if doGC is false and /opt:icf is not given, icfLevel will be ICFLevel::ALL which enables icf.

rnk added inline comments.Mar 3 2021, 12:31 PM
lld/COFF/Driver.cpp
1606–1607

Yes, sorry, my intent is to use Optional to handle the case where neither /OPT:REF nor /OPT:ICF appear. In that case, icfLevel will be None, and doICF should become ICFLevel::None. Maybe a better way of writing that is:

if (!icfLevel)
  icfLevel = config->doGC ? ICFLevel::All : ICFLevel::None;
zequanwu updated this revision to Diff 327920.Mar 3 2021, 2:27 PM

Address comment.

zequanwu retitled this revision from [lld-link] Fix addrsig symbols merging in ICF. to [lld-link] Add safe icf mode to lld-link, which does safe icf for all sections..Mar 3 2021, 2:27 PM
zequanwu edited the summary of this revision. (Show Details)
rnk accepted this revision.Mar 3 2021, 2:29 PM

lgtm

This revision is now accepted and ready to land.Mar 3 2021, 2:29 PM
This revision was landed with ongoing or failed builds.Mar 3 2021, 2:53 PM
This revision was automatically updated to reflect the committed changes.
thakis added a subscriber: thakis.Mar 3 2021, 4:29 PM

I'm super confused by this change. lld-link used to do aggressive icf for code and safe icf for data. The consensus on https://bugs.chromium.org/p/chromium/issues/detail?id=1087923 was that we want aggressive icf for both code and data, like lld on other platforms. But this change adds an option to make ICF less aggressive, not more aggressive, from what I understand (correct?)

I'm super confused by this change. lld-link used to do aggressive icf for code and safe icf for data. The consensus on https://bugs.chromium.org/p/chromium/issues/detail?id=1087923 was that we want aggressive icf for both code and data, like lld on other platforms. But this change adds an option to make ICF less aggressive, not more aggressive, from what I understand (correct?)

The motivation for this is from: https://bugs.chromium.org/p/chromium/issues/detail?id=1169276, which is for better debugging experience.