lld-link's /opt:safeicf will do safe icf for all sections.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
We had a user request to bring the "safe" and "all" ICF modes into alignment between the ports: https://crbug.com/1087923
I think LLD's /OPT:ICF behavior should match MSVC, which doesn't give you safe ICF, it should give you real ICF. So, I guess the thing to do is add a safeicf mode (/OPT:SAFEICF?), and then update these tests to use it.
LLD's current behavior is a bit emergent: we do safe ICF for non-executable data. That's pretty useful. Very few programs rely on function pointer address significance, but many programs do things like taking the address of a global and stuffing it in a hash table. The question is, should we have a mode that preserves this behavior? Fewer modes is better, but if we don't have a behavior preserving mode, Chrome either takes a binary size hit (safe ICF) or is potentially broken by global data ICF.
- Add /OPT:safeicf, which will mark all sections as unique if .addrsig doesn't exist. By default, /OPT:icf will do aggressive icf, matching the behavior of MSVC linker's /OPT:icf.
- Update test cases.
As far as I know the current behavior is that /opt:icf gives you full ICF for executable sections and safe ICF for non-executable sections. This is more than what MSVC does, which is full ICF for executable sections and no ICF for non-executable sections. I think it makes sense to add a mode that gives you full ICF for non-executable sections, but the behavior of /opt:icf should be preserved.
Initially, we found that the current behavior of /opt:icf merges executable sections even if they are in address-significant table, which obviously is a bug. This patch is trying to fix it and adding a /opt:safeicf to do safe icf for both executable and non-executable sections. The behavior of /opt:icf is preserved.
No it isn't. This matches the behavior of MSVC, which doesn't have a concept of an address significance table.
This patch is trying to fix it and adding a /opt:safeicf to do safe icf for both executable and non-executable sections. The behavior of /opt:icf is preserved.
If you think that /opt:icf should do safe ICF for executable sections then I don't get what you think the difference between /opt:icf and /opt:safeicf should be.
Preserve the bahavior of /opt:icf. Using /opt:safeicf to do safe icf for all sections.
Some nits.
I'm a bit worried that our end state for Chrome is going to be slightly confusing. In order to allow users to suppress ICF with attributes, this is what we will have to do:
- Add a new mode to clang -faddrsig that stops putting address taken functions in .llvm_addrsig (-faddrsig-data(-only)? -fno-addrsig-code?)
- Add a new attribute to clang that puts a function in .llvm_adddrsig (not sure if this should be called literally __attribute__((addrsig)) or something general like noicf)
- Switch from /OPT:IFC to /OPT:SAFEICF in Chrome
Someone reading the build files might think that we are using safe ICF, but the compiler flag (-fno-addrsig-code or whatever) will mean we are effectively using regular, aggressive ICF. I guess we'll just have to live with that. It's pretty normal for build files to be confusing. =/
lld/COFF/Config.h | ||
---|---|---|
86 | This isn't precisely MSVC's behavior. LLD will do ICF for code like MSVC, but it will do safe ICF for data using .llvm_addrsig, which MSVC doesn't look at. I'd make the comment more explicit, like: "Aggressive ICF for code, but safe ICF for data, similar to MSVC's behavior" It seems to me that MSVC will not merge any readonly data sections: extern "C" int printf(const char *, ...); extern const int g1; extern const int g2; const int g1 = 42; const int g2 = 42; int main() { printf("g1: %d, g2: %d\n", g1, g2); printf("&g1: %p, &g2: %p\n", &g1, &g2); } g1 and g2 have distinct addresses: $ cl -Z7 -Gy -Gw -O1 -c t.cpp -Facl.asm && link t.obj -OPT:REF,ICF -debug && ./t.exe Microsoft (R) C/C++ Optimizing Compiler Version 19.28.29336 for x64 Copyright (C) Microsoft Corporation. All rights reserved. t.cpp Microsoft (R) Incremental Linker Version 14.28.29336.0 Copyright (C) Microsoft Corporation. All rights reserved. g1: 42, g2: 42 &g1: 00007FF77EDD22E0, &g2: 00007FF77EDD22E4 | |
lld/COFF/ICF.cpp | ||
86–89 | nit: I feel like this would read better as: // Under regular (not safe) ICF, all code sections are eligible. if (icfLevel == ICFLevel::All && ...EXECUTE) return true; I think that's functionally equivalent: under safe ICF we'd fall through to the main return. |
Thanks, I have some more driver code simplification suggestions.
lld/COFF/Driver.cpp | ||
---|---|---|
1554–1557 | I see, I had forgotten about this limited ICF behavior. | |
1606–1607 | I think this FIXME is wrong, LLD will merge readonly data in both safe and regular ICF modes. In both modes, it will refuse to merge readonly data unless the object file has .llvm_addrsig and the data is not address taken. Let's make the following changes:
|
lld/COFF/Driver.cpp | ||
---|---|---|
1606–1607 | This will change the default behavior. Do you meant that? Or I misunderstand. if (icfLevel == 1 && !doGC) icfLevel = 0; By your changes, if doGC is false and /opt:icf is not given, icfLevel will be ICFLevel::ALL which enables icf. |
lld/COFF/Driver.cpp | ||
---|---|---|
1606–1607 | Yes, sorry, my intent is to use Optional to handle the case where neither /OPT:REF nor /OPT:ICF appear. In that case, icfLevel will be None, and doICF should become ICFLevel::None. Maybe a better way of writing that is: if (!icfLevel) icfLevel = config->doGC ? ICFLevel::All : ICFLevel::None; |
I'm super confused by this change. lld-link used to do aggressive icf for code and safe icf for data. The consensus on https://bugs.chromium.org/p/chromium/issues/detail?id=1087923 was that we want aggressive icf for both code and data, like lld on other platforms. But this change adds an option to make ICF less aggressive, not more aggressive, from what I understand (correct?)
The motivation for this is from: https://bugs.chromium.org/p/chromium/issues/detail?id=1169276, which is for better debugging experience.
This isn't precisely MSVC's behavior. LLD will do ICF for code like MSVC, but it will do safe ICF for data using .llvm_addrsig, which MSVC doesn't look at. I'd make the comment more explicit, like: "Aggressive ICF for code, but safe ICF for data, similar to MSVC's behavior"
It seems to me that MSVC will not merge any readonly data sections:
g1 and g2 have distinct addresses: