This is an archive of the discontinued LLVM Phabricator instance.

[ThinLTO] Fix ThinLTOCodegenerator to export llvm.used symbols
ClosedPublic

Authored by steven_wu on Apr 8 2019, 1:46 PM.

Details

Summary

Reapply r357931 with fixes to ThinLTO testcases and llvm-lto tool.

ThinLTOCodeGenerator currently does not preserve llvm.used symbols and
it can internalize them. In order to pass the necessary information to the
legacy ThinLTOCodeGenerator, the input to the code generator is
rewritten to be based on lto::InputFile.

Now ThinLTO using the legacy LTO API will requires data layout in
Module.

"internalize" thinlto action in llvm-lto is updated to run both
"promote" and "internalize" with the same configuration as
ThinLTOCodeGenerator. The old "promote" + "internalize" option does not
produce the same output as ThinLTOCodeGenerator.

This fixes: PR41236
rdar://problem/49293439

Diff Detail

Repository
rL LLVM

Event Timeline

steven_wu created this revision.Apr 8 2019, 1:46 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 8 2019, 1:46 PM

What part of the patch caused the need to change the internalize action to do promote+internalize in one go?

What part of the patch caused the need to change the internalize action to do promote+internalize in one go?

It is to test the fix in the patch (llvm.used) which doesn't reproduce with llvm-lto without change. In the previous patch, I updated promote directly to behave the same as ThinLTOCodegenerator and I thought no tests is broken. It turns out my local configuration is wrong and some tests are labelled unsupported.
It turns out there are lots of weak resolution tests rely on the current configuration of promote function, which is testing the output from a pipeline doesn't match ThinLTOCodeGenerator. Merging promote + internalize and matching that with ThinLTOCodeGenerator is the best solution I have right now.

steven_wu updated this revision to Diff 194207.Apr 8 2019, 2:42 PM

Add one file that was missing from previous patch

Hi Steven,

I missed that this work was being done - sorry.

Currently the legacy api does not require the IRSymtab to be written to the bitcode files so there is potentially a performance degradation due to this approach and the generated bitcode files will be larger, did you measure that? There is also a hidden performance cost that if the bitcode needs upgrading then the IRSymtab must be regenerated which can be expensive.

An alternative approach is for the legacy api to get the list of used symbols from the bitcode and expose them via the lto_symbol_attributes; the system linker can then add "used" symbols to the preserve list directly. This seems more in-keeping with the original design IMO.

Hi Steven,

I missed that this work was being done - sorry.

Currently the legacy api does not require the IRSymtab to be written to the bitcode files so there is potentially a performance degradation due to this approach and the generated bitcode files will be larger, did you measure that?

The legacy API is not writing out IRSymtab but build that in memory during code generation. The only requirement change is that now data layout is required for thin LTO .

There is also a hidden performance cost that if the bitcode needs upgrading then the IRSymtab must be regenerated which can be expensive.

I did a simple performance test in https://reviews.llvm.org/D60226. Yes, the cost of hitting the upgrade path is quite significant (50% thin link time increase) but not measurable if no upgrade is hit. I don’t have big concerns over this. Let me know if you think it is harmful.

An alternative approach is for the legacy api to get the list of used symbols from the bitcode and expose them via the lto_symbol_attributes; the system linker can then add "used" symbols to the preserve list directly. This seems more in-keeping with the original design IMO.

That was one of the proposals I had but some change needs to made to encode and convey this info. See previous review for more info.

ormris added a subscriber: ormris.Apr 10 2019, 9:29 AM

Sorry for the delay in replying. I was trying to understand if the upgrade path was important. Comments inline:

Hi Steven,

I missed that this work was being done - sorry.

Currently the legacy api does not require the IRSymtab to be written to the bitcode files so there is potentially a performance degradation due to this approach and the generated bitcode files will be larger, did you measure that?

The legacy API is not writing out IRSymtab but build that in memory during code generation. The only requirement change is that now data layout is required for thin LTO .

My understanding is that the on disk bitcode contains a binary representation of the IRSymtab and this is read into memory when loading the file (which is why the upgrade path is more expensive than the no-upgrade path). I am saying that currently the legacy api does not require the IRSymtab to be written to disk. Writing it may cost more time and make the input files bigger.

There is also a hidden performance cost that if the bitcode needs upgrading then the IRSymtab must be regenerated which can be expensive.

I did a simple performance test in https://reviews.llvm.org/D60226. Yes, the cost of hitting the upgrade path is quite significant (50% thin link time increase) but not measurable if no upgrade is hit. I don’t have big concerns over this. Let me know if you think it is harmful.

My concern is with patch releases. These might hit the upgrade path for thirdparty bitcode libraries. This might be inconvenient for developers to work around. I will try to get back to you on Monday after I have gathered more information on this.

An alternative approach is for the legacy api to get the list of used symbols from the bitcode and expose them via the lto_symbol_attributes; the system linker can then add "used" symbols to the preserve list directly. This seems more in-keeping with the original design IMO.

That was one of the proposals I had but some change needs to made to encode and convey this info. See previous review for more info.

I can have a go at implementing this to see what would be involved, if that would help?

Sorry for the delay in replying. I was trying to understand if the upgrade path was important. Comments inline:

Hi Steven,

I missed that this work was being done - sorry.

Currently the legacy api does not require the IRSymtab to be written to the bitcode files so there is potentially a performance degradation due to this approach and the generated bitcode files will be larger, did you measure that?

The legacy API is not writing out IRSymtab but build that in memory during code generation. The only requirement change is that now data layout is required for thin LTO .

My understanding is that the on disk bitcode contains a binary representation of the IRSymtab and this is read into memory when loading the file (which is why the upgrade path is more expensive than the no-upgrade path). I am saying that currently the legacy api does not require the IRSymtab to be written to disk. Writing it may cost more time and make the input files bigger.

My patch didn't change that. IRSymtabs are built during thin-link time, hold in memory and it is not written out the disk. Any additional IO during thinLTO can potentially be a huge overhead so we are definitely introducing that. Am I misunderstanding your question?

There is also a hidden performance cost that if the bitcode needs upgrading then the IRSymtab must be regenerated which can be expensive.

I did a simple performance test in https://reviews.llvm.org/D60226. Yes, the cost of hitting the upgrade path is quite significant (50% thin link time increase) but not measurable if no upgrade is hit. I don’t have big concerns over this. Let me know if you think it is harmful.

My concern is with patch releases. These might hit the upgrade path for thirdparty bitcode libraries. This might be inconvenient for developers to work around. I will try to get back to you on Monday after I have gathered more information on this.

Sure. @dexonsmith, do you have any concerns on this?

An alternative approach is for the legacy api to get the list of used symbols from the bitcode and expose them via the lto_symbol_attributes; the system linker can then add "used" symbols to the preserve list directly. This seems more in-keeping with the original design IMO.

That was one of the proposals I had but some change needs to made to encode and convey this info. See previous review for more info.

I can have a go at implementing this to see what would be involved, if that would help?

Sure. I think my initial proposal was in the bug report: https://bugs.llvm.org/show_bug.cgi?id=41236
You can take a look and see if you agree with our discussion.

LGTM! Thanks for waiting for me to confirm. Comments inline:

Sorry for the delay in replying. I was trying to understand if the upgrade path was important. Comments inline:

Hi Steven,

I missed that this work was being done - sorry.

Currently the legacy api does not require the IRSymtab to be written to the bitcode files so there is potentially a performance degradation due to this approach and the generated bitcode files will be larger, did you measure that?

The legacy API is not writing out IRSymtab but build that in memory during code generation. The only requirement change is that now data layout is required for thin LTO .

My understanding is that the on disk bitcode contains a binary representation of the IRSymtab and this is read into memory when loading the file (which is why the upgrade path is more expensive than the no-upgrade path). I am saying that currently the legacy api does not require the IRSymtab to be written to disk. Writing it may cost more time and make the input files bigger.

My patch didn't change that. IRSymtabs are built during thin-link time, hold in memory and it is not written out the disk. Any additional IO during thinLTO can potentially be a huge overhead so we are definitely introducing that. Am I misunderstanding your question?

Sorry for the confusion. I see that this patch hasn't changed the behavior. This is because the IRSymtab is always written to the bitcode files. I imagined that, downstream, if you are shipping a toolchain that only uses the legacy api you would have disabled the generation of the IRSymtab, as it is not required. That's why I thought that there would be an impact from requiring the IRSymtab.

There is also a hidden performance cost that if the bitcode needs upgrading then the IRSymtab must be regenerated which can be expensive.

I did a simple performance test in https://reviews.llvm.org/D60226. Yes, the cost of hitting the upgrade path is quite significant (50% thin link time increase) but not measurable if no upgrade is hit. I don’t have big concerns over this. Let me know if you think it is harmful.

My concern is with patch releases. These might hit the upgrade path for thirdparty bitcode libraries. This might be inconvenient for developers to work around. I will try to get back to you on Monday after I have gathered more information on this.

Sure. @dexonsmith, do you have any concerns on this?

As things are now, this patch would mean that customers who receive patch releases would hit the upgrade path. However, we think that by the time we ship a toolchain containing this patch, we should either be able to change our release processes so that this is not an issue (by guaranteeing that patch releases will not need to upgrade the IRSymtab), or we will commit to improving the performance of the upgrade path. Therefore, this is no longer a concern for me.

An alternative approach is for the legacy api to get the list of used symbols from the bitcode and expose them via the lto_symbol_attributes; the system linker can then add "used" symbols to the preserve list directly. This seems more in-keeping with the original design IMO.

That was one of the proposals I had but some change needs to made to encode and convey this info. See previous review for more info.

I can have a go at implementing this to see what would be involved, if that would help?

Sure. I think my initial proposal was in the bug report: https://bugs.llvm.org/show_bug.cgi?id=41236
You can take a look and see if you agree with our discussion.

Thanks. Actually, after looking at this more, I want to accept this patch rather than any of the options that you list on the bug. Using the IRSymtab is a departure from the original design of the legacy api... but I now believe that this is a "good thing", as it means that the legacy and the resolution api's will use the same mechanism. I would like the legacy api and the resolution api to share as much code as possible, as this will improve the situation with maintaining and extending LTO. I am also interested in speeding up access to the "llvm.linker.options" metadata from the legacy api. In Sony we use llvm.linker.options for "dependent libraries" - #pragma comment(lib, ...). Currently, accessing this metadata is slow; but, if we use the IRSymtab access the dependent libraries, then we will be able to increase performance without having to make any changes to the bitcode loader.

tejohnson accepted this revision.Apr 16 2019, 8:59 AM

LGTM
Sorry for the delay, just catching up after a vacation.

This revision is now accepted and ready to land.Apr 16 2019, 8:59 AM

Thanks Teresa and Ben for the feedback! I am landing this change.

This revision was automatically updated to reflect the committed changes.