This is an archive of the discontinued LLVM Phabricator instance.

[WIP] Early ThinLTOLayer2 prototype
AbandonedPublic

Authored by sgraenitz on Sep 6 2018, 11:32 AM.

Details

Reviewers
lhames
Summary

Prototype for an ORC layer that discovers future callees from ThinLTO module summaries. It aims to minimize frontloading and compiler interception overhead.

When adding a module, instead of parsing the entire IR code, the layer will only read its summary and merge it into the global index.
When querying an initial function, the layer will identify the defining module and stage it for compilation. Additionally it discovers all callees from that function, stage their modules and so on.
Materialization for the initial function will then trigger materialization of all callees.

Discovery is not yet configurable. It only runs once and unlimited. Undiscovered call sites will cause resolution errors in the linking layer.
The goal is to limit discovery and insert special stubs at the end points, that would trigger further discovery once execution arrives there. This requires some more work though.

For now I would like to discuss the general approach and find a good way to get rid of the current hack in Core.cpp (search for hackReplaceGuidWithNameIn).

Diff Detail

Event Timeline

sgraenitz created this revision.Sep 6 2018, 11:32 AM
mgrang added inline comments.Sep 6 2018, 11:37 AM
lib/ExecutionEngine/Orc/ThinLTOLayer.cpp
363
sgraenitz added a comment.EditedSep 6 2018, 12:10 PM

TEST INSTRUCTIONS
I am aware that this must change in case it gets upstreamed some day. Still, if you have a good idea how I could do this properly with gtest, please let me know. Also note, that the .ll files may not fit all systems.

$ ninja OrcJITTests
$ ninja llvm-as
$ bin/llvm-as ../llvm/unittests/ExecutionEngine/Orc/Inputs/Bar.ll -o unittests/ExecutionEngine/Orc/Bar.bc
$ bin/llvm-as ../llvm/unittests/ExecutionEngine/Orc/Inputs/Foo.ll -o unittests/ExecutionEngine/Orc/Foo.bc
$ unittests/ExecutionEngine/Orc/OrcJITTests -debug -debug-only="orc-thinlto-layer,function-import" --gtest_filter=LookupOrders/ThinLTOLayer2Test.*
lib/ExecutionEngine/Orc/Core.cpp
49

This helped me a lot in debugging.

1400

The hack I referred to in the summary. The straightforward solution would be another extension point in the API (callback or something), but I wonder if we could even keep the GUIDs and do this in the resolver?

1420

@lhames This may aid readability, as it's easy to confuse which symbols/symbol flags are modified here.

sgraenitz added inline comments.Sep 7 2018, 3:33 AM
unittests/ExecutionEngine/Orc/Inputs/Bar.ll
23

notEligibleToImport: 1 popped up recently and currently fails discovery with: ignored! No qualifying callee with summary found. Didn't find a quick way to fix that, so for now I will just clear it manually. Generated with:

clang -c -flto=thin Foo.cpp -o - | llvm-dis -o Foo.ll
sgraenitz updated this revision to Diff 164391.Sep 7 2018, 4:59 AM

[WIP] Manuall clear flag notEligibleToImport in test IR files + minor fixes

sgraenitz updated this revision to Diff 164393.Sep 7 2018, 5:20 AM

[WIP] Remove additional test inputs committed accidentally

sgraenitz updated this revision to Diff 164395.Sep 7 2018, 5:29 AM

[WIP] Support unused functions in emitted modules

sgraenitz updated this revision to Diff 164439.Sep 7 2018, 9:05 AM

[WIP] Add lazy compile callback stubs for undiscovered symbols.

sgraenitz updated this revision to Diff 164446.Sep 7 2018, 9:31 AM
sgraenitz marked an inline comment as done.

[WIP] Clean up test IR files + minor fixes.

sgraenitz added inline comments.Sep 7 2018, 9:32 AM
lib/ExecutionEngine/Orc/ThinLTOLayer.cpp
363

Ok changed that. (Although dups will be removed in the below line and their order is of no relevance.)

tejohnson added inline comments.Sep 7 2018, 10:14 AM
unittests/ExecutionEngine/Orc/Inputs/Bar.ll
23

There are 2 categories that will cause notEligibleToImport to be set:

  1. Correctness: the value is a local used in some opaque way (e.g. in module asm or on the llvm.used) that means the backend cannot safely promote it to global scope and rename.
  2. Profitability: if we think the backend cannot inline it anyway, then there has traditionally not been a reason to import.

Looks like you are being hit here by reason 2 - since bar has a "noinline" attribute.

I've been thinking about splitting these categories into 2 different flags, looks like that might be useful for you.

sgraenitz added inline comments.Sep 8 2018, 7:30 AM
unittests/ExecutionEngine/Orc/Inputs/Bar.ll
23

Thanks for the clarification. IIUC it's a hint for the importer to ignore the function for an aggregate of reasons summaries don't otherwise provide. So it's only a matter of interpretation, in the way that the summary generator may decide to set notEligibleToImport on a GV, but that doesn't cause it to omit regular information in the summary, right?

I've been thinking about splitting these categories into 2 different flags, looks like that might be useful for you.

Well, yes greater detail usually allows for better reasoning. But it sounds to me like both categories don't affect pure discovery the same way they affect actual cross-module-import. After all I only use the summaries to select which modules to compile at which time. Function import and inlining will certainly be interesting at some point, but it feels like a distance future.

So far I just used computeImportForFunction() and it's fine for my simple examples. I guess discovery in real-world code would profit from an adapted version that interprets summary information slightly different (will probably need that anyway for configurability).

Again, thank you so much for making the time and sharing your knowledge!

sgraenitz abandoned this revision.May 9 2019, 2:36 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 9 2019, 2:36 AM