This is an archive of the discontinued LLVM Phabricator instance.

[Orc] Add verylazy example for C-bindings
ClosedPublic

Authored by vchuravy on Jun 23 2021, 10:52 AM.

Details

Summary

Still WIP, based on the Kaleidoscope/BuildingAJIT/Chapter4.

Diff Detail

Event Timeline

vchuravy created this revision.Jun 23 2021, 10:52 AM
vchuravy requested review of this revision.Jun 23 2021, 10:52 AM
Herald added a project: Restricted Project. · View Herald TranscriptJun 23 2021, 10:52 AM

@lhames this currently fails with

➜  build git:(vc/orcv2-lazy-capi) ✗ bin/OrcV2CBindingsVeryLazy 0
Error: Duplicate definition of symbol 'foo_body'
fish: Job 1, 'bin/OrcV2CBindingsVeryLazy 0' terminated by signal SIGSEGV (Address boundary error)

Should I have used independent JITDylibs or done a triple indirection foo -> foo_trampoline -> foo_body?

vchuravy updated this revision to Diff 354162.Jun 23 2021, 11:45 PM

Use IRLayer::emit instead of LLJIT::add

vchuravy updated this revision to Diff 354575.Jun 25 2021, 12:14 PM

Use MaterializationResponsibility::replace

vchuravy updated this revision to Diff 354676.Jun 26 2021, 7:27 AM

Create separate MaterializationUnit's

vchuravy updated this revision to Diff 355991.Jul 1 2021, 12:58 PM

clang-format and rebase

vchuravy updated this revision to Diff 356215.Jul 2 2021, 10:49 AM

add first draft of docs and unit test for MaterializationResponisibility

I see a couple of warnings when building this code:

/path/to/llvm-project/llvm/examples/OrcV2Examples/OrcV2CBindingsVeryLazy/OrcV2CBindingsVeryLazy.c:127:14: warning: variable 'TSM' is used uninitialized whenever 'if' condition is false [-Wsometimes-uninitialized]
  } else if (strcmp(CSym, "bar_body") == 0) {
             ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/path/to/llvm-project/llvm/examples/OrcV2Examples/OrcV2CBindingsVeryLazy/OrcV2CBindingsVeryLazy.c:133:10: note: uninitialized use occurs here
  assert(TSM);
         ^~~
/Library/Developer/CommandLineTools/SDKs/MacOSX11.5.sdk/usr/include/assert.h:93:25: note: expanded from macro 'assert'
    (__builtin_expect(!(e), 0) ? __assert_rtn(__func__, __FILE__, __LINE__, #e) : (void)0)
                        ^
/path/to/llvm-project/llvm/examples/OrcV2Examples/OrcV2CBindingsVeryLazy/OrcV2CBindingsVeryLazy.c:127:10: note: remove the 'if' if its condition is always true
  } else if (strcmp(CSym, "bar_body") == 0) {
         ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/path/to/llvm-project/llvm/examples/OrcV2Examples/OrcV2CBindingsVeryLazy/OrcV2CBindingsVeryLazy.c:120:33: note: initialize the variable 'TSM' to silence this warning
  LLVMOrcThreadSafeModuleRef TSM;
                                ^
                                 = NULL

and it asserts on execution on Darwin with:

./bin/OrcV2CBindingsVeryLazy 
Assertion failed: ((!RequiresNullTerminator || BufEnd[0] == 0) && "Buffer is not null terminated!"), function init, file /path/to/llvm-project/llvm/lib/Support/MemoryBuffer.cpp, line 49.

I'll see if I can figure out what's going on there tomorrow.

Regarding the MaterializationResponsibility API: it would be good to add the addDependencies and addDependenciesForAll methods. These declare dependencies for each materializing symbol so that ORC can track them in its dependence graph, which in turn ensures that lookups remain safe regardless of what threads the symbols are being materialized on. These methods are an essential part of the MaterializationResponsibility API.

llvm/include/llvm-c/Orc.h
704–720

This can be removed for now, since the operation isn't supported in the C++ API yet.

llvm/lib/ExecutionEngine/Orc/OrcV2CBindings.cpp
475–484

I'll remove this function on the mainline. This code can be dropped from this review.

vchuravy updated this revision to Diff 357817.Jul 11 2021, 1:01 PM

Address part of the review

vchuravy marked 2 inline comments as done.Jul 11 2021, 1:11 PM

Regarding the MaterializationResponsibility API: it would be good to add the addDependencies and addDependenciesForAll methods. These declare dependencies for each materializing symbol so that ORC can track them in its dependence graph, which in turn ensures that lookups remain safe regardless of what threads the symbols are being materialized on. These methods are an essential part of the MaterializationResponsibility API.

I added methods for both, but I haven't added a test yet. I marked in the unit tests which API methods for MR are not yet covered.

I think we also talked about the type hierarchy for the Layers and that in theory we should allow for some polymorphism instead of requiring specific types.

llvm/examples/OrcV2Examples/OrcV2CBindingsVeryLazy/OrcV2CBindingsVeryLazy.c
149

@lhames am I right that I need to pass the LLJIT as the Context?

llvm/include/llvm/ExecutionEngine/Orc/LLJIT.h
188

@lhames are you okay with making this public?

We need either that or a way to get the TM from LLJIT.
For Julia I copied the implementation https://github.com/maleadt/LLVM.jl/blob/600f5cd22252d12c64ff3e00f2921ba101323a70/deps/LLVMExtra/lib/orc.cpp#L229-L244

llvm/unittests/ExecutionEngine/Orc/OrcCAPITest.cpp
380 ↗(On Diff #357817)
400 ↗(On Diff #357817)

@lhames the flags returned here are wrong so I might have a mistake somewhere in the conversion code.

409 ↗(On Diff #357817)

@lhames Ideas for tests are appreciated :)

I found a few more issues while getting the example working on Darwin:

  1. Rather than exposing applyDataLayout, it would be better to add an LLVMOrcLLJITGetDataLayoutString function, then use the existing LLVMSetDataLayout function.
  1. You should uses strlen rather than sizeof when building the MemoryBuffers for the source -- MemoryBuffer's null termination guarantee is that the character one past the end of the buffer is null, so 'sizeof' causes assertions if the memory following the null terminator happens to be non-zero.
  1. You should compares mangled+interned symbol names, rather than doing string comparisons on symbol names. That makes the example work regardless of the platform mangling scheme. (Side note: in a real world example we'd want to intern the comparison symbols once and re-use them, but the extra memory management felt like it would just complicate things here so I've opted to re-intern the strings on each call.)
  1. You'll need to add a LLVMOrcDisposeMaterializationResponsibility function: MaterializationResponsibility objects are uniquely owned -- If you hand it off to a base layer you're off the hook, but if you fail the materialization you need a way to dispose of it. This should be called on MR after you fail the materialization in the example.

I've also added some inline notes about ownership in the comments.

llvm/include/llvm-c/Orc.h
615–622

typo in 'lenght'.

The comment should include a note that the pool entries are not retained by this operation: The entries need not be released, but must be retained if the caller wants them to outlive any mutating operations on the MR object.

626–628

This should include a note that it does not release the entries.

641–644

Should include a note that the returned entry is not retained -- the caller should retain it if they want it to outlive any mutating operations on the MR.

651

This should include a not that it only frees the list, it does not release the entries.

742–744

This should include a note that the function returns ownership of the instance, and the client is responsible for freeing it (after resolving/emitting, or failing).

vchuravy updated this revision to Diff 357896.Jul 12 2021, 4:16 AM

Apply changes from lhames

vchuravy updated this revision to Diff 357900.Jul 12 2021, 4:49 AM

Update docs

vchuravy updated this revision to Diff 357953.Jul 12 2021, 8:32 AM

Update unittests

vchuravy added inline comments.Jul 12 2021, 8:34 AM
llvm/unittests/ExecutionEngine/Orc/OrcCAPITest.cpp
432 ↗(On Diff #357953)

@lhames any ideas what I am doing wrong here?

lhames added inline comments.Jul 14 2021, 6:41 PM
llvm/unittests/ExecutionEngine/Orc/OrcCAPITest.cpp
432 ↗(On Diff #357953)

-debug-only=orc tells the story:

In main defining materializing symbols { ("_other", [Callable]) }
In main replacing symbols with MU@0x10d721db0 ("<Absolute Symbols>", { ("_other", [Callable]) })
Adding dependencies for _foo: { (main, { _other }) }
In main adding dependencies for _foo: { (main, { _other }) }
Adding dependencies for all symbols in { ("_foo", [Callable]) }: { (main, { _other }) }
In main adding dependencies for _foo: { (main, { _other }) }
In main resolving { ("_foo": 0x00000001000eb8a0 [Callable]) }
In main emitting { ("_foo", [Callable]) }
Done dispatching MaterializationUnits.

You've hit a not-yet-considered corner case: Adding a dependency on a symbol (other) for which there is no corresponding lookup. Materialization is currently only triggered by lookup, not by adding dependencies, so the lookup for foo is left stuck (due to the dependence on other) with no way to trigger materialization of other.

A real test of the dependence tracking in the success case would require async lookups. You could:

  1. Materialize foo, making foo depend on other.
  2. In the caller, verify that the lookup callback for foo has not run (due to the dependence)
  3. Materialize other by looking it up.
  4. In the caller, verify that the lookup callback for foo has now run.

I don't think you need to implement the APIs required to test this though. For this test I would just go ahead and resolve/emit other too, then this should work.

Could you include a FIXME ('write async lookup API, then... ', and the steps listed above) in the test case so that we can go back and do it The Right Way in the future?

vchuravy updated this revision to Diff 358994.Jul 15 2021, 8:49 AM
vchuravy marked an inline comment as not done.

Update test for adding dependencies and clang-format

Good to go from my side

lhames accepted this revision.Jul 17 2021, 9:39 PM

LGTM. This is great stuff -- thanks Valentin!

This revision is now accepted and ready to land.Jul 17 2021, 9:39 PM
This revision was landed with ongoing or failed builds.Jul 18 2021, 3:07 AM
This revision was automatically updated to reflect the committed changes.