This is an archive of the discontinued LLVM Phabricator instance.

[Orc] Fix pending debug object tracking in DebugObjectManagerPlugin
ClosedPublic

Authored by sgraenitz on Mar 17 2021, 8:49 AM.

Details

Summary

There can be multiple MaterializationResponsibilitys in-flight for a single ResourceKey. Hence, pending debug objects must be tracked by MaterializationResponsibility and not by ResourceKey.

Diff Detail

Event Timeline

sgraenitz created this revision.Mar 17 2021, 8:49 AM
sgraenitz requested review of this revision.Mar 17 2021, 8:49 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 17 2021, 8:49 AM

This is hard to reproduce with lli, because single-threaded execution with LLLazyJIT will implicitly sequence materialization. LLJIT instead materializes module dependencies recursively, which makes this issue easy to reproduce. Unfortunately, we have no LLJIT tool that I could use for a test. It's not the first time this limitation comes up for me and I wonder if it's worth adding an orc-greedy kind to lli. What do you think?

llvm/lib/ExecutionEngine/Orc/DebugObjectManagerPlugin.cpp
515

Can we guaranteed that affected in-flight MRs are still alive at this point? Running the code as is works, but I wonder if it's a robust assumption.

523

It's a different issue, but I realized that this might be a dangerous use of resource keys. They are only guaranteed to be valid within the lambda passed into withResourceKeyDo() right?

lhames added inline comments.Mar 17 2021, 6:22 PM
llvm/lib/ExecutionEngine/Orc/DebugObjectManagerPlugin.cpp
515

We can guarantee that no MR is trying to register resources with the key at the point that notifyRemovingResources is called, provided that keys are never leaked from withResourceKeyDo -- see below. :)

523

Yep. I missed this on my first read through.

You should always use MaterializationResponsibility::withResourceKeyDo to create associations between resources and keys, and never leak the key from inside withResourceKeyDo. This system guarantees that the association never ends up in a race with resource removal: Any race becomes ordered as either: ASSOCIATE then REMOVE (normal removal case), or REMOVE then ASSOCIATE (in which case withResourceKey do will return a ResourceTrackerDefunct error and not run your lambda).

This is hard to reproduce with lli, because single-threaded execution with LLLazyJIT will implicitly sequence materialization. LLJIT instead materializes module dependencies recursively, which makes this issue easy to reproduce. Unfortunately, we have no LLJIT tool that I could use for a test. It's not the first time this limitation comes up for me and I wonder if it's worth adding an orc-greedy kind to lli. What do you think?

I'm 100% in favor of an -orc-greedy mode for lli. The only reason it doesn't have one is that I've been short on time.

If it were built I would say that it should just be -jit-kind=orc (with -jit-kind=orc-lazy remaining as it is), and that it should be the new default. In that case all MCJIT test cases would just have to be updated to explicitly specify -jit-kind=mcjit.

sgraenitz updated this revision to Diff 331537.Mar 18 2021, 6:07 AM

Pending resources are irrelevant in both, notifyTransferringResources() and notifyRemovingResources(). They are guaranteed to be either finalizd in notifyEmitted() or discarded in notifyFailed().

sgraenitz marked 2 inline comments as done.Mar 18 2021, 6:33 AM

Thanks for elaborating.

llvm/lib/ExecutionEngine/Orc/DebugObjectManagerPlugin.cpp
515

It turned out that I don't need to worry about pending debug objects here, which renders my question irrelevant.

523

Fixing this in D98863

sgraenitz marked 2 inline comments as done.Mar 18 2021, 6:34 AM
lhames accepted this revision.Mar 22 2021, 9:26 AM

LGTM.

This revision is now accepted and ready to land.Mar 22 2021, 9:26 AM