This is an archive of the discontinued LLVM Phabricator instance.

Add ModulesDidLoad to LanguageRuntime
Needs ReviewPublic

Authored by domipheus on Apr 13 2015, 7:56 AM.

Details

Reviewers
clayborg
jingham
Summary

Having ModulesDidLoad in the base LanguageRuntime allows for language plugins to be notified of module loading without plugin specifics creeping into the Target project.

Diff Detail

Repository
rL LLVM

Event Timeline

domipheus updated this revision to Diff 23676.Apr 13 2015, 7:56 AM
domipheus retitled this revision from to Add ModulesDidLoad to LanguageRuntime.
domipheus updated this object.
domipheus edited the test plan for this revision. (Show Details)
domipheus added reviewers: jingham, clayborg.
domipheus set the repository for this revision to rL LLVM.
domipheus added a subscriber: Unknown Object (MLST).

Any comments on this?

clayborg edited edge metadata.Apr 15 2015, 11:39 AM

So the main question here is what are you trying to accomplish by adding this call? Your LanguageRuntime plug-in will have a static CreateInstance() and that looks at the target and looks at the image list of that target to see if a shared library is around for the. See the function named "AppleObjCRuntime::GetObjCVersion()" in AppleObjCRuntime.cpp for an example. AppleObjCRuntimeV1::CreateInstance() and AppleObjCRuntimeV2::CreateInstance() call this function to check the targets image list to look for anyone that has certain things in any of the images.

Currently, you call a function in lldb_private::Process to get your language runtime:

virtual ObjCLanguageRuntime *
GetObjCLanguageRuntime (bool retry_if_null = true);

You would add one of these for RenderScript and it would to to call "LanguageRuntime *RenderScriptLanguageRuntime::CreateInstance(Process *process, lldb::LanguageType language)" in order to try and instantiate your runtime. You would then look through the shared libraries to see if your render script shared library is loaded and return a valid instance if it is there, and NULL if not.

Let me know what you were thinking you want LanguageRuntime::ModulesDidLoad(...) for. The only reason to add it would be so that you existing language runtime could be kept up to date and know about all shared libraries that the language runtime might want to know about, but I doubt you need that functionality and I would venture to say we don't need this function.

So the main question here is what are you trying to accomplish by adding this call? Your LanguageRuntime plug-in will have a static CreateInstance() and that looks at the target and looks at the image list of that target to see if a shared library is around for the. See the function named "AppleObjCRuntime::GetObjCVersion()" in AppleObjCRuntime.cpp for an example. AppleObjCRuntimeV1::CreateInstance() and AppleObjCRuntimeV2::CreateInstance() call this function to check the targets image list to look for anyone that has certain things in any of the images.

Currently, you call a function in lldb_private::Process to get your language runtime:

virtual ObjCLanguageRuntime *
GetObjCLanguageRuntime (bool retry_if_null = true);

You would add one of these for RenderScript and it would to to call "LanguageRuntime *RenderScriptLanguageRuntime::CreateInstance(Process *process, lldb::LanguageType language)" in order to try and instantiate your runtime. You would then look through the shared libraries to see if your render script shared library is loaded and return a valid instance if it is there, and NULL if not.

The issue here is that I'd need to add some sort of RenderScript base language definition to the Target project, just like ObjC. I dont think this is ideal; as more languages are added to LLDB, they should really remain in their own plugin project. It pollutes the Target with language and higher-end specifics which (in my opinion, of course!) ought not to be there. If you look in Target::ModulesDidLoad, it calls into the ObjC runtime to check there for the runtime library (using that Process helper function). I'd envisaged as a first step adding something like the following to that function, under the ObjC logic:

LanguageRuntime* rs_runtime = process->GetLanguageRuntime(eLanguageTypeExtRenderScript);
if (rs_runtime)
    rs_runtime->ModulesDidLoad(module_list);

Let me know what you were thinking you want LanguageRuntime::ModulesDidLoad(...) for. The only reason to add it would be so that you existing language runtime could be kept up to date and know about all shared libraries that the language runtime might want to know about, but I doubt you need that functionality and I would venture to say we don't need this function.

This is actually one use case. RenderScript runs alongside other programming languages, such as C++ or Java. In RenderScript, its kernel objects can be loaded at runtime dynamically and ideally i'd like these tracked as early as possible. I guess there could be another hook into modifications in the module list which Language Runtimes could use, either way I do require that functionality.

Additionally, this is also a step in the direction of multiple languages being debugged at once in a single process which is not C++/ObjC, and whilst it's a small step I do think there needs to be more ways for other languages to be supported without adding direct support into the various core modules of LLDB.

If you think there is a better way of achieving this, I'm of course open to having that discussion!

jingham edited edge metadata.Apr 15 2015, 3:05 PM

Yes, ideally there would be some language runtime manager which the plugins would register with, and generic code would call it to dispatch general runtime tasks to all the specific runtimes. ModulesDidLoad would be an good first candidate for this.

The story is complicated by the fact that there are runtime specific things that you want to do before you know that the runtime for that language has actually been loaded into your program or not (so you may not know which specific instance of the runtime is available.) It used to be the case that Apple supported the gcc3 & gcc4 C++ runtimes (they had different mangling schemes and everything) so you could have a vanilla ObjC program that could load a framework that used either one or both of these runtimes. You couldn't necessarily know a priori which one you will encounter, so the runtimes had to come into being during the life of the process, not the target. However, there are clearly some LanguageRuntime jobs you would like to get done without having a particular runtime. So for instance setting the exception breakpoints is complicated because you want to set the breakpoint when you just have a target, but you don't know what actual resolver to use till the language runtime for that language actually shows up.

And in another instance, the LanguageRuntimeManager's ModulesDidLoad call would normally call each loaded language runtime's DidLoad, but if it a particular LanguageRuntime wasn't loaded yet it would have to call a static "Does this module load indicate that runtime for this language exist/if so create same" method. It will probably be necessary to have a LanguageRuntimeTarget in the target which handles all the things you might set up for the language before a specific instance of that runtime shows up, and then the one in the process that corresponds to the actual version of the runtime we've discovered during the course of running.

More generally, there are some places where you will obviously have to talk directly to specific runtimes - for instance the ClangExpressionParser & friends are very C/C++/ObjC specific and ask fairly detailed questions of the runtime. I'm not sure that it's ever going to be easy to push all that into generic interfaces. In other places we could make a general interface and have the runtime manager do the specific runtime iteration.

When we started drafting out lldb, I didn't feel like I knew enough about the way the LanguageRuntimes would work to want to try to generalize the interfaces, so I didn't make the manager class, etc, but left calling into the runtimes fairly ad hoc. The process of supporting a couple more languages will make this clearer and give us the insight to do this piece of work correctly.

Jim

Thanks for the update, Jim. Would progressing with this patch be okay in the mean time, with the end goal being that we use the knowledge gained from getting RenderScript working, and knowing how it ends up engaging with the other language plugins and the Target/Process to formalise a LanguageRuntimeManager?

Colin

That sounds fine to me.

Jim

Greg, if you're in agreement approve and I'll get this committed.

Cheers,
Colin

I just committed a patch:

% svn commit
Sending include/lldb/Target/LanguageRuntime.h
Sending include/lldb/Target/ObjCLanguageRuntime.h
Sending source/Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCRuntime.cpp
Sending source/Plugins/LanguageRuntime/ObjC/AppleObjCRuntime/AppleObjCRuntime.h
Sending source/Target/ObjCLanguageRuntime.cpp
Sending source/Target/Process.cpp
Sending source/Target/Target.cpp
Transmitting file data .......
Committed revision 235118.

It goes a bit further and moves logic down into Process::ModulesDidLoad instead of doing the logic in Target::ModulesDidLoad. Let me know if my patch doesn't solve your problems. Test suite ran clean on MacOSX with this patch, and since only the Objective C runtime is affected, it shouldn't affect anyone else.

It goes a bit further and moves logic down into Process::ModulesDidLoad instead of doing the logic in Target::ModulesDidLoad. Let me know if my patch doesn't solve your problems. Test suite ran clean on MacOSX with this patch, and since only the Objective C runtime is affected, it shouldn't affect anyone else.

Hi Greg,

This looks good. There is one issue to solve, and that is the first initialization of a given Language Runtime. ObjC Has a special case in that it's already initialized by various calls to GetLanguageRuntime. Another language runtime, especially one that only 'activates' mid-process, needs some way of launching so it exists in the m_language_runtimes collection.

Colin

Right now we have all LanguageRuntime's be on demand where only if someone asks for a language runtime plug-in, we will create it lazily.

Then we need to think about if there is ever anything a language runtime plug-in should be able to do without being asked. Jim and I talked about having the Process::ModulesDidLoad() run through all language runtime plug-ins and try to load them each time modules are loaded. I would like to avoid this if possible and do things lazily as these kinds of things have a tendency to slow things down for everyone all the time, just so we can load a plug-in that nobody might use. Many of these language runtime plug-ins will need to look through the sections or look for symbols with specific names and that can be costly (going through 150 shared libraries and parsing all of their symbol tables and making the by name lookup accelerator tables).

So let me know if you truly have the need for your language runtime plug-in to be always created and why before we move on making any changes.

Greg

There is no need for it to be always created. But at present, I can't see anywhere for the user to load a language runtime explicitly. Would adding a language load command to the plugins group be suitable?

I'm thinking C++ going into other language bindings and back again. For the user, having these plugins automatically loaded on some sort of discovery event would be ideal. However, having a 'plugin language load <language name>' command so the user can set up their environment is the next best thing. I can add this command if you agree (will simply GetLanguageRuntime() on the process, ensuring non-null is returned) and put up a patch.

Colin

Why would we need to manually load a LanguageRuntime plug-in? The expression parser just knows what language it is dealing with and the RenderScript expression parser should just request the language runtime plug-in when and only when it needs it.

What you are trying to accomplish? Can you give me a workflow of what you are thinking?

Greg

There is a variety of things that a language runtime can expose. Scripting languages can expose versions and various other things such as jit/interpeter status and object statuses. RenderScript offers details on object allocations, device capabilities and workload flow. Breakpoints on certain actions could be accomplished, too.

Here is an example renderscript workflow:

  1. Connect to target which is running a process with many different languages, say C++ and RenderScript.
  2. User stops in C++ code.
  3. User knows the program contains renderscript, loads the RenderScript Language Runtime, which scans the process and produces a model of the environment.
  4. Using new lldb commands from the language runtime, can inspect the state of the RenderScript runtime, and sees a function kernel loaded it wants to inspect.
  5. User sets breakpoint on this renderscript runtime kernel function or on some other exception event.
  6. Process stops at breakpoint in renderscript code, user requests information on a parameter.
  7. RenderScriptRuntime uses information from process and inspected runtime state to get the allocation details and kernel usage as well as standard 'lldb value'
  8. User does something else..

Maybe LanguageRuntime isn't the best place for this - but is there somewhere else where this makes sense? When I read language runtime, I thought it was anything to do with a language at 'run-time', so if a language actually has a runtime/manager library, interactions with it would occur there. The ability to have lldb collect details and create a model of the environment helps, especially in the cases of scripting languages with relevant C++ bindings.

Thanks for explaining what your thoughts are. Sounds like we should add a new top level command like "language" whose for is "language <language-name> ...".

So we could do:

(lldb) language renderscript ...

This command would take the second parameter "renderscript" and lookup the enumeration eLanguageRenderscript and then ask the process for the language runtime:

Language language;
if (language.SetLanguageFromCString(arg2))
{

LanguageRuntime *language_runtime = process->GetLanguageRuntime(language.GetLanguage());
if (language_runtime)
{
    Error error = language_runtime->HandleCommand(remaining_args);
    ....
}

}

Then each language runtime can vend its own commands for inspection and anything else you want to do.

We also will need a special language runtime breakpoint that we can set that uses a LanguageRuntime to implement special breakpoints with their own set of unique options to allow for complex language runtime breakpoints as you specified below...

How does that sound?

That sounds great - I can create another patch for this if you'd like.

Colin

Yes a new patch would be great.

If we get this in place, it might be useful to add a

(lldb) language <Language> set-exception-breakpoint

as well as:

(lldb) break set -E <Language>

so that we can pass language specific options for filtering the exception breakpoints. Right now we don't have this capability, but presumably different languages will allow different filters and trying to make them all pass through generic breakpoint setting options is going to be a real pain.

We could leave the original "break set -E" in place just so it wouldn't be hard to find. But then we could use:

(lldb) language <Language> modify-exception-breakpoint <BKPT_ID> <Language-Specific-Option>

to set the language specific options on the exception breakpoint...

Jim