This is an archive of the discontinued LLVM Phabricator instance.

LLVMRunStaticConstructors can be called before object is finalized, #24028
ClosedPublic

Authored by Dav1d on Jan 14 2016, 8:02 AM.

Details

Summary

Since you cannot call finalizeObject manually through the C-API and other functions from the C-API automatically call it, LLVMRunStaticConstructors should also call it or otherwise you cannot call it without first calling a workaround function (or call any other function from the C-API which implicitly finalizes the object).

Diff Detail

Repository
rL LLVM

Event Timeline

Dav1d updated this revision to Diff 44882.Jan 14 2016, 8:02 AM
Dav1d retitled this revision from to LLVMRunStaticConstructors can be called before object is finalized, #24028.
Dav1d updated this object.
Dav1d added reviewers: deadalnix, echristo.
Dav1d added a subscriber: llvm-commits.
deadalnix edited edge metadata.Jan 14 2016, 8:07 AM

What happen if the EE object is finalized several times ?

What happen if the EE object is finalized several times ?

Good question it doesn't explicitly state that you can call it more than once.

http://llvm.org/docs/doxygen/html/classllvm_1_1ExecutionEngine.html#ab2973f596de3b640bdc087bf2d46bcfa
It is the user-level function for completing the process of making the object usable for execution. It should be called after sections within an object have been relocated using mapSectionAddress. When this method is called the MCJIT execution engine will reapply relocations for a loaded object. This method has no effect for the interpeter.

So judging by the documentation, the current C bindings and the implementation of MCJIT (e.g. LLVMRunFunction calls finalizeObject() before everything else) you can call it more than once without a change and it should be idempotent, if there were no changes made to the module. If there were changes calling it is also wanted.

deadalnix accepted this revision.Jan 14 2016, 12:10 PM
deadalnix edited edge metadata.

LGTM then

This revision is now accepted and ready to land.Jan 14 2016, 12:10 PM
mehdi_amini edited edge metadata.Jan 14 2016, 3:24 PM

It is not clear to me why LLVMGetPointerToGlobal calls it but not LLVMGetGlobalValueAddress (or LLVMAddGlobalMapping or ...)

In D16188#327370, @joker.eph wrote:

It is not clear to me why LLVMGetPointerToGlobal calls it but not LLVMGetGlobalValueAddress (or LLVMAddGlobalMapping or ...)

As I mentioned in the Issue (https://llvm.org/bugs/show_bug.cgi?id=24028), I am not happy with the C-API, stuff is missing (less and less with every new version, yay!), stuff happens implicitly without being mentioned (like finalizeObject). Changing the API now will break a lot of code, even if you only want to add a function to call finalizeObject explicitly (which is btw marked with a 'TODO rename me') and make it more consistent (add the function and remove the implicit calls).

I am not sure how far you want to go with breaking the API. In my opinion a 'dumb' binding which simply only exposes the functionality of C++ to C would be the best *but* this would break existing code in a really bad manner (LLVMRunFunction suddenly segfaults for no obvious reason).

This patch only focuses on the problem I ran into, the patch could be extended to call finalizeObject before every method call which requires a finalized object, but I don't think this is a good solution. If you want me to I can (probably, if I figure out how to with phabricator) update the patch to call finalizeObject on all the other functions which require a finalized object.

PS: My current workaround:
I have an empty function in my module, so before calling the constructors I run the function:

		// workaround which calls ee->finalizeObjects, which makes
		// LLVMRunStaticConstructors not segfault
		LLVMDisposeGenericValue(LLVMRunFunction(ee, "vmain", []));
		LLVMRunStaticConstructors(ee);
lhames edited edge metadata.Jan 14 2016, 4:21 PM

This looks good to me, and yes - repeat calls to finalizeObject should be idempotent (except where new code is added, where they're required anyway).

<shameless plug> You should try the new ORC C API. It's a chance for us to start fresh and do this right. </shameless plug>

lhames accepted this revision.Jan 14 2016, 4:22 PM
lhames edited edge metadata.

@lhames , I have plans to move to it when 3.8 is out. Expect feedback.

This revision was automatically updated to reflect the committed changes.
lhames added a subscriber: lhames.Jan 15 2016, 6:04 PM

Hi David,

I'm happy to review and accept improvements to the MCJIT APIs, but have you
checked out the ORC C bindings (llvm/include/llvm-c/OrcCBindings.h) ?

The ORC bindings are relatively new. They're implemented on top of the ORC
JIT library, which is a modular re-implementation of the MCJIT concept.
Being new the bindings are fairly bare-bones at the moment, but I think
they represent a good opportunity for us to re-think our JIT C API design
in light of our experiences with MCJIT. I'll be doing the best I can with
them, time permitting, but I'm not a client of the C API, and I'd certainly
appreciate input from people who have used the existing API and who care
about C API quality in general.

To reach feature parity with MCJIT's C API we'd need to plumb through
support for custom memory managers and constructor and destructor execution
(all of which are already supported by the underlying implementation). I
don't think this plumbing would take too much work though. Notably, though
I don't know whether this will be of any use to you personally, the ORC C
APIs already support features that MCJIT is missing, such as lazy
compilation.

Cheers,
Lang.

Hey,

I have not heard of the 'ORC' before. ORC in general appears to be a bunch of classes helping you build a JIT.
There seems to be a OrcMCJITReplacement but there is also a OrcCBindingsStack which seems to be used for the C bindings.
Is OrcMCJITReplacement a EE based on the old concept (so there could be a LLVMCreateOrcMCJITCompilerForModule)? And OrcCBindingsStack is the new 'stuff'? WIll this be a new common interface for all ORC based JITs (similiar to the EE)?

But aside all these implementation details, I would really like an extremely simple C interface, every C++ Function/Method gets en equivalent in C, no magic involved (implicitly calling finalizeObject), no utility. In my opinion the C interface should just be there as a (complete) interface for other languages, these languages can then try to abstract the C interface again.

Hi David,

ORC in general appears to be a bunch of classes helping you build a JIT.

That sums it up perfectly.

OrcMCJITReplacement is built with ORC classes, and aims to exactly
reproduce MCJIT's behaviour. It's intended as both a proof-of-concept and,
if we decide we're going to deprecate the MCJIT code, an upgrade path for
people who want to stick with MCJIT's behaviour and the ExecutionEngine
interface. We haven't added support for OrcMCJITReplacement to the C API
yet, but I expect we will in the future.

OrcCBindingsStack is, as you guessed, a new JIT implementation that is also
built with ORC components. Since we can't easily/efficiently express the
combination of ORC components in C, the intent is to build a stack with
"the lot", and design a C interface that allows the various pieces to be
accessed or disabled. I'm deliberately not going for 100% compatibility
with MCJIT's behaviour in OrcCBindingsStack, but I think it should come
close enough for 99% of MCJIT users. There is no common interface for all
ORC based JITs, but this is the only one that will be in-tree (apart from
the MCJIT replacement), so I think that's moot. It might be nice to have a
common interface between the OrcCBindingsStack and the interpreter, but I
expect that interface would be very limited.

As I mentioned - there's no nice way to express the combination of ORC
components that is possible in C++ in other languages. My aim is just to
provide all the functionality that users need within the OrcCBindingsStack,
then expose that classes interface as directly as possible.

Cheers,
Lang.

@lhames As far as I'm concerned, I'd like to have something closer to the old JIT. As I'm doing a lot of JIT/update module/reJIT cycles, moving to MCJIT was actually quite a step backward for the project I work on.

Is having a behavior close to the one of the old JIT doable with Orc ? If so, is it doable from C ? If not, what would it take to make it happen ?