This is an archive of the discontinued LLVM Phabricator instance.

[Host] Add LibraryLoader abstraction around dlopen/LoadLibrary
AbandonedPublic

Authored by JDevlieghere on Mar 26 2019, 4:41 PM.

Details

Summary

Since there's no dlopen on Windows, we need a cross platform abstraction. The LibraryLoader is exactly that.

I've kept it as simple as possible for now.

As per usual I don't have access to a Windows machine. If anybody could test this before we check this in (looks at @zturner and @stella.stamenova) that would be great! ;-)

Diff Detail

Event Timeline

JDevlieghere created this revision.Mar 26 2019, 4:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 26 2019, 4:41 PM
Herald added a subscriber: mgorny. · View Herald Transcript

The Windows code looks correct to me, but I've added a couple inline questions/comments.

I can patch it in and test it for you tomorrow morning if nobody else has done it by then. I think zturner's still on vacation for another day or two.

lldb/include/lldb/Host/LibraryLoader.h
18

I'm curious why anyone would choose close = false

18

Why const char * here when the GetSymbol method takes a StringRef?

21

Probably should be a const method.

This will work on Windows, since we know this handle is from LoadLibrary. Some other Windows handles can signal an error with INVALID_HANDLE_VALUE rather than a null pointer, but LoadLibrary doesn't. If you wanted to be extra careful, you could delegate the validity check to the concrete type, but that doesn't seem necessary and it looks like we could change it later without affecting the interface.

lldb/source/Host/windows/LibraryLoader.cpp
18

Since you're passing a char string to LoadLibrary, you might want to explicitly call LoadLibraryA (note the -A suffix).

I know that we generally don't support Unicode well in filenames throughout the LLVM system. If you know the encoded string is UTF-8, you could convert it to UTF-16 and call LoadLibraryW. If you leave it to LoadLibraryA, it'll use the user's current default code page, which almost certainly is not UTF-8.

It isn't safe to pass "data()" of a StringRef to dlsym.

Also, it's a little weird that the library name is a "const char *" but the symbol name is a StringRef?

JDevlieghere marked 4 inline comments as done.

Thanks for the review, Adrian!

lldb/include/lldb/Host/LibraryLoader.h
18

I changed it so I could pass a nullptr in the unittest.

18

I'm not sure about the semantics of "closing" the library. For the python plugin we call initialize once, so it's fine that the symbol is gone, as long as the library remains in memory.

It isn't safe to pass "data()" of a StringRef to dlsym.

Why?

jingham added a comment.EditedMar 26 2019, 5:20 PM

It isn't safe to pass "data()" of a StringRef to dlsym.

Why?

StringRef's aren't guaranteed to be NULL terminated. Or rather, the Data that gets returned by the data() method is just the raw pointer in the StringRef, which isn't guaranteed to be NULL terminated.

It isn't safe to pass "data()" of a StringRef to dlsym.

Why?

StringRef's aren't guaranteed to be NULL terminated. Or rather, the Data that gets returned by the data() method is just the raw pointer in the StringRef, which isn't guaranteed to be NULL terminated.

While true, I don't think this is a good reason to make the interface accept a const char*, because that means that the interface is exposing an implementation detail. Instead, the interface should accept a StringRef, and the implementation can pass foo.str().c_str(). It might be slightly less efficient (almost to the point of minutia), but this isn't really performance sensitive code, as loading the library will dominate the call to malloc / memcpy.

It isn't safe to pass "data()" of a StringRef to dlsym.

Why?

StringRef's aren't guaranteed to be NULL terminated. Or rather, the Data that gets returned by the data() method is just the raw pointer in the StringRef, which isn't guaranteed to be NULL terminated.

While true, I don't think this is a good reason to make the interface accept a const char*, because that means that the interface is exposing an implementation detail. Instead, the interface should accept a StringRef, and the implementation can pass foo.str().c_str(). It might be slightly less efficient (almost to the point of minutia), but this isn't really performance sensitive code, as loading the library will dominate the call to malloc / memcpy.

I don't actually care one way or the other about using a StringRef here (though I still think it is odd to have the library be a char * and the symbol a StringRef...). I was only concerned that you can't pass the data() of a StringRef to anything that expects a c-string.

JDevlieghere marked an inline comment as done.

data() -> str().c_str()

Is there a reason you can't reuse/extend DynamicLibrary from LLVMSupport?

labath requested changes to this revision.Mar 27 2019, 12:38 AM

Is there a reason you can't reuse/extend DynamicLibrary from LLVMSupport?

+1. It sounds like you don't need anything fancy here, and this functionality should already be available in llvm.

Also, this is not relevant here, but as I think I know where this is heading, I feel like I should point out that the API surface shared with the loaded library will need to be annotated with __declspec(dllimport/export) in order for things to work on windows. If you're going to be dlopening libpython (that's how I'd probably do it, but it wouldn't give us real "plugins"), then this should already be taken care of. However, if you want to dlopen some lldb custom library, then any stuff that crosses the dll boundary will probably need to be anotated (this may include llvm classes like StringRef etc.)...

This revision now requires changes to proceed.Mar 27 2019, 12:38 AM
JDevlieghere abandoned this revision.Mar 27 2019, 7:55 AM

Apparently Phabricator is again not importing e-mail replies. As I said per e-mail, I didn't know about the existence of DynamicLibrary in LLVM and only checked LLDB for prior art.