This is an archive of the discontinued LLVM Phabricator instance.

[llvm-c] Improve IR Introspection: Add LLVM{Get,Set}ModuleIdentifier
ClosedPublic

Authored by ubsan on Apr 2 2016, 9:49 PM.

Diff Detail

Repository
rL LLVM

Event Timeline

ubsan updated this revision to Diff 52490.Apr 2 2016, 9:49 PM
ubsan retitled this revision from to [llvm-c] Improve IR Introspection: Add LLVM{Get,Set}ModuleIdentifier.
ubsan updated this object.
ubsan added a reviewer: whitequark.
ubsan added a subscriber: llvm-commits.
whitequark added inline comments.Apr 3 2016, 3:44 AM
tools/llvm-c-test/echo.cpp
221 ↗(On Diff #52490)

Why is this in clone_constant? It should be in llvm_echo below (and only executed once...)

whitequark requested changes to this revision.Apr 3 2016, 4:20 AM
whitequark edited edge metadata.
This revision now requires changes to proceed.Apr 3 2016, 4:20 AM
Wallbraker requested changes to this revision.Apr 3 2016, 4:47 AM
Wallbraker added a reviewer: Wallbraker.
Wallbraker added a subscriber: Wallbraker.
Wallbraker added inline comments.
include/llvm-c/Core.h
483–496 ↗(On Diff #52490)

Please add a length parameter for both functions. Not all languages keep their string zero terminated and keep track of the length. For C StringRef will call strlen so the cost is the same.

char *LLVMGetModuleIdentifier(LLVMModuleRef M, size_t *Length);
void LLVMSetModuleIdentifier(LLVMModuleRef M, const char *ID, size_t Length);
whitequark added inline comments.Apr 3 2016, 4:51 AM
include/llvm-c/Core.h
483–496 ↗(On Diff #52490)

No other function in llvm-c currently does this. I don't see how the nonexistent benefit of being able to add a non-zero-terminated string for a human-readable identifier overweighs the drawback of making the API inconsistent.

Wallbraker accepted this revision.Apr 3 2016, 5:08 AM
Wallbraker edited edge metadata.
Wallbraker added inline comments.
include/llvm-c/Core.h
483–496 ↗(On Diff #52490)

Nitpicking: LLVMConstString* functions does include a Length argument, as does the new functions in D18727.

I have to agree it would be inconsistent, I will retract my comment and refer the issue to other stake holders in the C API.

deadalnix added inline comments.
include/llvm-c/Core.h
483–496 ↗(On Diff #52490)

Actually, several function in C API do that (for instance LLVMGetAsString and LLVMGetMDString), while other use zero terminated strings. The current state is rather inconsistent. If we are going to make it consistent, I'd go for the Length as out parameter.

deadalnix requested changes to this revision.Apr 3 2016, 2:29 PM
deadalnix added a reviewer: deadalnix.
deadalnix added inline comments.
tools/llvm-c-test/echo.cpp
221 ↗(On Diff #52490)

This has nothing to do here.

whitequark added inline comments.Apr 3 2016, 2:31 PM
include/llvm-c/Core.h
483–496 ↗(On Diff #52490)

That's reasonable. Nicole, can you please change the API as suggested?

ubsan added inline comments.Apr 3 2016, 2:47 PM
include/llvm-c/Core.h
483–496 ↗(On Diff #52490)

So, both zero terminated string and string with length as out param? Or just the latter?

whitequark added inline comments.Apr 3 2016, 2:48 PM
include/llvm-c/Core.h
483–496 ↗(On Diff #52490)

No, just the latter, it doesn't make a lot of sense to have both versions of this API when we agree that the string+length variant is the way it should all look eventually.

ubsan added inline comments.Apr 3 2016, 2:51 PM
include/llvm-c/Core.h
483–496 ↗(On Diff #52490)

Great, will change.

ubsan added inline comments.Apr 3 2016, 2:59 PM
include/llvm-c/Core.h
483–496 ↗(On Diff #52490)

Should it take an unsigned, or a size_t? A size_t would be more in keeping with what is actually meant by "length", but unsigned is more consistent with the rest of llvm.

ubsan updated this revision to Diff 52517.Apr 3 2016, 4:31 PM
ubsan edited edge metadata.
deadalnix added inline comments.Apr 3 2016, 6:42 PM
include/llvm-c/Core.h
483–505 ↗(On Diff #52517)

I think you are right, size_t is the correct type here, while I don't think it makes a big difference in practice (I don't expect anyone to get anywhere near 4G of description for a module).

deadalnix added inline comments.Apr 3 2016, 7:00 PM
lib/IR/Core.cpp
161–162 ↗(On Diff #52517)

A string should know its length already, so taking the pointer and then computing strlen is wasteful. It is also not needed to strdup. It is common practice in the C API to return string the caller don't have ownership of. If caller has ownership of something, then a LLVMDisposeXXX method is provided. In this case, I don't think this is necessary.

auto &Str = unwrap(M)->getModuleIdentifier()
*Length = Str.length();
return Str.c_str();

Also, just take a size_t for length so yu don't need to assert. I see no reason why someone would have such a large identifier that it matters, but, on the other hand, I see no reason to prevent it.

deadalnix added inline comments.Apr 3 2016, 7:04 PM
tools/llvm-c-test/echo.cpp
881 ↗(On Diff #52517)

What happens if you read the module identifier from src and copy it to dst ? That would avoid having to hardcode a module name as it is done now.

ubsan added inline comments.Apr 3 2016, 7:21 PM
tools/llvm-c-test/echo.cpp
881 ↗(On Diff #52517)

How do you mean?

ubsan added inline comments.Apr 3 2016, 7:22 PM
include/llvm-c/Core.h
483–505 ↗(On Diff #52517)

However, everything else uses unsigned for length.

lib/IR/Core.cpp
161–162 ↗(On Diff #52517)

Will do.

ubsan updated this revision to Diff 52520.EditedApr 3 2016, 9:26 PM
ubsan edited edge metadata.

Return a const string instead.

deadalnix added inline comments.Apr 3 2016, 10:12 PM
lib/IR/Core.cpp
161–162 ↗(On Diff #52520)

As said, just use size_t and get rid of the assert.

tools/llvm-c-test/echo.cpp
881 ↗(On Diff #52520)

This test take a module as input, read it, and write a copy in another module. That allow to test read and write operations.

Reading the identifier from the Src module writing it in the Dst module seems like the way to do that.

ubsan updated this revision to Diff 52576.Apr 4 2016, 10:14 AM

Follow deadalnix's suggestions.

deadalnix accepted this revision.Apr 4 2016, 10:26 AM
deadalnix edited edge metadata.
whitequark accepted this revision.Apr 4 2016, 10:31 AM
whitequark edited edge metadata.
This revision is now accepted and ready to land.Apr 4 2016, 10:31 AM
whitequark added inline comments.Apr 4 2016, 12:58 PM
lib/IR/Core.cpp
167 ↗(On Diff #52576)

Actually, can you call StringRef constructor here? I'm pretty sure I've never seen brace-initializer style anywhere else in LLVM.

ubsan added inline comments.Apr 4 2016, 3:27 PM
lib/IR/Core.cpp
167 ↗(On Diff #52576)

Yeah, I can. I will change it when I can.

ubsan updated this revision to Diff 52658.Apr 4 2016, 11:47 PM
ubsan edited edge metadata.

Switch to explicit constructor.

Looks good, thanks for all the work!

This revision was automatically updated to reflect the committed changes.