This is an archive of the discontinued LLVM Phabricator instance.

LLVM C DIBuilder Creation APIs
ClosedPublic

Authored by harlanhaskins on Apr 21 2017, 12:40 PM.

Details

Summary

This is an initial set of LLVM DIBuilder APIs that can create builders, compile units, files, and scopes, and introspect the debug info inside modules.

Diff Detail

Repository
rL LLVM

Event Timeline

harlanhaskins created this revision.Apr 21 2017, 12:40 PM

arc...picked some wrong branches here. I'll fix this diff ๐Ÿ˜…

Fix merge head

Fix 80-column limit for LLVMDWARFSourceLanguage

Add back remanining files (sorry for the email spam!)

deadalnix edited edge metadata.Apr 21 2017, 12:48 PM

Hi, you can put "HEAD^" in the file ".git/arc/default-relative-commit" , this ensure that only the comit you are currently on is sent to phabricator.

whitequark added inline comments.Apr 21 2017, 12:50 PM
lib/IR/DebugInfo.cpp
694 โ†—(On Diff #96208)

I believe our policy for all new LLVM-C code is to use pointer/length pairs for all array arguments, including strings.

harlanhaskins added inline comments.Apr 21 2017, 12:51 PM
lib/IR/DebugInfo.cpp
694 โ†—(On Diff #96208)

Ah! Yes, I had forgotten. Thanks!

  • Fix 80-column limit for LLVMDWARFSourceLanguage
  • Pass string parameters as pointer/length pairs

Ideally we'd like to have some test with this. The way it is usually done is to use the echo test: it reads a module using the C API and then recreate another identical module using the same C API. This ensure end to end testing. However this doesn't provide any reading facility so that me be more work than you are willing to put into this.

include/llvm-c/DebugInfo.h
63 โ†—(On Diff #96208)

Maybe having 2 function rather than a bool in the API that end up being a uint8_t ?

65 โ†—(On Diff #96208)

The doc should specify if it finalize or not when disposing.

90 โ†—(On Diff #96208)

DWOId yields notthing relevant on Google. Maybe this needs to be explained a bit more.

106 โ†—(On Diff #96208)

It hasn't been enforced consistently int he past, so that may not be immediately obvious, but we usually pass a pointer and a length for string in the newer API. This avoid useless calls to strlen and works even if the string isn't zero terminated which is a plus. Keep in mind that the C API is often used to bind to other languages, so while this may not be the most idiomatic thing to do in C, this ends up fitting our use case better.

lib/IR/DebugInfo.cpp
668 โ†—(On Diff #96208)

There is a facility to unwrap subclasses. You can see LLVMValueRef for how it is done. You shouldn't need to write this.

harlanhaskins added inline comments.Apr 21 2017, 1:17 PM
include/llvm-c/DebugInfo.h
65 โ†—(On Diff #96208)

This begs the question: Should it just call finalize itself?

106 โ†—(On Diff #96208)

This has been fixed in a later revision ๐Ÿ‘

lib/IR/DebugInfo.cpp
668 โ†—(On Diff #96208)

Thanks! I was going off of the Rust implementation initially.

harlanhaskins added inline comments.Apr 21 2017, 1:21 PM
include/llvm-c/DebugInfo.h
90 โ†—(On Diff #96208)

I'll be completely honest here, this has stumped me for a while. I don't think there's even a facility within the wrappers to create a "split skeleton" compile unit...

aprantl added inline comments.Apr 21 2017, 1:30 PM
include/llvm-c/DebugInfo.h
10 โ†—(On Diff #96213)

Can you doxygenify this comment, too?

52 โ†—(On Diff #96213)

We have autobrief enabled in the doxygen configuration. Please don't use \brief any more.

aprantl edited edge metadata.Apr 21 2017, 1:37 PM

Could you also add some unit tests for the new interface?

Wallbraker requested changes to this revision.Apr 23 2017, 2:24 PM

Hey cool that you are doing this work, finally!

See inline comments (also click the done tag on the comments when those are fixed, please).

Regarding the tests, only since this API is only about creating debug info I'm okay with just a creation test, this API already have a special status.

include/llvm-c/DebugInfo.h
106 โ†—(On Diff #96208)

Please use size_t for the the string sizes, no need to go to 64bit on 32bit systems.

This revision now requires changes to proceed.Apr 23 2017, 2:24 PM
harlanhaskins edited edge metadata.
  • Fix 80-column limit for LLVMDWARFSourceLanguage
  • Address comments
harlanhaskins marked 9 inline comments as done.May 19 2017, 11:00 AM
  • uint64_t -> size_t for string lengths
harlanhaskins marked an inline comment as done.May 19 2017, 11:13 AM
  • Add simple debuginfo tests
  • Fix unused variable warning

This is looking good and I like where this is going. I have a few nits in comments.

Where I'm more doubtful is for the test. The way we've tested the C API in the past is by reading a module and then re-emiting a clone of it on the output. This patch doesn't have the ability to read debug infos, so that's a bit compromised. One way forward is obviously to add reading support for the elements we add emission support from. I'm not sure how much work that would be, so I'm not sure how practical this is within the scope of that diff. If that's too hard then an alternative should be chosen, like emitting a given module and checking its output. This is where this patch is going, however, it doesn't looks like this patch actually checks the output so that's a problem.

include/llvm-c/DebugInfo.h
30 โ†—(On Diff #99612)

These id needs to persist across versions, and this is not ensuring this. You can look at other enums in Core.h to see how it is done. The basic idea is to list all the enum values here, and then use the macro option to generate the mapping between exposed values and internal llvm values.

57 โ†—(On Diff #99612)

Please use LLVMBool

108 โ†—(On Diff #99612)

LLVMBool for flags.

whitequark requested changes to this revision.May 19 2017, 12:39 PM
whitequark added inline comments.
include/llvm-c/DebugInfo.h
103 โ†—(On Diff #99612)

size_t ProducerLen

104 โ†—(On Diff #99612)

LLVMBool IsOptimized

107 โ†—(On Diff #99612)

Are these last two also booleans? LLVMBool too then

128 โ†—(On Diff #99612)

What's the value for "line/column not available"?

lib/IR/DebugInfo.cpp
708 โ†—(On Diff #99612)

Ditto

709 โ†—(On Diff #99612)

LLVMBool IsOptimized

This revision now requires changes to proceed.May 19 2017, 12:39 PM

(Having unit tests will avoid

include/llvm-c/DebugInfo.h
128 โ†—(On Diff #99612)

Compiler-generated locations (which are different from empty locations) are using line 0.

harlanhaskins edited edge metadata.
  • uint8_t -> LLVMBool for flags
  • ProducerLen should be size_t
harlanhaskins marked 7 inline comments as done.May 20 2017, 11:52 AM
harlanhaskins added inline comments.
include/llvm-c/DebugInfo.h
128 โ†—(On Diff #99612)

Should this be documented?

  • Fix 80-column limit for LLVMDWARFSourceLanguage
  • Address comments
  • uint64_t -> size_t for string lengths
  • Add simple debuginfo tests
  • Fix unused variable warning
  • uint8_t -> LLVMBool for flags
  • ProducerLen should be size_t
  • Add test and update CMakeLists to build debuginfo.c
whitequark added inline comments.Jun 2 2017, 11:39 PM
include/llvm-c/DebugInfo.h
50 โ†—(On Diff #101308)

Why is LLVMDebugMetadataVersion returning uint32_t and LLVMGetModuleDebugMetadataVersion returning unsigned?

harlanhaskins added inline comments.Jun 3 2017, 1:56 PM
include/llvm-c/DebugInfo.h
50 โ†—(On Diff #101308)

Just because of the types of the things they call into. Should they both be unsigned?

deadalnix added inline comments.Jun 4 2017, 2:21 PM
include/llvm-c/DebugInfo.h
50 โ†—(On Diff #101308)

I think it is better to use well defined types in the C API, so other language are not exposed to the C madness where nobody knows what an integral really is. It wasn't done consistently in the API in the past, but I think this is a good habit we should take.

whitequark added inline comments.Jun 4 2017, 2:26 PM
include/llvm-c/DebugInfo.h
50 โ†—(On Diff #101308)

That works

50 โ†—(On Diff #101308)

I disagree. Anyone building a binding already has to know what unsigned is, and using uint32_t here just makes it easier to accidentally break ABI.

deadalnix added inline comments.Jun 5 2017, 10:39 AM
include/llvm-c/DebugInfo.h
50 โ†—(On Diff #101308)

Can you elaborate ?

whitequark added inline comments.Jun 5 2017, 10:59 AM
include/llvm-c/DebugInfo.h
50 โ†—(On Diff #101308)

The C API uses unsigned practically everywhere, and that's not changing, so anything that binds to it will have to know what ABI width unsigned has. And having some functions use unsigned and some uint32_t for parameters that have the same underlying semantics is just a recipe for a footgun when going between 32 and 64-bit builds...

deadalnix added inline comments.Jun 5 2017, 11:14 AM
include/llvm-c/DebugInfo.h
50 โ†—(On Diff #101308)

This 32 bits on windows, linux and OSX both in 32bits build and 64bits builds. It is 64bits on solaris and some other obscures UNIXes. I think it's very much worth it moving everything to uint32_t (unless we have a lot of solaris 64bits users of the C API).

whitequark added inline comments.Jun 5 2017, 11:21 AM
include/llvm-c/DebugInfo.h
50 โ†—(On Diff #101308)
  1. This should be done in a separate patch.
  2. I am not as convinced as you that it is completely safe and I think this should be discussed on ML. This has the potential to screw binding authors quite badly if it turns out that they *do* indeed have to support those systems.
deadalnix added inline comments.Jun 5 2017, 11:23 AM
include/llvm-c/DebugInfo.h
50 โ†—(On Diff #101308)

I won't block the diff for this. unsigned it is then.

  • Fix 80-column limit for LLVMDWARFSourceLanguage
  • Address comments
  • uint64_t -> size_t for string lengths
  • Add simple debuginfo tests
  • Fix unused variable warning
  • uint8_t -> LLVMBool for flags
  • ProducerLen should be size_t
  • Add test and update CMakeLists to build debuginfo.c
  • Fix test
  • Use unsigned instead of explicitly sized unsigned integer types
harlanhaskins added inline comments.Jun 21 2017, 10:25 PM
include/llvm-c/DebugInfo.h
128 โ†—(On Diff #99612)

Really, the question is what _is_ the canonical value? Should we explicitly state 0 for line/column in the documentation or let implementers decide their sentinels?

harlanhaskins edited the summary of this revision. (Show Details)Jun 21 2017, 10:30 PM

I have one more comment. It's looking good.

include/llvm-c/DebugInfo.h
30 โ†—(On Diff #103521)

These 2 needs to have stable values. You can look in Core.h how it is done for instruction types for instance.

harlanhaskins marked 10 inline comments as done.Jun 22 2017, 10:26 AM
harlanhaskins added inline comments.
include/llvm-c/DebugInfo.h
30 โ†—(On Diff #103521)

I'm still unsure about this. Should they mirror the values defined in DINode::DIFlags? What happens if that gains or loses a case? What happens if their raw bit mask values change?

echristo added inline comments.Jun 22 2017, 2:02 PM
include/llvm-c/DebugInfo.h
128 โ†—(On Diff #99612)

From the dwarf standard:
"The compiler may emit the value 0 in cases where
an instruction cannot be attributed to any source
line."
So yes, that's the canonical "I have no idea where this is" value.

harlanhaskins added inline comments.Jun 22 2017, 2:52 PM
include/llvm-c/DebugInfo.h
128 โ†—(On Diff #99612)

Thank you! Alright, I'll add that to the comment..

  • Add comment explaining 0 for debug locations
harlanhaskins marked 6 inline comments as done.Jun 22 2017, 3:46 PM
deadalnix added inline comments.Jun 22 2017, 4:13 PM
include/llvm-c/DebugInfo.h
30 โ†—(On Diff #103521)

The way it is done in the C API is that you make the enum in C without using macro codegen. Then you write a function that internally map the C enum value to the C++ enum value. To do so, it uses a switch which cases are generated with the macro codegen technique. The the API implementation uses this function internally.

See https://reviews.llvm.org/diffusion/L/browse/llvm/trunk/lib/IR/Core.cpp;306068$1085-1103

CodaFi added inline comments.Sep 29 2017, 7:26 PM
include/llvm-c/DebugInfo.h
30 โ†—(On Diff #103521)

Are you sure this is a good idea? For the other enums, this makes sense because they have values that aren't subject to change and have a manageable set of cases (less than 20). DWARF is going to continue to evolve over the years and LLVM's coverage of the API includes 96 tags and 200+ attributes, each of which needs to be handled by the requested mapping functions. This doesn't seem like a reasonable thing to try to maintain by hand.

harlanhaskins updated this revision to Diff 117266.EditedSep 30 2017, 3:12 PM
  • Added explicit enum definitions for C API DWARF values (Change made by @CodaFi)
aprantl added inline comments.Oct 2 2017, 11:06 AM
include/llvm-c/DebugInfo.h
30 โ†—(On Diff #103521)

I get that we don't want macros in the installed header files to make it easier for FFIs and other interface generators, but why can't we run the preprocessor on the file as a custom build step?

@whitequark Think this is good to merge in its current state? I have another patch for more functions Moreno or less ready to go once this is in.

include/llvm-c/DebugInfo.h
30 โ†—(On Diff #103521)

So I've updated this revision to explicitly lay out these enums and provide a mapping. Thoughts, @deadalnix?

Sure, go ahead and commit.

I don't have commit access -- would you mind merging it, @whitequark?

Please rebase the patch, it doesn't apply as-is.

wq - Fix 80-column limit for LLVMDWARFSourceLanguage

  • Address comments
  • uint64_t -> size_t for string lengths
  • Add simple debuginfo tests
  • Fix unused variable warning
  • uint8_t -> LLVMBool for flags
  • ProducerLen should be size_t
  • Add test and update CMakeLists to build debuginfo.c
  • Fix test
  • Use unsigned instead of explicitly sized unsigned integer types
  • Add comment explaining 0 for debug locations
  • Expand enum definitions and translate to DI (#1)

Changes rebased on master.

This revision was automatically updated to reflect the committed changes.