This is an initial set of LLVM DIBuilder APIs that can create builders, compile units, files, and scopes, and introspect the debug info inside modules.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
Hi, you can put "HEAD^" in the file ".git/arc/default-relative-commit" , this ensure that only the comit you are currently on is sent to phabricator.
lib/IR/DebugInfo.cpp | ||
---|---|---|
694 โ | (On Diff #96208) | I believe our policy for all new LLVM-C code is to use pointer/length pairs for all array arguments, including strings. |
lib/IR/DebugInfo.cpp | ||
---|---|---|
694 โ | (On Diff #96208) | Ah! Yes, I had forgotten. Thanks! |
- Fix 80-column limit for LLVMDWARFSourceLanguage
- Pass string parameters as pointer/length pairs
Ideally we'd like to have some test with this. The way it is usually done is to use the echo test: it reads a module using the C API and then recreate another identical module using the same C API. This ensure end to end testing. However this doesn't provide any reading facility so that me be more work than you are willing to put into this.
include/llvm-c/DebugInfo.h | ||
---|---|---|
63 โ | (On Diff #96208) | Maybe having 2 function rather than a bool in the API that end up being a uint8_t ? |
65 โ | (On Diff #96208) | The doc should specify if it finalize or not when disposing. |
90 โ | (On Diff #96208) | DWOId yields notthing relevant on Google. Maybe this needs to be explained a bit more. |
106 โ | (On Diff #96208) | It hasn't been enforced consistently int he past, so that may not be immediately obvious, but we usually pass a pointer and a length for string in the newer API. This avoid useless calls to strlen and works even if the string isn't zero terminated which is a plus. Keep in mind that the C API is often used to bind to other languages, so while this may not be the most idiomatic thing to do in C, this ends up fitting our use case better. |
lib/IR/DebugInfo.cpp | ||
668 โ | (On Diff #96208) | There is a facility to unwrap subclasses. You can see LLVMValueRef for how it is done. You shouldn't need to write this. |
include/llvm-c/DebugInfo.h | ||
---|---|---|
90 โ | (On Diff #96208) | I'll be completely honest here, this has stumped me for a while. I don't think there's even a facility within the wrappers to create a "split skeleton" compile unit... |
Hey cool that you are doing this work, finally!
See inline comments (also click the done tag on the comments when those are fixed, please).
Regarding the tests, only since this API is only about creating debug info I'm okay with just a creation test, this API already have a special status.
include/llvm-c/DebugInfo.h | ||
---|---|---|
106 โ | (On Diff #96208) | Please use size_t for the the string sizes, no need to go to 64bit on 32bit systems. |
This is looking good and I like where this is going. I have a few nits in comments.
Where I'm more doubtful is for the test. The way we've tested the C API in the past is by reading a module and then re-emiting a clone of it on the output. This patch doesn't have the ability to read debug infos, so that's a bit compromised. One way forward is obviously to add reading support for the elements we add emission support from. I'm not sure how much work that would be, so I'm not sure how practical this is within the scope of that diff. If that's too hard then an alternative should be chosen, like emitting a given module and checking its output. This is where this patch is going, however, it doesn't looks like this patch actually checks the output so that's a problem.
include/llvm-c/DebugInfo.h | ||
---|---|---|
30 โ | (On Diff #99612) | These id needs to persist across versions, and this is not ensuring this. You can look at other enums in Core.h to see how it is done. The basic idea is to list all the enum values here, and then use the macro option to generate the mapping between exposed values and internal llvm values. |
57 โ | (On Diff #99612) | Please use LLVMBool |
108 โ | (On Diff #99612) | LLVMBool for flags. |
include/llvm-c/DebugInfo.h | ||
---|---|---|
103 โ | (On Diff #99612) | size_t ProducerLen |
104 โ | (On Diff #99612) | LLVMBool IsOptimized |
107 โ | (On Diff #99612) | Are these last two also booleans? LLVMBool too then |
128 โ | (On Diff #99612) | What's the value for "line/column not available"? |
lib/IR/DebugInfo.cpp | ||
708 โ | (On Diff #99612) | Ditto |
709 โ | (On Diff #99612) | LLVMBool IsOptimized |
(Having unit tests will avoid
include/llvm-c/DebugInfo.h | ||
---|---|---|
128 โ | (On Diff #99612) | Compiler-generated locations (which are different from empty locations) are using line 0. |
include/llvm-c/DebugInfo.h | ||
---|---|---|
128 โ | (On Diff #99612) | Should this be documented? |
- Fix 80-column limit for LLVMDWARFSourceLanguage
- Address comments
- uint64_t -> size_t for string lengths
- Add simple debuginfo tests
- Fix unused variable warning
- uint8_t -> LLVMBool for flags
- ProducerLen should be size_t
- Add test and update CMakeLists to build debuginfo.c
include/llvm-c/DebugInfo.h | ||
---|---|---|
50 โ | (On Diff #101308) | Why is LLVMDebugMetadataVersion returning uint32_t and LLVMGetModuleDebugMetadataVersion returning unsigned? |
include/llvm-c/DebugInfo.h | ||
---|---|---|
50 โ | (On Diff #101308) | Just because of the types of the things they call into. Should they both be unsigned? |
include/llvm-c/DebugInfo.h | ||
---|---|---|
50 โ | (On Diff #101308) | I think it is better to use well defined types in the C API, so other language are not exposed to the C madness where nobody knows what an integral really is. It wasn't done consistently in the API in the past, but I think this is a good habit we should take. |
include/llvm-c/DebugInfo.h | ||
---|---|---|
50 โ | (On Diff #101308) | Can you elaborate ? |
include/llvm-c/DebugInfo.h | ||
---|---|---|
50 โ | (On Diff #101308) | The C API uses unsigned practically everywhere, and that's not changing, so anything that binds to it will have to know what ABI width unsigned has. And having some functions use unsigned and some uint32_t for parameters that have the same underlying semantics is just a recipe for a footgun when going between 32 and 64-bit builds... |
include/llvm-c/DebugInfo.h | ||
---|---|---|
50 โ | (On Diff #101308) | This 32 bits on windows, linux and OSX both in 32bits build and 64bits builds. It is 64bits on solaris and some other obscures UNIXes. I think it's very much worth it moving everything to uint32_t (unless we have a lot of solaris 64bits users of the C API). |
include/llvm-c/DebugInfo.h | ||
---|---|---|
50 โ | (On Diff #101308) |
|
include/llvm-c/DebugInfo.h | ||
---|---|---|
50 โ | (On Diff #101308) | I won't block the diff for this. unsigned it is then. |
- Fix 80-column limit for LLVMDWARFSourceLanguage
- Address comments
- uint64_t -> size_t for string lengths
- Add simple debuginfo tests
- Fix unused variable warning
- uint8_t -> LLVMBool for flags
- ProducerLen should be size_t
- Add test and update CMakeLists to build debuginfo.c
- Fix test
- Use unsigned instead of explicitly sized unsigned integer types
include/llvm-c/DebugInfo.h | ||
---|---|---|
128 โ | (On Diff #99612) | Really, the question is what _is_ the canonical value? Should we explicitly state 0 for line/column in the documentation or let implementers decide their sentinels? |
I have one more comment. It's looking good.
include/llvm-c/DebugInfo.h | ||
---|---|---|
30 โ | (On Diff #103521) | These 2 needs to have stable values. You can look in Core.h how it is done for instruction types for instance. |
include/llvm-c/DebugInfo.h | ||
---|---|---|
30 โ | (On Diff #103521) | I'm still unsure about this. Should they mirror the values defined in DINode::DIFlags? What happens if that gains or loses a case? What happens if their raw bit mask values change? |
include/llvm-c/DebugInfo.h | ||
---|---|---|
128 โ | (On Diff #99612) | From the dwarf standard: |
include/llvm-c/DebugInfo.h | ||
---|---|---|
128 โ | (On Diff #99612) | Thank you! Alright, I'll add that to the comment.. |
include/llvm-c/DebugInfo.h | ||
---|---|---|
30 โ | (On Diff #103521) | The way it is done in the C API is that you make the enum in C without using macro codegen. Then you write a function that internally map the C enum value to the C++ enum value. To do so, it uses a switch which cases are generated with the macro codegen technique. The the API implementation uses this function internally. See https://reviews.llvm.org/diffusion/L/browse/llvm/trunk/lib/IR/Core.cpp;306068$1085-1103 |
include/llvm-c/DebugInfo.h | ||
---|---|---|
30 โ | (On Diff #103521) | Are you sure this is a good idea? For the other enums, this makes sense because they have values that aren't subject to change and have a manageable set of cases (less than 20). DWARF is going to continue to evolve over the years and LLVM's coverage of the API includes 96 tags and 200+ attributes, each of which needs to be handled by the requested mapping functions. This doesn't seem like a reasonable thing to try to maintain by hand. |
include/llvm-c/DebugInfo.h | ||
---|---|---|
30 โ | (On Diff #103521) | I get that we don't want macros in the installed header files to make it easier for FFIs and other interface generators, but why can't we run the preprocessor on the file as a custom build step? |
@whitequark Think this is good to merge in its current state? I have another patch for more functions Moreno or less ready to go once this is in.
include/llvm-c/DebugInfo.h | ||
---|---|---|
30 โ | (On Diff #103521) | So I've updated this revision to explicitly lay out these enums and provide a mapping. Thoughts, @deadalnix? |
wq - Fix 80-column limit for LLVMDWARFSourceLanguage
- Address comments
- uint64_t -> size_t for string lengths
- Add simple debuginfo tests
- Fix unused variable warning
- uint8_t -> LLVMBool for flags
- ProducerLen should be size_t
- Add test and update CMakeLists to build debuginfo.c
- Fix test
- Use unsigned instead of explicitly sized unsigned integer types
- Add comment explaining 0 for debug locations
- Expand enum definitions and translate to DI (#1)