This is an archive of the discontinued LLVM Phabricator instance.

Add userData to the diagnostic handler C API
ClosedPublic

Authored by GeorgeLyon on Nov 18 2020, 1:07 PM.

Details

Summary

Previously, there was no way to add context to the diagnostic engine via the C API

Diff Detail

Event Timeline

GeorgeLyon created this revision.Nov 18 2020, 1:07 PM
GeorgeLyon requested review of this revision.Nov 18 2020, 1:07 PM
GeorgeLyon edited the summary of this revision. (Show Details)Nov 18 2020, 1:22 PM
GeorgeLyon added reviewers: ftynse, stellaraccident.
mehdi_amini accepted this revision.Nov 18 2020, 5:24 PM
mehdi_amini added inline comments.
mlir/include/mlir-c/Diagnostics.h
83

Please document the userData here and above

This revision is now accepted and ready to land.Nov 18 2020, 5:24 PM

Added deleteUserData callback

@mehdi_amini I added one more bit of functionality here, so you may want to take another look.

Can you expand a bit on the use-case for deleteUserData? When is the user not in control of the lifetime? This seems all kind of tied to the context in the end?

If a host language is registering dialect handlers that may need to be cleaned up (for instance, via a reference count decrement), it theoretically _could_ keep a list of attached handlers and manually release them on detach or when the context is destroyed. deleteUserData is a simple way to eliminate this bit of bookkeeping and make "ownership" a bit more formal in this case, namely the context is responsible for telling you when it is done using the handler (and thus, the userData). This way, if we hypothetically add a setHandlers method which takes an array of handles and _replaces_ the current handler stack, then the host language wouldn't need to account for this, deleteUserData would just be called when the handler is no longer in use.

@stellaraccident / @ftynse can you double check on the deleteUserData part and the shared_ptr?

Shared pointer logic looks fine: the lambda takes it by-copy and is stored internally in std::function, so the data remains live.

I am not entirely convinced that we want to burden the C API with additional memory management facilities though. It's C after all, so I don't see a problem with requiring the caller to manage their data. MLIR will not remove the handler by itself. So the user of C API can be reasonably expected to clean their data if they remove the handler, or after calling mlirContextFree.

mlir/lib/CAPI/IR/Diagnostics.cpp
60

Nit: elide trivial braces

mlir/test/CAPI/ir.c
424–425

This change is irrelevant and weakens the test (CHECK only looks at the current line of the comment).

915–922

This change is irrelevant. Consider using git clang-format that would only touch the diff.

Address feedback

GeorgeLyon added a comment.EditedNov 20 2020, 10:25 AM

I have worked with many C APIs in a number of languages and the attitude of "just do the memory management bookkeeping in the host language" has created some very unfortunate complexity in the bindings and more than a few bugs. This is a very low-cost change for the C bindings, and doesn't add much complexity for language bindings that don't care about this functionality (they can just pass NULL to the extra arguments). I think this is also a win from a self-documentation standpoint: before, I would need to go and figure out what are the different ways diagnostic handlers can be unregistered to understand the semantics. I would also need to worry about concurrency, specifically whether or not my handler is guaranteed to not be called after detach returns. I could read the code to figure this out (it turns out, the answer is "no") but I think it is better to minimize this sort of spelunking and just say "we will call your handler and when we are done calling your handler we will call your callback".

Also, if I remember correctly the // CHECK lines used to be surrounded by // clang-format off/on. Is there a reason this changed? I used to run clang-format on save in my editor and it doesn't look like git clang-format has support for this type of flow.

ftynse accepted this revision.Nov 23 2020, 5:07 AM

I have worked with many C APIs in a number of languages and the attitude of "just do the memory management bookkeeping in the host language" has created some very unfortunate complexity in the bindings and more than a few bugs. This is a very low-cost change for the C bindings, and doesn't add much complexity for language bindings that don't care about this functionality (they can just pass NULL to the extra arguments). I think this is also a win from a self-documentation standpoint: before, I would need to go and figure out what are the different ways diagnostic handlers can be unregistered to understand the semantics. I would also need to worry about concurrency, specifically whether or not my handler is guaranteed to not be called after detach returns. I could read the code to figure this out (it turns out, the answer is "no") but I think it is better to minimize this sort of spelunking and just say "we will call your handler and when we are done calling your handler we will call your callback".

Works for me as long as you put some version of this argument in the commit message.

Also, if I remember correctly the // CHECK lines used to be surrounded by // clang-format off/on. Is there a reason this changed? I used to run clang-format on save in my editor and it doesn't look like git clang-format has support for this type of flow.

clang-format used to mess up CHECK comments, but does not anymore, at least for me. So I assumed it has been taught about FileCheck lines.

This revision was automatically updated to reflect the committed changes.