Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/test/Bindings/OCaml/debuginfo.ml | ||
---|---|---|
293 | When I ran the tests with assertions enabled, the OCaml debuginfo test at llvm/test/Bindings/OCaml/debuginfo.ml failed with the message: File "/home/alan/llvm-project/debug/test/Bindings/OCaml/Output/debuginfo.ml.tmp/Testsuite.ml", line 1: Warning 70 [missing-mli]: Cannot find interface file. File "/home/alan/llvm-project/debug/test/Bindings/OCaml/Output/debuginfo.ml.tmp/debuginfo.ml", line 1: Warning 70 [missing-mli]: Cannot find interface file. executable: /home/alan/llvm-project/llvm/lib/IR/DIBuilder.cpp:809: llvm::DILocalVariable* llvm::DIBuilder::createParameterVariable(llvm::DIScope*, llvm::StringRef, unsigned int, llvm::DIFile*, unsigned int, llvm::DIType*, bool, llvm::DINode::DIFlags, llvm::DINodeArray): Assertion `ArgNo && "Expected non-zero argument number for parameter"' failed. FileCheck error: '<stdin>' is empty. FileCheck command line: /home/alan/llvm-project/debug/bin/FileCheck /home/alan/llvm-project/llvm/test/Bindings/OCaml/debuginfo.ml The docs for createParameterVariable (https://llvm.org/doxygen/classllvm_1_1DIBuilder.html#afa7c4efcb5c40c8527c13759f10ef5ed) state that the index is 1-based, so the test code with argno:0 seems to be wrong? |
Oh no, I've always struggled with Phabricator and now I've really messed something up. Patch application failed... How do I get back to a clean state in my working copy so I can fix this?
I see, somebody made a commit to the OCaml bindings on main, and I did not realize and did not pull the latest code.
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
12–13 | A high-level question about this diff is whether you really want to change the bindings in this way. The current situation is that they intentionally take a low-level view of the FFI and have been written with that in mind and reviewed from that perspective. There is probably more chance of introducing bugs by changing this than by just adding the encoding of naked pointers and retaining the use of the low-level FFI. My personal preference would be to keep the existing low level FFI. But if such a change from the low-level to the high-level FFI is done, it should be done as a separate diff that does only that. | |
42 | It is common practice elsewhere in the LLVM codebase to add an informative message to assertions, in a way that does not affect the meaning: | |
592–604 | To take this function as a specific example, I think that the suggested change would be a much more minimal change that yields code that does not suffer from naked pointers, is safe in a sequential context, and is probably safe (up to the best of my understanding) in a parallel context. | |
llvm/test/Bindings/OCaml/debuginfo.ml | ||
293 | I think you're right. |
I was brought here from https://discuss.ocaml.org/t/proposal-care-more-about-ocaml-bindings-for-popular-libraries/11451 and https://discourse.llvm.org/t/rfc-moving-ocaml-bindings-to-peripheral-tier-and-disabling-them-by-default/68290/10.
I hope my comments will be helpful. There is some weird stuff here, but if author's code started to pass the tests after these changes, we could address other issues with binding in separate PRs (this is already rather long)
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
51 | Could you add a comment why these two functions are needed? I always though that out-of-heap values should be wrapped into Abstract_tag. At least the manual says that https://v2.ocaml.org/manual/intfc.html#ss:c-outside-head Sorry for asking stupid questions, I was teleported there from https://discuss.ocaml.org/t/proposal-care-more-about-ocaml-bindings-for-popular-libraries/11451 | |
58 | Am I understanding right... LLVM uses non-zero-terminated strings under the hood, so this function is need? In case of zero terminated string it would be better to use caml_copy_string https://github.com/ocaml/ocaml/blob/5.0/runtime/alloc.c#L208 | |
126 | I'm not 100% follow the intent of this function. It looks like it tries to convert OCaml array of OCaml values to C array of OCaml values. But if it is true, I'm not sure that llvm_struct_element_types works as expected. In I understood intent correctly, it should be renamed to array_of_ocaml_array or something..... | |
660 | Could you double check that there is no bug here? See my comment about alloc_temp. | |
702 | This code looks suspicious. If we are going to return an array, why not to try to use caml_alloc_array function? | |
937 | Here we again allocate a tuple and return an array. Looks weird. | |
1402 | I'm wondering... Is it a good idea to define a macro for more short definitions of binary operations above? | |
2911 | Isn't that strange that argn is not used here? | |
3123–3124 | It looks like the 'value' is not required. |
Hi, I've went over your comments and answered them. Please let me know if you have questions or concerns still.
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
51 | When I first wrote this patch, my original approach was to wrap LLVM values in an OCaml block of Abstract_tag. In a discussion with @jberdine, he said that I could assume that all pointers from LLVM are 2-byte aligned, and set the low bit to 1 when exposing them to OCaml (https://github.com/llvm/llvm-project/issues/58134#issuecomment-1271544642). The OCaml GC distinguishes pointers from integers from the low bit; the low bit of integers is 1. Therefore, the OCaml GC will treat out-of-OCaml-heap LLVM pointers as integers. The OCaml documentation that you link also recommends this representation:
This approach avoids an extra allocation. This design choice is documented in the comments and assert messages. Is there anything else I need to add to make it clear for people unfamiliar with the code? | |
58 | This code is what exists on the main branch and my patch does not touch it. The code passes the tests; changing this function is not the business of my patch. It seems that this function accepts NULL pointers and handles the NULL case by allocating a zero-length OCaml-heap-allocated string. | |
126 | Yes, it converts an OCaml array to a C array. The code on the main branch uses the fact that pointers are naked to directly cast OCaml blocks to C arrays, which is no longer possible now that OCaml 5 disallows naked pointers. The x_of_y naming style is an OCaml quirk. I don't want to name this helper function that. | |
660 | This code allocates a temporary, intermediate array as a buffer to store the output of LLVMGetStructElementTypes. Then, it copies them over to an OCaml-heap-allocated array (which internally have the same representation as OCaml tuples) and copies over the elements, wrapping them by setting the low bit to 1. | |
702 | The original code uses caml_alloc_tuple, so that's what I did as well. I think caml_alloc_array (which takes a callback that converts the array elements to OCaml values) would work as well. Thanks for suggesting it. | |
1402 | For this patch, I just did things by-hand. I think this is simpler than a macro for now, as a lot of this code wraps functions by hand. If you really want a macro, I can change it to one. | |
2911 | No. FFI functions of arity greater than five should have two functions, one for native which simply takes all the parameters, and one for bytecode which takes an arg vector and the vector's length (https://v2.ocaml.org/manual/intfc.html#ss:c-prim-impl). It's the way that interface is defined that the bytecode wrapper function receives the number of arguments, but here, we know that the function has an arity of 7 and just forward them from the bytecode wrapper function to the native wrapper function | |
3123–3124 | By putting value, I define a new Hd local variable that shadows the outer one. This code is fine, but reusing the outer variable will also probably be fine. It just depends on your style preference. |
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
12–13 | I thought about this a long time, even before you left this comment. The OCaml manual instructs you to use the CAMLparam, CAMLlocal, and CAMLreturn macros for all FFI functions (https://v2.ocaml.org/manual/intfc.html#ss:c-simple-gc-harmony). It does not give any nuances, like "You don't need to register unboxed integers with CAMLparam" or "You don't need to use the macros if you never call the OCaml runtime." When I first wrote this patch, I just added all the macros to be extra safe. I've been going through the code locally now and removing the macros, and so far the tests still pass, but there are several spots where I was unsure if removing the macro would be unsafe. (Example: Where one of the arguments is a pointer into the OCaml-managed heap, such as an OCaml string or an Abstract_tagged LLVM value for purposes of custom finalizer, and the function calls the OCaml runtime by allocating something.) I don't want a heisenbug where the code usually passes the tests, but mysteriously segfaults in some user's program if the GC kicks in at the wrong time and deallocates memory that shouldn't, because the binding code plays fast and loose with the macros. Then, we have the issue with the diagnostic handler, meaning that the OCaml runtime could kick in at many points that may not be obvious, but the diagnostic handler could violate so many assumptions about C and C++ resource cleanup, as we've discussed, that I don't know if it should factor in to the decision whether to use the macros. When I was thinking about whether to use the macros or drop them, my view was mainly about efficiency versus readability, where I viewed the macros as sacrificing slight efficiency (because the code spends instructions registering values with the GC, when this may not be necessary) for readability (because the code follows the rules of the OCaml FFI manual to the letter). I do not share your view that using the macros would *introduce* bugs or make the code harder to review; I always thought the macros make the code safer and easier to review. It's better to just use the macro than to think, "Do we need to use a macro here? Does the GC need to know about this value before we call caml_alloc_whatever?" |
It looks like most of my comments were false alarms. Although, two places are left, which bother me.
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
51 | Sorry, I forgot that there is an optimization to avoid Abstract_tag. It would be great to have a comment explicitly saying that. | |
58 | Yeah, you are right. The caml_alloc_string+memcpy could be probably be replaced by caml_copy_string, but it could be a topic for a distinct patch.... | |
660 | Yes, but it looks like alloc_temp accepts array as an argument, but you pass unsigned Length here... | |
1402 | Nonono, this patch is OK without a macro. | |
3123–3124 | Your explanation is fine, if we talk about C++, but OCaml manual explicitly says |
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
51 | Pointers are not integers. |
Thanks all, it is great to get more eyes on this!
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
12–13 | My main point is not so much about whether or not to most desirable final situation involves abiding by the simplified FFI rules, but that the changes to do so are large and orthogonal to the goal of changing the bindings to avoid naked pointers. Changing to the simplified FFI rules is a judgement call that has pros and cons, while eliminating naked pointers is a hard requirement going forward. That is why I think the two things should be done as two separate diffs. As for potentially introducing bugs, maybe I'm just paranoid, but large changes are hard to review and not as precise when bisecting issues that are discovered later. More specifically, adding CAMLlocal and associated macros publish more locations as GC roots, and the associated changes involve more allocations. Going all the way to strictly following the simplified rules should be fine, but if there are any oversights and we only get part of the way there, it seems entirely possible to introduce bugs. The diagnostic handler API is limited in the sense that the handler functions must not access the OCaml runtime system. But following the simplified FFI rules will not be enough to enable supporting arbitrary handler functions. That would be another orthogonal change. | |
51 |
From the perspective of the FFI, bit patterns with the lsb set are not interpreted as pointers that the runtime system should follow. It happens that integers in OCaml are represented with bit patterns where the lsb is set, so it is common, perhaps imprecise, usage to refer to pointers where their lsb has been set as "integers". | |
58 | Yes, this code is old, and the same as the implementation of caml_copy_string except for also treating NULL as an empty string. | |
660 | Yes, alloc_temp should be replaced by a call to malloc here. I made a similar comment on D136537 a while ago but didn't notice that I hadn't submitted it. :-( |
I think only minor quirks are left here. @alan , if you are going to make other patches about OCaml, don't hesitate to ping me.
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
659 | In the function above we can see the usage of size_t instead of unsigned. I'm far from being expert about C++, but heard that size_t should be preferred nowadays. Is it? If you are going to change, than in other functions probably too. |
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
12–13 | I can respect the view that a smaller diff is better and that we can add the macros in a separate patch and focus this one on making the code work on OCaml 5. I've removed most of the CAML macros I added. There were some functions that took a pointer into the OCaml heap (e.g. an OCaml string) and called the OCaml runtime (e.g. to allocate a block or throw an exception), and I kept the macros in those cases because I think that not using the macros would be incorrect. Hopefully the diff should be smaller and easier to review now. |
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
702 | I investigated more into [caml_alloc_array](https://github.com/ocaml/ocaml/blob/92adb0ff67dea9c14643a1c2eb6eeaf9d48629ef/runtime/alloc.c#L218) and found out that it expects the array to have a 0 sentinel value to mark the end rather than taking the length as an argument. This code can be adapted to use caml_alloc_array if it allocates Length + 1 elements instead and sets the last element to NULL. Do the reviewers think this is a change I should make? |
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
702 | Using caml_alloc_array is indeed cumbersome. It should be better to left it as it is. I would move allocation of OCaml array closer to the for-loop, to make more obvious that no other allocations are between these two things, but it is minor improvement. |
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
51 | From the perspective of program analysis, pointers are not integers because pointers have provenance. From the perspective of the runtime system, everything is a word. To be pedantic: A word is a sequence of bits; pointers, integers, Booleans, and nullary data constructors are all possible meanings ascribed to words. OCaml's GC, being a precise GC, distinguishes between pointers, which it may follow and relocate, and non-pointers, through the low bit. Setting the low bit of a pointer originating from LLVM causes the OCaml GC to treat it as a word that the GC does not follow. |
Thanks! I was thinking that a significantly smaller diff would be possible. Did you see the suggested version of llvm_param_types I made in a comment?
I saw that, but I have some mixed feelings about this suggestion. I don't think it would make the diff significantly smaller, and I think it's a little strangely written. The original code allocates from the OCaml-managed heap and casts the pointer because it uses the naked pointer assumption to freely cast between OCaml arrays and C arrays. Now that this patch gets rid of naked pointers, I think that allocating from the OCaml heap to create an array to pass to LLVM is tackier than to just allocate an array with malloc. That being said, I'm going to go and make the change because you want it. If I view the suggested code from the perspective of "mutating each element to make it appropriate for OCaml," instead of from the perspective of "reusing the same allocation for a new array," which is how I first viewed the code, the code becomes less weird in my eyes.
I have some comments to add about the latest revision. The caml_alloc_tuple_uninit is actually not an OCaml runtime function, but a helper function that the LLVM binding code on the main branch defines as follows:
value caml_alloc_tuple_uninit(mlsize_t wosize) { if (wosize <= Max_young_wosize) { return caml_alloc_small(wosize, 0); } else { return caml_alloc_shr(wosize, 0); } }
The documentation for caml_alloc_small and caml_alloc_shr state the following: https://v2.ocaml.org/manual/intfc.html#ss:c-low-level-gc-harmony
We now give the GC rules corresponding to the low-level allocation functions caml_alloc_small and caml_alloc_shr. You can ignore those rules if you stick to the simplified allocation function caml_alloc.
Rule 5 After a structured block (a block with tag less than No_scan_tag) is allocated with the low-level functions, all fields of this block must be filled with well-formed values before the next allocation operation. If the block has been allocated with caml_alloc_small, filling is performed by direct assignment to the fields of the block:Field(v, n) = vn;If the block has been allocated with caml_alloc_shr, filling is performed through the caml_initialize function:
caml_initialize(&Field(v, n), vn);The next allocation can trigger a garbage collection. The garbage collector assumes that all structured blocks contain well-formed values. Newly created blocks contain random data, which generally do not represent well-formed values.
The binding code on the main branch does not follow these rules; when it calls caml_alloc_tuple_uninit, it always uses Field(v, n) to initialize the members, without checking if the amount allocated is greater than Max_young_wosize. Therefore, any code that hits the caml_alloc_shr case uses the OCaml runtime API incorrectly. I'm also not a fan of the caml_ naming scheme, which misleads the reader into thinking the function comes from the OCaml runtime API.
Instead of keeping this function and checking Max_young_wosize everywhere it's used to determine whether to use Field(v, n) or caml_initialize(&Field(v, n), vn), I just removed it and used caml_alloc_tuple everywhere, which I think results in the less intrusive diff. Allocations using caml_alloc_tuple should initialize their members with Store_field(v, n, vn) instead of Field(v, n) = vn.
I agree that the caml_ prefix on caml_alloc_tuple_uninit is probably not a good idea.
Note that the intent of my comment about llvm_param_types was to show one concrete example of a pattern that occurs in several other functions, whose ml types return arrays, such as llvm_struct_element_types, llvm_subtypes, llvm_indices, etc. The added allocation seems unavoidable for the functions that accept an array and need to create an analogue where the lsbs have been cleared before passing it to an LLVM function. In a sequential setting, it would be possible to clear the lsbs of elements of the argument OCaml array in place, pass it to LLVM, and then re-set the lsbs. I am not suggesting to do that, it would be unsafely racy in a concurrent setting.
I don't know the history of what the FFI docs said when, but the OCaml bindings are *old*. The difference between caml_alloc_small and caml_alloc_shr is that the former allocates on the minor heap and the latter on the major heap. The FFI rules distinguish them since if a pointer to the/a minor heap is stored into the major heap, then the GC needs to be informed or else it may incorrectly consider the object in the minor heap to be dead. This is what caml_initialize and Store_field do. The key here is that in the case of the LLVM bindings, a crucial point of the design that sets the low bit of pointers to LLVM objects is that the memory allocated by either caml_alloc_small or caml_alloc_shr is initialized with values that the OCaml GC considers to be non-pointers. In particular, they cannot possibly be old-to-young pointers. This is the invariant across the whole of the LLVM bindings that makes the current code safe.
Replacing the existing calls to caml_alloc_tuple_uninit with caml_alloc_tuple will result in much of the memory allocated by the bindings being needlessly double-initialized. That is a change that is orthogonal to removing naked pointers, and I think that it should be considered separately. (It is not a change that I would myself like to see made, but it is a judgement call with pros and cons.)
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
258 | Is this CAMLparam/CAMLreturn left over from combining the two diffs, or necessary for a reason I am overlooking? | |
372 | I don't see why adding this CAMLparam is necessary | |
470–471 | Left over from combining the 2 diffs? | |
819–820 | Left over? | |
888–889 | Left over? | |
2391–2395 | I am not sure why this change is needed, rather than changing the (value) casts in the original to to_val. In particular, if V and BB are computed after Hd is allocated, they do not need to survive any allocations, so do not need to be roots. | |
2397–2400 | List does not need to be a root since it is not live across any allocations. | |
2441 | Left over? | |
2454 | Left over? | |
3112–3113 | Left over? | |
3242 | Left over? | |
3266 | Left over? |
Thank you for your code review. I didn't realize the reasoning behind the Field/caml_initialize distinction and was just following the OCaml documentation's rules to the letter. So, you taught me something new about the OCaml runtime, Perhaps this information could be included in a comment (an issue for another patch, of course).
I went and added back caml_alloc_tuple_uninit and got rid of the malloc/frees in the wrapper functions that return an array. I also removed all the CAMLparam/CAMLreturns you pointed out. (Some of the CAMLparams I had left were by accident, and others were because the wrapper function called the OCaml runtime API (e.g. by throwing an exception) and I wanted to be cautious. You're right that the roots aren't necessary.) There were some minor parts where I made the diff smaller as well.
I really appreciate your time and effort looking at my patch.
Thanks for persisting with this and tolerating all my comments! This looks like it is getting very close, nice! I'm sorry for the piecemeal reviewing, but making a full pass takes longer than I can usually spend at once and I figure it is better to send comments intermittently rather than save them until the end. So far I have spent most time looking at llvm_ocaml.{c,ml,mli} and not so much on the other files, but I think that llvm_ocaml is just about there.
For info I found https://github.com/ocaml/ocaml/blob/f1550145d4ebe8ee22853d13f97fca288aaef51f/runtime/memory.c#L203 which notes that normal assignments are ok to initialize freshly allocated blocks with value that are not pointers to young blocks.
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
45 | It might not hurt to add a comment before to_val and from_val saying that they encode pointers to LLVM objects as OCaml tagged integers, following the 3rd encoding suggested by the OCaml FFI documentation at https://ocaml.org/releases/5.0/htmlman/intfc.html#ss:c-outside-head . I don't know how stable such urls are expected to be though. | |
54 | ||
126 | This function could probably use a comment explaining its usage since its functionality is quite specific and has a general sounding name. I don't feel strongly, but there might be a better name possible. It is basically from_val lifted to operate on arrays, so perhaps from_val_array. | |
127 | Note that Wosize_val returns type mlsize_t, which is defined in a platform-specific way because it needs to be different on 32 vs 64 bit systems (it will be uint64_t on 64-bit systems, so usually wider than unsigned). It would probably be better to use mlsize_t instead of unsigned where the values come from OCaml and keep unsigned where the values come from LLVM. In cases where both are involved, I think but am not sure that it would generally be clearer to have the local variables with the unconverted types, and leave the casts between unsigned and mlsize_t implicit when passing arguments to functions. | |
132 | Just a note to ourselves for maybe later: If we want to make the bindings safe for concurrent clients, we may want something stronger than a non-atomic store here. (The load from Elements is volatile.) | |
391–395 | No big deal, but I don't know why this function was moved up a few lines | |
560 | I think this CAMLparam is not needed since alloc_temp allocates off the OCaml heap. And if this one is needed, then RetTy should also be a root. | |
570 | Same as function above | |
610 | here too | |
620 | and here | |
636 | here | |
637 | Related to the comment on alloc_temp about unsigned, I think it would be clearer if this unsigned was instead mlsize_t, leaving the case to unsigned implicit in the call to LLVMStructSetBody below. There are several other similar uses, I would look around each call to alloc_temp and at each call to Wosize_val. | |
2077 | Unless reordering the next several definitions is needed, it would be easier to review if they were left in the same order. They can be reordered if desired in a separate diff that only reorders things. | |
2250–2262 | Could you double-check this one. Temp is allocated on the caml heap unlike other functions, and as written F and Index will need to be registered as roots. | |
2454 | Still left over I think |
Oh no, I did something wrong with Arcanist. I ran arc diff and thought it would push a revision to this diff, but it instead made a new patch. I feel very frustrated with Arcanist, because it isn't explicit with diff IDs, and sometimes it surprises me about what patch it pushes to. I need to figure out what's wrong with the repo state on my computer and how I can fix this problem.
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
45 | I added comments to the header file explaining how these functions work. Should I put additional comments in the source file? | |
2250–2262 | F is an llvalue, which is represented using the low bit tagging scheme, and Index is an int, so both are unboxed and neither is a root. Is this reasoning correct? |
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
45 | The comment in the header is good but I think since it discusses the implementation details of the encoding, I would have put it in the .c and used a comment in the header that just said that the functions implement an encoding without specifying which one. But I have not checked if the uses in other files need to know about details of the encoding. Those are the things I think of, but go with your preference. | |
2250–2262 | You're right about F and Index. I would expect an implementation more like in the suggested edit, similar to llvm_struct_element_types. In particular, Temp is an ocaml block with enough space for only 1 word, and I think LLVMGetCallSiteAttributes expects a pointer to an array with space for Length pointers. | |
llvm/bindings/ocaml/llvm/llvm_ocaml.h | ||
39 | ||
46 | maybe |
llvm/bindings/ocaml/debuginfo/debuginfo_ocaml.c | ||
---|---|---|
364 ↗ | (On Diff #500352) | several unsigned ought to be mlsize_t in this file |
llvm/bindings/ocaml/executionengine/executionengine_ocaml.c | ||
103–104 ↗ | (On Diff #500352) | left over |
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
888–889 | still left over | |
llvm/bindings/ocaml/llvm/llvm_ocaml.h | ||
2 | intended? | |
llvm/bindings/ocaml/target/target_ocaml.c | ||
57 ↗ | (On Diff #500352) | left over |
64–65 ↗ | (On Diff #500352) | left over |
148 ↗ | (On Diff #500352) | left over |
159–160 ↗ | (On Diff #500352) | left over |
179 ↗ | (On Diff #500352) | left over |
185 ↗ | (On Diff #500352) | left over |
250–251 ↗ | (On Diff #500352) | left over |
319 ↗ | (On Diff #500352) | left over |
There were some parts where you had commented about CAMLparam and which I had previously chosen to keep the macro because the function called the OCaml runtime, such as by raising an OCaml exception or allocating from the OCaml heap, and I wanted to be careful. However, I went and accepted your suggestions of removing them because I don't think the roots are necessary for one of two reasons:
- The local variable only exists to store the result of an OCaml heap allocation, so before the call to the OCaml runtime, it does not contain a root
- The local variable only exists to store data to pass to an LLVM API call, and by the point that the OCaml runtime is called, the local variable no longer needs to be live.
However, there is one point where I kept the CAMLparam and left a comment at, in which the function must register one of its parameters as a root as it makes an OCaml heap allocation and the argument of course needs to be live throughout the function call.
I feel a little nervous about removing the CAMLparam macros because I'm worried about mistakes.
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
45 | The reason I would rather have the comments in the header is that if the functions were to be used for some LLVM type that is not 2-byte aligned, or the implementation of some type changed so that it is no longer 2-byte aligned, the code will break. (In that case, the specific type that is not 2-byte aligned should have a dedicated alloc_ function, just like the ones that already exist for types that need custom finalizers.) Therefore, I think that the 2-byte alignment aspect is a contract that users of the functions need to know about, not an implementation detail. | |
llvm/bindings/ocaml/target/target_ocaml.c | ||
148 ↗ | (On Diff #500352) | This root is necessary. DataLayout.t is an OCaml custom block, not a pointer with the low bit set. Int64.t is also boxed. So, this function allocates on the OCaml heap and the GC needs to know that DL is live. |
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
45 | Yes, that makes sense. | |
2250–2251 | these no longer need to be added | |
llvm/bindings/ocaml/target/target_ocaml.c | ||
148 ↗ | (On Diff #500352) | I understand that DL is a custom block, but its value is only used before the allocation in caml_copy_int64. This is the same before and after this diff, or am I overlooking something? |
llvm/bindings/ocaml/target/target_ocaml.c | ||
---|---|---|
148 ↗ | (On Diff #500352) | DL is a parameter, so it needs to be live when the function ends. Otherwise, a caller could pass a DataLayout.t value, and that value could be invalidated by the GC during the function call. |
I feel a little bit apprehensive about the lack of CAML macros in some places. After this patch is done, I think it might be worth it to review the code and make a patch that:
- Adds CAML macros if there are any places we left them out but shouldn't have
- Add comments documenting the justification for being loose with the OCaml runtime API rules (e.g. that using Field to initialize an allocation from caml_alloc_tuple_uninit is safe because you are never storing a pointer in it, or that omitting CAMLlocal is safe when the variable is only used to store the result of calling the OCaml runtime API).
Do you think this is necessary or advisable?
If the reason why this patch doesn't add CAML macros is just to keep the diff small and focus on one issue (removing naked pointers), it might be worth it to have another patch that just adds CAML macros everywhere. It's a question of whether the CAML macros add any overhead that one would want to avoid, and whether the cost of being loose with the rules and adding comments to convey the rationale to other people reading the code is worth the maintenance effort versus just following the (oversimplified) rules of the OCaml docs.
Adding comments about the reasoning behind e.g. avoiding the write barrier seems like a good idea. I don't know if it would be better to have one comment block near the top of the file, or many comments sprinkled around. There might be a lot of redundancy which would be noisy and easy to get out of sync.
There is some overhead to the CAMLparam, etc. macros. Each one involves some memory operations itself, checks the thread local state of the runtime system, and also increases the number of roots that need to be scanned each time the GC is run. I would be hesitant to add them uniformly without a good benchmark suite to measure the impact and see that it doesn't matter.
llvm/bindings/ocaml/target/target_ocaml.c | ||
---|---|---|
148 ↗ | (On Diff #500352) | I'm sorry but I still don't understand. Suppose the GC runs when caml_copy_int64 allocates, and happens to move the value DL points to. Then note that DL is no longer needed since it is used only to compute the argument of LLVMOffsetOfElement and in particular is dead after the allocation in caml_copy_int64. Also, if the caller of llvm_datalayout_offset_of_element needs what it passed as the DL argument, then it will be live and a root in the caller, and the GC will update the caller's pointer when moving the block pointed to by DL. So I don't see the issue, but I am not sure I understand the point you are making. Do you see what I mean? |
llvm/bindings/ocaml/target/target_ocaml.c | ||
---|---|---|
148 ↗ | (On Diff #500352) | You're right, I made a mistake. I didn't consider that if the caller still needed to keep DL around, the caller stack frame would already have DL as a root, so this function doesn't need to register it as a root. Thank you for reviewing my code and pointing this out. |
I have redone my own testing with the latest version of this diff plus changes for the inline comments in this last iteration of review. This test translates each of the bitcode files in llvm-project/llvm/test to a different IR, as well as translating and doing some static analysis on several test programs. This testing did not uncover any issues with this diff.
Additionally, the test was repeated where the translation of the functions of each llvm module was done in parallel (using Domainslib.Task.parallel_for). This is not much of a parallel stress test, but it did not encounter any issues.
So these inline comments are the last minor changes, then this can be rebased, and I can commit it if you need.
llvm/bindings/ocaml/llvm/llvm_ocaml.c | ||
---|---|---|
936 | It is pre-existing, but while you are at it, it may be clearer and more uniform to change to ++I. | |
1098–1099 | This CAMLparam needs to be kept since Name needs to survive the allocation of Nodes. | |
1805–1806 | Removing these parens triggers warning: using the result of an assignment as a condition without parentheses [-Wparentheses] |
What "rebasing" does this need? As far as I can tell, it is compatible with the main branch. The last commit to the OCaml bindings is my patch removing the PassManager functions, which this patch is already rebased on. If this weren't ready to merge, Phabricator would say "patch application failed."
Please do commit this for me.
intended?