This is an archive of the discontinued LLVM Phabricator instance.

[llvm-ocaml] Assume pointers are at least 2-bit aligned

Authored by alan on Oct 22 2022, 12:55 PM.



This diff assumes that all pointers that LLVM gives to OCaml are at least 2-bit aligned, removing the branch that allocates/unwraps the OCaml Abstract_tag block.

The current binding code on the main branch skips a lot of CAMLparamX and CAMLreturn boilerplate because the wrapped code did not allocate. My parent patch added the CAMLparamX and CAMLreturn boilerplate because I assumed the code might allocate. This patch undoes that, since it again assumes that there is no allocation.

Furthermore, if the code does allocate, I only register boxed values (such as strings) with CAMLlocalX, skipping unboxed values (such as ints).

Diff Detail

Event Timeline

alan created this revision.Oct 22 2022, 12:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 22 2022, 12:55 PM
alan requested review of this revision.Oct 22 2022, 12:55 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 22 2022, 12:55 PM
alan removed a subscriber: llvm-commits.Oct 22 2022, 1:09 PM
alan updated this revision to Diff 469937.Oct 22 2022, 3:15 PM


alan updated this revision to Diff 469941.Oct 22 2022, 3:28 PM

Add missing return

alan edited the summary of this revision. (Show Details)Oct 22 2022, 4:16 PM
alan added a comment.Oct 22 2022, 6:51 PM

I'm extremely puzzled by When I test on OCaml 4.14, all tests pass, but when I test on OCaml 5.0~beta1, this test hangs forever. All tests on the parent patch pass on OCaml 5.0~beta1.

When I look at the backtrace with gdb, the test hangs in the function LLVMCreateMemoryBufferWithSTDIN. I think that the code is reading from stdin? Is the code supposed to signal an error instead?

alan updated this revision to Diff 469950.Oct 22 2022, 7:13 PM

Don't use CAML macros in iterator functions

alan updated this revision to Diff 469956.Oct 22 2022, 9:13 PM

Add missing CAMLreturn

alan updated this revision to Diff 469964.Oct 22 2022, 11:05 PM

Fix missing CAMLreturn causing hanging tests

alan updated this revision to Diff 469966.Oct 23 2022, 12:37 AM

Convert the rest of the files

alan added a comment.Oct 24 2022, 5:18 AM seems to imply that the shortcuts that this diff makes are still safe.

alan updated this revision to Diff 470318.Oct 24 2022, 4:23 PM

Assume all functions may allocate on the OCaml heap

alan updated this revision to Diff 470637.Oct 25 2022, 4:06 PM

Rebase onto parent patch

jberdine requested changes to this revision.Feb 8 2023, 3:28 PM

Just a few initial comments so far, I have read and tested this some, but need to do more. Also, I think it would be easier to review if this diff and D136400 were combined into a single diff.


It may not matter, but I would be more comfortable to use bitwise operations to set and clear the low bit of pointers. And I think it would be better to add an assertion to test that the low bit is clear, as a way to more explicitly document the assumption, and to give us a chance to catch it in a debug build if it is ever violated.


CAMLparam and CAMLreturn do not need to be added (by the composition of this diff and its parent) here, right? There are other functions that seem to be in a similar situation, but asking about one example to check my understanding.


alloc_temp expects an ocaml block, not a size

This revision now requires changes to proceed.Feb 8 2023, 3:28 PM
alan added a comment.Feb 10 2023, 11:26 AM

I tried to figure this out myself, but I'm inexperienced with Phabricator and only messed things up. I've edited the parent diff, How do I rebase this patch so that it applies to the latest version of the parent? I somehow accidentally rebased the parent patch instead.

alan added inline comments.Feb 14 2023, 11:43 AM

In this diff, I decided to take some liberties by not registering a value with CAMLparam if it is known to not be a GC'd block (for example, if it is an int, or a pointer from LLVM that is represented by setting the low bit). The original bindings on main take this type of liberty a lot by omitting CAMlparam entirely in many places. In my parent diff, I added the CAMlparams and registered all values with CAMlparam, including ones that the GC doesn't need to track, to be safe, but here, I decided to relax which values I registered.

Is this a good choice? Registering all values probably makes the code clearer and reduces room for error (e.g. forgetting to register something), choosing not to register things like ints and 2-bit aligned LLVM pointers with the low bit set probably slightly increases performance.

alan abandoned this revision.Feb 14 2023, 6:33 PM

I've manually applied the main change of this diff (to assume that all pointers from LLVM are 2-bit aligned) to manually.

Argh, I made these comments some time ago and failed to submit them.


Here and in a number of other cases the result of malloc is not checked for NULL.