This is a rough prototype of the tool described in https://lists.llvm.org/pipermail/llvm-dev/2021-June/151321.html. The tl;dr is that busybox will attempt to merge all llvm tools into one single binary rather than separate binaries. This will primarily be useful for toolchains that distribute a suite of llvm tools. This prototype only has porting for llvm-objdump and llvm-objcopy, but this technique can be applied to any combination of llvm tools.
Design:
- Individual llvm tools are "librarified" into static libraries that get linked together in the busybox tool, which itself is just a tool that just dispatches to the appropriate "main" function of a specific tool.
- The busybox binary is just called "llvm", but other llvm tools can be symlinked to it (ie. ln -s llvm-objdump llvm) and busybox should dispatch to the appropriate tool (by checking argv[0]).
Usage:
- Various llvm tools can be invoked via busybox by passing a shortened tool name as its first argument (llvm obdump [objdump_args]).
- Symlinked tools to busybox should "just work" out of the box with no implementation differences.
- This is accounted for and all LLVM tests currently pass with busybox enabled.
- Busybox is enabled via the cmake flag LLVM_ENABLE_EXPERIMENTAL_BUSYBOX which is OFF by default.
Mesaurements:
Each of these binary measurements are for the stripped + release build versions, with --gc-sections enabled.
- The busybox binary is ~20.5 MB.
- The statically compiled llvm-objdump and llvm-objcopy are ~20 MB and ~4 MB respectively (~24 MB combined).
- Size savings are likely from deduped symbols that were statically linked into the final binary.
- The dynamically compiled llvm-objdump and llvm-objcopy that depend on libLLVM are ~695 KB and ~563 KB respectively. libLLVM is ~90 MB. The combined size of these is ~92 MB.
- Size savings in this case are likely because --gc-sections removes a large chunk of libLLVM that would've been statically linked into the statically compiled binaries.
Implementation issues:
- Ideally we wouldn't need to have so many cmake changes, but I might not have enough cmake mastery to reduce the complexity.
- One noticeable issue is that cmake code for creating symlinks from the original binary is duplicated. See the inlined cmake comment below for an explanation of this.
- Each tool seems to have different runtime configurations depending on what the symlinked tool is. These are all controlled via a similarly operating Is function that I just copied into the busybox implementation.
Adding a tool to the busybox (ideally we would have as few steps as possible):
In the tool's directory:
- Inside the tool's CMakeFiles.txt, "librarify" the llvm binary by checking to see if the busybox cmake flag is ON. This can be done by abstracting out the source files/arguments and passing them to a similarly named LLVM{ToolName} via add_llvm_library. See how this patch does it for llvm-objcopy/dump.
- Note that symlinks should not be added within the tool's CMakeFiles.txt if busybox is enabled since the tool target hasn't been made yet. Those will be added to busybox's CMakeFiles.txt.
- Add a macro check around the tool's main function to call an externally available main-like function. Something like:
+#ifdef LLVM_ENABLE_EXPERIMENTAL_BUSYBOX +int llvm_objdump_main(int argc, char **argv) { +#else int main(int argc, char **argv) { +#endif `
In busybox's directory:
- Inside busybox's CMakeFiles.txt, add the newly created LLVM{ToolName} library as a dependency to the llvm tool and add {ToolName} to LLVM_LINK_COMPONENTS.
- Add any symlinks that would've pointed to the original tool to instead point to the busybox tool (llvm) at the end of CMakeFiles.txt. All symlinks will now point to the busybox tool.
- Note that while it is possible to have symlinks point to other symlinks, I ran into some cmake errors when doing this. See the long FIXME comment in llvm-busybox/CMakeLists.txt.
- In llvm/tools/llvm-busybox/Tools.def, add the appropriate TOOL macros for the various tools that busybox should dispatch to.
- BusyboxName is the name tool name that would be passed as the first argument to the llvm binary and what busybox compares against when attempting to dispatch.
- LLVMName is the name of the original tool that would've been created (or symlinked) without busybox.
- MainFunc is the main-like function added in step 2.
Followups:
- The cmake machinery for creating "install" symlinks to an installed version of busybox have not been implemented yet. Under busybox, the only stripped tool we'll need to make is the busybox tool and all other "stripped" tools should just be symlinks to this stripped busybox.
- Find a way to remove duplicate cmake code in busybox and the original tool directory.
- Figure out windows support where symlinks are mostly limited to administrators and developer mode. (Maybe we might just not support windows for now?)
I think we should avoid the term busybox to avoid the confusion with https://busybox.net/, instead we should probably use the term multiplexing (or muxing for short) which has been also used by https://wdtz.org/files/oopsla18-allmux-dietz.pdf.
We could name the option as LLVM_ENABLE_EXPERIMENTAL_MULTIPLEXING (or LLVM_ENABLE_EXPERIMENTAL_MUXING) and name the tool as llvm-mux.