This interface allows clients to write multiple modules to a single
bitcode file. Also introduce the llvm-join utility which can be used
to create a bitcode file containing multiple modules.
Details
Diff Detail
- Build Status
Buildable 930 Build 930: arc lint + arc unit
Event Timeline
It'd be nice to add the opposite feature of llvm-join to llvm-extract: being able to list/extract the modules in a single file.
Yes, that's blocked on an appropriate reader interface though (waiting on a response to my RFC).
llvm-combine, llvm-fuse, llvm-merge are alternative names I can think about in the same idea, but that does not distinguish from llvm-link either.
What about llvm-append?
Can we just add an llvm-bcutil and add subcommands to it? This could probably subsume llvm-extract and llvm-bcanalyzer someday too (and we could switch on argv[0] to imitate them if necessary). Maybe we could grow one of those two into llvm-bcutil?
Maybe something like
llvm-bcutil cat (or concat as peter suggested)
llvm-bcutil uncat
llvm-bcutil split-module (current llvm-split?)
llvm-bcutil extract-function (current llvm-extract?)
llvm-bcutil extract-module (for D26778/llvm-modextract?)
llvm-bcutil analyze (current llvm-bcanalyzer?)
llvm-bcutil link-modules (current llvm-link?)
...
We don't have to get there overnight, but having a nice place to put these things with low boilerplate would be good IMO. We shouldn't have significant overhead or boilerplate for adding testing utilities or utilities that are handy for us as developers.
In your list, there are some that are IR level tools, it seems strange to me to put them under a "bcutil" umbrella.
I thought about that, but I think the only one that doesn't actually materialize a Module in memory is llvm-bcanalyzer. So in some sense they are all about more than just bitcode. Also, colloquially, "bc file" is often used when IR is meant ("can you send me a bc file of that so I can debug the issue?"; often, textual IR would be fine in those contexts), so I don't think it's that confusing. I agree it's slightly confusing though.
I'd like llvm-modextract and llvm-join (or llvm-concat) to operate without materializing IR. It is blocked on updating the bitcode to not have absolute reference from the beginning of the file.
- http://clang.llvm.org/demo/cathead.png has decreed that the tool should be renamed llvm-cat
llvm/include/llvm/Bitcode/BitcodeWriter.h | ||
---|---|---|
35 โ | (On Diff #79502) | \brief is only useful if there is more than one sentence. |
llvm/tools/llvm-cat/llvm-cat.cpp | ||
46 โ | (On Diff #79502) | So, if I followed correctly, we should be able to do now as well: { SmallVector<char, 0> Buffer; BitcodeWriter Writer(Buffer); } for (std::string InputFilename : InputFilenames) { std::unique_ptr<MemoryBuffer> MB = ExitOnErr(errorOrToExpected(MemoryBuffer::getFileOrSTDIN(InputFilename))); std::vector<BitcodeModule> Mods = ExitOnErr(getBitcodeModuleList(*MB)); for (auto &BitcodeMod : Mods) Buffer.insert(Buffer.end(), BitcodeMod.getBuffer().begin(), BitcodeMod.getBuffer().end()); } Can we put this behind an option in llvm-cat? I'm not sure sure it is worth creating llvm-binarycat for this. (Also it makes me notice that your current implementation wouldn't handle input files that contains multiple modules) |
llvm/tools/llvm-cat/llvm-cat.cpp | ||
---|---|---|
46 โ | (On Diff #79502) | Seems reasonable enough for now. Let's split it out into llvm-bcutil binarycat once we get around to that though. |
Now that we have D26778 you could update this test I think!