Add basic binary support for chained fixups. This allows basic tests
with chained fixups without trying to create a format for them until the
work on the Object library is considered finished.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Looping in @lhames in case he has any thoughts from Apple.
From a design perspective I'm not sure this is the approach I'd take, but I also don't know that I'm the best person to judge since I'm not going to be the user here.
When I added the initial mach-o support the emphasis was on representing the binary accurately enough to reproduce binary-accurate transformations, but to also have the YAML representation be human readable wherever it could be. This patch really just treats it as binary data, which makes it pretty unreadable.
llvm/lib/ObjectYAML/MachOEmitter.cpp | ||
---|---|---|
603 | ||
610 | ||
617 | ||
llvm/tools/obj2yaml/macho2yaml.cpp | ||
630 |
Agree 100%. The need from this come from D133974. I had the need of testing that the data was moved correctly during objcopy, but I don't have the need to inspect the data itself. My hope is that whenever someone has the need to check that the data is actually representing the right thing, they will have the knowledge of what the data represents and how better transform it into YAML. I prefer not to check-in executables in the source tree, but if we don't even want this code as a temporary solution, I think I can work around the need of adding this support.
Amusingly the only documentation I'm aware of that documents the format of the export trie... is in a file you modified in this change: https://github.com/llvm/llvm-project/blob/main/llvm/tools/obj2yaml/macho2yaml.cpp#L473
llvm/lib/ObjectYAML/MachOYAML.cpp | ||
---|---|---|
171 | Should this be LinkEditData.DyldExportsTrie? | |
llvm/tools/obj2yaml/macho2yaml.cpp | ||
646 | So the exports trie is a little weird. It used to live in the LC_DYLD_INFO/LC_DYLD_INFO_ONLY along with bind opcodes. If the binary is using opcode based fixups, then it is required that the trie come from there, not the LC_DYLD_EXPORTS_TRIE. If the binary is newer, and uses LC_DYLD_CHAINED_FIXUPS then it is required that the export trie (if there is one) comes from the LC_DYLD_EXPORTS_TRIE. I don't know if that impacts this code or not, but roughly i'd expect the export trie to be dumped either next to the bind opcodes as part of the LC_DYLD_INFO, or next to the chained fixups as a LC_DYLD_EXPORTS_TRIE. Does this dump method handle both those cases, or just the LC_DYLD_EXPORTS_TRIE case? |
😆 I think you wanted to use "As usual", and not "Amunsigly". Lacking format documentation seems to be a constant with Mach-O 😆
Do you happen to know if the contents of the newer LC_DYLD_EXPORTS_TRIE have the same format as the previous trie? It seems to me that they look very similar. I wonder if I can reuse a little bit of the code and improve this to have a more human-friendly dump (I will need to modify some code of the Object library first, which seems to not deal with this old place/new place duality).
llvm/lib/ObjectYAML/MachOYAML.cpp | ||
---|---|---|
171 | Thanks. Good eyes. I did not realize that the initial yaml dump had the same values in both places. | |
llvm/tools/obj2yaml/macho2yaml.cpp | ||
646 | Thanks for the explanation. It clarifies why I could not get one with both LC_DYLD_INFO and LC_DYLD_CHAINED_FIXUPS. I was confuse why there was the design had two structures named almost the same ("exports trie" and a "dyld exports trie"). As you pointed out above, this was doing the wrong thing. I think autocomplete of dyld… found getDyldInfoExportsTrie and I did not realize that I was looking for the wrong piece of the binary. I changed the code to find the LC and dump the contents directly. This code also only deals with LC_DYLD_EXPORTS_TRIE, since it was the one we started having problems (for the problems see D133974). It seems that the code in dumpExportsTrie and processExportNode dump the other. |
I'm not familiar with Mach-O object files. I can only give some minor suggestions on your codes.
llvm/lib/ObjectYAML/MachOEmitter.cpp | ||
---|---|---|
603 | Since Obj.LinkEdit.ChainedFixups is an array of bytes. You can simply serialize it using if (Obj.LinkEdit.ChainedFixups.size() > 0) OS.write(reinterpret_case<const char *>(Obj.LinkEdit.ChainedFixups.data()), Obj.LinkEdit.ChainedFixups.size()); | |
610 | Ditto. | |
617 | Ditto. | |
llvm/tools/obj2yaml/macho2yaml.cpp | ||
671 | Why not using reinterpret_case<> here? |
llvm/tools/obj2yaml/macho2yaml.cpp | ||
---|---|---|
671 | void* to uint8_t* doesn’t need a reinterpret_cast so it is better to use a static_cast. |
llvm/tools/obj2yaml/macho2yaml.cpp | ||
---|---|---|
671 | Yes, I mean why not just uint8_t *Bytes = reinterpret_cast<uint8_t *>(&DICE); |
llvm/tools/obj2yaml/macho2yaml.cpp | ||
---|---|---|
671 | Ah, I see. The single reinterpret_cast is probably clearer here. Sorry for the noise. |