This diff is similar to what D71394 did for llvm-objdump -- it avoids
trying to look up a section name for STABS symbols, since some STABS
symbol types (like N_OSO) use the n_sect field to store other data
instead of a section index.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
llvm/tools/llvm-readobj/MachODumper.cpp | ||
---|---|---|
274 | type -> Type |
llvm/test/tools/llvm-readobj/MachO/Inputs/stabs.obj.macho-x86_64.yaml | ||
---|---|---|
1 ↗ | (On Diff #294899) | Do you need a separate input file? I think you can inline it to the test file. |
250 ↗ | (On Diff #294899) | I am not an expert in MachO, but It feels that this YAML contains excessive symbols, sections, strings etc. |
llvm/test/tools/llvm-readobj/MachO/stabs-macho.test | ||
1 ↗ | (On Diff #294899) | All of our llvm-readobj tests for ELF normally have a header with a description that explains Also, since your new test is in the MachO folder, you probably don't need a -macho suffix |
3 ↗ | (On Diff #294899) | Please separate CHECK line from RUN line with an empty line. |
llvm/test/tools/llvm-readobj/MachO/Inputs/stabs.obj.macho-x86_64.yaml | ||
---|---|---|
250 ↗ | (On Diff #294899) | it's a bit of a pain to reduce yaml by hand. A lot of the extra symbols are things that get inserted into a trivial Mach-O binary by clang and/or the linker. I've removed a bunch of extra load commands but left the symbols as-is (symbols are harder to remove) |
A nit from me. Leaving the discussion on the Mach-O yaml to others. One of these days somebody's going to break and actually make yaml2obj more usable for Mach-O!
llvm/test/tools/llvm-readobj/MachO/stabs.yaml | ||
---|---|---|
2 | (I wasn't initially sure if the "--" had some special meaning in the context of stabs - it was just easier to change the punctuation) |
llvm/test/tools/llvm-readobj/MachO/stabs.yaml | ||
---|---|---|
121 | I was able to quickly reduce the number of commands here to 2 by simple removing of unused parts: ncmds: 2 ... LoadCommands: - cmd: LC_SEGMENT_64 cmdsize: 232 segname: __TEXT vmaddr: 4294967296 vmsize: 4096 fileoff: 0 filesize: 4096 maxprot: 5 initprot: 5 nsects: 1 flags: 0 Sections: - sectname: __text segname: __TEXT addr: 0x0000000100000FA0 size: 15 offset: 0x00000FA0 align: 4 reloff: 0x00000000 nreloc: 0 flags: 0x80000400 reserved1: 0x00000000 reserved2: 0x00000000 reserved3: 0x00000000 content: 554889E531C0C745FC000000005DC3 - cmd: LC_SYMTAB cmdsize: 24 symoff: 4152 nsyms: 11 stroff: 4328 strsize: 104 | |
210 | I was able to remove the whole ExportTrie description. | |
llvm/tools/llvm-readobj/MachODumper.cpp | ||
287 | I was able to comment out few cases: switch (Type) { case MachO::N_FUN: //case MachO::N_STSYM: //case MachO::N_LCSYM: case MachO::N_BNSYM: //case MachO::N_SLINE: case MachO::N_ENSYM: case MachO::N_SO: //case MachO::N_SOL: //case MachO::N_ENTRY: //case MachO::N_ECOMM: //case MachO::N_ECOML: return true; default: return false; And the test case passed for me. I think it means these cases are not tested properly. |
llvm/test/tools/llvm-readobj/MachO/stabs.yaml | ||
---|---|---|
121 | the resulting binary won't execute successfully though. I kept them because I thought it would be nice to have a somewhat representative input to the linker... | |
llvm/tools/llvm-readobj/MachODumper.cpp | ||
287 | I got these cases from reading stabs.h in the macOS SDK, but I'm not actually sure how to get the linker to generate them. Would you prefer I not print out the section name for all stabs sections, as was done for llvm-objdump in D71394? Another option would be to edit the YAML to create these stabs symbols, but as above, it seemed a bit iffy to be creating something that might not match a realistic input. |
llvm/test/tools/llvm-readobj/MachO/stabs.yaml | ||
---|---|---|
121 | I don't think we do this normally. E.g, usually any`llvm-readobj (ELF)` test cases are as minimal as possible and we don't care about feeding them to a linker and executing. It is significantly easier to read, mantain and expand tests that are little and contain only important parts. | |
llvm/tools/llvm-readobj/MachODumper.cpp | ||
287 |
I am - not an expert in MachO, so this is probably a question to someone who is more familar with MachO. I.e. what output is expected. Perhaps it is fine, since was accepted for llvm-objdump.
For testing - I think this would be a right approach. For ELF dumper part of llvm-readelf we handle many unrealistic cases to check we print a reasonable output and/or warnings and don't crash etc. llvm-readelf is a tool that can be used to inspect broken inputs, so I believe it is fine to feed it with unusual things and check it shows a resonable output. Another point is that if you don't want to create "something that might not match a realistic input", then probably you don't need to write an untested code that supports this first of all? To summarize, I see following solutions here:
|
llvm/tools/llvm-readobj/MachODumper.cpp | ||
---|---|---|
287 | I think keeping the code only for cases covered by test would make for pretty surprising behavior to someone who hasn't seen the corresponding test. I could see some future user being confused as to why an arbitrary subset of STABS symbols get their section names displayed. I think matching llvm-objdump's behavior (and adding a TODO) is easiest for now. |
llvm/test/tools/llvm-readobj/MachO/stabs.yaml | ||
---|---|---|
181 | 2 points:
You've previously said that "symbols are harder to remove", but I've tried and found it is actually very easy: all you need is to remove a symbol and adjust the nsyms value. E.g in the piece below I've removed duplicated symbols with type == 0x64: nsyms: 7 stroff: 4328 strsize: 104 LinkEditData: NameList: - n_strx: 45 n_type: 0x64 n_sect: 0 n_desc: 0 n_value: 0 - n_strx: 72 n_type: 0x66 n_sect: 3 n_desc: 1 n_value: 1601361378 - n_strx: 1 n_type: 0x2E n_sect: 1 n_desc: 0 n_value: 4294971296 - n_strx: 96 n_type: 0x24 n_sect: 1 n_desc: 0 n_value: 4294971296 - n_strx: 1 n_type: 0x24 n_sect: 0 n_desc: 0 n_value: 15 - n_strx: 1 n_type: 0x4E n_sect: 1 n_desc: 0 n_value: 15 - n_strx: 22 n_type: 0x0F n_sect: 1 n_desc: 0 n_value: 4294971296
- n_strx: 72 n_type: 0x66 ## N_OSO n_sect: 3 n_desc: 1 n_value: 1601361378 |
You also need to update the patch title. It says about "symbols of type N_OSO", but the code changes the behavior for all stabs symbols.
You've previously said that "symbols are harder to remove", but I've tried and found it is actually very easy: all you need is to remove a symbol and adjust the nsyms value.
Most of the hassle is in cleaning up the corresponding strings in the string table, but I guess I could have just removed the symbol entries and left the strings in. Anyway, I've cleaned up both now...
I have no more comments (except the one minor nit).
Someone else, who is more famailar with MachO should give a final approve I think.
llvm/test/tools/llvm-readobj/MachO/stabs.yaml | ||
---|---|---|
101 | ## -> # here and below. (We should be consistent in marking comments). |
I think it could be OK to have a single string, e.g "foo". Probably it is not a problem to use it for all symbols in tests like this,
as all you need to check is that something is dumped or not.
Or, it could be a set of strings of the same sizes, e.g. "foo", "bar", "zed" what would allow to computate string index much easier.
Or, a single large string, like "abcdefghijhklm", but you could use different n_strx indices (N...1) so that each symbol got a different name in result ("m". "lm", "klm" etc).
Sure, it is just a problem of yaml2elf COFF that it seems doesn't allow to assing names to symbols easily.
Looks reasonable aside from @grimar's comments, but needs a person with more Mach-O experience to review too before this gets committed.
llvm/test/tools/llvm-readobj/MachO/stabs.yaml | ||
---|---|---|
101 | Did you mean # -> ##? |
llvm/test/tools/llvm-readobj/MachO/stabs.yaml | ||
---|---|---|
101 | I think that's what he meant - comments should use ##. |
llvm/test/tools/llvm-readobj/MachO/stabs.yaml | ||
---|---|---|
101 | Yes :) |
Adding @davide who seems to have contributed significantly to MachODumper.cpp in the past
I'm sorry, I haven't touched this code in a long time and I don't feel qualified to review it.
llvm/tools/llvm-readobj/MachODumper.cpp | ||
---|---|---|
631 | maybe I'm missing something, but it looks like on the lines 631-639 only the case of non-stab symbols is handled, so for a stab symbol SectionName is set to "", is there another place where it's updated ? (to cover the "unless we know that .. " part of the comment). |
llvm/tools/llvm-readobj/MachODumper.cpp | ||
---|---|---|
631 | It's not handled. That's what the TODO comment is for :) |
(I wasn't initially sure if the "--" had some special meaning in the context of stabs - it was just easier to change the punctuation)