Implement several XCOFF hooks to get '-r' option working for llvm-objdump -r.
Details
Diff Detail
Event Timeline
llvm/lib/BinaryFormat/XCOFF.cpp | ||
---|---|---|
41 | Most architectures use the names R_<ARCH>_... for their relocation names (e.g. R_X86_64_NONE). Is that not the case for XCOFF? | |
65 | This seems fishy. If I was to create a XCOFF object file with a relocation with an unknown type value, would I get this error when e.g. dumping relocations? My feeling is that anything that uses report_fatal_error is a bug waiting to happen. Better would be to return some string indicating an unknown relocation type. | |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
323–331 | Does other code further up the stack check to make sure the file is not 64 bits? If not, this isn't a valid assertion, since the program could reach this point. What would happen then if assertions are disabled? Same comment goes for all the other instances of this. | |
365 | Is this really unreachable? Could you get to here with an object with some invalid relocation or section property? | |
376 | clang-format. But should this really be an assertion rather than an llvm::Error or similar? | |
llvm/test/CodeGen/PowerPC/aix-xcoff-reloc.ll | ||
450 | Nit: remove the second blank line. |
This looks great, thanks Jason!
llvm/lib/BinaryFormat/XCOFF.cpp | ||
---|---|---|
41 | according this document these names are correct as they currently are. https://www.ibm.com/support/knowledgecenter/ssw_aix_72/filesreference/XCOFF.html#XCOFF__sua3i125jbau | |
65 | ELF returns the string "Unknown". Will make it less painful in case this code falls out of sync with new relocs. | |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
365 | Agree. I think a relocation with an invalid r_vaddr could fall through this. |
llvm/lib/BinaryFormat/XCOFF.cpp | ||
---|---|---|
65 | for enum type , if your switch case has enumerate all the case, | |
llvm/tools/llvm-objdump/XCOFFDump.cpp | ||
26 | I think you can use function unwrapOrError() unwrapOrError(Obj.getFileName(), Obj.getSymbolName(SymbolDRI)); |
llvm/lib/BinaryFormat/XCOFF.cpp | ||
---|---|---|
65 |
please do not use the default, otherwise it will cause a compiler error with clang Does this apply for all enum variations? It's trivially easy to cast something such that it isn't a recognised value. Regardless, in this context, you can easily return outside the switch. | |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
323–331 | I consider report_fatal_error to be as bad as an assertion for code that can get to here, as the error is useless and implies an internal problem in llvm, which isn't true if it's only hit due to malformed input. If at all possible, I'd recommend using llvm::Expected to report errors, if it's possible within the interface (what do other ObjectFile variations do?). | |
365 | See my earlier comment. Please use llvm::Expected if possible, not report_fatal_error. |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
323–331 | I think the report_fatal_error is appropriate here. In here, we do imply there is an internal problem in llvm for XCOFF object file, as our tooling do not support parsing 64 bit object yet (not that the input object file is malformed input). |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
376 | Ideally, we should return llvm::Expect<symbol_iterator> for this function. But if we do that, it would be a very big interface change across the board. We need to return llvm::Expect<symbol_iterator> for RelocationRef's getSymbol() as well, and there are several dozens of places used RelocationRef's getSymbol() method. For other object files, they would return symbol_end(), and I think it would make more sense to follow suit in this case. |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
365 | As mentioned below, I find it a bit challenge to change the existing interface to return llvm::Expected. Although this one might not be as hard as getRelocationSymbol, it is still a non-trivial task after some investigation (i.e. there are python script actually depending on existing interface if I'm not mistaken). So I'm thinking to proceed with some less resistant way. Other target for this function will return the virtual address stored in the relocation entry without checking and calculation. |
Address comments.
llvm/tools/llvm-objdump/XCOFFDump.cpp | ||
---|---|---|
26 | unwrapOrError is not suitable for this interface when we actually want to propagate the llvm::Error to the caller. |
@jhenderson @davidb @DiggerLin
Thanks for all the comments. I have updated the patch, and you are welcome to review again anytime.
Thanks again.
llvm/tools/llvm-objdump/XCOFFDump.cpp | ||
---|---|---|
22 | what happen if the SymI point to symbol_end() ? |
llvm/tools/llvm-objdump/XCOFFDump.cpp | ||
---|---|---|
22 | We will return error for it. Thanks for point it out. |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
325 | Please replace Typename* binding with Declarator *binding. |
llvm/lib/BinaryFormat/XCOFF.cpp | ||
---|---|---|
36 | This does not follow the convention of placing the backslash on the same column for all lines of a macro definition. | |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
21–26 | This should be in an anonymous namespace if a drive-by fix is okay. | |
25 | The types associated with these constants do not appear to be uniform. It does not appear that these should be all enumerators of the same enumeration type. |
llvm/lib/BinaryFormat/XCOFF.cpp | ||
---|---|---|
65 | Sounds good to me. | |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
323–331 | I had a look at what other object files do. In ELF, an error is simply thrown away and a null iterator is constructed. In COFF, a report_fatal_error is reached. Mach-O and wasm both don't have any errors. I'm not happy about the current state, but I suppose there's prior art and it needs a bigger cleanup anyway. Plus, once 64-bit is implemented, this goes away. | |
324 | Here and all report_fatal_error calls: remove the trailing full stop. | |
325 | Isn't this handled by clang-format? | |
365 | I had a play around with this. I think a change in the interface is viable, and not all that invasive, with an issue in the llvm-c level being the only one that stopped me trying to push the change further for now. UINT64_MAX is probably a better value than zero, I suppose. I'm not massively happy with it, since technically, there's nothing stopping a single-byte-patching relocation patching the very last value in a massive ELF. However, in practice that's never going to happen. Could you put the value as a static constant in the XCOFF object file, and then refer to it elsewhere? That would at least ensure the meaning of the value is clear. You'd want to mention that it returns the constant in the function's doxygen comment. | |
367–368 | If you pull the value into a named constant, you won't need this comment. | |
llvm/tools/llvm-objdump/XCOFFDump.cpp | ||
25–26 | Errors usually have lower case first letter and no trailing full stop. |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
324 | We have been trying to full stops for XCOFF work for invocations of report_fatal_error, llvm_unreachable, and assert in order to be consistent within our patches and after having observed that such uses are not uncommon. For example: llvm/lib/Object/COFFObjectFile.cpp: report_fatal_error("Section was outside of section table."); llvm/lib/Object/COFFObjectFile.cpp: report_fatal_error("Aux Symbol data was outside of symbol table."); llvm/lib/Object/MachOObjectFile.cpp: report_fatal_error("Malformed MachO file."); llvm/lib/Object/MachOObjectFile.cpp: report_fatal_error("Requested symbol index is out of range."); | |
325 | It does for me now. I'm not certain it always did so in the past. | |
llvm/tools/llvm-objdump/XCOFFDump.cpp | ||
25–26 | Thanks for pointing this out. We will correct this. |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
337 | clang-format here too. Please check throughout the patch. | |
360 | I believe it is more idiomatic to place the range boundaries on either end of the comparison: Sec32->VirtualAddress <= RelocAddress && RelocAddress < (Sec32->VirtualAddress + Sec32->SectionSize) | |
364 | I believe it is customary to use prefix increment when possible: https://llvm.org/docs/CodingStandards.html#prefer-preincrement | |
389 | Minor nit: Functions that can be "simple one-liners" should be written as such. The overhead of the declaration for Reloc is a large proportion of this function, especially considering its (lack of) utility here. | |
llvm/tools/llvm-objdump/XCOFFDump.cpp | ||
2 | Line exceeds 80 columns. |
llvm/tools/llvm-objdump/XCOFFDump.cpp | ||
---|---|---|
25 | Suggestion: |
@jhenderson @hubert.reinterpretcast
Thanks, James and Hubert. I believe I addressed all your comments.
llvm/include/llvm/Object/XCOFFObjectFile.h | ||
---|---|---|
280 ↗ | (On Diff #250939) | Please add a blank line before the comment block. |
281 ↗ | (On Diff #250939) | Suggestion: |
282 ↗ | (On Diff #250939) | Suggestion: |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
22 | My understanding of @jhenderson's comment is that this should be an interface constant in a header. |
llvm/include/llvm/Object/XCOFFObjectFile.h | ||
---|---|---|
252 ↗ | (On Diff #251074) | Will UINT64_C(-1) be clearer? |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
22 | I think it's good for internal linkage, and make it unreachable from other translation units. |
llvm/include/llvm/Object/XCOFFObjectFile.h | ||
---|---|---|
252 ↗ | (On Diff #251074) | I would use std::numeric_limits<uint64_t>::max(), since it's clearer to me than a literal. It's meaning seems obvious. |
284–287 ↗ | (On Diff #251473) | Please reflow this comment so that it isn't weirdly spaced (especially the end of the third line). I'd actually recommend another change too: "\returns the relocation offset with the base address of the containing section as zero, or InvalidRelocOffset on error (for example for a relocation that does not refer to an address in any section)." The "\returns" style is a doxygen thing. |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
19–21 | At this point would it just make more sense to make these static constant literals? // TODO: Use appropriate naming convention - are these spec-defined, or just local names? static const int RELOC_OVERFLOW = 65535; // Use an appropriate type here. static const uint8_t FUNCTION_SYM = 0x20; static const uint8_t SYM_TYPE_MASK = 0x07; static const uint16_t NO_REL_MASK = 0x0001; | |
324 | They may not be uncommon, because people haven't been particularly careful with how they write error messages in the past. Whereever possible, I've tried to encourage people to adopt a single consistent style - note that most (all?) of our error messages in clang, LLD and various tools like llvm-readobj and llvm-objdump use lower-case and no trailing full stop. report_fatal_error calls should be no different. | |
361 | I'm pretty certain you don't need the parentheses here. | |
llvm/test/tools/llvm-objdump/XCOFF/disassemble-all.test | ||
1–5 ↗ | (On Diff #251473) | Please don't test -r + -D instead of -D on its own. -D should be tested on its own. -r should be tested on its own in a separate test. I'd also recommend -r + -d be tested there too (or that could even be a different test file if you want). |
llvm/include/llvm/Object/XCOFFObjectFile.h | ||
---|---|---|
252 ↗ | (On Diff #251074) | This was my original implementation, I don't have anything against it. @hubert.reinterpretcast Any comments? |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
19–21 | This was a drive-by fix because I touched this line when I want to put in InvalidRelocOffset. Now, it's irrelevant to the current patch, and I will put it to its original state. |
llvm/include/llvm/Object/XCOFFObjectFile.h | ||
---|---|---|
252 ↗ | (On Diff #251074) | I was hoping to not add an extra header inclusion to an interface file, but readability concerns are more important. @jasonliu, sorry for the churn. |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
324 | @jhenderson, I agree that a single consistent style would be great. I note that the Coding Standards do not document this style. It is also unfortunate that this style is not consistent, in respect to the capitalization of the first word, with the examples in the Coding Standards for strings in assert and llvm_unreachable. If you find consensus to update the Coding Standards to document this style, it would probably help to improve the consistency in the project. |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
324 | Please stop capitalizing and stop using a trailing full stop. For the common infrastructure of LLVMObject and binary utilities where James and several other contributors spending a lot of time on, we have made lots of efforts improving the diagnostics and their consistency. As a relatively new user of these infrastructure, it'd be much appreciated if XCOFF could follow suite. I hope this XCOFF development in Object won't be sort of code dump. |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
324 | We are good with the consistent style. And I will make the change. We are just asking if it make sense to update Coding Standards to document the style as right now it's not all consistent through out the project, and it's easy to follow the wrong precedent.
I don't find this comment to be very constructive. |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
324 | I believe that this is the first patch for which we were told about this convention. No one had said we were not changing this after James indicated that he believes it to be an important point. I find the insinuation that a "code dump" is being attempted to be rather extreme. |
Remove trailing stop on report_fatal_error and various comment addressed.
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
19–21 | Sorry, just realized that NO_REL_MASK is the reason I need to do the drive-by fix and it's still needed. So I will use the static const instead. |
Thanks for all the constructive comments.
@jhenderson @MaskRay Let me know if current revision looks good to you.
llvm/test/tools/llvm-objdump/XCOFF/disassemble-all.test | ||
---|---|---|
66 ↗ | (On Diff #251700) | It may look slightly better if you add enough spaces after : to make the first letter on the right of CHECK-D-r-NEXT: |
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
324 | Thanks for making the change. I'll post a question on llvm-dev about whether we should update the coding standard. I guess assert and llvm_unreachable are a little different, because of how they are printed. I'll do some investigation and write a full proposal up. | |
llvm/test/tools/llvm-objdump/XCOFF/disassemble-all.test | ||
2 ↗ | (On Diff #251700) | Indent continuation lines by two extra spaces to show they are clearly continuations: # RUN: llvm-objdump -D %p/Inputs/xcoff-section-headers.o | \ # RUN: FileCheck %s I'm suggesting it for all the checks. |
10–11 ↗ | (On Diff #251700) | As mentioned, I think this case should be moved to a different test file, as it's really testing something unrelated. |
66 ↗ | (On Diff #251700) | I'd recommended using FileCheck's --check-prefixes option to fold these two sequences together, as it will more clearly show the differences. Example: # RUN: llvm-objdump -D %p/Inputs/xcoff-section-headers.o | \ # RUN: FileCheck %s # RUN: llvm-objdump -D -r %p/Inputs/xcoff-section-headers.o | \ # RUN: FileCheck %s --check-prefixes=CHECK,WITH-R CHECK: Inputs/xcoff-section-headers.o: file format aixcoff-rs6000 CHECK: Disassembly of section .text: CHECK: 00000000 <.text>: CHECK-NEXT: 0: 80 62 00 04 lwz 3, 4(2) WITH-R-NEXT: 00000002: R_TOC CHECK-NEXT: 4: 80 63 00 00 lwz 3, 0(3) |
Address comments on test.
llvm/lib/Object/XCOFFObjectFile.cpp | ||
---|---|---|
324 | Thanks for the effort. Appreciate it. |
llvm/test/tools/llvm-objdump/XCOFF/disassemble-all.test | ||
---|---|---|
23 ↗ | (On Diff #252040) | I think additional prefixes can be added for the .text part to common up the -D -r and the -d -r. |
llvm/test/tools/llvm-objdump/XCOFF/disassemble-all.test | ||
---|---|---|
23 ↗ | (On Diff #252040) | Common up for the two cases make sense as they are identical for 95% of the time right now, and does not need much effort to parse both of the them. |
llvm/test/tools/llvm-objdump/XCOFF/disassemble-all.test | ||
---|---|---|
23 ↗ | (On Diff #252040) | Just for sketching out the effect, the end result would be using: With: TEXT: Inputs/xcoff-section-headers.o: file format aixcoff-rs6000 TEXT: Disassembly of section .text: TEXT: 00000000 <.text>: TEXT-NEXT: 0: 80 62 00 04 lwz 3, 4(2) TEXT-R-NEXT: 00000002: R_TOC a TEXT-NEXT: 4: 80 63 00 00 lwz 3, 0(3) TEXT-NEXT: 8: 4e 80 00 20 blr TEXT-NEXT: c: 00 00 00 00 <unknown> TEXT-NEXT: 10: 00 00 20 40 <unknown> TEXT-NEXT: 14: 00 00 00 01 <unknown> TEXT-NEXT: 18: 00 00 00 0c <unknown> TEXT-NEXT: 1c: 00 04 66 75 <unknown> TEXT-NEXT: 20: 6e 63 00 00 xoris 3, 19, 0 TEXT-NEXT: ... ALL: Disassembly of section .data: ALL: 00000080 <func>: ALL-NEXT: 80: 00 00 00 94 <unknown> ALL: 00000084 <a>: ALL-NEXT: 84: 00 00 00 a4 <unknown> ALL: 00000088 <b>: ALL-NEXT: 88: 00 00 00 a0 <unknown> ALL: 0000008c <c>: ALL-NEXT: 8c: 00 00 00 08 <unknown> ALL: 00000090 <d>: ALL-NO-R-NEXT: 90: 00 00 00 00 <unknown> ALL-R-NEXT: ... ALL: 00000094 <func>: ALL-NEXT: 94: 00 00 00 00 <unknown> ALL-NEXT: 98: 00 00 00 80 <unknown> ALL-NEXT: 9c: 00 00 00 00 <unknown> ALL: 000000a0 <b>: ALL-NEXT: a0: 00 00 30 39 <unknown> ALL: Disassembly of section .bss: ALL: 000000a4 <a>: ALL-NEXT: ... ALL: Disassembly of section .tdata: ALL: 00000000 <d>: ALL-NEXT: 0: 40 09 21 f9 bdnzfl 9, .+8696 ALL-NEXT: 4: f0 1b 86 6e <unknown> ALL: Disassembly of section .tbss: ALL: 00000008 <c>: ALL-NEXT: ... |
llvm/test/tools/llvm-objdump/XCOFF/print-reloc.test | ||
2 ↗ | (On Diff #252040) | Minor nit: No need to use a --check-prefix now that this is a separate file. |
llvm/test/tools/llvm-objdump/XCOFF/disassemble-all.test | ||
---|---|---|
23 ↗ | (On Diff #252040) | Thanks for generating the proposed result. |
llvm/test/tools/llvm-objdump/XCOFF/print-reloc.test | ||
2 ↗ | (On Diff #252040) | Thanks. I will address it in the commit if I don't receive other comments. |
Just confirming that this LGTM (with the minor comment posted before). Please wait to see if @jhenderson agrees that his comments have been addressed.
llvm/test/tools/llvm-objdump/XCOFF/disassemble-all.test | ||
---|---|---|
23 ↗ | (On Diff #252040) | Okay, thanks for looking. |
Thanks for the effort. This is nearly there.
llvm/test/tools/llvm-objdump/XCOFF/disassemble-all.test | ||
---|---|---|
7–8 ↗ | (On Diff #252040) | -d -r should be tested in a test for the --disassemble option, not --disassemble-all. Arguably in fact, you don't need it tested at all, as the -D -r combination is probably sufficient. |
23 ↗ | (On Diff #252040) | So I think the question is really, what is the purpose of the test? Is it to allow easy comparison of two outputs or something else? As things stand, it isn't easy to compare said two outputs. This is of course slightly moot, since there really should only be two test cases in this file anyway (see my comment above). |
llvm/test/tools/llvm-objdump/XCOFF/print-reloc.test | ||
2 ↗ | (On Diff #252040) | You should probably add --match-full-lines and --strict-whitespace to this FileCheck call, especially the former of these. This might require you to do some extra indentation stuff to make things look okay (see below - it ensures that all output lines up): RELOC:Inputs/xcoff-section-headers.o: file format aixcoff-rs6000 RELOC:RELOCATION RECORDS FOR [.text]: RELOC-NEXT:OFFSET TYPE VALUE RELOC-NEXT:00000002 R_TOC a RELOC:RELOCATION RECORDS FOR [.data]: RELOC-NEXT:OFFSET TYPE VALUE ... |
4–14 ↗ | (On Diff #252040) | Use '##' for comments. |
16 ↗ | (On Diff #252040) | ';' -> '#' for consistency with other lit directives. Also, this should be at the top of the file as is common for REQUIRES directives. |
20 ↗ | (On Diff #252040) | Add RELOC-EMPTY here and after the last relocation in the other block to show that all relocations have been printed. |
LGTM, once my last comment has been addressed and question answered (assuming that the question doesn't highlight an issue in the patch itself). Thanks for putting up with my lengthy review!
llvm/test/tools/llvm-objdump/XCOFF/disassemble-all.test | ||
---|---|---|
45–46 ↗ | (On Diff #252597) | I find this behaviour difference confusing. What is causing it? |
llvm/test/tools/llvm-objdump/XCOFF/print-reloc.test | ||
30 ↗ | (On Diff #252597) | Still need a CHECK-EMPTY: at the end here. |
llvm/test/tools/llvm-objdump/XCOFF/disassemble-all.test | ||
---|---|---|
45–46 ↗ | (On Diff #252597) | This block of code kicks in: if (!DisassembleZeroes) { uint64_t MaxOffset = End - Index; // For -reloc: print zero blocks patched by relocations, so that // relocations can be shown in the dump. if (RelCur != RelEnd) MaxOffset = RelCur->getOffset() - Index; if (size_t N = countSkippableZeroBytes(Bytes.slice(Index, MaxOffset))) { outs() << "\t\t..." << '\n'; Index += N; continue; } } The MaxOffset is larger when we have relocation. So we start to print '...' for that. In this file, you might think there is no relocation showing up after line 25. But it's actually there if we check the new print-reloc.test. The reason llvm-objdump is not printing them is because of this patch: https://reviews.llvm.org/D73531. Since we could not disassemble those instructions in the data section, we decide not to analysis them and skipped the relocation printing. This skip printing relocation is not ideal for XCOFF as you could see, but I would like to address it in a different patch. |
llvm/test/tools/llvm-objdump/XCOFF/print-reloc.test | ||
30 ↗ | (On Diff #252597) | Thanks. I will address it in the commit if there is no other comment. |
llvm/test/tools/llvm-objdump/XCOFF/disassemble-all.test | ||
---|---|---|
45–46 ↗ | (On Diff #252597) | To elaborate why MaxOffset is larger when we have relocation: |
Suggestion to match the other BinaryFormats getRelocationTypeString -> getRelocationTypeName