This is an archive of the discontinued LLVM Phabricator instance.

[AIX] llvm-readobj support a new option --exception-section for xcoff object file.
ClosedPublic

Authored by DiggerLin on Aug 31 2022, 8:28 AM.

Diff Detail

Event Timeline

DiggerLin created this revision.Aug 31 2022, 8:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2022, 8:28 AM
DiggerLin requested review of this revision.Aug 31 2022, 8:28 AM
Herald added a project: Restricted Project. · View Herald TranscriptAug 31 2022, 8:28 AM
pscoro added inline comments.Aug 31 2022, 9:53 AM
llvm/tools/llvm-readobj/XCOFFDumper.cpp
145

On the example I am running this just prints "Symbol Index: )", although I checked the individual SymIdx and SymName variables and they have the right values.

fixed Paul Scoropan's report bug. thanks for Paul

DiggerLin marked an inline comment as done.Aug 31 2022, 12:38 PM

I did wonder whether this should really just be reusing the existing --unwind option, but as I understand it, the XCOFF exception section isn't really about unwinding the stack or anything along those lines. Is that correct?

Rather than canned binaries, I think it wouldn't be too hard to add yaml2obj support for exception sections, so that you can create the input at test time.

Your commit message should be a grammatically correct sentence, with leading capital letter (i.e. "Add a new ..."), much like comments.

I've not really looked at the XCOFFObjectFile code changes or the testing just yet. Please don't forget to test error/warning paths, to show that the code can handle malformed inputs.

llvm/docs/CommandGuide/llvm-readobj.rst
335

I think you can simplify to the inline suggestion.

llvm/lib/Object/XCOFFObjectFile.cpp
94

This file hasn't been clang-formatted. Please fix.

1036–1039

I recommend folding this if into the assertions, possibly like the inline suggestion. Otherwise you have an empty if without assertions, which seems messy (although the optimizer should get rid of it in this case).

llvm/test/tools/llvm-readobj/XCOFF/exception-section.test
10

I think it would be more similar to other dumping formats to print this row as:
Symbol: .bar (12).

llvm/tools/llvm-readobj/XCOFFDumper.cpp
143

unwrapOrError should be considered deprecated in llvm-readobj, as it stops llvm-readobj from continuing, which is not useful for dumping tools, especially given that the error in this case won't prevent other files or sections from being dumped. Instead, prefer reporting problems as warnings (via reportUniqueWarning or equivalent). See the ELF dumper for good examples.

152

Ditto: don't use unwrapOrError.

llvm/tools/llvm-readobj/llvm-readobj.cpp
510–511

Is there a reasson you're not putting this up with the other XCOFF-specific dumping option?

DiggerLin edited the summary of this revision. (Show Details)Sep 1 2022, 6:32 AM
DiggerLin marked an inline comment as done.Sep 1 2022, 7:01 AM

I did wonder whether this should really just be reusing the existing --unwind option, but as I understand it, the XCOFF exception section isn't really about unwinding the stack or anything along those lines. Is that correct?

yes, you are correct, the exception section is not for unwind. so we can not use the --unwind for it.

Rather than canned binaries, I think it wouldn't be too hard to add yaml2obj support for exception sections, so that you can create the input at test time.

yes, if I do the yaml2obj first , I need a tools to decode the object file generated by yaml2obj when I add a test for the yaml2obj, but there is not now. My propose is three steps:

  1. using canned binaries in this patch,
  2. write a second patch to yaml2obj support for exception sections(using the llvm-readobj --exception-section to test the second patch).
  3. have another patch to modify the llvm/test/tools/llvm-readobj/XCOFF/exception-section.test which use the yaml2obj to generate xcoff object with exception sections and replace the canned binaries in the test case.

Your commit message should be a grammatically correct sentence, with leading capital letter (i.e. "Add a new ..."), much like comments.

I've not really looked at the XCOFFObjectFile code changes or the testing just yet. Please don't forget to test error/warning paths, to show that the code can handle malformed inputs.

DiggerLin updated this revision to Diff 457337.Sep 1 2022, 11:33 AM
DiggerLin marked 4 inline comments as done.

address James's comment.

llvm/lib/Object/XCOFFObjectFile.cpp
1036–1039

thanks for explain. fixed

llvm/test/tools/llvm-readobj/XCOFF/exception-section.test
10

according to https://www.ibm.com/docs/en/aix/7.2?topic=formats-xcoff-object-file-format#XCOFF__iua3i23ajbau

e_addr.e_symndx+
Symbol table index for function

the value of the field is for symbol index, so putting symbol index at first and putting symbol name in brackets are reasonable.

llvm/tools/llvm-readobj/llvm-readobj.cpp
510–511

If there are several options in the command line at the same time. I want to keep the output as order of content xcoff object file as much as possible.
If you do not agree with this, I can put the all the XCOFF-specific dumping option together.

jhenderson added a comment.EditedSep 2 2022, 12:40 AM

Rather than canned binaries, I think it wouldn't be too hard to add yaml2obj support for exception sections, so that you can create the input at test time.

yes, if I do the yaml2obj first , I need a tools to decode the object file generated by yaml2obj when I add a test for the yaml2obj, but there is not now. My propose is three steps:

  1. using canned binaries in this patch,
  2. write a second patch to yaml2obj support for exception sections(using the llvm-readobj --exception-section to test the second patch).
  3. have another patch to modify the llvm/test/tools/llvm-readobj/XCOFF/exception-section.test which use the yaml2obj to generate xcoff object with exception sections and replace the canned binaries in the test case.

I think my preferred option would be to use yaml2bj/obj2yaml to test each other, with an additional manual step for now:

  1. Write yaml2obj code for the desired behaviour, and obj2yaml code which can convert the object back into YAML, which you can then check with FileCheck.
  2. After running the test, use your system tools to manually check the temporary object file created by the test looks correct. If so, land this change.
  3. In a second patch (this one), use yaml2obj to create your test inputs and verify that llvm-readobj can dump them correctly.
  4. You could also manually verify using your system tools that the input looks as you would expect again.
  5. Add llvm-readobj checks to your yaml2obj tests.
llvm/test/tools/llvm-readobj/XCOFF/exception-section.test
10

This isn't any different to how ELF is usually dumped. For example, when dumping ELF relocations (see for example https://github.com/llvm/llvm-project/blob/main/llvm/test/tools/llvm-readobj/ELF/relocations.test), the ELF format refers to a symbol index, but the printout displays the symbol name. In general, if a user is trying to find out information, it is more likely that they are interested in the symbol not the index of that symbol, so printing the name first is the more useful thing.

llvm/test/tools/llvm-readobj/XCOFF/invalid-exception-section.test
2
4

dd isn't used anywhere else in the tests as far as I can tell, which means adding it might break some users who do not have that tool. Instead, you can use python to achieve the same effect, by reading in the file and modifying the specific bytes.

(Although this is where yaml2obj is more useful)

5

Use echo not printf.

8
llvm/tools/llvm-readobj/XCOFFDumper.cpp
144

I don't think you've tested this warning?

llvm/tools/llvm-readobj/llvm-readobj.cpp
510–511

Okay, your explanation makes sense, thanks. Please put it in comments somewhere in the file, e.g. "this data appears early in XCOFF files so display it first" etc.

jhenderson added inline comments.Sep 2 2022, 1:06 AM
llvm/include/llvm/Object/XCOFFObjectFile.h
231

Why the public directive? It makes it look like the fields are private, except they aren't because this is a struct.

238

I'd also consider replacing Reason with Reason != 0. This makes it a mirror of the getSymbolIndex function.

249–250

Why the duplication? In fact, why the extern template at all?

llvm/lib/Object/XCOFFObjectFile.cpp
407

I think it would help the understandability of these changes if the refactoring was split into a separate prerequisite patch. You'd thus have two patches (well three if you include the yaml2obj patch discussed out of line):

  1. NFC refactoring to make the loader section code reusable.
  2. Implementation of the exception section.
430
436–437

I'd use the slightly shorter names SectionOffset and SectionSize.

442

Don't start blocks with a blank line.

446–451

All the magic numbers make this code completely opaque. Why "3", why "16", why "1" etc. I think you'd be better off assigning them to const variables so you can name them.

Also, I think the consensus is that lambdas should have name style like variables, because they are function objects rather than pure functions. In other words, I'd name this GetSectionName (no need to abbreviate).

794–798

Could a template function (or auto lambda) allow you to avoid this duplication (passing in the sections32/sections64 functions as parameters)?

1030

Probably delete this blank line.

1036–1037

This assertion is independent of finding the section, so should probably the first line of the function.

DiggerLin updated this revision to Diff 458744.Sep 8 2022, 8:04 AM
DiggerLin marked 19 inline comments as done.

address comment.

llvm/include/llvm/Object/XCOFFObjectFile.h
231

just for readable to separate the data member and function.

249–250

"extern template" means the template is initiated somewhere.
for the template has member function. without the "extern template". there maybe member function maybe be initiate in different compile unit. it waste the the code.

llvm/lib/Object/XCOFFObjectFile.cpp
407

the function getLoaderSectionAddress is mapped into getSectionFileOffsetToRawData(XCOFF::STYP_LOADER).

and the function getSectionFileOffsetToRawData has more functionality than only get getLoaderSectionAddress. if split into two patch. I do not think it is NFC patch.

446–451

the functionality of the mapping the value SectionTypeFlags into string.

https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/BinaryFormat/XCOFF.h

794–798

good idea.

jhenderson added inline comments.Sep 9 2022, 1:00 AM
llvm/include/llvm/Object/XCOFFObjectFile.h
231

just for readable to separate the data member and function.

The addition of an explicit and unnecessary public makes things less readable for me, not more. If you wish to explicitly call out that the method is public, put the method before the members and label the whole struct with public.

Also nit: add a blank line between your methods, where your methods are multi line.

249–250

We don't bother with extern templates in many other places, so I'm not sure why you felt that this particular case needed it?

Actually, after writing that, I see it's the same for other XCOFFObjectFile declarations, so it's probably not worth me worrying about in this patch, although it's still the case that there's limited use of this feature outside this file. I'd still like to know why you feel like these structs need it specifically, when in most cases within LLVM we don't bother.

llvm/lib/Object/XCOFFObjectFile.cpp
407

All the more reason to split this up then: if the first patch isn't a pure refactor, and adds new functionality, that functionality should be added and tested for the loader section. If on the other hand it is not used, the additional functionality can be added in the second patch.

446–451

Correct me if I'm wrong, but it looks to me like this is just an over-complicated way of mapping the section type values to section names. Wouldn't a switch statement be significantly clearer? Something like:

StringRef getSectionName(int32_t SectType) {
  switch(SectType) {
  case STYP_PAD:
    return "pad";
  case STYPE_DWARF:
    return "dwarf";
  ...
  default:
    return "<unknown: " + SectType + ">";
  }
}

(The default case covers the situation where a user provides a type that isn't a known type, but could be replaced by something else appropriate).

llvm/test/tools/llvm-readobj/XCOFF/exception-section.test
19

It would probably help readers/maintainers if this had some small comments in the lines immediately below with arrows pointing up at the start of each field in the data. There are several examples of this in the yaml2obj DWARF tests.

Something like:

SectionData: "0000000000000000003400030000005c0002000000010000000001140002000001400002"
              ^        ^-FieldName
              +-SymIndex

(add more labels with field names as appropriate).

26

Nit: get rid of double blank line. Same in the other test.

llvm/test/tools/llvm-readobj/XCOFF/invalid-exception-section.test
8

Not addressed, and now duplicated in the invalid_sym case too.

21–25

Do you need any symbols at all for this test case? If so, would 1 suffice?

36

Do you really need this much data for this test case? A single entry would surely be sufficient, and you wouldn't even need any symbols at all.

DiggerLin marked 10 inline comments as done.

address comment.

llvm/include/llvm/Object/XCOFFObjectFile.h
249–250

in the patch, it only llvm-readobj/XCOFFDumper.cpp used the member functions of ExceptionSectionEntry. but I instanced the member functions of ExceptionSectionEntry in the XCOFFObjectFile.cpp instead other files. just because we still want to implement the decode the Exception Section in llvm-objdump later, I am not sure whether we use use the member functions of ExceptionSectionEntry in XCOFFObjectFile.cpp later. I only instanced the ExceptionSectionEntry in the XCOFFObjectFile.cpp explicitly and extern instanced the ExceptionSectionEntry explicitly, It will guarantee the members functions of ExceptionSectionEntry only be instantiated once.

llvm/lib/Object/XCOFFObjectFile.cpp
407

for the in the patch https://reviews.llvm.org/D110320 and https://reviews.llvm.org/D106643 , in the getLoaderSectionAddress(). there is no test case for the code "return createError(toString(std::move(E)) +

": loader section with offset 0x" +
Twine::utohexstr(OffsetToLoaderSection) +
" and size 0x" + Twine::utohexstr(SizeOfLoaderSection) +
" goes past the end of the file");

"
I do not think we will used a canned invalid xcoff object file to test invalid loader Section.

I will not touch the code the getLoaderSectionAddress() in current patch.
and After the current patch landed, I will create a new NFC patch to refactor the getLoaderSectionAddress().(which will delete the function getLoaderSectionAddress())

446–451

yes, it is mapping the section type values to section names, I think your way is easy to understand but much more code(at lease 30 lines codes, at least 20 lines more codes). and if the enum of SectType has 100 different value, that means I have to write about 200 lines?

llvm/test/tools/llvm-readobj/XCOFF/invalid-exception-section.test
21–25

yes,agree.

36

agree, thanks

jhenderson added inline comments.Sep 13 2022, 12:09 AM
llvm/include/llvm/Object/XCOFFObjectFile.h
249–250

Right, I understand what the extern template achieves, but your answer doesn't explain why this struct is special compared to many other classes and structs in LLVM that don't use extern template.

Anyway, like I said, this is a discussion for another time.

llvm/lib/Object/XCOFFObjectFile.cpp
407

Sounds reasonable to me.

446–451

If the code is simpler to understand, you should always prefer that approach to the opaque option, even if the opaque option is many fewer lines of code. The switch/case approach may well be more efficient too, since compilers can easily optimize a switch case into a single jump.

Listing them out, the advantages of switch/case are:

  • Easy to follow.
  • Clear mapping of value to name (doesn't require maintaining two parallel but intrinsically linked arrays/enums).
  • Improved compiler diagnostics (e.g. compilers can warn if no default case and not all cases in an enum are covered).
  • Potentially improved performance.

The advantage of the existing approach:

  • Fewer lines of code.

SectType doesn't have 100 values, so your argument is a false comparison. Regardless, even if it did, you'd still have to have a 100 element array and make sure that the order in that array exactly matched the order of the enum values, so it's not exactly like it's any more maintainable.

457
DiggerLin marked 3 inline comments as done.
jhenderson added inline comments.Sep 14 2022, 12:29 AM
llvm/lib/Object/XCOFFObjectFile.cpp
454

SectionName is unitialized if this function is called with an unknown section type. Either add a default case to the switch, or, preferably (in my opinion), assign it to some other initial value, preferably that includes the section type value, so that a user can see what the input is that is passed in but has gone wrong.

If there's a straightforward way to test this, e.g. using a gtest unit test, than that would be good, but I accept that this might not be the case at the moment.

472
DiggerLin marked an inline comment as done.

address comment

DiggerLin added inline comments.Sep 14 2022, 10:19 AM
llvm/lib/Object/XCOFFObjectFile.cpp
454

For in the code, I have covered all the enumeration values.
if I use a default label, there will be compile error as "error in switch which covers all enumeration values [-Werror,-Wcovered-switch-default]"

jhenderson added inline comments.Sep 15 2022, 12:38 AM
llvm/lib/Object/XCOFFObjectFile.cpp
454

256 is unnecessarily large, given the maximum possible size of any of these names. Reduce it so that the typical max size is the maximum size of one of the normal cases.

455

The <unknown:0x1234> style is used in other places, so I think it's reasonable to use it here. On ELF, we use hex numbers for the section type too. You may wish to do that here too.

DiggerLin updated this revision to Diff 460401.Sep 15 2022, 7:29 AM
jhenderson accepted this revision.Sep 16 2022, 12:18 AM

One remaining nit, otherwise LGTM.

llvm/lib/Object/XCOFFObjectFile.cpp
455

Do you need to add the 0x explicitly? I haven't looked at uthexstr, so it might not need it, but I think we should have the 0x to make it clear it is in hex.

This revision is now accepted and ready to land.Sep 16 2022, 12:18 AM
DiggerLin marked an inline comment as done.Sep 19 2022, 6:48 AM
DiggerLin added inline comments.
llvm/lib/Object/XCOFFObjectFile.cpp
455

Twine::utohexstr will have "0x" as prefix.