This is an archive of the discontinued LLVM Phabricator instance.

[XCOFF][AIX] Generate symbol table entries with llvm-readobj
ClosedPublic

Authored by jasonliu on Jul 24 2019, 12:56 PM.

Details

Summary

This patch implements main entry and auxiliary entries of symbol table generation for llvm-readobj on AIX.
The source code of aix_xcoff_xlc_test8.o (compile with xlc) is:

-bash-4.2$ cat test8.c
extern int i;
extern int TestforXcoff;
extern int fun(int i);
static int static_i;
char* p="abcd";
int fun1(int j) {
  static_i++;
  j++;
  j=j+*p;
  return j;
}
int main() {
  i++;
  fun(i);
  return fun1(i);
}

Patch provided by DiggerLin. I'm posting this on his behalf as he is currently on vacation.

Diff Detail

Event Timeline

jasonliu created this revision.Jul 24 2019, 12:56 PM
jasonliu edited the summary of this revision. (Show Details)Jul 24 2019, 12:59 PM
jasonliu edited the summary of this revision. (Show Details)Jul 24 2019, 1:03 PM
sfertile added inline comments.Jul 29 2019, 8:20 AM
llvm/include/llvm/Object/XCOFFObjectFile.h
129

minor nit: Was this supposed to be NamePad?

151

minor nit: IndexofNextEntry --> IndexOfNextEntry.

llvm/lib/Object/XCOFFObjectFile.cpp
462

I think this is the wrong interface for this function. Taking a pointer to an XCOFFSymbolEntry ends up making us cast a bunch of symbol table entries to a type they clearly are not to get their Symbol index. Instead we should take a unitptr_t.

615

minor nit: place the assert as the first statement in the function.

616

Use a reinterpret_cast instead of a c-style cast.

A though about the implementation of this function:
We are converting the address in the DataRefImpl to an XCOFFSymbolEntry and doing the pointer arithmetic on that type and it works because all symbol table entries have the same size as this structiure. But why convert in the first place? Instead we can calculate the address using getWithOffset. That code would be equally valid for 64-bit and 32-bit XCOFF objects.

assert(hasCSectAuxEnt());
uintptr_t AuxAddr = getWithOffset(SymEntDataRef.p, SYM_TAB_ENTRY_SIZE * getNumberOfAuxEntries());
return reiterpret_cast<const XCOFFCSectAuxEnt *>(AuxAddr);
649

If we fail to find a section, then should a StorageMappingClass of XMC_PR be enough to determine the symbol is a function?

llvm/tools/llvm-readobj/XCOFFDumper.cpp
124

When we fix the interface to getSymbolIndex we can get rid of this function.

131

This is the cast I find confusing. We have a pointer to a structure of type XCOFFFileAuxEnt and we have to cast it to something it is not to get the symbol table entries index.

abrachet added inline comments.
llvm/include/llvm/Object/XCOFFObjectFile.h
162โ€“167

How do these and the other structs work? They have seemingly random padding that align these on non normal boundaries. Assuming this is the file format, why is there no need for #pragma pack or aligans? If we take just this type as an example 18 bytes worth of members, wont this be aligned by the compiler to 24 bytes?

sfertile added inline comments.
llvm/include/llvm/Object/XCOFFObjectFile.h
162โ€“167

Assuming this is the file format, why is there no need for #pragma pack or aligans? If we take just this type as an example 18 bytes worth of members, wont this be aligned by the compiler to 24 bytes?

No: because none of the members used have an alignment greater then 1. support::[u]bigN_t types are 1 byte aligned, and the only other types used are char and uint8_t.

How do these and the other structs work? They have seemingly random padding that align these on non normal boundaries.

Most (maybe all) are described here: https://www.ibm.com/support/knowledgecenter/en/ssw_aix_72/filesreference/XCOFF.html#XCOFF__yaa3i18fjbau There is also a set of headers shipped as part of the OS that describes them but they are much harder to decipher. I'm not sure if the headers are available on the internet but hopefully @hubert.reinterpretcast or @cebowleratibm can point you to them if they are.

Each of them being 18 bytes in size is expected, once you know that and that there is no alignment restriction on the 2 and 4 byte types used the padding layout should no longer seem so random.

jasonliu updated this revision to Diff 212356.Jul 30 2019, 8:49 AM
jasonliu updated this revision to Diff 212357.Jul 30 2019, 8:56 AM
jasonliu marked 5 inline comments as done.
jasonliu marked an inline comment as done.Jul 30 2019, 9:09 AM
jasonliu added inline comments.
llvm/lib/Object/XCOFFObjectFile.cpp
649

In a single .o file, then maybe XMC_PR is enough to determine the symbol is a function.
But I'm not sure if that's the case for a executable/dynamic library where linker could regroup the functions in a different way.
I'm open to suggestions if there is a better way to do this.

Gentle ping.

sfertile added inline comments.Aug 6 2019, 1:48 PM
llvm/include/llvm/Object/XCOFFObjectFile.h
152

We should open a defect against the XCOFF object file docs, it list the pading as length 1 which is incorrect.

llvm/lib/Object/XCOFFObjectFile.cpp
120

We can remove the second argument to generateStringRef and use XCOFF::NameSize directly in the implementation, since all calls use the same value for the argument.

462

Looks good. One suggestion is to have some error checking similar to what we have in checkSectionAddress .

641

label define --> label definition.

649

Sorry, I don't have any helpful suggestions. I'm not familiar enough with XCOFF to know when/how it uses absolute symbols. I think this current implementation is OK for now. It might lead to missing information in llvm-readobj but we can refine it as our understanding grows.

llvm/test/tools/llvm-readobj/xcoff-symbols.test
2

Add the source used, compiler, and options used to genetrate the object file in a comment.

Also it seems that we have added several constructs that are not exercised in the test: Block Aux Entry, Dwarf Aux Entry, Function Aux Ent. If these aren't tested then remove them from this patch and we can add them in a subsequent patch.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
251

We are adding a bunch of cases which aren't tested. Would it be better to add these cases with an assert and implement them in a future patch which also adds the missing test coverage?

llvm/include/llvm/BinaryFormat/XCOFF.h
23

Please commit the removal of SectionNameSize and SymbolNameSize separately from this patch. This conflicts with other Phabricator reviews in flight.

jasonliu marked 7 inline comments as done.Aug 7 2019, 11:11 AM
jasonliu added inline comments.
llvm/include/llvm/Object/XCOFFObjectFile.h
152

Will do.

jasonliu updated this revision to Diff 213954.Aug 7 2019, 11:14 AM
sfertile added inline comments.Aug 9 2019, 6:36 AM
llvm/include/llvm/BinaryFormat/XCOFF.h
23

We should commit this as a NFC patch and then rebase this patch to reflect the change.

llvm/lib/Object/XCOFFObjectFile.cpp
461

Why a pointer to a unitptr_t, shouldn't the argument just be a unitptr_t?

468

Should also check that the address isn't larger then the end of the symbol table.

llvm/test/tools/llvm-readobj/xcoff-symbols.test
2

Sorry I wasn't explicit before: the Run steps should be at the top of the file. Ideally the source and compile command would be at the end of the file.

jasonliu marked an inline comment as done.Aug 9 2019, 11:00 AM
jasonliu added inline comments.
llvm/include/llvm/BinaryFormat/XCOFF.h
23

I was planning to land this first. Then when dust settles(https://reviews.llvm.org/D65159), have a NFC patch to remove those enums.
In this way, we could have less dependencies.
The more pedantic way is of course to land that NFC before this patch and D65159, then rebase these two patches.
Have a slightly preference on my original plan because of less action on both end.
But if you think it's better to land that NFC first, let me know and I could do that.

sfertile added inline comments.Aug 12 2019, 7:17 AM
llvm/include/llvm/BinaryFormat/XCOFF.h
23

I do think its worthwhile to land the enum name change first and rebase the patches. I understand it introduces a bit of extra work but I think its worth the small bit of time it takes.

jasonliu updated this revision to Diff 214638.Aug 12 2019, 8:34 AM
jasonliu marked 3 inline comments as done.
jasonliu marked 2 inline comments as done.
jasonliu marked an inline comment as done.Aug 12 2019, 8:36 AM
jasonliu added inline comments.
llvm/include/llvm/BinaryFormat/XCOFF.h
23

Sure. When the NFC patch is available, I will rebase this patch to reflect that.

sfertile added inline comments.Aug 12 2019, 8:45 AM
llvm/include/llvm/BinaryFormat/XCOFF.h
23
jasonliu updated this revision to Diff 214704.Aug 12 2019, 2:12 PM

Rebased on latest trunk.

jasonliu marked 4 inline comments as done.Aug 12 2019, 2:12 PM
abrachet removed a subscriber: abrachet.Aug 12 2019, 2:18 PM

I believe I addressed the existing comments. Looking for more comments or approval. Thanks!

llvm/include/llvm/Object/XCOFFObjectFile.h
115

XCOFFCSectAuxEnt32.

117

This does not hold the hash value itself, so ParameterHashIndex.

121

StabInfoIndex, like ParameterHashIndex.

122

StabSecNum, like TypeChkSectNum.

hubert.reinterpretcast requested changes to this revision.Aug 17 2019, 9:37 PM
hubert.reinterpretcast added inline comments.
llvm/include/llvm/Object/XCOFFObjectFile.h
136

ReservedZeroes?

137

Add at least a comment to indicate that this field is XCOFF64-specific.

291

Stray top-level const on a parameter declaration not attached to a function definition.

llvm/test/tools/llvm-readobj/xcoff-symbols.test
22

There is no such field defined for XCOFF32.

llvm/lib/Object/XCOFFObjectFile.cpp
461

Top-level const on parameter types are known to cause grief with certain build compilers.

462

Type error. The answer fits into 32-bits after being divided by 18, but not necessarily before that.

462

I suggest moving this to after the error checking.

472

>=. The calculation on the RHS points "one-past-the-end".

This revision now requires changes to proceed.Aug 17 2019, 9:37 PM
llvm/include/llvm/Object/XCOFFObjectFile.h
307

Make both fields const if we don't need to modify them.

311

Parameter variable name does not match the prevalent style.

316

There's no reason to add XCoff to the set of ways we capitalize XCOFF. We already use XCOFF even when we capitalize the next word in names.

319

Newline before this to split the derived-property functions from the field accessors.

jhenderson added inline comments.Aug 20 2019, 2:30 AM
llvm/test/tools/llvm-readobj/xcoff-symbols.test
2

A few test comments from me:

Add a brief comment to the top of the file explaining the purpose of the test.

You don't need to prepend the RUN: lines with a '#' character.

Add a single blank line between the end of the RUN: lines and the start of the CHECK patterns.

I take it yaml2obj doesn't have XCOFF support? Assuming that's the case, is it possible to use something like llvm-mc to build this at test time rather than use the pre-canned binary? In general we've been trying to avoid adding pre-canned binaries to the llvm-readobj tests, as they are opaque and hard to maintain.

sfertile added inline comments.Aug 20 2019, 7:30 AM
llvm/lib/Object/XCOFFObjectFile.cpp
473

getWithOffset?

llvm/test/tools/llvm-readobj/xcoff-symbols.test
2

I take it yaml2obj doesn't have XCOFF support? Assuming that's the case, is it possible to use something like llvm-mc to build this at test time rather than use the pre-canned binary? In general we've been trying to avoid adding pre-canned binaries to the llvm-readobj tests, as they are opaque and hard to maintain.

We are using the pre-canned binaries for testing since we need the tools support first, to be able to write lit tests as we add backend XCOFF functionality. Once we have enough functionality in the XCOFFObjectWriter we can remove the pre-canned binaries and generate them with llvm-mc as you suggest.

jasonliu marked 19 inline comments as done.Aug 20 2019, 8:23 AM
jasonliu added inline comments.
llvm/include/llvm/Object/XCOFFObjectFile.h
122

Changed it to StabSectNum.

llvm/test/tools/llvm-readobj/xcoff-symbols.test
2

Thanks for the comments. Addressed them in the new revision.

Yes, we haven't implement XCOFF support for yaml2obj yet. And llvm-mc do not yet have enough functionality to generate all the symbol entries kind for our testing, although we are working towards that goal. And we are actually going to use llvm-readobj to test the correctness of our llvm-mc obj gen.
So there is a chicken-and-egg problem here for us. And what we are choosing to do is to use a pre-canned binary that we know is correct (generated from xlc) to test the correctness of the tooling (llvm-readobj). Then later use this tooling to test the object file generated by llc/llvm-mc.

22

Removed this for printing XCOFF32.

jasonliu updated this revision to Diff 216160.Aug 20 2019, 8:30 AM
jasonliu marked 2 inline comments as done.
jasonliu marked 2 inline comments as done.

I addressed all the comments I see.
Thanks for the reviewing.

jhenderson added inline comments.Aug 20 2019, 9:00 AM
llvm/test/tools/llvm-readobj/xcoff-symbols.test
2

Fair enough. Another way to resolve this, which was used for minidump support, was implement support in yaml2obj for XCOFF, and test using gtest unit-testing. yaml2obj can then be used to generate objects for llvm-readobj testing, which can in turn verify llvm-mc and other tools.

llvm/include/llvm/Object/XCOFFObjectFile.h
306

I meant this line (re: const) as well.

316

Might as well use Csect here while we are at this for this patch (see D65159).

llvm/lib/Object/XCOFFObjectFile.cpp
123

Please common up some of the code between this and getSymbolName. We might be able to go with one template function implementation for both of these interfaces to use.

462

getSymbolIndex is a strange place for the error checking to occur. We already got a SymbolEntPtr, and we might have used it already. We should move the error checking out to a separate function and call it where appropriate (perhaps at each place where we retrieve a handle to a symbol table entry, or perhaps only when jumping by an index).

664

Global replace CSect with Csect in code (note: not the same for English).

668

To explain the code, add a comment to indicate that the csect auxiliary entry is, for symbols that have them in XCOFF32, the last auxiliary entry.

670

Might make sense to call the error checking function here.

682

s/Sc/SC;

698

Newline before the comment. Clarify the comment as saying that we are expecting function definitions to be label definitions.

llvm/test/tools/llvm-readobj/xcoff-symbols.test
2

s/generate/display the/;
s/for/for a/;

17
TB_C = 0,
TB_CPLUSPLUS = 9,
18
TCPU_PPC64 = 2,
TCPU_COM = 3,
TCPU_970 = 19,
110

Global replace "CSect" with "CSECT" in English.

427

The filename is test8.c.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
28

Given the way it is used, this should be something like SymbolAlignmentBitOffset.

247

Assert XCOFF32. This entry type does not exist for XCOFF64.

llvm/lib/Object/XCOFFObjectFile.cpp
22

This is unused, and this should be something like RELOC_LNNO_OVERFLOW if we were to have it in this patch.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
134

Use static_cast.

237

If the previous line did not need the cast to do the bitwise-and, then neither does this one. Also, remove the redundant parentheses.

239

Use static_cast.

251

Overflow handling is not here, and there is no TODO.

297

Add a period to the end of the sentence.
Replace llvm_unreachable(x) with assert(false && x) here. I guess the assert makes sense to have (in place of just a TODO), so that developers can clue in when they are running this utility on an object with the unimplemented kinds.

hubert.reinterpretcast requested changes to this revision.Aug 21 2019, 4:13 PM
hubert.reinterpretcast added inline comments.
llvm/tools/llvm-readobj/XCOFFDumper.cpp
305

s/unimplmented/unimplemented/;

335

Use static_cast.

344

s/entires/entries/;
Replace the period with an ellipsis. What is here is not a sentence.

345

Yes, the name should be ".file", but I don't think there is sufficient rationale to skip processing the auxiliary entries if the name isn't.

355

s/is Function/is for a function/;

359

Can we have some rationale here for not printing the first auxiliary entry if it is not also the last?
Also, please fix the spacing, capitalization, etc.

362

Remove the " are print out" part. It is clear that it is being printed out.

364

SYM_TAB_ENTRY_SIZE might make more sense (since we are not really talking about an XCOFFSymbolEntry object).

364

Use reinterpret_cast.

375

This is unsafe. There is no statement I could find indicating that such an auxiliary entry is required, and xlc will generate such symbol table entries with no auxiliary entry.

[16]    m   0x00000094     .data     1  unamex                    _$STATIC
[17]    a4  0x00000004       0    0     SD       RW    0    0
[18]    m   0x00000094     .data     0  static                    x
386

See comments above.

388

See comments above.

392

Add a newline between the functions.

This revision now requires changes to proceed.Aug 21 2019, 4:13 PM
llvm/tools/llvm-readobj/XCOFFDumper.cpp
370

printCsectAuxEnt32(XCOFFSymRef.getXCOFFCsectAuxEnt32()).

jasonliu updated this revision to Diff 216666.Aug 22 2019, 10:47 AM
jasonliu marked 38 inline comments as done.
jasonliu added a subscriber: DiggerLin.
jasonliu added inline comments.
llvm/lib/Object/XCOFFObjectFile.cpp
22

Yes, removed it from the patch.

@DiggerLin
Please note this change might be needed for the upcoming relocation patch.

462

Hopefully, I covered all the cases that we needed to call the checker function.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
375

We have

if (NumberOfAuxEntries == 0)
    return;

before this switch statement. If there is no aux entry, we would exit already before getting here.

jasonliu marked an inline comment as done.Aug 22 2019, 11:31 AM
jasonliu added inline comments.
llvm/tools/llvm-readobj/XCOFFDumper.cpp
117

Will add a new line before this function.

jasonliu marked an inline comment as done.Aug 22 2019, 11:47 AM
jasonliu added inline comments.
llvm/tools/llvm-readobj/XCOFFDumper.cpp
251

Discussed offline, the overflow handling is for section auxiliary header, not for symbol auxiliary entry.

Looking through this latest version still (this is a largish patch).

llvm/tools/llvm-readobj/XCOFFDumper.cpp
375

Ah, yes. As a further consideration, I think we should do something about having more than one auxiliary entry here.

Looks like the patch is getting close.

llvm/include/llvm/BinaryFormat/XCOFF.h
167

Use "64-bit".

llvm/include/llvm/Object/XCOFFObjectFile.h
195

This function is sufficiently generic that we can just say that it returns string table entries (and name it appropriately).

llvm/lib/Object/XCOFFObjectFile.cpp
85

Ref.p is a uintptr_t, so this can be moved to before the viewAs.

114

Use "end-of-symbol-table".

146

Remove "or .debug section" as this function does not cover this case (at least not yet).

156

This is more generically "Bad offset for string table entry".

177

generateStringRef got modified in an NFC patch I believe. The uses in this patch make it more obvious that the one-argument version of that function should not be named so generically. Perhaps generateXCOFFFixedNameStringRef?

llvm/tools/llvm-readobj/XCOFFDumper.cpp
166

I think we should treat this as code.

251

I've experimented with XLC, and confirmed that it does not do the overflow handling for these fields. It will populate these with the low bits of the logical value.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
251

I think we can have a comment here that, unlike the corresponding fields in the section header, these fields do not handle values greater than 65535.

jasonliu updated this revision to Diff 217177.Aug 26 2019, 9:04 AM
jasonliu marked 13 inline comments as done.

Addressed all the new comments.

llvm/lib/Object/XCOFFObjectFile.cpp
177

Agree. The name should give more information on what it is trying to do.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
375

Could C_STAT symbol have more than 1 aux entry? I added an assertion for that.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
375

Can we make that a report_fatal_error? It is an error in the user-provided input, so we should report the error.

jasonliu marked an inline comment as done.Aug 26 2019, 10:27 AM
jasonliu added inline comments.
llvm/tools/llvm-readobj/XCOFFDumper.cpp
375

Right. I will make it a report_fatal_error in the commit or next revision.

LGTM with minor comments.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
288

Newline before the comment.

434

s/could/should/;

This revision is now accepted and ready to land.Aug 26 2019, 1:33 PM
This revision was automatically updated to reflect the committed changes.