This is an archive of the discontinued LLVM Phabricator instance.

[XCOFF][AIX] Enable tooling support for 64 bit symbol table parsing
ClosedPublic

Authored by jasonliu on Aug 11 2020, 1:00 PM.

Details

Summary

Add in the ability of parsing symbol table for 64 bit object.

Diff Detail

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
DiggerLin added inline comments.Aug 13 2020, 1:21 PM
llvm/lib/Object/XCOFFObjectFile.cpp
406

can we add new member function as getNumberOfSymbolTableEntries()
{

return is64Bit() is64Bit() ? getNumberOfSymbolTableEntries64()
              : getLogicalNumberOfSymbolTableEntries32();

}

the function can also use in
XCOFFObjectFile::create()
and
getSymbolNameByIndex()

jasonliu added inline comments.Aug 13 2020, 1:31 PM
llvm/lib/Object/XCOFFObjectFile.cpp
198

Please see my other comments regarding combining the 32bit and 64 bit version into 1 function.

202

Please see my other comments regarding combining the 32bit and 64 bit version into 1 function.

406

About all the comments mentioning if we could combining the 32bit and 64 bit version into 1 function.
I don't think it's good idea because people would ignore the fact that they are returning different types underneath.

I just wonder whether we can implement two separate structure XCOFFSymbolEntry32 and XCOFFSymbolEntry64 without so much union be used on currently implement.

Agree that current implementation have many union, and it's hard for people to parse what exactly is inside for the structure.
But separating them into two structures, namely, XCOFFSymbolEntry32 and XCOFFSymbolEntry64, would mean a lot more if (Obj->is64Bit()) check across all tooling, which sacrifice a lot in the usability department. A lot more logic would look duplicated.
One potential solution I thought about is for every data member, we introduce a getter to retrieve the data, and mark the data members private. So that most of the time, user of the structure do not need to look inside of the structure to figure out how to retrieve certain data. But the downside is we are going to introduce a lot of getters for that, and not sure if it would be worth the effort.

It seems to me like this should be using inheritance here. You have a base class that has the common members, and provides pure virtual declarations of the various getters, with the sub-classes defining them to do the right thing. Yes, it would introduce a number of getters, but I feel like it would make everything a bit cleaner from a usability standpoint. In most cases, you then don't need any is64Bit queries, because the getters hide that from you.

On a testing note, there are several places in the new code which detect some kind of error. You need testing for these code paths too.

llvm/lib/Object/XCOFFObjectFile.cpp
406

From my experience working with tools that had to support 32-bit and 64-bit ELF, you don't worry about the underlying type in most cases and always use the larger type. The same probably applies here. Of course, it becomes a bit moot if you add a common getter interface as suggested out-of-line, because those getters will have to return the larger of the two return types anyway.

Is there a strong reason to not use the larger type everywhere?

734

Related to my comments elsewhere - it looks to me like most consumers will need to handle both 32 and 64-bit versions, so they'll always have to do this dance. Thus your concern about how the caller uses them is misplaced - the caller is more likely to do the wrong thing i.e. call the wrong version than have problems with the return types.

770

Better than errorCodeToError(/*some error code*/) is to use createStringError() or createFileError() to provide more context to the failure (how did the parsing fail? where? etc).

785

I don't think you want to use int here. There's always going to be a positive number of entries, and there are no subtractions etc inolving Index here. Better would be an unsigned type of some form (presumably the return type of getNumberOfAuxEntries()).

794

Same comment as before - use createStringError() or createFileError().

llvm/test/tools/llvm-objdump/XCOFF/disassemble-symbol-description64.test
14–15

I'm not going to stop you checking in a pre-compiled object, as I'm not an XCOFF maintainer, but as you are continuing to add more functionality here, I strongly advise you to write a yaml2obj XCOFF port, to avoid pre-canned binaries. You'll find pre-built binaries extremely inconvenient to work with as you maintain things going forward. Not only that, but they are harmful to the git repository size, especially if you have to occasionally rebuild them.

Using yaml2obj may also be about the only way you can test most parse failure paths.

If yaml2obj isn't viable, at least consider llvm-mc or similar, if possible.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
390–399

@grimar has gone to a lot of effort to get rid of unwrapOrError from the ELF dumping code. I'd prefer it if we could avoid using it here too. It is generally better in dumping tools to report a warning and abort dumping the current section than to emit an error and terminate the program, since it gives the user more of the information they've asked for.

llvm/lib/Object/XCOFFObjectFile.cpp
406

Is there a strong reason to not use the larger type everywhere?

I don't know what strength this reason has, but we had noticed that some of the tools do not reflect the width of the 32-bit format fields very well (even for relatively uninterpreted output). Where the producer of the binary is under development, developers are better served if the tools emit the correct width for fields in the format.

jasonliu added inline comments.Aug 14 2020, 12:38 PM
llvm/test/tools/llvm-objdump/XCOFF/disassemble-symbol-description64.test
14–15

I agree that we would want to move away from pre-canned binaries at some point.
When writing a yaml2obj port, we would still require tools such as llvm-readobj and llvm-objdump to make sure our yaml2obj implementation is correct. So we still have a chicken-or-egg problem here.
I think the current plan is to use pre-canned binary to develop the tooling support first. Then use the verified tooling support to verify XCOFF object file generation from llc. Then we could replace the pre-canned binary with llvm-mc/llc.

grimar added inline comments.Aug 17 2020, 3:20 AM
llvm/tools/llvm-readobj/XCOFFDumper.cpp
390–399

Yeah. Having unwrapOrError available is my concern. I am trying to cleanup ELF dumper, but other files (e.g. COFF) are still using it, thought ideally I'd just remove this API from llvm-readobj code, it seems does more harm than good for a long term.
At least I'd be happy if people stop adding more calls to the code.

DiggerLin added inline comments.Aug 17 2020, 6:02 AM
llvm/include/llvm/BinaryFormat/XCOFF.h
296

the name must be consistent with the aix OS file syms.h?
what about the to change to AUX_EXCEPT . it consistent with our current style.

jasonliu added inline comments.Aug 17 2020, 6:31 AM
llvm/include/llvm/BinaryFormat/XCOFF.h
296

The current style in this file seems to be (correct me if I'm wrong):
Use a descriptive name if it's not directly taken from OS header. Otherwise, take the name directly from header and add detailed description/comment to it.
In this case, it follows the latter. Other enums member do not have '_' because OS version do not have it either.

jasonliu added inline comments.Aug 17 2020, 6:33 AM
llvm/tools/llvm-readobj/XCOFFDumper.cpp
390–399

Thanks. Agreed. Will avoid using unwrapOrError in future code.

llvm/include/llvm/BinaryFormat/XCOFF.h
296

The issue here is that the name from the OS header is a reserved name. So by reason of not wanting undefined behaviour, we cannot use the name taken from the OS header. Unfortunately, that means we cannot be consistent in terms of using the name taken from the OS header without switching all the enumerator names to be descriptive and in the LLVM style.

jasonliu added inline comments.Aug 17 2020, 7:58 AM
llvm/include/llvm/BinaryFormat/XCOFF.h
296

hmm... Is the namespace XCOFF not enough to prevent the undefined behavior happening? Or are we afraid of people just use using namespace XCOFF to defeat it?

llvm/include/llvm/BinaryFormat/XCOFF.h
296

The practical cause of undefined behaviour in such a case is usually that the instance of the identifier here is misparsed, or otherwise has surprising behaviour, either because it is defined as a macro or is an extension keyword. Whether the name turns out to be in scope elsewhere is not a factor for such mechanisms.

jasonliu added inline comments.Aug 17 2020, 8:12 AM
llvm/include/llvm/BinaryFormat/XCOFF.h
296

Got it. I will switch the style of SymbolAuxType in the next revision. We will need to come up with a plan to switch the rest of the classes in this file (There are a lot).

jhenderson added inline comments.Aug 18 2020, 2:37 AM
llvm/include/llvm/BinaryFormat/XCOFF.h
296

+1 to dropping the underscore. I think it's okay for that to be the only change, but have no strong opinion either way, so happy with whichever you prefer.

301

Not that it really matters, but it's more traiditional to order enums in ascending numerical order. Any particular reason you've done this in the reverse order?

llvm/lib/Object/XCOFFObjectFile.cpp
406

In think in the context of printing the appropriately formatted output, you'd want to switch on the source type (i.e. is64Bit or whatever), at the formatting time. Certainly, this is how we've done it in our own internal code bases I work on, and there are examples of this in a number of other LLVM utilities. For example in https://github.com/llvm/llvm-project/blob/master/llvm/lib/DebugInfo/DWARF/DWARFCompileUnit.cpp#L17, the dump function dumps the offset with a width according to the DWARF format (i.e. 32 or 64 bit), but the getLength function returns a 64-bit value always. Similarly https://github.com/llvm/llvm-project/blob/master/llvm/tools/llvm-objdump/ELFDump.cpp#L257 identifies the ELF format kind and uses that in determining the width of offset, size and address fields (which are stored as uint64_t) when printing ELF program header tables.

There are certainly plenty of places where this hasn't been done. Sometimes this is a mistake, other times it's for consistency with GNU output, but I think the preferred approach is the "store large, explicitly specify format on output" approach.

jhenderson added inline comments.Aug 18 2020, 3:12 AM
llvm/test/tools/llvm-objdump/XCOFF/disassemble-symbol-description64.test
14–15

Yeah, chicken-or-egg problem is a bit of an issue. I'm not sure there's always a clear answer to this. The one I've encouraged for yaml2obj DWARF support testing is to actually inspect the hex output (with sufficient additional commenting to make it clear what the output represents). By keeping the initial functionality small enough, you can boostrap up from there.

The issue is that a lot of our low-level tool testing (i.e. testing of things like llvm-readobj) has switched over to yaml2obj, but clearly we can't (in theory) then use llvm-readobj to test the basic output of yaml2obj or we end up with a circular test dependency - a bug in a common library might not obviously manifest itself in this context, but would if using a tool from outside the ecosystem.

Another strategy which I've used occasionally for testing DWARF parsing before the yaml2obj support existed was writing assembly using just .byte/.quad etc directives to craft the input format precisely, without relying on the higher-level assembly directives (like .file/.loc etc). This may not work in all situations though.

jasonliu added inline comments.Aug 18 2020, 10:00 AM
llvm/include/llvm/BinaryFormat/XCOFF.h
301

I sort of "copied" the list from the OS header, and that's just the order it appeared in OS header.
I don't think it's particular important to have it in current order, I could change it to ascending order.

jhenderson added inline comments.Aug 19 2020, 12:10 AM
llvm/include/llvm/BinaryFormat/XCOFF.h
301

I haven't got any particular preference, so am happy to defer to whatever you prefer on this one.

jasonliu updated this revision to Diff 290801.Sep 9 2020, 12:47 PM

Addresses comments.
Add in test case to test errors.
Use view/reference class to encapsulate 32-bit and 64-bit differences instead.

It seems to me like this should be using inheritance here. You have a base class that has the common members, and provides pure virtual declarations of the various getters, with the sub-classes defining them to do the right thing. Yes, it would introduce a number of getters, but I feel like it would make everything a bit cleaner from a usability standpoint. In most cases, you then don't need any is64Bit queries, because the getters hide that from you.

@jhenderson I tried to use inheritance as suggested. But inheritance would mean I need to use pointers to enable the runtime polymorphism. Then there is a life time issue that need to be managed when using pointers. The easier way to achieve that is to return via unique_ptr. But using unique_ptr introduced usability issue in the caller/user side, as we would see std::move, SymbolRef.get() before getting to the query we want. Also underneath of the unique_ptr, new/delete is not very efficient as well.
In the end, I tried to solve this in similar manner as COFF does, which is using a view class without inheritance. Although the downside of it is we basically have an if query in every call to the view class to differentiate which version (32/64) we are having right now, the good thing is that caller side is much more cleaner.

DiggerLin added inline comments.Sep 10 2020, 12:56 PM
llvm/include/llvm/Object/XCOFFObjectFile.h
158

change to
return Entry32 ? Entry32->ParameterHashIndex : Entry64->ParameterHashIndex

and change in the following functions too ?

176

change to return reinterpret_cast<uintptr_t>(Entry32 ? Entry32 : Entry64)

401

this means for getSymbolEntryAddressByIndex(uint32_t SymbolTableIndex) const ?

482

assert(OwningObjectPtr != nullptr) here ?

492–525

maybe we can use a macro here.
#define GETVALUE(X) Entry32 ? Entry32->X : Entry64 ->X

int16_t getSectionNumber() const {

return GETVALUE(SectionNumber);

}

uint16_t getSymbolType() const {

   return GETVALUE(SymbolType);
}

and so on

532

getAddress() may confuse with getting the address of the symbol. maybe good to rename to getEntryAddress() ?

llvm/lib/Object/XCOFFObjectFile.cpp
579

several place use above NumberOfSymTableEntries , maybe good to provide a helper function.

735

change to const uint64_t SymbolTableSize ?

797–799

not all the symbol has Csect entry.
what about to return
Optional<XCOFFCsectAuxRef>XCOFFSymbolRef::getXCOFFCsectAuxRef()

801

I think assert(isCsectSymbol()) myabe better.
sometime maybe our developer call getXCOFFCsectAuxRef() at a no CsectSymbol . it is not a object file parse failed.

llvm/tools/llvm-objdump/XCOFFDump.cpp
49

I can not see benefit to change
from
XCOFFSymbolRef SymRef(Sym.getRawDataRefImpl(), Obj);
to
XCOFFSymbolRef SymRef = Obj->toSymbolRef(Sym.getRawDataRefImpl());

jasonliu added inline comments.Sep 10 2020, 1:48 PM
llvm/include/llvm/Object/XCOFFObjectFile.h
158

I was initially worried about MSVC breakage here: https://github.com/llvm/llvm-project/commit/210314ae8c59bc0a8801c2528eda892cd5960c31
But after taking a closer look, it seems to be only a problem for conversion from ubig32_t value to a unit64_t, which does not apply here.
So I will give it a try.

jasonliu added inline comments.Sep 10 2020, 5:27 PM
llvm/lib/Object/XCOFFObjectFile.cpp
797–799

I returned Expected<XCOFFCsectAuxRef> partly because of the original comment on this function:
TODO: The function needs to return an error if there is no csect auxiliary
entry.

I believe that's a change from your previous commit. Is there any reason that you changed your mind?
In 32 bit mode, you could have a csect symbol, but without any auxiliary entry. That should return an error (which I haven't detected here, but I should).
Also, in 64 bit mode, it's possible that you have a csect symbol that has auxiliary entries, but do not have a csect auxiliary entry, that should be an error situation right? So it also makes sense to return error in that case.
I think the interface would be too complicated if we return an Expected wrap with an Optional, or the other way around.

801

Sure.

llvm/tools/llvm-objdump/XCOFFDump.cpp
49

I did it for consistency reason, i.e: always get XCOFFSymbolRef via toSymbolRef.

jasonliu added inline comments.Sep 10 2020, 5:39 PM
llvm/include/llvm/Object/XCOFFObjectFile.h
401

Sorry, what's the difference between Symbol and SymbolEntry?
I'm also seeing getSymbolIndex and getSymbolNameByIndex around this function. Any reason they are not "SymbolEntry"?

DiggerLin added inline comments.Sep 11 2020, 6:24 AM
llvm/include/llvm/Object/XCOFFObjectFile.h
401

the value of symbol maybe a symbol relocation address , I was confused getSymbolAddress with getting the relocation address at my first glance of the code, getSymbolEntryAddress, that means we need the SymbolEntry address not relocation address of a symbol.

jasonliu marked 4 inline comments as done.Sep 11 2020, 10:46 AM
jasonliu added inline comments.
llvm/include/llvm/Object/XCOFFObjectFile.h
401

I don't think we have a symbol relocation address IMO. The address related to relocation could be relocation entry's address, or the virtual address data member inside of a relocation entry. But those addresses are not related to symbol in any ways.
Also, there is a function from base class which we overrides here called getSymbolAddress, which we don't want to change. So it would make sense to keep the same naming style here.

532

Do you still find it confusing after seeing my other comments about Symbol vs SymbolEntry?
I would think it's fine since we don't really have other addresses we could get in here.

jasonliu added inline comments.Sep 11 2020, 11:23 AM
llvm/include/llvm/Object/XCOFFObjectFile.h
401

hmm... Just realized what you meant.
getSymbolAddress actually returns toSymbolRef.getValue() which is a relocatable address.
This function is suppose to return the address of the symbol table entry within the object file.

532

Will change.

jasonliu updated this revision to Diff 291302.Sep 11 2020, 11:54 AM

Address comments from Digger.

MaskRay added inline comments.Oct 2 2020, 12:05 PM
llvm/include/llvm/Object/XCOFFObjectFile.h
145

You probably don't need these assert. The dereference will crash anyway

llvm/include/llvm/Object/XCOFFObjectFile.h
145

That's not true. There are some number of addressable bytes containing 0 starting from address 0x0 on AIX.

DiggerLin added inline comments.Mar 18 2021, 11:12 AM
llvm/include/llvm/Object/XCOFFObjectFile.h
158

thanks let me know.

171

what about return reinterpret_cast<uintptr_t>(Entry32 ? Entry32 : Entry64) ?

204

ruse GETVALUE(SymbolAlignmentAndType) ?

426

not sure we want to Distance to be negative value future? I think change to int32_t Distance, means that we can backward

433

not sure whether we want to define a enum for the LanguageID in this patch.
The values for this field are defined in the e_lang field in "Exception Section"

482

"Symbol table pointer can not be nullptr!" --> "Symbol table entry pointer can not be nullptr!"

490

using GETVALUE(Value) for consistent ?

llvm/lib/Object/XCOFFObjectFile.cpp
216

const int16_t SectNum ?

477

const int16_t SectionNum ?

DiggerLin added inline comments.Mar 18 2021, 1:51 PM
llvm/lib/Object/XCOFFObjectFile.cpp
781

const int16_t SectNum

DiggerLin added inline comments.Mar 18 2021, 2:09 PM
llvm/lib/Object/XCOFFObjectFile.cpp
815

create a static function getSymbolAuxType in a file scope maybe better?
All the Aux symbol of 64bit all has the AuxType . we can use the function for other type too in other place later ?

DiggerLin added inline comments.Mar 19 2021, 8:07 AM
llvm/lib/Object/XCOFFObjectFile.cpp
775

if we not enable -ffunction-sections , function entry is label.

815

there already has a function on XCOFFCsectAuxRef ::getAuxType64()

Esme added a subscriber: Esme.Apr 19 2021, 1:17 AM

Hi Jason, what's your plan about the patch? When will you move forward with it?
Currently D100375 relies on the patch for its 64-bit llvm-readobj. In additional, D97656, D99164 and D98003 should be rebased on it.
Besides, I would like to add support for the line number dump in llvm-objdump if this patch is ready.
It looks like this is a fundamental patch for tools implementation. And it looks good except for some comments haven't been addressed yet.
Please let me know if I can be of any help. Thanks!

@Esme I will rebase and address comments asap.

jasonliu updated this revision to Diff 338892.Apr 20 2021, 9:11 AM
jasonliu marked 3 inline comments as done.

Rebase and Address comments.

llvm/include/llvm/Object/XCOFFObjectFile.h
171

No, you could not do that. It only works if Entry32 and Entry64 are the same type. But they are not here.

426

I don't see a need to jump backward now. If it's needed in the future, we could always change in the future patch.

433

I think we are already doing enum mapping in tools/llvm-readobj/XCOFFDumper.cpp. I don't see a strong need to create an enum for it.

490

I would prefer to be more explicit here because we are doing a conversion to larger value for 32 bit version, which is different from the rest of GETVALUE(Value).

llvm/lib/Object/XCOFFObjectFile.cpp
775

Thanks. I brought back the old behavior and added the FIXME to say that this function does not return a correct value if we have -ffunction-sections enabled.

815

Yes, there is a XCOFFCsectAuxRef ::getAuxType64(), but if you notice, this function is used to create an XCOFFCsectAuxRef object. So you don't have that function available in the creator.
And I don't think a static function is needed, because when you created XCOFFCsectAuxRef object through this function, then you could call the XCOFFCsectAuxRef ::getAuxType64() to get your type. So this lambda should only exists in this function.

jasonliu updated this revision to Diff 338896.Apr 20 2021, 9:16 AM
jasonliu marked 2 inline comments as done.
jasonliu updated this revision to Diff 340857.Apr 27 2021, 8:33 AM

Address clang-tidy comment and added binary file.

jsji added a reviewer: Restricted Project.Apr 27 2021, 9:00 AM
jhenderson added inline comments.Apr 28 2021, 6:28 AM
llvm/lib/Object/XCOFFObjectFile.cpp
805

Is this error user-facing (I'm assuming so)? Assuming it is, you should record here which symbol is causing the problem. Otherwise the user will be faced with an error along the lines of this:

error: this csec symbol contains no auxiliary entry

which is not really actionable (imagine the input had 100000 symbols in - the user can't realistically go through each to find the offending one).

834

Similar to my above comment - which entry was not found? Give the user more context so that they can act on the problem.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
394

This will write the error inline, rather than to stderr. Are you sure that's what you want? it isn't what most dumping tools do on failure.

jasonliu updated this revision to Diff 341230.Apr 28 2021, 9:07 AM

Address comments.

jasonliu marked 3 inline comments as done.Apr 28 2021, 9:07 AM
jhenderson added inline comments.Apr 29 2021, 12:22 AM
llvm/lib/Object/XCOFFObjectFile.cpp
805

I believe this requires std::move(Error);, as you're returning an Expected, not an Error.

809–810

I think it would make more sense to insert the name in the middle of the message to make it a bit more concise. Something like:
"csect symbol name contains no auxiliary entry"

839

I'd suggest quoting somehow the symbol name, so that any whitespace or similar that happens to end up in the name (rare, but possible to write using assembly, at least for other platforms) is easily understood to be part of the name.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
392–393

Use reportError or reportWarning so that the error is reported in a clean manner and consistent with other llvm-readobj varieties, and not report_fatal_error which looks like a crash. General rule of thumb: try to avoid using report_fatal_error, especially in tool code where it is easy to report errors properly.

In llvm-readobj for ELF, we try to avoid even using reportError where possible, as that stops the tool from continuing dumping, which can be problematic occasionally. We prefer reportWarning (or more specifically the local reportUniqueWarning which avoids reporting the same warning multiple times) and bailing out of the current routine. Take a look at ELFDumper.cpp for examples.

jasonliu updated this revision to Diff 341639.Apr 29 2021, 2:21 PM
jasonliu marked 3 inline comments as done.

Address comments.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
392–393

Thanks for the elaboration. That clears up things a lot for me. I will use reportUniqueWarning here.

jhenderson added inline comments.Apr 30 2021, 1:34 AM
llvm/tools/llvm-readobj/XCOFFDumper.cpp
392–393

reportUniqueWarning can take an Error directly, so you can just do:

if (!ErrOrCsectAuxRef)
  reportUniqueWarning(ErrOrCsectAuxRef.takeError());

Also, be careful, as the program continues, so referencing ErrOrCsectAuxRef after this may result in things going wrong...

jasonliu updated this revision to Diff 341894.Apr 30 2021, 7:22 AM
jasonliu marked an inline comment as done.

Address comments.

@jhenderson @DiggerLin @Esme
Any more comments?

I'll try to take another look in the next day or two. The pre-merge bots are failing. Is that an issue with this patch?

@jhenderson @DiggerLin @Esme
Any more comments?

I'll try to take another look in the next day or two. The pre-merge bots are failing. Is that an issue with this patch?

Thanks a lot.
I think it's more of an issue that the binaries did not get into the phabricator probably. Not really an issue with the patch itself.

jhenderson accepted this revision.May 10 2021, 1:14 AM

Only some minor nits, otherwise LGTM. I haven't attempted to review the test coverage for all the new code, as I don't feel like I'm in a good position to do that, not being an XCOFF developer. I assume someone else has though.

llvm/include/llvm/Object/XCOFFObjectFile.h
481
483
llvm/lib/Object/XCOFFObjectFile.cpp
775
784

Basically any time you use consumeError, add a comment explaining why it's justified that we don't report the error to the user.

This revision is now accepted and ready to land.May 10 2021, 1:14 AM
DiggerLin added inline comments.May 10 2021, 12:37 PM
llvm/tools/llvm-readobj/XCOFFDumper.cpp
380

this is only for the 32bits .
" By convention, the csect auxiliary entry in an XCOFF32 file must be the last auxiliary entry for any external symbol that has more than one auxiliary entry"

for 64bit, it maybe look for the x_auxtype ==AUX_CSECT

DiggerLin added inline comments.May 10 2021, 1:03 PM
llvm/tools/llvm-readobj/XCOFFDumper.cpp
185

if (AuxEntPtr->AuxType != XCOFF::AUX_FILE ) , it should not be parsed as
XCOFF::AUX_FILE
it may better to print out raw data in the printSymbol()

234

if (AuxEntPtr->AuxType != XCOFF::AUX_CSECT) , it should not be parsed as
XCOFF::AUX_CSECT above
it may better to print out raw data in the printSymbol()

jasonliu updated this revision to Diff 345227.May 13 2021, 11:15 AM
jasonliu marked 4 inline comments as done.

Address comments.

jasonliu marked 2 inline comments as done.May 13 2021, 11:18 AM
jasonliu added inline comments.
llvm/tools/llvm-readobj/XCOFFDumper.cpp
185

I think it's the caller's responsibility to make sure they are passing in the right auxiliary type. So we should assume when we enter this function, we have the right auxiliary type here. So I modified it and made it an assert instead.

234

Same above. I modified it to be an assertion instead.

380

Good point. Updated the code.

DiggerLin added inline comments.May 13 2021, 12:07 PM
llvm/lib/Object/XCOFFObjectFile.cpp
94

SymbolAuxType is only for the 64 bits.
add assert(is64Bit() ) before return ?

llvm/tools/llvm-readobj/XCOFFDumper.cpp
366

as you mention "I think it's the caller's responsibility to make sure they are passing in the right auxiliary type. "

I think we need to check the AuxType == XCOFF::AUX_FILE for 64 bits. if not , print out the raw data as AUX_CSECT did ?

DiggerLin added inline comments.May 13 2021, 12:22 PM
llvm/tools/llvm-readobj/XCOFFDumper.cpp
394

we have iterated auxiliary entries from line 382~396 and reiterated again in the printCsectAuxEnt(). I think we can improve on it.

And in 64bits, The auxiliary entries maybe be reordered in above implement.
it will print out all no AUX_CSECT auxiliary entries first and then AUX_CSECT entry, even if the AUX_CSECT is in the middle of the auxiliary entries
and if there are two AUX_CSECT on auxiliary entries, we only print out one.

jasonliu marked 2 inline comments as done.May 13 2021, 12:47 PM
jasonliu added inline comments.
llvm/tools/llvm-readobj/XCOFFDumper.cpp
394

I don't think we reiterated again in printCsectAuxEnt() for other auxiliary entries.
We did similar things for 382~396 in getXCOFFCsectAuxRef, but that function is specifically designed to only get XCOFFCsectAuxRef. So it's really not intended to pass information out there.
I don't really think the re-iteration should be a big concern because in theory 382~396 won't be executed all that much. If it does, I would rather have it implement properly (i.e. actually recognize the missing auxiliary entry) instead of just printing out raw datas.

jasonliu updated this revision to Diff 345261.May 13 2021, 1:12 PM
jasonliu marked an inline comment as done.
jasonliu marked an inline comment as done.
DiggerLin accepted this revision.May 13 2021, 1:59 PM

LGTM with address comment.

llvm/tools/llvm-readobj/XCOFFDumper.cpp
367

it need "continue;"
here

and please run clang-format

jasonliu updated this revision to Diff 345283.May 13 2021, 2:12 PM

Address comments.

I took a quick look at the pre-merge output. The message is that the object isn't recognised as a valid object file, not that it wasn't found, which suggests either the object or code is broken in some manner, or you're missing a dependent patch. Does this patch depend on another patch that isn't in main yet?

llvm/tools/llvm-readobj/XCOFFDumper.cpp
358

i -> I

(you're changing most of the loop body - you might as well fix this whilst you're here)

364

Test case?

382–384

My English language ping went off at the word "till" in these two sentences. I'd probably change it to "to". Also, use "first" rather than "1st", I suggest in both places.

Also "skips" -> "skip" for grammatical consistency.

386

i -> I

jasonliu updated this revision to Diff 345492.May 14 2021, 10:46 AM

Address comments.

I took a quick look at the pre-merge output. The message is that the object isn't recognised as a valid object file, not that it wasn't found, which suggests either the object or code is broken in some manner, or you're missing a dependent patch. Does this patch depend on another patch that isn't in main yet?

No this patch does not depend on other patches. I think this is caused by git generated binary for the patch doesn't really assemble to the same one?

I took a quick look at the pre-merge output. The message is that the object isn't recognised as a valid object file, not that it wasn't found, which suggests either the object or code is broken in some manner, or you're missing a dependent patch. Does this patch depend on another patch that isn't in main yet?

No this patch does not depend on other patches. I think this is caused by git generated binary for the patch doesn't really assemble to the same one?

Sounds vaguely plausible, I guess.

llvm/test/tools/llvm-readobj/XCOFF/file-aux-wrong64.test
1–2 ↗(On Diff #345492)

Fix a couple of grammar issues and a premature line-wrap.

4–5 ↗(On Diff #345492)
19 ↗(On Diff #345492)

The output on this line looks incorrect to me. Possibly a bug in the code resulting from a missing newe line?

jasonliu updated this revision to Diff 345903.May 17 2021, 8:50 AM

Address comments.

jasonliu marked 2 inline comments as done.May 17 2021, 8:51 AM
jasonliu added inline comments.
llvm/test/tools/llvm-readobj/XCOFF/file-aux-wrong64.test
19 ↗(On Diff #345492)

I added an newline after all the raw bytes. Other than that, I think the output is good. We have 18 bytes per symbol table entry, and we are printing 18 raw bytes here.

jhenderson added inline comments.May 18 2021, 12:17 AM
llvm/test/tools/llvm-readobj/XCOFF/file-aux-wrong64.test
19 ↗(On Diff #345492)

I don't know if it is, but you probably want 00fb indented to line up nicely with the previous line. You can then confirm that this indentation is maintained by enabling --match-full-lines and --strict-whitespace in FileCheck. (If you do that, you'll need to remove the space after CHECK-NEXT:)

jasonliu updated this revision to Diff 348533.May 28 2021, 8:20 AM

Adjustment the print out.

jasonliu added inline comments.May 28 2021, 8:21 AM
llvm/test/tools/llvm-readobj/XCOFF/file-aux-wrong64.test
19 ↗(On Diff #345492)

Hi James, I adjusted the code to print 00fb in the same line, I think that actually works better because we could have multiply auxiliary entry data to print out. Putting each in the same line is easier to parse.

jhenderson accepted this revision.Jun 7 2021, 1:22 AM

Looks good again. Sorry for the delay - I was off work last week.

This revision was landed with ongoing or failed builds.Jun 7 2021, 10:25 AM
This revision was automatically updated to reflect the committed changes.

Looks good again. Sorry for the delay - I was off work last week.

No worries. Thanks for all the reviews.