Page MenuHomePhabricator

implement of the parsing symbol table for xcoffobjfile and output as yaml format
AcceptedPublic

Authored by DiggerLin on Fri, May 3, 1:12 PM.

Details

Summary

the implement of the parsing symbol table do not support parsing auxiliary entries of symbol. I will create a new issue to implement it.

the xcoff object file (aix_xcoff.o)which used in the test are comed from

-bash-4.2$ cat test.c
extern int i;
extern int TestforXcoff;
int main()
{
i++;
TestforXcoff--;
}

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
hubert.reinterpretcast requested changes to this revision.Tue, May 14, 4:58 PM
hubert.reinterpretcast added inline comments.
llvm/include/llvm/Object/XCOFFObjectFile.h
181

s/Note that return/Returns/;

185

s/Return/Returns a/;

llvm/tools/obj2yaml/xcoff2yaml.cpp
64

Length mismatch.

llvm::yaml::Hex16 Value; // Symbol value; storage class-dependent.
support::ubig32_t Value; // Symbol value; storage class-dependent.
This revision now requires changes to proceed.Tue, May 14, 4:58 PM
llvm/test/tools/obj2yaml/aix_xcoff.test
17

The question for this line is whether printing as a single type field is appropriate. This is a C_FILE entry where n_lang is C (0x00) and n_cputype is COM (0x03). That is, perhaps we should have inspected CFileLanguageIdAndTypeId?

llvm/lib/Object/XCOFFObjectFile.cpp
57

Use Size instead of size to match Name and also the corresponding parameter in checkSize.

107

Hardcoded 0 should be replaced with something that indicates the semantic: XCOFFSymbolEntry::NAME_IN_STR_TBL_MAGIC or at least 0x00000000.

111

By convention, a storage class value with the high-order bit on indicates that this field is a symbolic debugger stabstring.

The above means that the check should be for SectionNumber < 0.

111

If an error recovery path is thought to be necessary, then llvm_unreachable is the wrong answer. Since the error recovery seems reasonable, use assert instead.

118

A byte offset value of 0 is a null or zero-length symbol name.

Is the "less than 4" an invention of this patch? A byte offset of 1, 2, or 3 points into the length field. It is well-formed as claimed by the comment in the patch, or is it erroneous?

123

Check that the Offset is within the string table.

125

It seems that there should be no period in the string:

llvm/lib/Object/WasmObjectFile.cpp:        make_error<StringError>("Bad magic number", object_error::parse_failed);
DiggerLin updated this revision to Diff 199609.Wed, May 15, 7:46 AM
DiggerLin marked 9 inline comments as done.

changed the code based on comment

llvm/include/llvm/Object/XCOFFObjectFile.h
181

fixed. thanks

185

fixed , thanks

llvm/test/tools/obj2yaml/aix_xcoff.test
17

Thanks for Hubert let me I make a wrong answer.

  1. for the question of " Is the intent to always keep dumping the type as a numeric value? For this tool I think that's reasonable but I'd like to clarify."

Answer:
in the xcoff document ,
n_type :
Bits 0-3
Symbol visibility. The SYM_V_MASK macro, with the value 0xF000, can be used to mask off bits in the n_type field that do not specify visibility. The following visibilities are defined:
0x1000 SYM_V_INTERNAL
0x2000 SYM_V_HIDDEN
0x3000 SYM_V_PROTECTED
0x4000 SYM_V_EXPORTED

Bit 10
Optionally set to 1 if the symbol is a function. Otherwise, it is set to 0.

I can not use enum to represent the value. I have to display as a value.

  1. For the question "This is a C_FILE entry where n_lang is C (0x00) and n_cputype is COM (0x03). That is, perhaps we should have inspected CFileLanguageIdAndTypeId?"

Answer: I do not think yaml framework support showing different field name based on the value of the field.

The field name of yaml output are defined in the function, Once it is defined , we can not change the field name based on the value.

void MappingTraits<XCOFFYAML::Symbol>::mapping(IO &IO, XCOFFYAML::Symbol &S) {

IO.mapRequired("Name", S.SymbolName);
IO.mapRequired("Value", S.Value);
IO.mapRequired("Section", S.SectionName);
IO.mapRequired("Type", S.Type);
IO.mapRequired("StorageClass", S.StorageClass);
IO.mapRequired("NumberOfAuxEntries", S.NumberOfAuxEntries);

}

llvm/tools/obj2yaml/xcoff2yaml.cpp
64

thanks for point out. fixed

llvm/lib/Object/XCOFFObjectFile.cpp
98–101

There's still a stray return statement here.

441

Why the extra parentheses around StringTable.Size?

llvm/lib/ObjectYAML/XCOFFYAML.cpp
17

Both of the above includes are included directly by XCOFFYAML.h for which this is the implementation file.

llvm/test/tools/obj2yaml/aix_xcoff.test
17

What happens with something like:

llvm/lib/ObjectYAML/COFFYAML.cpp:  IO.mapOptional("SymbolTableIndex", Rel.SymbolTableIndex);

?

DiggerLin marked 8 inline comments as done.

changed code based the comment-May 15,2019

DiggerLin marked 5 inline comments as done.

changed code based on comment

llvm/lib/Object/XCOFFObjectFile.cpp
57

fixed , thanks

98–101

deleted

107

changed as suggestion

111

we do not support xcoff object file which generated by -g currently. we will support it later, and symbol name in the .debug section later. So I think use llvm_unreachable maybe more reasonable.

118

in the xcoff document, it describe as " A byte offset value of 0 is a null or zero-length symbol name."

A byte offset of 1, 2, or 3 points into the length field , the document do not talk about anything on it. we look it as "zero-length symbol name" . If we can treated the "A byte offset of 1, 2, or 3" as error format, and return as
make_error<GenericBinaryError>("Symbol Name offset error",object_error::parse_failed);
But I think looking "A byte offset of 1, 2, or 3" as "zero-length symbol name" will make obj2yaml more compatible。

123

Good idea, added offset check with the string table size. thanks

125

deleted the period.

441

deleted parentheses

llvm/lib/ObjectYAML/XCOFFYAML.cpp
17

good point, I deleted all redundant header files

llvm/test/tools/obj2yaml/aix_xcoff.test
17

The field name of yaml output are defined in the function, Once it is defined , we can not change the field name based on the value.

void MappingTraits<XCOFFYAML::Symbol>::mapping(IO &IO, XCOFFYAML::Symbol &S) {

IO.mapRequired("Name", S.SymbolName);
IO.mapRequired("Value", S.Value);
IO.mapRequired("Section", S.SectionName);
IO.mapRequired("Type", S.Type);
IO.mapRequired("StorageClass", S.StorageClass);
IO.mapRequired("NumberOfAuxEntries", S.NumberOfAuxEntries);
}

17

when you did the first comment on it , I think over to use IO.mapOptional, But I denied it in final, "n_type" is not optional field in the xcoff object file. if I use optional field for the CFileLanguageIdAndTypeId , it means that CFileLanguageIdAndTypeId is optional field of symbol entry , It may cause confuse, it actually only a different interpret of the n_type.

DiggerLin marked 3 inline comments as done.

changed code based on comment

DiggerLin added inline comments.Wed, May 15, 12:35 PM
llvm/lib/Object/XCOFFObjectFile.cpp
111

changed to assert.

llvm/tools/obj2yaml/xcoff2yaml.cpp
34

if symbol name parse failed .
it return std::error_code which value is 3 . defined in
Error.h

enum class object_error {

// Error code 0 is absent. Use std::error_code() instead.
arch_not_found = 1,
invalid_file_type,
parse_failed,
unexpected_eof,
string_table_non_null_end,
invalid_section_index,
bitcode_section_not_found,
invalid_symbol_index,

};

deleted a #include "llvm/Support/YAMLTraits.h" from llvm/lib/ObjectYAML/XCOFFYAML.cpp

llvm/lib/Object/XCOFFObjectFile.cpp
111

Suggestion: Access to symbolic debugger stabstrings from .debug section not yet implemented!
Also, reminder re: https://reviews.llvm.org/D61532?id=198073#inline-550220.

118

The comment should be clear as to what is part of the specification and what is part of the implementation. Suggestion:
A byte offset value of 0 is a null or zero-length symbol name. A byte offset in the range 1 to 3 (inclusive) points into the length field; as a soft-error recovery mechanism, we treat such cases as having an offset of 0.

125

I still see the period.

llvm/tools/obj2yaml/xcoff2yaml.cpp
34

Sean is pointing out that there is no return statement in the case where there is no error on the call to dumpSymbols.

llvm/lib/Object/XCOFFObjectFile.cpp
110

We still should not reach the next line in release mode.

MaskRay added inline comments.Thu, May 16, 1:10 AM
llvm/tools/obj2yaml/xcoff2yaml.cpp
50

Return Error.

DiggerLin updated this revision to Diff 199886.Thu, May 16, 1:00 PM
DiggerLin marked 4 inline comments as done.

modified code based on comment

llvm/lib/Object/XCOFFObjectFile.cpp
111

thanks for the information, using "By convention, a storage class value with the high-order bit on indicates that this field is a symbolic debugger stabstring." is more exact than assert(SymEntPtr->SectionNumber != XCOFF::N_DEBUG)

llvm/tools/obj2yaml/xcoff2yaml.cpp
34

thanks for clarify the question, fixed it.

50

thanks for the information. But I think I still keep std::error_code XCOFFDumper::dumpSymbols() as it now. I will schedule a new patch after this patch upstream.

the new patch will change the high level interface
static std::error_code dumpObject(const ObjectFile &Obj) {

if (Obj.isCOFF())
  return coff2yaml(outs(), cast<COFFObjectFile>(Obj));

if (Obj.isXCOFF())
  return xcoff2yaml(outs(), cast<XCOFFObjectFile>(Obj));

if (Obj.isELF())
  return elf2yaml(outs(), Obj);
if (Obj.isWasm())
  return wasm2yaml(outs(), cast<WasmObjectFile>(Obj));

return obj2yaml_error::unsupported_obj_file_format;

}

to

Expected<Error> dumpObject(const ObjectFile &Obj) as your suggestion.

if change to Expected<Error> dumpObject(const ObjectFile &Obj). I also need to modified the function
coff2yaml();
elf2yaml();
wasm2yaml();

DiggerLin marked 3 inline comments as done.Thu, May 16, 1:05 PM
DiggerLin added inline comments.
llvm/lib/Object/XCOFFObjectFile.cpp
111

fixed. thanks

118

changed the comment as suggestion

125

the period gone now

I think this is pretty much there. One last issue for this patch, and then I think we can leave further tweaks for later patches.

llvm/lib/Object/XCOFFObjectFile.cpp
115

I'm not sure what asserting on a asserts-enabled build but generating a hard-error on a release build buys us. If we are okay with a hard error, then we can just go with it for all builds. If we are not okay with a hard error, then we don't want the debug build to trip either (and a TODO comment with some recovery path is the right answer).

Note that llvm/test/tools/llvm-readobj/Inputs/xcoff-basic.o has such symbols.

sfertile added inline comments.Thu, May 16, 8:51 PM
llvm/include/llvm/Object/XCOFFObjectFile.h
159

separate the functions which are inherited from objectfile class from the functions api which only in xcoffobjectfile.

Thats a good idea, adding a brief comment to that end would work.

llvm/lib/Object/XCOFFObjectFile.cpp
98

Ok, that's pretty reasonable.

115

Note that llvm/test/tools/llvm-readobj/Inputs/xcoff-basic.o has such symbols.

We need to commit to implementing the debug name handling either in the same patch that enables dumping symbols from readobj or before that. It would be useless if we have to tip-toe around object files with debug info. Until then I think the assert alone is fine.

118

Could we have a valid offset of less than 4 if the name is in the debug section? I'm not suggesting we need to change in this patch, just considering the future.

401

I definitely agree the ELF way is better. The way I extended the parsing for 64-bit support though relies on it being done in the constructor since I rely on the file header accessors and section header accessors to abstract over the structure differences. I'll have a closer look at the ELF handling to see if we can move in that direction.

llvm/tools/obj2yaml/xcoff2yaml.cpp
34

Sorry, I should have been clearer.

llvm/lib/Object/XCOFFObjectFile.cpp
115

Printing unrelated strings or reporting spurious errors in release is not fine in my book. The hard-error alone in fine (no need for an assert).

118

Great question.

In llvm/test/tools/llvm-readobj/Inputs/xcoff-basic.o:

0000620:                          0000 0000 0000
0000630: 0002 0000 0000 fffe 0000 8c00

The offset into the debug section is 2.

daltenty requested changes to this revision.Thu, May 23, 8:08 AM
daltenty added a subscriber: daltenty.
daltenty added inline comments.
llvm/include/llvm/Object/XCOFFObjectFile.h
71

Building with this section with -DLLVM_ENABLE_WERROR=ON results in "error: anonymous types declared in an anonymous union are an extension"

83

Same problem with anonymous types with this union

This revision now requires changes to proceed.Thu, May 23, 8:08 AM
DiggerLin updated this revision to Diff 200990.Thu, May 23, 8:57 AM
DiggerLin marked 4 inline comments as done.

fixed the code based on the comment

LGTM with additional small changes (including the changes requested by @daltenty).

llvm/lib/Object/XCOFFObjectFile.cpp
110

Add a comment:

// A storage class value with the high-order bit on indicates that the name is a
// symbolic debugger stabstring.
111

Missing "return".

DiggerLin updated this revision to Diff 201255.Fri, May 24, 8:18 AM
DiggerLin marked 3 inline comments as done.

fixed the compiler error of
"anonymous types declared in an anonymous union are an extension

[-Wnested-anon-types] "

when compile with clang++

llvm/include/llvm/Object/XCOFFObjectFile.h
71

I have fixed the problem

83

I build success as
bash-4.2$ g++ -std=c++11 -I/scratch/zhijian/llvm/src/llvm/include -I/scratch/zhijian/llvm/build/include -c llvm/include/llvm/Object/XCOFFObjectFile.h

83

I have fixed the problem .

llvm/lib/Object/XCOFFObjectFile.cpp
115
  1. I deleted the assert
  2. print symbol in as "Unimplemented Debug Name" after I discussed with Sean
daltenty accepted this revision.Fri, May 24, 1:42 PM

LGTM (with addition of comment @hubert.reinterpretcast suggested above)

This revision is now accepted and ready to land.Fri, May 24, 1:42 PM

There are still unaddressed comments.

llvm/lib/Object/XCOFFObjectFile.cpp
111

Please fix this.