This is an archive of the discontinued LLVM Phabricator instance.

Output XCOFF object text section header and symbol entry for program code
ClosedPublic

Authored by DiggerLin on Aug 29 2019, 11:44 AM.

Details

Summary

Original form of this patch is provided by Stefan Pintillie.

  1. The patch try to output program code section header , symbol entry for program code (PR) and Instruction into the raw text section.
  2. The patch include how to alignment and layout the CSection in the text section.
  3. The patch also reorganize the code , put some codes into a function(XCOFFObjectWriter::writeSymbolTableEntryForControlSection)

Additional: We can not add raw data of text section test in the patch, If want to output raw text section data,it need a function description patch first.

Diff Detail

Repository
rL LLVM

Event Timeline

There are a very large number of changes, so older changes are hidden. Show Older Changes
DiggerLin updated this revision to Diff 219365.Sep 9 2019, 8:42 AM
DiggerLin marked 3 inline comments as done.
hubert.reinterpretcast requested changes to this revision.Sep 11 2019, 7:37 PM
hubert.reinterpretcast added inline comments.
llvm/lib/MC/XCOFFObjectWriter.cpp
53

Is this answer in relation to some specific use of Syms or its containing class?

153

Stray top-level const on function parameter declaration on a function declaration.

155

Stray top-level const on function parameter declaration on a function declaration.

296

Please don't copy the ControlSections.

296

We already use Sec and Csect for similar cases. I don't think we need to add CSection into the mix.

301

Is the llvm:: qualification necessary?

325

putNameInStringTable sounds like an action. Maybe use nameShouldBeInStringTable.

326

Names of NameSize length are placed in the symbol table.

Please add appropriate testing. This is a regression that was not caught.

329

This function is misnamed for what it does. It writes the symbol table entry name field, not the symbol name.

461–471

Same question re: llvm:: as before.

463

Please don't copy the ControlSections.

This revision now requires changes to proceed.Sep 11 2019, 7:37 PM
llvm/lib/MC/XCOFFObjectWriter.cpp
331

Minor nit: This was fixed in D65159 to use int32_t (not that there is a real difference). However this is my second comment on this patch regarding its replacement of more correct code with less correct code.

352

Neither GCC nor XL fail to set the function property when debugging is enabled. This is at least a TODO.

353

s/symbol/symbol's/;

360

Since the field is named SectionLen in llvm::object::XCOFFCsectAuxEnt32, a comment is warranted regarding its use also for referencing the containing csect by symbol table index. Please also add a comment in include/llvm/Object/XCOFFObjectFile.h.

366

This field is unsigned; see D65159:

W.write<uint8_t>(getEncodedType(Sec.MCCsect));
368

Ditto.

370

Ditto.

372

Ditto.

llvm/lib/MC/XCOFFObjectWriter.cpp
341

This function is misnamed. It neither handles all variations of the LLVM concept of "symbol" nor the same for the XCOFF concept of "symbol". Maybe use writeSymbolTableEntryForCsectMemberLabel.

343

Top-level const still not fixed here.

376

Ditto re: top-level const.

llvm/test/CodeGen/PowerPC/aix-return55.ll
8

foo: with the colon. .foo: if possible.

llvm/lib/MC/XCOFFObjectWriter.cpp
385

Please add a comment about this value being always zero (for now). Again, this was better in the existing code before this patch.

398

Unsigned.

400

Unsigned.

402

Unsigned.

404

Unsigned.

501

Technically this assert is too late. INT_MAX is allowed to be the same as INT16_MAX, so the addition may overflow even in the promoted type.

519

The string text should probably be about the size not being properly padded to a multiple of the section alignment.

527–528

Ditto regarding the assertion being too late.

The lack of testing with regards to what gets written for the raw data of the .text XCOFF section ought to be addressed.

DiggerLin marked 23 inline comments as done.Sep 17 2019, 7:44 AM
DiggerLin added inline comments.
llvm/lib/MC/XCOFFObjectWriter.cpp
53

the problem is happened in the // Print out symbol table for the data section.

// Print out symbol table for the program code.
for (const auto CSection : ProgramCodeCsects) {
  // Write out the control section first and then each symbol in it.
  writeSymbolTableEntryForControlSection(CSection, Text.Index,
                                         CSection.MCCsect->getStorageClass());
  for (const auto Sym : CSection.Syms) {
    writeSymbolTableEntryForCsectMemberLabel(Sym, CSection, Text.Index,
                                   Layout.getSymbolOffset(*(Sym.MCSym)));
  }
} 
for (const auto CSection : ProgramCodeCsects) {

it will caused Symbol::operator=. since we will change to for (const auto& CSection : ProgramCodeCsects) in new copy, I will recover "const MCSymbolXCOFF *const MCSym;"

153

change the "const int16_t" to "int16_t"

155

change the "const int16_t" to "int16_t"

296

change as suggestion

301

yes, it is not necessary, the file already has using namespace llvm;

325

I prefer to use isNameInStringTable

326

changed, and I have asked Xing to add some test case for symbol name which length is 8 and symbol name which length large than 8 . @xingxue

329

it write symbol name information in the function, I think writeSymbolName is OK.

331

changed.thanks

352

add a new TODO

353

done

368

changed.

372

changed

376

fixed , thanks

385

added a comment

398

changed.thanks

402

changed.thanks

404

changed.thanks

461–471

deleted

463

changed

501

I delete the assert, I do not think we need assert here.

DiggerLin marked 2 inline comments as done.Sep 17 2019, 7:44 AM
DiggerLin updated this revision to Diff 220502.Sep 17 2019, 7:46 AM

new patch which address comment

DiggerLin marked 3 inline comments as done.Sep 17 2019, 1:50 PM
DiggerLin added inline comments.
llvm/lib/MC/XCOFFObjectWriter.cpp
519

changed assert to Address= alignTo(Address, DefaultSectionAlign);

527–528

the assert will be deleted

llvm/test/CodeGen/PowerPC/aix-return55.ll
8

changed .thanks

DiggerLin updated this revision to Diff 220573.Sep 17 2019, 1:54 PM

address comment

xingxue added inline comments.Sep 20 2019, 1:54 PM
llvm/lib/MC/XCOFFObjectWriter.cpp
296

I'd suggest to use Csect to be consistent with what is used in this file for other similar cases.

sfertile added inline comments.Sep 23 2019, 8:07 AM
llvm/lib/MC/XCOFFObjectWriter.cpp
53

since we will change to for (const auto& CSection : ProgramCodeCsects) in new copy, I will recover "const MCSymbolXCOFF *const MCSym;"

👍

356

Minor nit: IIUC We aren't supporting the old XCOFF32 interpretation, but rather setting the bit (which is optional in the new interpretation) when debugging is enabled.

360

I'm not disagreeing with this, but it should be done in a separate patch.

490

Why did you change this, then add an 'Address' in each of the section layout calculations? We can use Text.Address/BSS.Address to calculate the sections size, so no need for an extra variable.

501

we assigned uint32_t Address = 0; at begin, I do not think we need to assert assigning Text.Address here, because if always true here.

I'm suggesting we need to assign Text.Address = Address even though we expect Address to be zero. If we for example decided there should be a section mapped before the .text section in the object file then Address may not be zero.I understand its unlikely for us to make such a change, but that is not a good reason to skip assigning the address.

DiggerLin marked an inline comment as done.Sep 23 2019, 8:17 AM
DiggerLin added inline comments.
llvm/lib/MC/XCOFFObjectWriter.cpp
296

changed as suggestion

DiggerLin updated this revision to Diff 221340.Sep 23 2019, 8:22 AM
DiggerLin marked 2 inline comments as done.EditedSep 24 2019, 2:06 PM
llvm/lib/MC/XCOFFObjectWriter.cpp
490

there is a patch https://reviews.llvm.org/D67125 , which will change Address to SectionStartAddress. And we think SectionStartAddress can express meaning than Address.

What we need another available Address in the uint32_t Address = SectionStartAddress; , the Address is used to store the start address of the Csection.

there is a more variable "Address" as your pointout, but I think the code will more readable with the additional variable "Address"

501

added as suggestion

DiggerLin updated this revision to Diff 221597.Sep 24 2019, 2:09 PM

I might have more comments later, but I'm posting what I have currently.

llvm/lib/MC/XCOFFObjectWriter.cpp
325

This really sounds like the name is already in the string table.

351

The old comment is fine here. Please switch back to the shorter version. This is a part of the documentation that requires careful reading in context (and is a bad idea to quote portions from).

356

The second TODO should say:

// TODO Set the function indicator (bit 10, 0x0020) for functions
// when debugging is enabled.
391

Use the shorter version of the comment here too,

461–471

I'm not seeing the change. llvm:: still here.

DiggerLin marked 7 inline comments as done.Sep 25 2019, 12:41 PM

The lack of testing with regards to what gets written for the raw data of the .text XCOFF section ought to be addressed.

in the patch we cannot deal with the DS csect in function in void XCOFFObjectWriter::executePostLayoutBinding . we can not generate a raw data of the '.text" XCOFF section.
there are upcoming patch which deal with the DS csect , I think we will add the test in that patch @jasonliu.

llvm/lib/MC/XCOFFObjectWriter.cpp
325

changed as suggestion

341

changed as suggestion

351

changed as suggestion

351

changed as suggestion

356

changed as suggestion

391

changed as suggestion

DiggerLin marked an inline comment as done.Sep 25 2019, 12:42 PM
DiggerLin updated this revision to Diff 221826.Sep 25 2019, 1:25 PM
sfertile added inline comments.Sep 26 2019, 10:39 AM
llvm/lib/MC/XCOFFObjectWriter.cpp
354

Minor nit: Extra space after TODO

490

I don't see how the change helps readability. Originally we have a single variable that represent the next assignable address in the file. When we assign a section start address it simply starts at that next available address. After the change we now have a variable to track the SectionStartAddress, which doesn't seem any different or special compared to any of the other addresses we assign. And we have the extra overhead of creating a local and assigning that with the value of section start address, then modifying that in the loop making sure we have to copy its modified value out to SectionStartAddress at the end of the loop. Why is the sections start address different from any other address we assign in the loop? In what way does having the extra variable increase readability?

501

I'm OK with postponing the overflow checking for now, since there is only 3 sections we might be emitting.

llvm/test/CodeGen/PowerPC/aix-return55.ll
8

Still missing the : in the CHECK-LABEL line.

DiggerLin marked 3 inline comments as done.Sep 26 2019, 1:39 PM
DiggerLin added inline comments.
llvm/lib/MC/XCOFFObjectWriter.cpp
354

changed

490

changed as suggestion.

llvm/test/CodeGen/PowerPC/aix-return55.ll
8

added

DiggerLin updated this revision to Diff 222022.Sep 26 2019, 1:56 PM
llvm/lib/MC/XCOFFObjectWriter.cpp
298

We have code that introduces alignment padding, so I suspect we should have code here to write out the padding.

391

There are some cases here of two consecutive spaces. It is not necessarily wrong, but we should be consistent.

llvm/test/CodeGen/PowerPC/aix-xcoff-common.ll
84

I think this warrants more explanation. Is the ability of the regex to match non-space characters intended?

we can not generate a raw data of the '.text" XCOFF section.
there are upcoming patch which deal with the DS csect , I think we will add the test in that patch @jasonliu.

I believe that the .text section testing should be separate. The fact that the patch you mention would enable the ability to generate a non-zero size XCOFF .text section does not mean that it should cover testing beyond its own scope.

From the current patch, we have identified a need for follow-up patches to test .text sections that aren't empty, to test the placement of symbol names, and to address the naming of a field that this patch uses for a different purpose. If the second sounds out-of-place, then that is because the current commit message does not actually describe the scope of the patch well. The patch did a refactoring of interfaces, and that portion of the patch is why the second follow-up patch is tied to this one.

Please update the commit message to describe the scope of the current patch (especially that necessary dependencies for there to be non-empty .text sections have not landed) and to indicate a plan for follow-on patches.

llvm/lib/MC/XCOFFObjectWriter.cpp
528–529

A comment is necessary here regarding alignment padding between the start of the section and the first item allocated within the section (which might require a stricter alignment). A question is why such padding is done "physically" in within the section as opposed to doing the padding "virtually" by adjusting the section's address in the address space.

DiggerLin retitled this revision from Output XCOFF object text section to Output XCOFF object text section header and symbol entry for program code.Oct 1 2019, 7:09 AM
DiggerLin marked 7 inline comments as done.Oct 1 2019, 8:45 AM
DiggerLin added inline comments.
llvm/lib/MC/XCOFFObjectWriter.cpp
298

Added new function to support it.

391

delete extras space.

528–529

added a comment

llvm/test/CodeGen/PowerPC/aix-xcoff-common.ll
84

there is Index: 0

Name: .text

before Symbol {

Index: 2
Name: a

origin test case will match .text symbol first and the value of Index will be zero.

using ; SYMS: Index: #Index:{{[[:space:]].*}}Name: a
can make sure that it match Symbol {

Index: 2
Name: a

not symbol .text and the index value will 2.

DiggerLin updated this revision to Diff 222633.Oct 1 2019, 8:48 AM
DiggerLin marked 2 inline comments as done.
DiggerLin edited the summary of this revision. (Show Details)
DiggerLin updated this revision to Diff 222649.Oct 1 2019, 10:43 AM

added a new padding zero for the section and csection.

DiggerLin updated this revision to Diff 222678.Oct 1 2019, 12:58 PM

updated two test cases.
as
SYMS: Index: #Index:{{[[:space:]]*}}Name: a

We need to verify if we need the storage padding or not before finalizing, but otherwise the patch looks pretty good.

llvm/lib/MC/XCOFFObjectWriter.cpp
69

We might be able to skip writing the actual zeros for padding, and instead just pad out the virtual addresses assigned to the sections/csects. I believe @hubert.reinterpretcast was following up with one of the AIX system toolchain devs to verify.

528–529

I'm sorry Digger, but I am having trouble figuring out what the comment is trying to convey.

Is it that each section is padded out to DefaultSectionAlign so that Address must be properly aligned already?

Is it trying to explain how we don't have to align the sections virtual address anymore then DefaultSectionAlign (even when contained csects may be more strictly aligned then that) becuase the sections alignment is immaterial?

DiggerLin updated this revision to Diff 223085.Oct 3 2019, 1:54 PM

first Csect of each section do not need padding zero. And we also need to adjust section virtual address to first Csect's address.

DiggerLin edited the summary of this revision. (Show Details)Oct 7 2019, 8:40 AM
DiggerLin edited the summary of this revision. (Show Details)Oct 7 2019, 8:42 AM
llvm/lib/MC/XCOFFObjectWriter.cpp
87

Remove this field (see comments on later lines).

100

Remove this field (see comments on later lines).

503

The inter-csect padding is not really a property of the csect requiring alignment. We do not need to store this value here. The amount of padding to write can be determined by tracking the virtual address of the raw section data being written during the serialization into the object file. The next virtual address following the padding should be Csect.Address.

517

The Size field accounts for the padding. We do not need to store this value here. The amount of padding to write can be determined by tracking the virtual address of the raw section data being written during the serialization into the object file. The next virtual address following the padding should be Text.Address + Text.Size.

llvm/lib/MC/XCOFFObjectWriter.cpp
360

@DiggerLin, please post such a patch (perhaps going so far as to rename the field to SectionOrLength) so we do not lose track of this.

DiggerLin marked an inline comment as done.Oct 8 2019, 10:42 AM
DiggerLin added inline comments.
llvm/lib/MC/XCOFFObjectWriter.cpp
360

I created a NFC patch. https://reviews.llvm.org/D68650

DiggerLin marked 5 inline comments as done.Oct 9 2019, 8:56 AM
DiggerLin added inline comments.
llvm/lib/MC/XCOFFObjectWriter.cpp
87

After I discuss with Sean, We decide to keep variable PaddingSize here. with additional variable, it can make the code more readable and the logic of padding more easy to understand.

100

After I discuss with Sean, We decide to keep variable PaddingSize here. with additional variable, it can make the code more readable and the logic of padding more easy to understand.

503

I got what you talk about, But If we do not calculate the padding size here, We have to calculate the padding size in the XCOFFObjectWriter::writeSections , it will make the logic of the function writeSections complicated.

for example
// Write the program code control sections one at a time.

uint32_t PaddingSize = 0;   //additional variable here.
for (auto it= ProgramCodeCsects.begin(); it!=ProgramCodeCsects.end() ;++it ) {
  if (PaddingSize)
    W.OS.write_zeros(Csect.PaddingSize);
 // And I think I also need some comment to explain following code.
 if(std::next(it) != ProgramCodeCsects.end() )
     PaddingSize = std::next(it)->Address - it->Address - it->size;
  Asm.writeSectionData(W.OS, Csect.MCCsect, Layout);
}
517

some reason as above.

DiggerLin updated this revision to Diff 224128.Oct 9 2019, 12:50 PM

change padding method as hubert 's suggestion.

DiggerLin updated this revision to Diff 224132.Oct 9 2019, 1:00 PM

add if (PaddingSize) protect.

llvm/lib/MC/XCOFFObjectWriter.cpp
296

Suggestion: CurrentAddressLocation

299

I think the code can be made sufficiently self-explanatory that we don't need a comment here.

302

The above write to PaddingSize is only read by the if and the use inside the if.

if (uint32_t PaddingSize = Csect.Address - CurrentAddressLocation)
308

Suggestion:
The size of the tail padding in a section is the end virtual address of the current section minus the the end virtual address of the last csect in that section.

310
if (!ProgramCodeCsects.empty())

however, I suggest checking the section and not the group of csects (they aren't the same thing):

if (Text.Index != -1)
312

Same comment about the write to PaddingSize.

505

There's two spaces after the +.

517

Use "csect" instead of "Csect" when using the term in an English context where the word would not be capitalized.

Suggestion:
The first csect of a section can be aligned by adjusting the virtual address of its containing section instead of writing zeroes into the object file.

528–529

The difference in the calculation for the virtual address of the .bss section and that of the .text section might complicate efforts to common up the handling. Note that a change in how the virtual address of .bss is calculated is within the scope of this patch because it changes the value from being always zero.

llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll
56

It seems this file was changed accidentally by today's updates. The (space) character before the * is correct.

DiggerLin marked 9 inline comments as done.Oct 10 2019, 10:40 AM
DiggerLin added inline comments.
llvm/lib/MC/XCOFFObjectWriter.cpp
296

changed as suggestion

299

deleted the comment.

308

changed comment as suggestion

310

changed as suggestion.

312

changed as suggestion.

505

deleted, thanks

517

changed as suggestion

528–529

changed , thanks

llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll
56

changed , thanks

Just some minor comments. I think this is almost ready.

llvm/lib/MC/XCOFFObjectWriter.cpp
150

This should be a static member function or a non-member function.

347

Maybe check for overflow here.

469

Please remove the excess parentheses.

llvm/test/CodeGen/PowerPC/aix-xcoff-common.ll
84

Can this be merged with the previous line?

SYMS:        Symbol {{[{][[:space:]] *}}Index: [[#Index:]]{{[[:space:]] *}}Name: a{{$}}
llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll
36

Same comment re: merging with the previous line.

56

Same comment re: merging with the previous line.

DiggerLin marked 6 inline comments as done.Oct 11 2019, 8:13 AM
DiggerLin added inline comments.
llvm/lib/MC/XCOFFObjectWriter.cpp
150

thanks for your suggestion.

347

added

469

deleted

llvm/test/CodeGen/PowerPC/aix-xcoff-common.ll
84

yes ,your suggestion is more reasonable.

llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll
36

changed

56

changed as suggestion.

DiggerLin updated this revision to Diff 224604.Oct 11 2019, 8:23 AM

LGTM with minor changes that can be made on the check-in.

llvm/lib/MC/XCOFFObjectWriter.cpp
346

This would not be sufficient to avoid overflow if SymbolOffset is less than UINT32_MAX away from UINT64_MAX, use: SymbolOffset <= UINT32_MAX - CSectionRef.Address.

llvm/test/CodeGen/PowerPC/aix-xcoff-common.ll
84

Please use {{[{] instead of {{{ to avoid ambiguity as to where the regular expression starts and, if the regular expression starts with the first {{, to avoid the undefined results indicated by POSIX regarding { as the first character of an ERE.

llvm/test/CodeGen/PowerPC/aix-xcoff-lcomm.ll
56

Same comment.

This revision is now accepted and ready to land.Oct 11 2019, 10:15 AM
This revision was automatically updated to reflect the committed changes.
DiggerLin marked 3 inline comments as done.
DiggerLin edited the summary of this revision. (Show Details)Apr 11 2021, 6:35 PM
DiggerLin edited the summary of this revision. (Show Details)Apr 11 2021, 6:52 PM
DiggerLin edited the summary of this revision. (Show Details)Apr 11 2021, 6:58 PM
DiggerLin edited the summary of this revision. (Show Details)Apr 14 2021, 12:18 PM