This is an archive of the discontinued LLVM Phabricator instance.

[XCOFF] dynamically allocating the section and csect group
AbandonedPublic

Authored by shchenz on Mar 24 2021, 6:01 AM.

Details

Reviewers
jasonliu
jsji
hubert.reinterpretcast
daltenty
Group Reviewers
Restricted Project
Summary

We add DWARF support in https://reviews.llvm.org/D97184. For DWARF sections, because there are many of them especially for DWARF 5, 20 DWARF sections without split-dwarf feature and 30 sections with split-dwarf support, it is not good to predefine them as we do for csect sections(text, data, bss). We should dynamically allocate section entries for DWARF sections.

This patch refactors the generation for csect sections to be dynamically allocated, so csect sections can be handled the same way as DWARF sections.

This patch also refactors the groups to be dynamically allocated in one csect section.

Because of dynamically allocating, the order of the sections/groups is determined by the object writer's caller. As shown in the test, the order of BSSCsects and FuncDSCsects is changed after this patch.

Diff Detail

Event Timeline

shchenz created this revision.Mar 24 2021, 6:01 AM
shchenz requested review of this revision.Mar 24 2021, 6:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 24 2021, 6:01 AM
shchenz updated this revision to Diff 332962.Mar 24 2021, 6:29 AM

1: LINT warning fix

Thanks for working on this version of the patch. It's always good to have something concrete to talk about instead of imaging how it's going to look like when everything is in place.
I'm still not convinced that dynamically allocating csect groups is the way to go though.
As you could see in the patch, the code are more complicated with dynamic allocation, and less efficient(doing more loopings to find the correct CsectGroup).
Also, we now lose control of how the CsectGroup are going to layout inside of a section, and how each section is going to layout in the object file (It now comes in a first come first serve order I believe).
This could bring some practical issues for us. For example, we may prefer to have layouts like: text -> data -> bss -> tdata -> tbss -> dwarf raw datas. But now, I think it's possible that we start having text -> data -> tdata -> bss -> tbss -> dwarf raw datas, which is not quite right when you have tdata cut in between of normal datas.
And I don't see a lot of benefit of having a single vector to contain DWARF sections and csect section. In D97184, we actually handles DWARF sections and csect sections quite differently (i.e. a large block of if/else statement for dwarf and csects).
I would still prefer only dynamically allocating DWARF raw datas, and have pre-defined csect sections. It's less of an issue to dynamically allocating DWARF raw datas because we do not really care the layout between different dwarf raw datas (as long as they are at the end of the object files).

@hubert.reinterpretcast @daltenty @DiggerLin What's your opinions on this? Should we have an offline discussion about the direction?

llvm/lib/MC/XCOFFObjectWriter.cpp
95

If we are doing dynamic allocation for CsectGroup, why do we need to store pointers?

232

I don't think it's necessary to have a unique_ptr of UndefinedCsects as well.

shchenz added a comment.EditedApr 5 2021, 8:27 PM

hi @jasonliu , thanks a lot for your review.

Also, we now lose control of how the CsectGroup are going to layout inside of a section, and how each section is going to layout in the object file (It now comes in a first come first serve order I believe).
This could bring some practical issues for us. For example, we may prefer to have layouts like: text -> data -> bss -> tdata -> tbss -> dwarf raw datas. But now, I think it's possible that we start having text -> data -> tdata -> bss -> tbss -> dwarf raw datas, which is not quite right when you have tdata cut in between of normal datas.

I think the layout of the object should not be done in XCOFFObjectWriter. In file XCOFFObjectWriter, we only call executePostLayoutBinding which theoretically should not do something to the layout? Instead, if we have any specific requirement for the layout of the object, does it make more sense to implement it in another virtual function finishLayout() or in MCAssembler::layout()? finishLayout() is before the calling to executePostLayoutBinding in class MCAssembler.

And I don't see a lot of benefit of having a single vector to contain DWARF sections and csect section. In D97184, we actually handles DWARF sections and csect sections quite differently (i.e. a large block of if/else statement for dwarf and csects).

Hmm, I did a prototype to change the patch D97184 to create a virtual function writeSection() in class XCOFFObjectWriter and override this function in class DwarfSectionEntry and CsectSectionEntry, so we can use a single interface to write sections for both control section and DWARF section. We can also do the same change for other interfaces like writing symbols. I gave this up to avoid to big changes and thought maybe it is better to do that in another patch.

I would still prefer only dynamically allocating DWARF raw datas, and have pre-defined csect sections. It's less of an issue to dynamically allocating DWARF raw datas because we do not really care the layout between different dwarf raw datas (as long as they are at the end of the object files).

Yeah, this is also a good way if we decide we don't need this refactoring.

llvm/lib/MC/XCOFFObjectWriter.cpp
95

If we use CsectGroup instead of a pointer, so we need to allocate a CsectGroup object in stack and then insert the std::move-d stack object to the deque? I am not sure the benefit of allocating the object in stack and in heap? Could you help to explain more? Thanks.

shchenz abandoned this revision.Jun 8 2021, 2:49 AM

This is not needed anymore. Now we use different containers for csect and dwarf sections