This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Add text section prefix for COFF object file
ClosedPublic

Authored by TaoPan on Nov 24 2020, 11:01 PM.

Details

Summary

Text section prefix is created in CodeGenPrepare, it's file format independent implementation, text section name is written into object file in TargetLoweringObjectFile, it's file format dependent implementation, port code of adding text section prefix to text section name from ELF to COFF.
Different with ELF that use '.' as concatenation character, COFF use '$' as concatenation character. That is, concatenation character is variable, so split concatenation character from text section prefix.
Text section prefix is existing feature of ELF, it can help to reduce icache and itlb misses, it's also make possible aggregate other compilers e.g. v8 created same prefix sections. Furthermore, the recent feature Machine Function Splitter (basic block level text prefix section) is based on text section prefix.

Diff Detail

Event Timeline

TaoPan created this revision.Nov 24 2020, 11:01 PM
Herald added a project: Restricted Project. · View Herald TranscriptNov 24 2020, 11:01 PM
TaoPan requested review of this revision.Nov 24 2020, 11:01 PM

Can you add more detail for the background on summary?

llvm/test/CodeGen/X86/text-section-prefix.ll
2

What's your checking here?

rnk added inline comments.Nov 25 2020, 11:25 AM
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
623

This is out of scope for the patch, but why is it called a "section prefix" if it goes after the section name? ".text.hot._ZfooEv"

llvm/test/CodeGen/X86/text-section-prefix.ll
2

Please run this test in a few configurations. You can add two RUN lines like:

; RUN: llc -mtriple x86_64-linux-gnu %s -o - | FileCheck %s --check-prefix=ELF
; RUN: llc -mtriple x86_64-linux-gnu -unique-section-names=0 %s -o - | FileCheck %s --check-prefix=ELF-NOUNIQ
; RUN: llc -mtriple x86_64-windows-msvc %s -o - | FileCheck %s --check-prefix=MSVC
; RUN: llc -mtriple x86_64-windows-msvc %s -o - | FileCheck %s --check-prefix=MINGW

Add check lines for the section directives.

MaskRay added inline comments.
llvm/test/CodeGen/X86/text-section-prefix.ll
4

You may want some RUN configurations in Transforms/CodeGenPrepare/X86/section.ll

TaoPan edited the summary of this revision. (Show Details)Nov 25 2020, 6:16 PM
TaoPan marked an inline comment as done.Nov 25 2020, 6:39 PM
TaoPan added inline comments.
llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp
623

Without this modification, ".texthot._ZfooEv", no '.' between .text and hot, as the '.' was removed in CodeGenPrepare.cpp
Please have a look at I just added Summary, key reason is ELF use '.', COFF use '$'.

llvm/test/CodeGen/X86/text-section-prefix.ll
2

I added some description in Summary.

The line of output

.section     .text,xxx

will be changed to

.section     .text**$hot**,xxx

or

.section     .text**$unlikely**,xxx

with this patch on Windows.

2

Thanks for your nice guidance!
I'll try your RUN lines.

4

Thanks! I'll refer to this file.

pengfei added inline comments.Nov 25 2020, 6:52 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
2

I see. Then you just need to add RUNs as Reid suggested and checks like what in section.ll

TaoPan added inline comments.Nov 25 2020, 7:01 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
2

ok, I'll add RUNs and checks as Reid and MaskRay suggested, thanks!

TaoPan updated this revision to Diff 307950.Nov 26 2020, 7:12 PM

Fix lit failures and update test case for checking text section prefix.

pengfei added inline comments.Nov 26 2020, 9:19 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
2

Maybe better to put the test in Transforms/CodeGenPrepare/X86

10

You don't need to distinguish HOT and UNLIKLY since they are in different functions. You just need SECTION-ELF and SECTION-COFF.
You may also need to add

; RUN: llc -mtriple x86_64-linux-gnu -unique-section-names=0 %s -o - | FileCheck %s --check-prefix=ELF-NOUNIQ
; RUN: llc -mtriple x86_64-windows-mingw64 %s -o - | FileCheck %s --check-prefix=MINGW

as Reid suggested.

TaoPan updated this revision to Diff 307958.Nov 26 2020, 10:24 PM

Fix testcase, remove HOT and UNLIKELY, add ELF-NOUNIQ and MINGW

TaoPan added inline comments.Nov 26 2020, 10:27 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
10

Thanks for your review comments! I removed HOT and UNLIKELY, and added ELF-NOUNIQ and MINGW, could you please have a look?

pengfei accepted this revision.Nov 26 2020, 10:34 PM

LGTM. You'd better wait few days to see if others object. Thanks.

llvm/test/CodeGen/X86/text-section-prefix.ll
10

Nit: It seems no difference between MSVC and MINGW, you can use COFF for both to reduce the duplicated check.

This revision is now accepted and ready to land.Nov 26 2020, 10:34 PM
TaoPan updated this revision to Diff 307959.Nov 26 2020, 10:41 PM

Fix testcase: combine MSVC and MINGW into COFF

TaoPan added inline comments.Nov 26 2020, 10:57 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
10

Thanks! It's good idea, I combined MSVC and MINGW into COFF.

TaoPan updated this revision to Diff 308244.Nov 29 2020, 5:01 PM

git rebase

rnk added inline comments.Nov 30 2020, 2:46 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
10

They will differ if you use a linkonce_odr comdat function. These are very common in C++, it is worth testing this case. Try:

$foocomdat = comdat any
define linkonce_odr dso_local i32 @foocomdat(i32 %0, i32 %1) comdat {
  %3 = add nsw i32 %1, %0
  ret i32 %3
}

The sections for foocomdat will have a suffix on mingw.

rnk added inline comments.Nov 30 2020, 4:24 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
10

Perhaps use weak_odr linkage instead, so that it is non-discardable and is not dropped by DCE.

TaoPan updated this revision to Diff 308535.Nov 30 2020, 9:39 PM

Remove redundant blank line

TaoPan added inline comments.Nov 30 2020, 9:51 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
10

Thanks for your review comment!
I tried you suggested case, and replaced linkonce_odr with weak_odr, I didn't find difference between x86_64-windows-msvc and x86_64-windows-mingw64.
Is the difference related to text section name? I think this patch doesn't change anything except text section name.

rnk added inline comments.Dec 1 2020, 8:20 AM
llvm/test/CodeGen/X86/text-section-prefix.ll
10

Yes, there should be a difference. Consider:

$ cat t.cpp 
inline int myadd(int x, int y) { return x + y; }
int main() { return myadd(1, 2); }

$ clang -S -O0 --target=x86_64-windows-msvc t.cpp  -o - | grep .text
        .text
        .section        .text,"xr",discard,"?myadd@@YAHHH@Z"

$ clang -S -O0 --target=x86_64-windows-gnu t.cpp  -o - | grep .text
        .text
        .section        .text$_Z5myaddii,"xr",discard,_Z5myaddii

Note how in the gnu environment LLVM appends the symbol name to the section name. Did you add the comdat as well?

TaoPan updated this revision to Diff 308824.Dec 1 2020, 5:33 PM

Change x86_64-windows-mingw64 to x86_64-windows-gnu for mingw format text section name

TaoPan added inline comments.Dec 1 2020, 5:41 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
10

Thanks for your notice of windows gnu environment and example!
I can make the difference after changing target to x86_64-windows-gnu (referred to your example).
Could you please have a look again?

TaoPan added inline comments.Dec 1 2020, 5:54 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
2

Sorry! I didn't note and reply this comment previously.
The reason I didn't put the test in Transforms/CodeGenPrepare/X86 is the key modification is in TargetLoweringObjectFile of CodeGen, not in CodeGenPrepare.

rnk accepted this revision.Dec 2 2020, 7:17 AM

I think this is an improvement, so let's go ahead and land it, but I'm not sure this will really work for mingw.

One last think I will note is that COFF linkers will sort input sections by the part of the section name after the dollar sign. There isn't any special treatment of .text$hot / .text$unlikely. They are sorted early and late by their ASCII ordering. That seems to achieve the right behavior, but maybe PGO should use more intentional section prefixes to get the desired ordering.

llvm/test/CodeGen/X86/text-section-prefix.ll
20

@mstorsjo This seems a bit more complicated than I originally thought it would be.

llvm/test/Transforms/CodeGenPrepare/X86/section.ll
47

The section name is .text.unlikely., with a trailing dot. I guess that is the existing behavior, but it seems odd to me. Nevermind.

mstorsjo added inline comments.Dec 2 2020, 1:23 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
20

Yeah, but I guess this is what we end up with, if we append things this way. I wonder whether this works properly when interacting with GNU ld?

The patch description has very little context about the what/where/why - so this is something that ends up invoked only if building with PGO?

TaoPan edited the summary of this revision. (Show Details)Dec 3 2020, 5:27 PM
TaoPan added inline comments.Dec 3 2020, 5:59 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
20

Thanks for your review comments!
COFF support "$" character as a special interpretation in section names for grouped sections. If GNU ld support COFF, it should recognize "$" and works properly.
I added some description to the end of Summary. Yes, it will be invoked only if building with PGO, but the benefit is not limited to PGO, it's also make possible aggregate other compilers e.g. v8 created same prefix sections.

TaoPan added inline comments.Dec 3 2020, 10:30 PM
llvm/test/Transforms/CodeGenPrepare/X86/section.ll
47

Yes, the existing code
https://github.com/llvm/llvm-project/blob/master/llvm/lib/CodeGen/TargetLoweringObjectFileImpl.cpp#L633

} else if (HasPrefix)
  Name.push_back('.');

leads to the trailing dot.

mstorsjo added inline comments.Dec 3 2020, 11:26 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
20

GNU ld does support COFF, but it has its own interpretation of some bits regarding to COMDAT sections, deviating a bit from the spec.

I'll try to test this usecase, before and after this patch, with both lld and GNU ld, to see if there are issues - I hope you're ok with waiting until I've had a change to try it.

TaoPan added inline comments.Dec 3 2020, 11:39 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
20

Sure, I'm ok with waiting for your test, and thanks for your test!

mstorsjo added inline comments.Dec 6 2020, 1:56 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
20

Ok, so I've tested this in a few mingw configurations, and it does seem to work, with a few caveats:

If a comdat symbol is marked "unlikely" in one translation unit but not in another one, linking fails with GNU ld (but lld still succeeds). So for GNU ld, it seems like the section suffix ($unlikely$comdatname) needs to be identical in all occurrances of the symbol/section for the comdat deduplication to work. In lld it works as intended via the leader symbol.

For comdats with associated sections, like .text$unlikely$comdatname associated with .xdata$unlikey$comdatname, associativity seems to work fine as long as the suffix, including $unlikely is identical between .text, .pdata and .xdata - which it turns out to be. But maybe it'd be good to have a test explicitly check that it really is the case?

TaoPan updated this revision to Diff 309803.Dec 6 2020, 7:20 PM

Check section order of .text$hot and .text$unlikely

TaoPan added inline comments.Dec 6 2020, 7:25 PM
llvm/test/CodeGen/X86/text-section-prefix.ll
20

I added test case to llvm/test/CodeGen/X86/mingw-comdats.ll for checking the order of .text$hot$comdatname and .text$unlikely$comdatname with existing sections, could you please have a look?

mstorsjo added inline comments.Dec 6 2020, 11:25 PM
llvm/test/CodeGen/X86/mingw-comdats.ll
88 ↗(On Diff #309803)

I'm not quite sure what this test actually is supposed to test; ordering of sections according to the suffixes doesn't happen at the object file stage, that only happens while linking.

What would be useful, though, would be to add a new function similar to the _Z3fooi one above, with e.g. a unlikely attribute, and check that all of .text/.xdata/.pdata get similar suffixes just like the checks right above on lines 80-83.

TaoPan updated this revision to Diff 309818.Dec 7 2020, 12:23 AM

Update mingw-comdats.ll to check .text$unlikely$comdatname,
.xdata$unlikely$comdatname and .pdata$unlikely$comdatname

Update mingw-comdats.ll to check .text$unlikely$comdatname,
.xdata$unlikely$comdatname and .pdata$unlikely$comdatname

Thanks, this aspect looks good to me now, so no more objections from me.

TaoPan added inline comments.Dec 7 2020, 12:30 AM
llvm/test/CodeGen/X86/mingw-comdats.ll
88 ↗(On Diff #309803)

Thanks for your detail guidance!