This is an archive of the discontinued LLVM Phabricator instance.

[LTO] Add SelectionKind to IRSymtab and use it in ld.lld/LLVMgold
ClosedPublic

Authored by MaskRay on Jul 17 2021, 3:41 PM.

Details

Summary

In PGO, a C++ external linkage function foo has a private counter
__profc_foo and a private __profd_foo in a comdat nodeduplicate.

A __attribute__((weak)) function foo has a weak hidden counter __profc_foo
and a private __profd_foo in a comdat nodeduplicate.

In ld.lld a.o b.o, say a.o defines an external linkage foo and b.o
defines a weak foo. Currently we treat comdat nodeduplicate as comdat any,
ld.lld will incorrectly consider b.o:__profc_foo non-prevailing. In the worst
case b.o:__profd_foo is retained and b.o:__profc_foo isn't there will be
dangling reference causing an undefined hidden symbol error.

Add SelectionKind to Comdat in IRSymtab and let linkers ignore nodeduplicate comdat.

Diff Detail

Event Timeline

MaskRay created this revision.Jul 17 2021, 3:41 PM
MaskRay requested review of this revision.Jul 17 2021, 3:41 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 17 2021, 3:41 PM
MaskRay edited the summary of this revision. (Show Details)Jul 17 2021, 3:42 PM

I'm not sure I understand the comdat noduplicates selection kind. The lang ref says it means "The linker requires that only section with this COMDAT key exist.". Does that mean that we should never see more than one section with that COMDAT key, in which case it seems like we shouldn't have the case you describe (2 noduplicates comdat with the key "__profc_foo"), or does it mean something else?

MaskRay added a comment.EditedJul 19 2021, 9:31 AM

I'm not sure I understand the comdat noduplicates selection kind. The lang ref says it means "The linker requires that only section with this COMDAT key exist.". Does that mean that we should never see more than one section with that COMDAT key, in which case it seems like we shouldn't have the case you describe (2 noduplicates comdat with the key "__profc_foo"), or does it mean something else?

"The linker requires that only section with this COMDAT key exist." The sentence looks unclear.

The semantics are: (1) Within a module, a section can be the member of at most one ELF section group or PE/COFF comdat. (2) if two groups in two modules have the same key, they are not deduplicated.

One comdat noduplicates may have multiple members. These members will be retained or discarded as a unit. The ability is used by instrumentation to keep some related sections as a unit.

The failure scenario happened for a PGO+ThinLTO case, where two modules have comdat noduplicates of the same name. We should not perform deduplication.

I'm not sure I understand the comdat noduplicates selection kind. The lang ref says it means "The linker requires that only section with this COMDAT key exist.". Does that mean that we should never see more than one section with that COMDAT key, in which case it seems like we shouldn't have the case you describe (2 noduplicates comdat with the key "__profc_foo"), or does it mean something else?

"The linker requires that only section with this COMDAT key exist." The sentence looks unclear.

The semantics are: (1) Within a module, a section can be the member of at most one ELF section group or PE/COFF comdat. (2) if two groups in two modules have the same key, they are not deduplicated.

One comdat noduplicates may have multiple members. These members will be retained or discarded as a unit. The ability is used by instrumentation to keep some related sections as a unit.

The failure scenario happened for a PGO+ThinLTO case, where two modules have comdat noduplicates of the same name. We should not perform deduplication.

So it sounds like the meaning of "noduplicates" is "allow duplicates" or maybe "do not de-duplicate" ? Is that right?

I'm not sure I understand the comdat noduplicates selection kind. The lang ref says it means "The linker requires that only section with this COMDAT key exist.". Does that mean that we should never see more than one section with that COMDAT key, in which case it seems like we shouldn't have the case you describe (2 noduplicates comdat with the key "__profc_foo"), or does it mean something else?

"The linker requires that only section with this COMDAT key exist." The sentence looks unclear.

The semantics are: (1) Within a module, a section can be the member of at most one ELF section group or PE/COFF comdat. (2) if two groups in two modules have the same key, they are not deduplicated.

One comdat noduplicates may have multiple members. These members will be retained or discarded as a unit. The ability is used by instrumentation to keep some related sections as a unit.

The failure scenario happened for a PGO+ThinLTO case, where two modules have comdat noduplicates of the same name. We should not perform deduplication.

So it sounds like the meaning of "noduplicates" is "allow duplicates" or maybe "do not de-duplicate" ? Is that right?

Right. Other selection kinds (any is the most common; exactmatch/largest/samesize are PE/COFF specific) use the key for duplication.
For noduplicates, the key is just a key to keep several members in the same group; it is not overloaded with the deduplication semantics.

I'm not sure I understand the comdat noduplicates selection kind. The lang ref says it means "The linker requires that only section with this COMDAT key exist.". Does that mean that we should never see more than one section with that COMDAT key, in which case it seems like we shouldn't have the case you describe (2 noduplicates comdat with the key "__profc_foo"), or does it mean something else?

"The linker requires that only section with this COMDAT key exist." The sentence looks unclear.

Yeah. At a minimum, I think a word is missing (should this be "only *one* section ..." ?). But it still doesn't describe the semantics well.

The semantics are: (1) Within a module, a section can be the member of at most one ELF section group or PE/COFF comdat. (2) if two groups in two modules have the same key, they are not deduplicated.

One comdat noduplicates may have multiple members. These members will be retained or discarded as a unit. The ability is used by instrumentation to keep some related sections as a unit.

The failure scenario happened for a PGO+ThinLTO case, where two modules have comdat noduplicates of the same name. We should not perform deduplication.

So it sounds like the meaning of "noduplicates" is "allow duplicates" or maybe "do not de-duplicate" ? Is that right?

Right. Other selection kinds (any is the most common; exactmatch/largest/samesize are PE/COFF specific) use the key for duplication.
For noduplicates, the key is just a key to keep several members in the same group; it is not overloaded with the deduplication semantics.

The name and description unfortunately don't imply this to me at least. Would you be able to update the langref (in a separate patch) to update the description so it is clearer?

Also, what happens in regular LTO when IR modules with two "noduplicates" comdats with the same key are merged? Is the assumption that the symbols within them would have been deduplicated after symbol resolution, so that we don't end up violating your (1) above?

I'm not sure I understand the comdat noduplicates selection kind. The lang ref says it means "The linker requires that only section with this COMDAT key exist.". Does that mean that we should never see more than one section with that COMDAT key, in which case it seems like we shouldn't have the case you describe (2 noduplicates comdat with the key "__profc_foo"), or does it mean something else?

"The linker requires that only section with this COMDAT key exist." The sentence looks unclear.

Yeah. At a minimum, I think a word is missing (should this be "only *one* section ..." ?). But it still doesn't describe the semantics well.
[...]

The name and description unfortunately don't imply this to me at least. Would you be able to update the langref (in a separate patch) to update the description so it is clearer?

Sounds good! Sent D106300 to clarify noduplicates and misc other things.

Also, what happens in regular LTO when IR modules with two "noduplicates" comdats with the same key are merged? Is the assumption that the symbols within them would have been deduplicated after symbol resolution, so that we don't end up violating your (1) above?

The LTO behavior needs to match the linker. I think LTO needs to retain all members and place all of them into the comdat.

For a case like merging two comdats of different selection kinds, I think the compiler may have to issue an error.

I'm not sure I understand the comdat noduplicates selection kind. The lang ref says it means "The linker requires that only section with this COMDAT key exist.". Does that mean that we should never see more than one section with that COMDAT key, in which case it seems like we shouldn't have the case you describe (2 noduplicates comdat with the key "__profc_foo"), or does it mean something else?

"The linker requires that only section with this COMDAT key exist." The sentence looks unclear.

Yeah. At a minimum, I think a word is missing (should this be "only *one* section ..." ?). But it still doesn't describe the semantics well.
[...]

The name and description unfortunately don't imply this to me at least. Would you be able to update the langref (in a separate patch) to update the description so it is clearer?

Sounds good! Sent D106300 to clarify noduplicates and misc other things.

Also, what happens in regular LTO when IR modules with two "noduplicates" comdats with the same key are merged? Is the assumption that the symbols within them would have been deduplicated after symbol resolution, so that we don't end up violating your (1) above?

The LTO behavior needs to match the linker. I think LTO needs to retain all members and place all of them into the comdat.

I assume that any weak symbols of the same name in those comdats would be deduplicated though? For regular LTO what would the resulting comdat group look like in the merged IR? What if the __profc_foo was weak in both modules being linked?

For a case like merging two comdats of different selection kinds, I think the compiler may have to issue an error.

lld/test/ELF/lto/comdat-noduplicates.ll
18 ↗(On Diff #359587)

I think it would be good to test a few more cases: for non-distributed (no --thinlto-index-only) check the resulting symbols/comdats in the IR dumped by save-temps; test regular LTO as well, since that is presumably affected by this change too.

MaskRay updated this revision to Diff 359915.Jul 19 2021, 2:24 PM

improve test

Thanks for the langref and IR patches, that helps my understanding. Have some suggestions below on new/improved comments needed.

lld/COFF/InputFiles.cpp
1069–1070

Can this FIXME be removed now? Or are there more improvements needed?

lld/test/ELF/lto/comdat-noduplicates.ll
19 ↗(On Diff #359915)

Please add a comment - I'm not completely sure what this value is.

58 ↗(On Diff #359915)

Some comments here and in the ABC case below about what specifically we are checking for in the resulting summaries would be helpful.

llvm/tools/gold/gold-plugin.cpp
627

Comment needed on how this achieves NoDuplicates effect.

MaskRay updated this revision to Diff 360193.Jul 20 2021, 10:22 AM
MaskRay marked 5 inline comments as done.

improve comments

MaskRay added inline comments.Jul 20 2021, 10:24 AM
lld/COFF/InputFiles.cpp
1069–1070

Changed the FIXME wording. I'll keep PE/COFF unchanged in this patch.

tejohnson accepted this revision.Jul 20 2021, 11:49 AM

lgtm with a comment follow on request below.

llvm/tools/gold/gold-plugin.cpp
626–629

I guess what I was looking for was more of a comment connecting how not setting comdat_key ensures we don't deduplicate.

This revision is now accepted and ready to land.Jul 20 2021, 11:49 AM
MaskRay updated this revision to Diff 360245.Jul 20 2021, 1:00 PM
MaskRay marked an inline comment as done.
MaskRay edited the summary of this revision. (Show Details)

clarify a comment.

rename noduplicates to the new name nodeduplicate

This revision was landed with ongoing or failed builds.Jul 20 2021, 1:22 PM
This revision was automatically updated to reflect the committed changes.

I'm seeing test failures on i686 with this patch:

Exit Code: 1

Command Output (stderr):

/usr/bin/ld.gold: error: /builddir/build/BUILD/llvm-13.0.0rc1.src/redhat-linux-build/test/tools/gold/X86/Output/comdat-nodeduplicate.ll.tmp/ab.lto.o: incompatible target

I'm seeing test failures on i686 with this patch:

Exit Code: 1

Command Output (stderr):

/usr/bin/ld.gold: error: /builddir/build/BUILD/llvm-13.0.0rc1.src/redhat-linux-build/test/tools/gold/X86/Output/comdat-nodeduplicate.ll.tmp/ab.lto.o: incompatible target

Cherry picked D107020 into release/13.x

Thanks you.

steplong added inline comments.
llvm/test/tools/gold/X86/comdat-nodeduplicate.ll
40

Hi, I'm seeing a test failure on amd64 with this line. The actual output I'm seeing is
DATA: 0x[[#%x,]] 01000000 00000000 02000000 00000000 ......... I don't know enough about the patch to determine what is going on. I wonder if you have any ideas.

MaskRay added inline comments.Aug 19 2021, 12:35 PM
llvm/test/tools/gold/X86/comdat-nodeduplicate.ll
40

I suspect it is a bug in older gold.

My llvm-readelf -x .data output with binutils 2.35.2 and (built from git) 2.36.50.2021.0520

Hex dump of section '.data':
  0x00401128 01000000 00000000                   ........

What's your version? If it's not too much a hassle, I hope we just accept it and don't do anything. The test will be good when your gold is upgraded..

steplong added inline comments.Aug 19 2021, 3:15 PM
llvm/test/tools/gold/X86/comdat-nodeduplicate.ll
40

We are using

$ ld.gold -version
GNU gold (GNU Binutils for Ubuntu 2.24) 1.11

and yup should be ok to just accept it as it is.