This is an archive of the discontinued LLVM Phabricator instance.

[lld] Fix handling of output section selection for unmerged mergeable inputs and relocatable output
AbandonedPublic

Authored by gbreynoo on Mar 1 2018, 7:11 AM.

Details

Summary

Usually mergeable sections, those with the flag SHF_MERGE, can be combined into synthetic sections; however use of the optimisation flag "-O0" does not allow this merging. When "-O0" is called with relocatable object output "-r", these non synthetic sections are used to create output sections. This allows the incorrect output of multiple sections with same name.

The change below ensures only synthetic sections are used to create output sections in this way.

Diff Detail

Event Timeline

gbreynoo created this revision.Mar 1 2018, 7:11 AM
ruiu added a comment.Mar 1 2018, 10:29 AM

As Rafael said, it is not an error or anything to have multiple sections with the same section name in one ELF file. Actually that's pretty common. For example, if you pass -fno-unique-section-names to clang, clang creates multiple ".text" sections. What is a motivation to make this change?

We found that there were problems with DWARF output when using the flags above, consumers were either creating incorrect output or crashing. These consumers where relying on there only being one .debug_str section. The DWARF specification has an attribute that is an offset into the .debug_str section (e.g. DW_form_strp), only one of such section is expected.

Use of -fno-unique-section-names with Clang makes the output of multiple sections with the same name explicit, however this behaviour was not expected with the flags used above.

gbreynoo updated this revision to Diff 137029.Mar 5 2018, 9:49 AM

In response to Rafael on the mailing list:

You are correct that the output looks to be valid, however as DWARF consumers are expecting one ".debug_str" section they can function incorrectly. For example llvm-dwarfdump appears to use any ".debug_str" offsets with the first ".debug_str" section it finds.

A test case with llvm-dwarfdump, the name attributes in one CU will be incorrect:

main.cpp:

int other_value();

int value (){
    return other_value();
}
int main (){
    return value ();
}

other.cpp:

int other_value(){
    return 0;
}

The DWARF Specification appears to implicitly say there is only one ".debug_str" section, meaning the current lld output is not valid.

Also I have updated the diff so this change in behaviour only effects ".debug_str".

I still think you found a bug in dwarfdump/llvm-dwarfdump.

But we do merge non SHF_MERGE sections and the intention of -O0 is to treat SHF_MERGE sections as regular sections, so it seems we got something wrong somewhere.

Making this specific to .debug_str makes it worse as it is a more special case. I am taking a look at what else can we do about it.

gbreynoo abandoned this revision.Sep 25 2019, 5:39 AM

These lines were dropped by D67504:

-  // When control reaches here, mergeable sections have already been merged into
-  // synthetic sections. For relocatable case we want to create one output
-  // section per syntetic section so that they have a valid sh_entsize.
-  if (config->relocatable && (isec->flags & SHF_MERGE))
-    return createSection(isec, outsecName);
-

There is one .rodata.1 and one .debug_str in the -r output. It addresses @gbreynoo's problem. It drops the potential optimization of .rodata.1 with different entry sizes because the two .rodata.1 are now merged. I have explained in D67504 that this is not too bad.