This is an archive of the discontinued LLVM Phabricator instance.

Use local symbols for creating .stack-size
ClosedPublic

Authored by espindola on Mar 19 2018, 7:03 PM.

Details

Summary

Right now .stack-size ends up having relocations that point to global symbols. This means that if the symbol from another file is selected at link time it will be referring to the wrong symbol.

This is a first step in fixing pr36717.

Diff Detail

Event Timeline

espindola created this revision.Mar 19 2018, 7:03 PM

LGTM., as long as this doesn't cause the number of symbols to double up?

LGTM., as long as this doesn't cause the number of symbols to double up?

No, in practice the assembler will convert the relocations to use section symbols instead.

This revision is now accepted and ready to land.Mar 26 2018, 1:44 PM

Hi Rafael. The SN-Linker has the following rule for metadata sections:

Relocations from "metadata sections" to global symbols are
treated as discarded if the chosen global was not from the
same object file. This makes sense as the metadata entry
for a function is only applicable to that one particular version.

I have mentioned this rule here:

https://groups.google.com/d/msg/generic-abi/A-1rbP8hFCA/z2xHWFQBCAAJ

It would be an obvious extension to your SHF_LINK_ORDER concept.
If we had this extension in ELF then we could remove this special case
in MC and remove all of these local symbols that are bloating the
assembler output.

What do you think?

Hi Rafael. The SN-Linker has the following rule for metadata sections:

Relocations from "metadata sections" to global symbols are
treated as discarded if the chosen global was not from the
same object file. This makes sense as the metadata entry
for a function is only applicable to that one particular version.

I agree with Cary that it is a bad idea to break sections. What is needed is a liker like lld that is efficient at handling sections.

It would be an obvious extension to your SHF_LINK_ORDER concept.
If we had this extension in ELF then we could remove this special case
in MC and remove all of these local symbols that are bloating the
assembler output.

It is not an obvious extension. The obvious solution is to just use multiple sections. I am sorry I missed this when it went in, but we should never add an extension that requires a section that refers to multiple ones. I will fix llvm to use multiple sections for -ffunction-sections/comdat next.

And there is no local symbol pouting the output. The assembler will convert the reference to use section symbols.

Hi Rafael,

Thanks for the reply. I agree that multiple sections is conceptually a
better
approach. Let's keep the discussion to the gabi list as it has nothing to do
with your change here.

I will fix llvm to use multiple sections for -ffunction-sections/comdat

I agree with this and with the rest of your assessment in pr36717.

And there is no local symbol pouting the output. The assembler will

convert the reference to use section symbols.

I understand that and I thought your change was good. Apologies,
if "bloating" sounded bad I should have used "extra" instead. What I
meant was - wouldn't the assembly output have doubled up local/global
symbols? My idea is, could we find some way to not have to emit the
additional local symbols in the assembly file and hopefully to
remove the special case code in mc that is needed to emit them?

I understand that and I thought your change was good. Apologies,
if "bloating" sounded bad I should have used "extra" instead. What I
meant was - wouldn't the assembly output have doubled up local/global
symbols? My idea is, could we find some way to not have to emit the
additional local symbols in the assembly file and hopefully to
remove the special case code in mc that is needed to emit them?

Why are extra symbols *in the assembly* a problem? We normally don't print assembly and using local symbols means that we have only one place that has to implement the logic for using section symbols in relocations. I quite like the rule: CodeGen and user written assembly files use local symbols everywhere and the assembler optimizes that when possible.

Having said that, if you really find the assembly verbosity problematic, getFunctionBegin could return a section symbol when the current function is at the start of a section.

Why are extra symbols *in the assembly* a problem? We normally don't print assembly

I suppose I like the assembly to look good :) I think that the first of the
following options reads nicely compared to the others:

    .globl _start
_start:
    ret
    .section ".stack_sizes","",@progbits
    .quad _start
    .uleb128 10
    .globl _start
_start:
.L_start:
    ret
    .section ".stack_sizes","",@progbits
    .quad .L_start
    .uleb128 10
    .globl first
first:
    ret
    .globl _start
_start:
    ret
    .section ".stack_sizes","",@progbits
    .quad .text+10
    .uleb128 10

Using local symbols means that we have only one place that has to implement the logic for using section symbols in relocations. I quite like the rule: CodeGen and user written assembly files use local symbols everywhere and the assembler optimizes that when possible.

This sounds like a sound design choice to me. (Although, I have to say that
as I'm more of a binary tools man than a compiler engineer, I actually hate
it when the assembler doesn't output exactly the object file I expect given
what I typed in.)

Having said that, if you really find the assembly verbosity problematic, getFunctionBegin could return a section symbol when the current function is at the start of a section.

This would certainly remove the extra local symbols. I like the purity of
your design principle with where only the assembler lowers to use section
symbols though. I'm on the fence here so unless someone else chimes in I
think leaving MC as-is is fine with me.

joerg added a subscriber: joerg.Mar 27 2018, 12:34 PM

Given that some people like to post-process assembler files, using the section symbol directly is a bad idea. Adding the local symbols is fine.