This is an archive of the discontinued LLVM Phabricator instance.

[MC][ELF] Accept abbreviated form with sh_flags and sh_entsize
ClosedPublic

Authored by burnus on Nov 24 2020, 11:58 AM.

Details

Summary

D73999 / commit 75af9da755721123e62b45cd0bc0c5e688a9722a
added for LLVM 11 a check that sh_flags and sh_entsize (and sh_type)
changes are an error, in line with GNU assembler.

However, GNU assembler accepts and GCC generates an abbreviated form:
while the first .section contains the flags and entsize, subsequent
sections simply contain the name without repeating entsize or flags.

Do likewise for better compatibility.

See https://bugs.llvm.org/show_bug.cgi?id=48201

Diff Detail

Unit TestsFailed

Event Timeline

burnus created this revision.Nov 24 2020, 11:58 AM
burnus requested review of this revision.Nov 24 2020, 11:58 AM

Correction: The previous change (D73999) was done for "LLVM 11" - not as I wrote for "LLVM 10".

MaskRay edited the summary of this revision. (Show Details)Nov 24 2020, 12:07 PM

This is known. I'd like it to be an error but it is unfortunate that GCC uses it this way. Can you share a code snippet which can trigger the relevant GCC logic?

/* If we have already declared this section, we can use an
   abbreviated form to switch back to it -- unless this section is
   part of a COMDAT groups, in which case GAS requires the full
   declaration every time.  */
if (!(HAVE_COMDAT_GROUP && (flags & SECTION_LINKONCE))
    && (flags & SECTION_DECLARED))
  { 
    fprintf (asm_out_file, "\t.section\t%s\n", name);
    return;
  }

There is a variant where an empty string literal is used (https://sourceware.org/pipermail/binutils/2020-March/110441.html): I haven't receive a reply yet.
Shall we continue erroring for that?

llvm/test/MC/ELF/section-entsize-not-repeated.s
1 ↗(On Diff #307421)

I'd rename it to section-omitted-attributes.s

llvm/test/MC/ELF/section-flags-changed-2.s
4 ↗(On Diff #307421)

Can you merge these tests into section-entsize-changed.s?

This is known. I'd like it to be an error but it is unfortunate that GCC uses it this way. Can you share a code snippet which can trigger the relevant GCC logic?

It looks as if one way to get it is requires the -ffunction-sections flag, example: https://godbolt.org/z/Yfveoh which creates:

  • .section .text.ULtod,"ax",@progbits
  • .section .text.ULtod

-ffunction-sections/-fdata-sections is described as: Place each function or data item into its own section in the output file if the target supports arbitrary sections. ... Only use these options when there are significant benefits from doing so. When you specify these options, the assembler and linker create larger object and executable files and are also slower.

As there is already a huge growth, avoiding the repetition of the flags/entsize helps a bit. Thus, the GCC and GNU assembler short variant makes kind of sense.

There is a variant where an empty string literal is used (https://sourceware.org/pipermail/binutils/2020-March/110441.html): I haven't receive a reply yet.
Shall we continue erroring for that?

No strong opinion, but I don't see how why one would write such code in real-world code; and if there is no good reason for valid code, giving an error should not harm, should it?

This is known. I'd like it to be an error but it is unfortunate that GCC uses it this way. Can you share a code snippet which can trigger the relevant GCC logic?

It looks as if one way to get it is requires the -ffunction-sections flag, example: https://godbolt.org/z/Yfveoh which creates:

  • .section .text.ULtod,"ax",@progbits
  • .section .text.ULtod

-ffunction-sections/-fdata-sections is described as: Place each function or data item into its own section in the output file if the target supports arbitrary sections. ... Only use these options when there are significant benefits from doing so. When you specify these options, the assembler and linker create larger object and executable files and are also slower.

As there is already a huge growth, avoiding the repetition of the flags/entsize helps a bit. Thus, the GCC and GNU assembler short variant makes kind of sense.

OK, thanks for the example: a jump table causes the second .section .text.foobar directive without attributes. This makes sense.

There is a variant where an empty string literal is used (https://sourceware.org/pipermail/binutils/2020-March/110441.html): I haven't receive a reply yet.
Shall we continue erroring for that?

No strong opinion, but I don't see how why one would write such code in real-world code; and if there is no good reason for valid code, giving an error should not harm, should it?

Right, if we can continue rejecting .section .foo,"" relatively easily, I hope we continue rejecting it (even though GNU as allows it). If a bit difficult, I'll not push too hard on losing the diagnostic.

llvm/test/MC/ELF/section-flags-changed-2.s
9 ↗(On Diff #307421)

Please drop tests which are already tested in section-flags-changed.s

burnus updated this revision to Diff 307520.Nov 24 2020, 11:41 PM

Address review comments.

I also added the example from https://sourceware.org/pipermail/binutils/2020-March/110441.html for which the previous patch gave (as with GNU as) no error:
.section .foo,"",@progbits; .byte 2

I fixed that example by using: (Flags || Size || !TypeName.empty()) &&. However, to handle even more, some boolean flag needs to be set during the parsing, which I didn't do (and seems to be overly complex just for diagnostics).

jhenderson added inline comments.Nov 25 2020, 12:27 AM
llvm/test/MC/ELF/section-entsize-changed.s
14 ↗(On Diff #307520)

Nit: delete extra blank line.

burnus updated this revision to Diff 307556.Nov 25 2020, 2:25 AM
burnus edited the summary of this revision. (Show Details)

Fix review nit – two instead of one empty line inside a test case.

MaskRay added inline comments.Nov 25 2020, 9:04 AM
llvm/test/MC/ELF/section-entsize-changed.s
14 ↗(On Diff #307556)

Unused(should be deleted)

28 ↗(On Diff #307556)

.byte directives are unused (should be deleted)

llvm/test/MC/ELF/section-omitted-attributes.s
4

#

12

No need for multiple pushsection .foo

burnus updated this revision to Diff 307637.Nov 25 2020, 9:35 AM
burnus marked an inline comment as done.

Fix review comments (all in the testcases)

Thanks for the reviews!

burnus marked 4 inline comments as done.Nov 26 2020, 2:48 PM
burnus set the repository for this revision to rG LLVM Github Monorepo.Nov 26 2020, 3:04 PM

Review PING – Is there anything else to be changed?

burnus added a comment.Dec 3 2020, 7:11 AM

REVIEW PING

I think all review comments by @jhenderson and @MaskRay have been fixed last week (thanks for the good suggestions!).
However, a (final) review is still pending ... Thus, if have time, please suggest additional fixes (or approve). Thanks!

MaskRay added inline comments.Dec 3 2020, 8:46 AM
llvm/test/MC/ELF/section-entsize-changed.s
14 ↗(On Diff #307637)

The labels are unused. Please remove them.

18 ↗(On Diff #307637)

Why does the additional test duplicate the testing of changed flags which has been covered by section-flags-changed.s?

llvm/test/MC/ELF/section-omitted-attributes.s
14

.bar just duplicates .foo , please drop .bar

And also foo

burnus updated this revision to Diff 309504.Dec 4 2020, 3:16 AM

@MaskRay: Thanks for the review. I hope it now makes sense.

As only '.section .foobar,"",@progbits`' is kind of a new check ("" was not tested before), I kept it but added it to section-flags-changed.s instead of section-entsize-changed.s. In section-omitted-attributes.s, I removed the only slightly different variant.

MaskRay accepted this revision.Dec 4 2020, 9:31 AM

Please wait for @jhenderson

llvm/test/MC/ELF/section-omitted-attributes.s
1

Please add a file-level comment about the purpose of the test:

If section flags and other attributes are omitted, don't error.

This revision is now accepted and ready to land.Dec 4 2020, 9:31 AM
jhenderson added inline comments.Dec 7 2020, 1:37 AM
llvm/test/MC/ELF/section-omitted-attributes.s
3

Is it worth adding something like this after this line:

CHECK-NOT: .section
CHECK-NOT: .pushsection

? I don't know if it's actually what happens, but it wouldn't make sense to repeat the section directives if they aren't needed.

burnus added inline comments.Dec 7 2020, 2:06 AM
llvm/test/MC/ELF/section-omitted-attributes.s
3

Is it worth adding something like this after this line:

CHECK-NOT: .section
CHECK-NOT: .pushsection

? I don't know if it's actually what happens, but it wouldn't make sense to repeat the section directives if they aren't needed.

What happens with llvm-mc -triple=x86_64 -o - is (full output without empty lines):

.text
.section        .foo,"aM",@progbits,1

Thus, all .foo sections are merged and the originally specified flag/entsize is used.

We could add the CHECK-NOT but my feeling is that it is needed.

In the real-world code, the '.section' are repeated (with GCC) with -ffunction-section as the example https://godbolt.org/ shows – as described above in my previous comment ( https://reviews.llvm.org/D92052#2414890 ).

To quote @MaskRay: "a jump table causes the second .section .text.foobar directive without attributes. This makes sense."

@jhenderson – did I answer your questions - or did I misread them? Are the other issues?

jhenderson accepted this revision.Dec 11 2020, 6:15 AM

@jhenderson – did I answer your questions - or did I misread them? Are the other issues?

Apologies, I somehow missed this. LGTM once @MaskRay's outstanding comment has been addressed.

llvm/test/MC/ELF/section-omitted-attributes.s
1

It looks like @MaskRay's comment hasn't been addressed here?

burnus updated this revision to Diff 311236.Dec 11 2020, 8:36 AM

Thanks for the reviews! Only change: added proposed comment string to llvm/test/MC/ELF/section-omitted-attributes.s regarding what the patch does.

This revision was landed with ongoing or failed builds.Dec 11 2020, 8:46 AM
This revision was automatically updated to reflect the committed changes.
abidh added a subscriber: abidh.Dec 11 2020, 8:49 AM

Committed in 1deff4009e0a on request of @burnus.

burnus marked 4 inline comments as done.Dec 11 2020, 9:07 AM
burnus added a subscriber: tstellar.

As it is a regression and affects real-world code (https://bugs.llvm.org/show_bug.cgi?id=48201),
I like to backport it to LLVM 11. – Any reasons not to do so?

@tstellar Can this still go into LLVM 11.0.1? It is a very small fix for a regression and I see that https://bugs.llvm.org/show_bug.cgi?id=release-11.0.1
still lists 18 open bugs.

As it is a regression and affects real-world code (https://bugs.llvm.org/show_bug.cgi?id=48201),
I like to backport it to LLVM 11. – Any reasons not to do so?

@tstellar Can this still go into LLVM 11.0.1? It is a very small fix for a regression and I see that https://bugs.llvm.org/show_bug.cgi?id=release-11.0.1
still lists 18 open bugs.

You may also want to try pinging him on the bug itself.

@jhenderson What's your opinion on backporting this?

@jhenderson What's your opinion on backporting this?

I think it's reasonably safe, because the change just means fewer errors reported by the assembler, and it's a fix for effectively a regression, so we're just going back to older behaviour. However, I'm not the most up-to-speed with the surrounding code, so @MaskRay may be a better person to give their opinion.

@jhenderson What's your opinion on backporting this?

I think it's reasonably safe, because the change just means fewer errors reported by the assembler, and it's a fix for effectively a regression, so we're just going back to older behaviour. However, I'm not the most up-to-speed with the surrounding code, so @MaskRay may be a better person to give their opinion.

@tstellar This makes an integrated assembler error less strict, so it is safe.

This is about assembling GCC's -S output with LLVM's integrated assembler, which isn't an area that I think people frequently use, so I'd also hear from @burnus why this is urgent.

This is about assembling GCC's -S output with LLVM's integrated assembler, which isn't an area that I think people frequently use, so I'd also hear from @burnus why this is urgent.

GCC's AMD GCN (alias Radeon) target (most useful for offloading) uses LLVM's assembler as GNU Binutils does not support GCN (yet?).

Many Linux distributions build GCC with offloading support for nvptx and amd gcn (installable was optional add-on packages).

The issue was found as building GCC was breaking on Debian once LLVM 11.0.0 was imported into Debian unstable. (Happens during the build of newlib's libc for GCN; this uses -ffunction-section). First reported at https://gcc.gnu.org/PR97827

This is about assembling GCC's -S output with LLVM's integrated assembler, which isn't an area that I think people frequently use, so I'd also hear from @burnus why this is urgent.

GCC's AMD GCN (alias Radeon) target (most useful for offloading) uses LLVM's assembler as GNU Binutils does not support GCN (yet?).

Many Linux distributions build GCC with offloading support for nvptx and amd gcn (installable was optional add-on packages).

The issue was found as building GCC was breaking on Debian once LLVM 11.0.0 was imported into Debian unstable. (Happens during the build of newlib's libc for GCN; this uses -ffunction-section). First reported at https://gcc.gnu.org/PR97827

Have any other targets in gcc besides amdgcn run into this problem? If this is something specific to amdgcn, then I'm not sure we should be adding a workaround for it to LLVM.

Have any other targets in gcc besides amdgcn run into this problem? If this is something specific to amdgcn, then I'm not sure we should be adding a workaround for it to LLVM.

The issue is general – but for AMD GCN it shows up first as this one always uses LLVM's linker.

It is also not really a "workaround" for a bug. GCC on purpose does not repeat the flags for .section in order to reduce the huge assembler file size one gets with -ffunction-sections.

MaskRay added a comment.EditedDec 14 2020, 10:19 AM

@burnus Looks like you did not mention https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97827 in the initial review. From the linked thread
https://gcc.gnu.org/pipermail/gcc-patches/2020-November/559572.html

Richard Sandiford's comment makes a lot of sense to me: "(a) doing something that works “everywhere” unconditionally (and keeping things simple) vs. (b) having both code that takes a shortcut and code that doesn't take a shortcut and trying to predict which one we should do."

I am now feeling a bit bad that reviewers did not receive full discussions before making a decision to accept the fix or workaround (depending on how the GCC community reacts). If they do want to fix 11, we might not need this patch or we could say this was for GCC before 11, making a better description.

@burnus Looks like you did not mention https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97827 in the initial review.

The link is in https://bugs.llvm.org/show_bug.cgi?id=48201 from the very beginning, and that bug was linked in here.

@burnus Looks like you did not mention https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97827 in the initial review.

The link is in https://bugs.llvm.org/show_bug.cgi?id=48201 from the very beginning, and that bug was linked in here.

OK my bad (I don't regularly see "See Also"... ) I've sent https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97827#c9 . I'll follow up with GCC folks there. I don't subscribe gcc-patches so it will be a bit inconvenient for me if most people comment on the mailing list instead...

OK my bad (I don't regularly see "See Also"... ) I've sent https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97827#c9 . I'll follow up with GCC folks there. I don't subscribe gcc-patches so it will be a bit inconvenient for me if most people comment on the mailing list instead...

There is nothing GNU as specific on it, all assemblers I'm aware of behave like that.
See e.g. https://docs.huihoo.com/solaris/2.4/801-6649/801-6649.pdf
"If section_name is a non-reserved section, attributes must be included the first time it is specified by the.section directive."
you can search for others. It isn't solely about what GCC emits either, there is a lot of hand-written assembler code in the wild and that also uses the documented .section directive behavior.
I'm strongly against changing what GCC emits here.

OK my bad (I don't regularly see "See Also"... ) I've sent https://gcc.gnu.org/bugzilla/show_bug.cgi?id=97827#c9 . I'll follow up with GCC folks there. I don't subscribe gcc-patches so it will be a bit inconvenient for me if most people comment on the mailing list instead...

There is nothing GNU as specific on it, all assemblers I'm aware of behave like that.
See e.g. https://docs.huihoo.com/solaris/2.4/801-6649/801-6649.pdf
"If section_name is a non-reserved section, attributes must be included the first time it is specified by the.section directive."
you can search for others. It isn't solely about what GCC emits either, there is a lot of hand-written assembler code in the wild and that also uses the documented .section directive behavior.
I'm strongly against changing what GCC emits here.

Thanks for chiming in :) OK, I am happy to keep this logic.

@tstellar It should be a good candidate for backporting.