Page MenuHomePhabricator

[LTO] Discard non-prevailing defined symbols in module-level assembly
AbandonedPublic

Authored by ychen on Feb 17 2021, 8:36 PM.

Details

Summary

by inline assembly string replacement. The alternative is using a ad-hoc
MCStreamer and AsmParser, but it is more
expensive to do that and the use cases are very rare that makes it not
worthwhile.

The motivation is that after D90108, the use case demonstrated by the
test case begins to fail. I find the error message is useful hence
this patch.

The change in MachineModuleInfo.cpp is to make sure the error code 1 is
returned. Without it, the error message is printed but llvm-lto2 still
returns 0.

Diff Detail

Unit TestsFailed

TimeTest
40 msx64 debian > LLVM.Transforms/PhaseOrdering::reassociate-after-unroll.ll
Script: -- : 'RUN: at line 3'; /mnt/disks/ssd0/agent/llvm-project/build/bin/opt -O2 -enable-new-pm=0 -S < /mnt/disks/ssd0/agent/llvm-project/llvm/test/Transforms/PhaseOrdering/reassociate-after-unroll.ll | /mnt/disks/ssd0/agent/llvm-project/build/bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/llvm/test/Transforms/PhaseOrdering/reassociate-after-unroll.ll --check-prefix=OLDPM
80 msx64 windows > LLVM.Transforms/PhaseOrdering::reassociate-after-unroll.ll
Script: -- : 'RUN: at line 3'; c:\ws\w1\llvm-project\premerge-checks\build\bin\opt.exe -O2 -enable-new-pm=0 -S < C:\ws\w1\llvm-project\premerge-checks\llvm\test\Transforms\PhaseOrdering\reassociate-after-unroll.ll | c:\ws\w1\llvm-project\premerge-checks\build\bin\filecheck.exe C:\ws\w1\llvm-project\premerge-checks\llvm\test\Transforms\PhaseOrdering\reassociate-after-unroll.ll --check-prefix=OLDPM

Event Timeline

ychen created this revision.Feb 17 2021, 8:36 PM
ychen requested review of this revision.Feb 17 2021, 8:36 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 17 2021, 8:36 PM

I don't love the custom pattern matching approach added here - it is hard to follow and seems potentially brittle. If this case is rare, presumably it is rare that we would have a non-empty set of non prevailing asm symbols, so using something heavier weight shouldn't be an issue if that would work and be clearer.

There are no comments right now so it isn't clear to me what this is doing. What is the behavior before and after the patch? Is this patch fixing an issue created by LTO, or attempting to fix bad input?

I don't follow this comment from the description:

I find the error message is useful hence this patch.

What happens for ThinLTO?

llvm/lib/CodeGen/MachineModuleInfo.cpp
173 ↗(On Diff #324512)

This seems like a good change but orthogonal to the issue you are trying to solve - please extract into a different patch along with the associated test changes.

llvm/test/LTO/X86/weak-asm.ll
2

Test case needs some comments.

Also, what is the behavior without this patch?

ychen added a comment.EditedFeb 19 2021, 4:58 PM

Thanks for the comment!

I don't love the custom pattern matching approach added here - it is hard to follow and seems potentially brittle. If this case is rare, presumably it is rare that we would have a non-empty set of non prevailing asm symbols, so using something heavier weight shouldn't be an issue if that would work and be clearer.

Me neither. One benefit though is that only one place in the source code needs inspection if anything goes off the rail and it doesn't *pollute* other parts of the LTO code. The AsmParser & AsmStreamer solution would (1) involve much more change in LTO/ModuleSymbolTable/MCAsmStreamer (need to create derived class from MCAsmStreamer which is not exposed now) (2) there would be inevitable code sharing/refactoring with ModuleSymbolTable. I really don't think it is a good idea even if it does not impact compile time.

After some thought, I think a third approach is possible: emit the list of to-be-discarded symbol names to a module metadata node and discard these symbols during the normal AsmPrinter::emitInlineAsm process. This should have a minimal compile-time impact, easy implementation. A fourth approach would be to demote the error introduced by D90108 to a warning.

Please let me know which approach is preferred.

There are no comments right now so it isn't clear to me what this is doing. What is the behavior before and after the patch? Is this patch fixing an issue created by LTO, or attempting to fix bad input?

Before this patch, the test fails with "foo changed binding to STB_GLOBAL" because, in the combined LTO module, there is conceptually .weak foo; .global foo hence triggers the error diagnose introduced by D90108. This patch discards the .weak foo part of .weak foo; .global foo to avoid the error diagnosis. I think it is a borderline LTO issue in that we don't handle the external symbols in module asm block with existing symbol resolutions. Although most of the time (including the test case in this patch), the behavior happens to be correct. ThinLTO and non-LTO do not suffer this issue because two bindings (.weak and .global) end up in two object files, the error diagnosis does not trigger then.

Thanks for the comment!

I don't love the custom pattern matching approach added here - it is hard to follow and seems potentially brittle. If this case is rare, presumably it is rare that we would have a non-empty set of non prevailing asm symbols, so using something heavier weight shouldn't be an issue if that would work and be clearer.

Me neither. One benefit though is that only one place in the source code needs inspection if anything goes off the rail and it doesn't *pollute* other parts of the LTO code. The AsmParser & AsmStreamer solution would (1) involve much more change in LTO/ModuleSymbolTable/MCAsmStreamer (need to create derived class from MCAsmStreamer which is not exposed now) (2) there would be inevitable code sharing/refactoring with ModuleSymbolTable. I really don't think it is a good idea even if it does not impact compile time.

After some thought, I think a third approach is possible: emit the list of to-be-discarded symbol names to a module metadata node and discard these symbols during the normal AsmPrinter::emitInlineAsm process. This should have a minimal compile-time impact, easy implementation. A fourth approach would be to demote the error introduced by D90108 to a warning.

Please let me know which approach is preferred.

There are no comments right now so it isn't clear to me what this is doing. What is the behavior before and after the patch? Is this patch fixing an issue created by LTO, or attempting to fix bad input?

Before this patch, the test fails with "foo changed binding to STB_GLOBAL" because, in the combined LTO module, there is conceptually .weak foo; .global foo hence triggers the error diagnose introduced by D90108. This patch discards the .weak foo part of .weak foo; .global foo to avoid the error diagnosis. I think it is a borderline LTO issue in that we don't handle the external symbols in module asm block with existing symbol resolutions. Although most of the time (including the test case in this patch), the behavior happens to be correct. ThinLTO and non-LTO do not suffer this issue because two bindings (.weak and .global) end up in two object files, the error diagnosis does not trigger then.

Thanks for explaining the situation better. Ideally LTO or MC would mimic the behavior of the linker, which I think would actually be to take the strong def regardless of linking order. When the IR is read and the IRSymbolTable created, the asm symbols are parsed and added so the linker would resolve them. However, I don't have a good idea how to use that when merging the module inline asm together.

@pcc any suggestions?

Is this still needed after D97449 ?

ychen added a comment.Mar 2 2021, 5:20 PM

Is this still needed after D97449 ?

Yes. This is a LTO issue.

ychen added a comment.Mar 3 2021, 6:47 PM

Hi @pcc , any suggestion to address this issue? Personally, I think regex to drop the non-prevailing def should be easiest. I'm not sure it is a complete solution but should be good for most cases. Thoughts?

MaskRay added a comment.EditedMar 4 2021, 8:48 PM

OK, I think this makes sense and likely fixes https://github.com/ClangBuiltLinux/linux/issues/1269.

A STB_GLOBAL definition can override a STB_WEAK definition. However, if we stream them in the order (.local foo) foo: (foo is not used, see parseAssignmentExpression) .weak foo; foo: ...

The binding will change to STB_WEAK, while we should actually respect the prevailing definition's binding: STB_LOCAL.

If the symbol is used by regular object files, we may emit a .globl and get a changed binding error.

It is unfortunate that we don't very elegant way nuking non-prevailing inline asm definitions.

Please fix the few suggestions.

llvm/lib/LTO/LTO.cpp
733

StringRef

735

llvm::join

745

auto causes an unnecessary std::string copy.

751

This is not needed. Just let MC ignore whitespace.

llvm/test/LTO/X86/weak-asm.ll
7

%t1 + %t2 is sufficient and makes more sense to the linker (there is no duplicate definition error).

You can add another test using %t2 + %t3.

MaskRay added inline comments.Mar 4 2021, 8:49 PM
llvm/test/LTO/X86/weak-asm.ll
38

Nit: this can be llvm.compiler.used

llvm.used on ELF get SHF_GNU_RETAIN now. We don't need that stricter semantics.

ychen updated this revision to Diff 329152.Mar 8 2021, 3:13 PM
  • Address comments
ychen marked 5 inline comments as done.Mar 8 2021, 3:22 PM
ychen added inline comments.
llvm/include/llvm/MC/MCRegisterInfo.h
170

This is to avoid creating a real MCRegisterInfo in getTargetAsmSeparator below.

llvm/lib/LTO/LTO.cpp
733

I could not do PrefixPat + P below with this. PrefixPat.str() + P also works but it seems easier just declaring it string?

ychen marked an inline comment as done.Mar 8 2021, 3:22 PM
ychen added inline comments.
llvm/lib/LTO/LTO.cpp
746

This should be able to handle more than one discarded symbols. Verified fixing https://github.com/ClangBuiltLinux/linux/issues/1269

pcc added a comment.Mar 8 2021, 3:24 PM

This is just solving part of the problem, no? Imagine that you have a weak definition and a global definition in two object files. Then you will have the same problem.

I think that some kind of marker that we can use to mark blocks of asm as conceptually coming from different object files would be best. But as a short term solution it seems best to disable the binding change diagnostic when full LTO is being used.

ychen added a comment.EditedMar 8 2021, 4:00 PM
In D96931#2612521, @pcc wrote:

This is just solving part of the problem, no? Imagine that you have a weak definition and a global definition in two object files. Then you will have the same problem.

I would think the global one is chosen by the linker?

I think that some kind of marker that we can use to mark blocks of asm as conceptually coming from different object files would be best.

Encoded into metadata: the map<inlineasm string offset, vector<nonprevail symbols>> Then in the MC layer, discard the non-prevail symbols.

pcc added a comment.Mar 8 2021, 4:32 PM
In D96931#2612521, @pcc wrote:

This is just solving part of the problem, no? Imagine that you have a weak definition and a global definition in two object files. Then you will have the same problem.

I would think the global one is chosen by the linker?

I thought that the problem was with full LTO where the module inline asm blobs are simply concatenated. So if you have one file with

.global foo
foo:
.byte 1

and another with

.weak foo
foo:
.byte 2

then the assembler will see both foo labels and complain about multiple definitions.

I think that some kind of marker that we can use to mark blocks of asm as conceptually coming from different object files would be best.

Encoded into metadata: the map<inlineasm string offset, vector<nonprevail symbols>> Then in the MC layer, discard the non-prevail symbols.

I mean you would be able to say something like

.new_module
.global foo
foo:
.byte 1

.new_module
.weak foo
foo:
.byte 2

and that would be equivalent to

.global foo
foo:
.byte 1
.byte 2

Similarly if the second file did not provide a definition of foo the assembler would ignore the .weak directive.

ychen added a comment.Mar 8 2021, 4:59 PM
In D96931#2612602, @pcc wrote:
In D96931#2612521, @pcc wrote:

This is just solving part of the problem, no? Imagine that you have a weak definition and a global definition in two object files. Then you will have the same problem.

I would think the global one is chosen by the linker?

I thought that the problem was with full LTO where the module inline asm blobs are simply concatenated. So if you have one file with

.global foo
foo:
.byte 1

and another with

.weak foo
foo:
.byte 2

then the assembler will see both foo labels and complain about multiple definitions.

Not sure if I understand you correctly but I'm pretty sure this patch could address this particular example. The newly added test has a similar test case.

I think that some kind of marker that we can use to mark blocks of asm as conceptually coming from different object files would be best.

Encoded into metadata: the map<inlineasm string offset, vector<nonprevail symbols>> Then in the MC layer, discard the non-prevail symbols.

I mean you would be able to say something like

.new_module
.global foo
foo:
.byte 1

.new_module
.weak foo
foo:
.byte 2

and that would be equivalent to

.global foo
foo:
.byte 1
.byte 2

Similarly if the second file did not provide a definition of foo the assembler would ignore the .weak directive.

I don't quite get the part that .byte 2 is following .byte 1. If we don't use LTO, there should be no .byte 2 in the final executable right?

MaskRay added a subscriber: nickdesaulniers.EditedMar 9 2021, 11:31 AM

The regular LTO precodegen bitcode file %to1.0.5.precodegen.bc is

module asm ".weak foo"    ;;;;;;;;; STB_WEAK definition
module asm "\09 .equ foo,bar"

@llvm.compiler.used = appending global [1 x i8*] [i8* bitcast (i32 (i32)* @bar to i8*)], section "llvm.metadata"

; Function Attrs: norecurse nounwind readnone willreturn
define dso_local i32 @foo(i32 %0) local_unnamed_addr #0 {   ;;;;;;;;;;; STB_GLOBAL definition. Conflict!
  %2 = add nsw i32 %0, 2
  ret i32 %2
}

; Function Attrs: norecurse nounwind readnone willreturn
define internal i32 @bar(i32 %0) #0 {
  %2 = add nsw i32 %0, 1
  ret i32 %2
}

The module-level inline asm is emitted before the functions. The call tree is roughly:

AsmPrinter::doInitialization
  emitInlineAsm (AsmPrinter.cpp:319)
    (void)Parser->Run(/*NoInitialTextSection*/ true, (AsmPrinterInlineAsm.cpp:124)
      ...
        MCELFStreamer::emitSymbolAttribute (set binding to STB_WEAK)
AsmPrinter::emitFunctionBody
  AsmPrinter::emitFunctionHeader
    AsmPrinter::emitLinkage
      MCELFStreamer::emitSymbolAttribute (`error: ... changed binding to STB_GLOBAL`)

A more elegant solution is to let the MCELFStreamer ignore non-prevailing attribute changes (.weak foo) and definitions (.equ foo,bar) during emitInlineAsm.

If that is complex to implement, @pcc having this short-term solution may be useful depending on how urgent @nickdesaulniers wants https://github.com/ClangBuiltLinux/linux/issues/1269 to be fixed.

In https://github.com/ClangBuiltLinux/linux/issues/1269#issuecomment-786963047 , @nickdesaulniers mentioned the binding can change from STB_GLOBAL to STB_WEAK, then STB_WEAK to STB_GLOBAL.
I am curious how that can happen:

  LTO     vmlinux.o
__ia32_compat_sys_sysctl: binding: 1
__ia32_compat_sys_sysctl: binding: 2
__ia32_compat_sys_sysctl: binding: 2
<unknown>:0: error: __ia32_compat_sys_sysctl changed binding to STB_GLOBAL
__ia32_compat_sys_sysctl: binding: 1
LLVM ERROR: Error parsing inline asm

PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
ychen added a comment.EditedMar 10 2021, 7:48 PM

@MaskRay Thanks for the suggestion.

I have an internal build config that needs this fix. I'm happy to take whatever approach that reviewers agree upon. To summarise the proposed solutions:

  • (@pcc) Add assembly directive (.newmodule or just .module), handle the linking in the MC layer.
    • Pros: a general solution that maybe useful in the future, no need to encoding information into IR).
    • Cons: new assembly directive? duplicate linker logic but should be fine since it should be complicated for the use cases.
  • (@MaskRay and my initial proposed alternative) discard non-prevailing symbols in MC.
    • Pros: no new assembly directive
    • Cons: through either IR or LLVM API, the symbol list and each symbol's position need to be passed to MC. Better than doing MCAsmParse in LTO phase but still feels like a lot of work for a specific problem.

I would take @pcc 's suggestion. @MaskRay thoughts?

ormris added a subscriber: ormris.Mar 11 2021, 10:44 AM
ychen abandoned this revision.Thu, Mar 18, 10:58 PM

Landed D98762 instead.