This is an archive of the discontinued LLVM Phabricator instance.

[MCA] llvm-mca MCTargetStreamer segfault fix
ClosedPublic

Authored by holland11 on May 18 2021, 12:06 PM.

Details

Summary

EDIT: This submission required a minor tweak that Andrea made and can be seen in his comment at the bottom of the page.

In order to create the code regions for llvm-mca to analyze, llvm-mca creates an AsmCodeRegionGenerator and calls AsmCodeRegionGenerator::parseCodeRegions(). Within this function, both an MCAsmParser and MCTargetAsmParser are created so that MCAsmParser::Run() can be used to create the code regions for us.

These parser classes were created for llvm-mc so they are designed to emit code with an MCStreamer and MCTargetStreamer that are expected to be setup and passed into the MCAsmParser constructor. Because llvm-mca doesn’t want to emit any code, an MCStreamerWrapper class gets created instead and passed into the MCAsmParser constructor. This wrapper inherits from MCStreamer and overrides many of the emit methods to just do nothing. The exception is the emitInstruction() method which calls Regions.addInstruction(Inst).

This works well and allows llvm-mca to utilize llvm-mc’s MCAsmParser to build our code regions, however there are a few directives which rely on the MCTargetStreamer. llvm-mc assumes that the MCStreamer that gets passed into the MCAsmParser’s constructor has a valid pointer to an MCTargetStreamer. Because llvm-mca doesn’t setup an MCTargetStreamer, when the parser encounters one of those directives, a segfault will occur.

In x86, each one of these 7 directives will cause this segfault if they exist in the input assembly to llvm-mca:

.cv_fpo_proc
.cv_fpo_setframe
.cv_fpo_pushreg
.cv_fpo_stackalloc
.cv_fpo_stackalign
.cv_fpo_endprologue
.cv_fpo_endproc

I haven’t looked at other targets, but I wouldn’t be surprised if some of the other ones also have certain directives which could result in this same segfault.

My proposed solution is to simply initialize an MCTargetStreamer after we initialize the MCStreamerWrapper. The MCTargetStreamer requires an ostream object, but we don’t actually want any of these directives to be emitted anywhere, so I use an ostream created with the nulls() function. Since this needs to happen after the MCStreamerWrapper has been initialized, it needs to happen within the AsmCodeRegionGenerator::parseCodeRegions() function. The MCTargetStreamer also needs an MCInstPrinter which is easiest to initialize within the main() function of llvm-mca. So this MCInstPrinter gets constructed within main() then passed into the parseCodeRegions() function as a parameter. (If you feel like it would be appropriate and possible to create the MCInstPrinter within the parseCodeRegions() function, then feel free to modify my solution. That would stop us from having to pass it into the function and would limit its scope / lifetime.)

My solution stops the segfault from happening and still passes all of the current (expected) llvm-mca tests. I also added a new test for x86 that checks for this segfault on an input that includes one of the .cv_fpo directives (this test fails without my solution, but passes with it).

As far as I can tell, all of the functions that I modified are only called from within llvm-mca so there shouldn’t be any worries about breaking other tools.

Assembly input that causes the segfault (relevant directive on line 5):

    .section    __TEXT,__text,regular,pure_instructions
    .build_version macos, 11, 0    sdk_version 11, 3
    .globl    __Z3funi                ## -- Begin function _Z3funi
    .p2align    4, 0x90
    .cv_fpo_pushreg ebx
__Z3funi:                               ## @_Z3funi
    .cfi_startproc
## %bb.0:
    pushq    %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register %rbp
    movl    %edi, -4(%rbp)
    movl    -4(%rbp), %eax
    shll    $1, %eax
    popq    %rbp
    retq
    .cfi_endproc
                                        ## -- End function
    .globl    _main                   ## -- Begin function main
    .p2align    4, 0x90
_main:                                  ## @main
    .cfi_startproc
## %bb.0:
    pushq    %rbp
    .cfi_def_cfa_offset 16
    .cfi_offset %rbp, -16
    movq    %rsp, %rbp
    .cfi_def_cfa_register %rbp
    subq    $32, %rsp
    movl    $0, -4(%rbp)
    movl    $5, -8(%rbp)
    movl    $3, -12(%rbp)
    movl    -8(%rbp), %eax
    imull    -12(%rbp), %eax
    movl    %eax, -16(%rbp)
    movl    -16(%rbp), %edi
    callq    __Z3funi
    movl    %eax, -20(%rbp)
    movl    -20(%rbp), %eax
    addq    $32, %rsp
    popq    %rbp
    retq
    .cfi_endproc
                                        ## -- End function
.subsections_via_symbols

Output when attempting to run this file through llvm-mca without the fix:

patrick@MacBook-Pro bin % pwd
/Users/patrick/Desktop/llvm/open_llvm/llvm-project/build/release/bin
patrick@MacBook-Pro bin % ./llvm-mca /Users/patrick/Desktop/files/temp_x86_crash.s



/Users/patrick/Desktop/files/temp_x86_crash.s:2:2: warning: .build_version macos used while targeting darwin20.4.0
        .build_version macos, 11, 0     sdk_version 11, 3
        ^
Assertion failed: (getParser().getStreamer().getTargetStreamer() && "do not have a target streamer"), function getTargetStreamer, file /Users/patrick/Desktop/llvm/open_llvm/llvm-project/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp, line 117.
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Stack dump:
0.    Program arguments: ./llvm-mca /Users/patrick/Desktop/files/temp_x86_crash.s
Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it):
0  llvm-mca                 0x0000000103eac5f7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 39
1  llvm-mca                 0x0000000103eab528 llvm::sys::RunSignalHandlers() + 248
2  llvm-mca                 0x0000000103eacfd0 SignalHandler(int) + 272
3  libsystem_platform.dylib 0x00007fff20511d7d _sigtramp + 29
4  libsystem_platform.dylib 0x00007fb15e00b701 _sigtramp + 18446743739737020833
5  libsystem_c.dylib        0x00007fff20421411 abort + 120
6  libsystem_c.dylib        0x00007fff204207e8 err + 0
7  llvm-mca                 0x0000000103fe2541 (anonymous namespace)::X86AsmParser::ParseDirective(llvm::AsmToken) (.cold.13) + 33
8  llvm-mca                 0x0000000103b7d426 (anonymous namespace)::X86AsmParser::ParseDirective(llvm::AsmToken) + 5238
9  llvm-mca                 0x0000000103e15a69 (anonymous namespace)::AsmParser::parseStatement((anonymous namespace)::ParseStatementInfo&, llvm::MCAsmParserSemaCallback*) + 3513
10 llvm-mca                 0x0000000103e0fddc (anonymous namespace)::AsmParser::Run(bool, bool) + 540
11 llvm-mca                 0x0000000103a215d3 llvm::mca::AsmCodeRegionGenerator::parseCodeRegions() + 243
12 llvm-mca                 0x0000000103a16725 main + 4261
13 libdyld.dylib            0x00007fff204e7f3d start + 1



zsh: abort      ./llvm-mca /Users/patrick/Desktop/files/temp_x86_crash.s
patrick@MacBook-Pro bin %

Output when attempting to run the file after the fix (only relevant part of the output is that it succeeds):

patrick@MacBook-Pro bin % pwd
/Users/patrick/Desktop/llvm/open_llvm/llvm-project/build/release/bin
patrick@MacBook-Pro bin % ./llvm-mca /Users/patrick/Desktop/files/temp_x86_crash.s



/Users/patrick/Desktop/files/temp_x86_crash.s:2:2: warning: .build_version macos used while targeting darwin20.4.0
        .build_version macos, 11, 0     sdk_version 11, 3
        ^
warning: found a return instruction in the input assembly sequence.
note: program counter updates are ignored.
warning: found a call in the input assembly sequence.
note: call instructions are not correctly modeled. Assume a latency of 100cy.
Iterations:        100
Instructions:      2300
Total Cycles:      2231
Total uOps:        3700

Dispatch Width:    6
uOps Per Cycle:    1.66
IPC:               1.03
Block RThroughput: 9.0


Instruction Info:
[1]: #uOps
[2]: Latency
[3]: RThroughput
[4]: MayLoad
[5]: MayStore
[6]: HasSideEffects (U)

[1]    [2]    [3]    [4]    [5]    [6]    Instructions:
 3      2     1.00           *            pushq    %rbp
 1      1     0.25                        movq    %rsp, %rbp
 1      1     1.00           *            movl    %edi, -4(%rbp)
 1      5     0.50    *                   movl    -4(%rbp), %eax
 1      1     0.50                        shll    %eax
 2      6     0.50    *                   popq    %rbp
 3      7     1.00                  U     retq
 3      2     1.00           *            pushq    %rbp
 1      1     0.25                        movq    %rsp, %rbp
 1      1     0.25                        subq    $32, %rsp
 1      1     1.00           *            movl    $0, -4(%rbp)
 1      1     1.00           *            movl    $5, -8(%rbp)
 1      1     1.00           *            movl    $3, -12(%rbp)
 1      5     0.50    *                   movl    -8(%rbp), %eax
 2      8     1.00    *                   imull    -12(%rbp), %eax
 1      1     1.00           *            movl    %eax, -16(%rbp)
 1      5     0.50    *                   movl    -16(%rbp), %edi
 4      3     1.00                        callq    __Z3funi
 1      1     1.00           *            movl    %eax, -20(%rbp)
 1      5     0.50    *                   movl    -20(%rbp), %eax
 1      1     0.25                        addq    $32, %rsp
 2      6     0.50    *                   popq    %rbp
 3      7     1.00                  U     retq


Resources:
[0]   - SKLDivider
[1]   - SKLFPDivider
[2]   - SKLPort0
[3]   - SKLPort1
[4]   - SKLPort2
[5]   - SKLPort3
[6]   - SKLPort4
[7]   - SKLPort5
[8]   - SKLPort6
[9]   - SKLPort7


Resource pressure per iteration:
[0]    [1]    [2]    [3]    [4]    [5]    [6]    [7]    [8]    [9]    
 -      -     3.99   3.99   6.00   6.00   9.00   3.85   4.17   6.00   

Resource pressure by instruction:
[0]    [1]    [2]    [3]    [4]    [5]    [6]    [7]    [8]    [9]    Instructions:
 -      -     0.19   0.01   0.05    -     1.00   0.15   0.65   0.95   pushq    %rbp
 -      -     0.01   0.16    -      -      -     0.83    -      -     movq    %rsp, %rbp
 -      -      -      -     0.94    -     1.00    -      -     0.06   movl    %edi, -4(%rbp)
 -      -      -      -     0.05   0.95    -      -      -      -     movl    -4(%rbp), %eax
 -      -     0.85    -      -      -      -      -     0.15    -     shll    %eax
 -      -     0.16   0.83   0.01   0.99    -      -     0.01    -     popq    %rbp
 -      -     0.16   0.20   0.64   0.36    -     0.64   1.00    -     retq
 -      -      -     0.01    -      -     1.00   0.20   0.79   1.00   pushq    %rbp
 -      -      -     0.22    -      -      -     0.78    -      -     movq    %rsp, %rbp
 -      -     0.18   0.78    -      -      -      -     0.04    -     subq    $32, %rsp
 -      -      -      -     0.34   0.02   1.00    -      -     0.64   movl    $0, -4(%rbp)
 -      -      -      -      -      -     1.00    -      -     1.00   movl    $5, -8(%rbp)
 -      -      -      -      -      -     1.00    -      -     1.00   movl    $3, -12(%rbp)
 -      -      -      -     0.04   0.96    -      -      -      -     movl    -8(%rbp), %eax
 -      -      -     1.00   0.96   0.04    -      -      -      -     imull    -12(%rbp), %eax
 -      -      -      -      -     0.65   1.00    -      -     0.35   movl    %eax, -16(%rbp)
 -      -      -      -     0.04   0.96    -      -      -      -     movl    -16(%rbp), %edi
 -      -     0.94   0.62   0.01    -     1.00   0.23   0.21   0.99   callq    __Z3funi
 -      -      -      -     0.95   0.04   1.00    -      -     0.01   movl    %eax, -20(%rbp)
 -      -      -      -     0.96   0.04    -      -      -      -     movl    -20(%rbp), %eax
 -      -     0.18    -      -      -      -     0.67   0.15    -     addq    $32, %rsp
 -      -     0.67    -     0.04   0.96    -     0.16   0.17    -     popq    %rbp
 -      -     0.65   0.16   0.97   0.03    -     0.19   1.00    -     retq



patrick@MacBook-Pro bin %

Diff Detail

Event Timeline

holland11 created this revision.May 18 2021, 12:06 PM
holland11 requested review of this revision.May 18 2021, 12:06 PM
holland11 edited the summary of this revision. (Show Details)May 18 2021, 12:12 PM
holland11 retitled this revision from llvm-mca MCTargetStreamer segfault fix to [MCA] llvm-mca MCTargetStreamer segfault fix.
andreadb accepted this revision.May 18 2021, 12:23 PM

LGTM

Thanks a lot for the detailed explanation.

This revision is now accepted and ready to land.May 18 2021, 12:23 PM

LGTM

Thanks a lot for the detailed explanation.

@andreadb Would you be able to commit on my behalf?

Sure. I can commit this patch for you.

What email address do you want me to specify in the author property?

Sure. I can commit this patch for you.

What email address do you want me to specify in the author property?

patrickeholland@gmail.com

Thank you!

This revision was landed with ongoing or failed builds.May 19 2021, 10:37 AM
This revision was automatically updated to reflect the committed changes.

@andreadb I seem to have made a mistake with the test file which is causing the commit to fail. I didn't realize I needed to explicitly declare the cpu (-mcpu= argument). I just looked at all of the other tests in the X86 directory and I will be modifying my test file then re-submitting as a new diff. Sorry about that.

holland11 edited the summary of this revision. (Show Details)May 19 2021, 11:57 AM

@andreadb I seem to have made a mistake with the test file which is causing the commit to fail. I didn't realize I needed to explicitly declare the cpu (-mcpu= argument). I just looked at all of the other tests in the X86 directory and I will be modifying my test file then re-submitting as a new diff. Sorry about that.

Hey Patrick,

don't worry about it. It is already fixed here: https://reviews.llvm.org/rG9acabe8b6ff5

I was keeping an eye at the buildbot page, and I saw the failures after your commit. It is all fixed now :)

holland11 edited the summary of this revision. (Show Details)May 19 2021, 1:16 PM

@andreadb I seem to have made a mistake with the test file which is causing the commit to fail. I didn't realize I needed to explicitly declare the cpu (-mcpu= argument). I just looked at all of the other tests in the X86 directory and I will be modifying my test file then re-submitting as a new diff. Sorry about that.

Hey Patrick,

don't worry about it. It is already fixed here: https://reviews.llvm.org/rG9acabe8b6ff5

I was keeping an eye at the buildbot page, and I saw the failures after your commit. It is all fixed now :)

Awesome! Thank you for that. I'm working on llvm-mca currently and I'm sure I could benefit from your insight occasionally if there's a way to reach you directly? (Maybe discord?)