EDIT: This submission required a minor tweak that Andrea made and can be seen in his comment at the bottom of the page.
In order to create the code regions for llvm-mca to analyze, llvm-mca creates an AsmCodeRegionGenerator and calls AsmCodeRegionGenerator::parseCodeRegions(). Within this function, both an MCAsmParser and MCTargetAsmParser are created so that MCAsmParser::Run() can be used to create the code regions for us.
These parser classes were created for llvm-mc so they are designed to emit code with an MCStreamer and MCTargetStreamer that are expected to be setup and passed into the MCAsmParser constructor. Because llvm-mca doesn’t want to emit any code, an MCStreamerWrapper class gets created instead and passed into the MCAsmParser constructor. This wrapper inherits from MCStreamer and overrides many of the emit methods to just do nothing. The exception is the emitInstruction() method which calls Regions.addInstruction(Inst).
This works well and allows llvm-mca to utilize llvm-mc’s MCAsmParser to build our code regions, however there are a few directives which rely on the MCTargetStreamer. llvm-mc assumes that the MCStreamer that gets passed into the MCAsmParser’s constructor has a valid pointer to an MCTargetStreamer. Because llvm-mca doesn’t setup an MCTargetStreamer, when the parser encounters one of those directives, a segfault will occur.
In x86, each one of these 7 directives will cause this segfault if they exist in the input assembly to llvm-mca:
.cv_fpo_proc .cv_fpo_setframe .cv_fpo_pushreg .cv_fpo_stackalloc .cv_fpo_stackalign .cv_fpo_endprologue .cv_fpo_endproc
I haven’t looked at other targets, but I wouldn’t be surprised if some of the other ones also have certain directives which could result in this same segfault.
My proposed solution is to simply initialize an MCTargetStreamer after we initialize the MCStreamerWrapper. The MCTargetStreamer requires an ostream object, but we don’t actually want any of these directives to be emitted anywhere, so I use an ostream created with the nulls() function. Since this needs to happen after the MCStreamerWrapper has been initialized, it needs to happen within the AsmCodeRegionGenerator::parseCodeRegions() function. The MCTargetStreamer also needs an MCInstPrinter which is easiest to initialize within the main() function of llvm-mca. So this MCInstPrinter gets constructed within main() then passed into the parseCodeRegions() function as a parameter. (If you feel like it would be appropriate and possible to create the MCInstPrinter within the parseCodeRegions() function, then feel free to modify my solution. That would stop us from having to pass it into the function and would limit its scope / lifetime.)
My solution stops the segfault from happening and still passes all of the current (expected) llvm-mca tests. I also added a new test for x86 that checks for this segfault on an input that includes one of the .cv_fpo directives (this test fails without my solution, but passes with it).
As far as I can tell, all of the functions that I modified are only called from within llvm-mca so there shouldn’t be any worries about breaking other tools.
Assembly input that causes the segfault (relevant directive on line 5):
.section __TEXT,__text,regular,pure_instructions .build_version macos, 11, 0 sdk_version 11, 3 .globl __Z3funi ## -- Begin function _Z3funi .p2align 4, 0x90 .cv_fpo_pushreg ebx __Z3funi: ## @_Z3funi .cfi_startproc ## %bb.0: pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset %rbp, -16 movq %rsp, %rbp .cfi_def_cfa_register %rbp movl %edi, -4(%rbp) movl -4(%rbp), %eax shll $1, %eax popq %rbp retq .cfi_endproc ## -- End function .globl _main ## -- Begin function main .p2align 4, 0x90 _main: ## @main .cfi_startproc ## %bb.0: pushq %rbp .cfi_def_cfa_offset 16 .cfi_offset %rbp, -16 movq %rsp, %rbp .cfi_def_cfa_register %rbp subq $32, %rsp movl $0, -4(%rbp) movl $5, -8(%rbp) movl $3, -12(%rbp) movl -8(%rbp), %eax imull -12(%rbp), %eax movl %eax, -16(%rbp) movl -16(%rbp), %edi callq __Z3funi movl %eax, -20(%rbp) movl -20(%rbp), %eax addq $32, %rsp popq %rbp retq .cfi_endproc ## -- End function .subsections_via_symbols
Output when attempting to run this file through llvm-mca without the fix:
patrick@MacBook-Pro bin % pwd /Users/patrick/Desktop/llvm/open_llvm/llvm-project/build/release/bin patrick@MacBook-Pro bin % ./llvm-mca /Users/patrick/Desktop/files/temp_x86_crash.s /Users/patrick/Desktop/files/temp_x86_crash.s:2:2: warning: .build_version macos used while targeting darwin20.4.0 .build_version macos, 11, 0 sdk_version 11, 3 ^ Assertion failed: (getParser().getStreamer().getTargetStreamer() && "do not have a target streamer"), function getTargetStreamer, file /Users/patrick/Desktop/llvm/open_llvm/llvm-project/llvm/lib/Target/X86/AsmParser/X86AsmParser.cpp, line 117. PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace. Stack dump: 0. Program arguments: ./llvm-mca /Users/patrick/Desktop/files/temp_x86_crash.s Stack dump without symbol names (ensure you have llvm-symbolizer in your PATH or set the environment var `LLVM_SYMBOLIZER_PATH` to point to it): 0 llvm-mca 0x0000000103eac5f7 llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) + 39 1 llvm-mca 0x0000000103eab528 llvm::sys::RunSignalHandlers() + 248 2 llvm-mca 0x0000000103eacfd0 SignalHandler(int) + 272 3 libsystem_platform.dylib 0x00007fff20511d7d _sigtramp + 29 4 libsystem_platform.dylib 0x00007fb15e00b701 _sigtramp + 18446743739737020833 5 libsystem_c.dylib 0x00007fff20421411 abort + 120 6 libsystem_c.dylib 0x00007fff204207e8 err + 0 7 llvm-mca 0x0000000103fe2541 (anonymous namespace)::X86AsmParser::ParseDirective(llvm::AsmToken) (.cold.13) + 33 8 llvm-mca 0x0000000103b7d426 (anonymous namespace)::X86AsmParser::ParseDirective(llvm::AsmToken) + 5238 9 llvm-mca 0x0000000103e15a69 (anonymous namespace)::AsmParser::parseStatement((anonymous namespace)::ParseStatementInfo&, llvm::MCAsmParserSemaCallback*) + 3513 10 llvm-mca 0x0000000103e0fddc (anonymous namespace)::AsmParser::Run(bool, bool) + 540 11 llvm-mca 0x0000000103a215d3 llvm::mca::AsmCodeRegionGenerator::parseCodeRegions() + 243 12 llvm-mca 0x0000000103a16725 main + 4261 13 libdyld.dylib 0x00007fff204e7f3d start + 1 zsh: abort ./llvm-mca /Users/patrick/Desktop/files/temp_x86_crash.s patrick@MacBook-Pro bin %
Output when attempting to run the file after the fix (only relevant part of the output is that it succeeds):
patrick@MacBook-Pro bin % pwd /Users/patrick/Desktop/llvm/open_llvm/llvm-project/build/release/bin patrick@MacBook-Pro bin % ./llvm-mca /Users/patrick/Desktop/files/temp_x86_crash.s /Users/patrick/Desktop/files/temp_x86_crash.s:2:2: warning: .build_version macos used while targeting darwin20.4.0 .build_version macos, 11, 0 sdk_version 11, 3 ^ warning: found a return instruction in the input assembly sequence. note: program counter updates are ignored. warning: found a call in the input assembly sequence. note: call instructions are not correctly modeled. Assume a latency of 100cy. Iterations: 100 Instructions: 2300 Total Cycles: 2231 Total uOps: 3700 Dispatch Width: 6 uOps Per Cycle: 1.66 IPC: 1.03 Block RThroughput: 9.0 Instruction Info: [1]: #uOps [2]: Latency [3]: RThroughput [4]: MayLoad [5]: MayStore [6]: HasSideEffects (U) [1] [2] [3] [4] [5] [6] Instructions: 3 2 1.00 * pushq %rbp 1 1 0.25 movq %rsp, %rbp 1 1 1.00 * movl %edi, -4(%rbp) 1 5 0.50 * movl -4(%rbp), %eax 1 1 0.50 shll %eax 2 6 0.50 * popq %rbp 3 7 1.00 U retq 3 2 1.00 * pushq %rbp 1 1 0.25 movq %rsp, %rbp 1 1 0.25 subq $32, %rsp 1 1 1.00 * movl $0, -4(%rbp) 1 1 1.00 * movl $5, -8(%rbp) 1 1 1.00 * movl $3, -12(%rbp) 1 5 0.50 * movl -8(%rbp), %eax 2 8 1.00 * imull -12(%rbp), %eax 1 1 1.00 * movl %eax, -16(%rbp) 1 5 0.50 * movl -16(%rbp), %edi 4 3 1.00 callq __Z3funi 1 1 1.00 * movl %eax, -20(%rbp) 1 5 0.50 * movl -20(%rbp), %eax 1 1 0.25 addq $32, %rsp 2 6 0.50 * popq %rbp 3 7 1.00 U retq Resources: [0] - SKLDivider [1] - SKLFPDivider [2] - SKLPort0 [3] - SKLPort1 [4] - SKLPort2 [5] - SKLPort3 [6] - SKLPort4 [7] - SKLPort5 [8] - SKLPort6 [9] - SKLPort7 Resource pressure per iteration: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] - - 3.99 3.99 6.00 6.00 9.00 3.85 4.17 6.00 Resource pressure by instruction: [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] Instructions: - - 0.19 0.01 0.05 - 1.00 0.15 0.65 0.95 pushq %rbp - - 0.01 0.16 - - - 0.83 - - movq %rsp, %rbp - - - - 0.94 - 1.00 - - 0.06 movl %edi, -4(%rbp) - - - - 0.05 0.95 - - - - movl -4(%rbp), %eax - - 0.85 - - - - - 0.15 - shll %eax - - 0.16 0.83 0.01 0.99 - - 0.01 - popq %rbp - - 0.16 0.20 0.64 0.36 - 0.64 1.00 - retq - - - 0.01 - - 1.00 0.20 0.79 1.00 pushq %rbp - - - 0.22 - - - 0.78 - - movq %rsp, %rbp - - 0.18 0.78 - - - - 0.04 - subq $32, %rsp - - - - 0.34 0.02 1.00 - - 0.64 movl $0, -4(%rbp) - - - - - - 1.00 - - 1.00 movl $5, -8(%rbp) - - - - - - 1.00 - - 1.00 movl $3, -12(%rbp) - - - - 0.04 0.96 - - - - movl -8(%rbp), %eax - - - 1.00 0.96 0.04 - - - - imull -12(%rbp), %eax - - - - - 0.65 1.00 - - 0.35 movl %eax, -16(%rbp) - - - - 0.04 0.96 - - - - movl -16(%rbp), %edi - - 0.94 0.62 0.01 - 1.00 0.23 0.21 0.99 callq __Z3funi - - - - 0.95 0.04 1.00 - - 0.01 movl %eax, -20(%rbp) - - - - 0.96 0.04 - - - - movl -20(%rbp), %eax - - 0.18 - - - - 0.67 0.15 - addq $32, %rsp - - 0.67 - 0.04 0.96 - 0.16 0.17 - popq %rbp - - 0.65 0.16 0.97 0.03 - 0.19 1.00 - retq patrick@MacBook-Pro bin %