Page MenuHomePhabricator

RegAllocFast: Do not free later early-clobbered registers.
AbandonedPublic

Authored by FreddyYe on Jan 3 2021, 1:56 AM.

Details

Summary

Consider corner case: MSInlineAsm may contain operands of both
implicit-def $eax and implicit-def early-clobber $eax

Diff Detail

Unit TestsFailed

TimeTest
2,150 msx64 debian > libarcher.races::task-two.c
Script: -- : 'RUN: at line 13'; /mnt/disks/ssd0/agent/llvm-project/build/./bin/clang -fopenmp -pthread -fno-experimental-isel -g -O1 -fsanitize=thread -I /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests -I /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/runtime/src -L /mnt/disks/ssd0/agent/llvm-project/build/lib -Wl,-rpath,/mnt/disks/ssd0/agent/llvm-project/build/lib /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-two.c -o /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-two.c.tmp -latomic && env TSAN_OPTIONS='ignore_noninstrumented_modules=0:ignore_noninstrumented_modules=1' /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/deflake.bash /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-two.c.tmp 2>&1 | tee /mnt/disks/ssd0/agent/llvm-project/build/projects/openmp/tools/archer/tests/races/Output/task-two.c.tmp.log | /mnt/disks/ssd0/agent/llvm-project/build/./bin/FileCheck /mnt/disks/ssd0/agent/llvm-project/openmp/tools/archer/tests/races/task-two.c

Event Timeline

FreddyYe created this revision.Jan 3 2021, 1:56 AM
FreddyYe requested review of this revision.Jan 3 2021, 1:56 AM
Herald added a project: Restricted Project. · View Herald TranscriptJan 3 2021, 1:56 AM
FreddyYe added a comment.EditedJan 3 2021, 2:10 AM

This patch is to fix a regression bug (https://gcc.godbolt.org/z/GG44f6) on recent rewrite on RegAllocFast.cpp. The situation that implicit-def $eax and implicit-def early-clobber $eax both occur in MI is rare. But since old implement can handle it, so I add a some condition to check such MI.

LuoYuanke added inline comments.Jan 3 2021, 6:02 AM
llvm/test/CodeGen/X86/phys-msInline-fastregalloc.ll
11 ↗(On Diff #314271)

It seems to be invalid inline assembly. We can't set eax both as output and clobber. The clobbered register and output register should be in different physical register.

[llvm]$ cat t.c

void foo(int a, int b) {
  int c;
  asm volatile("addl %1, %2 \n"
               "movl %2, %0\n"
               :"=a"(c)
               :"r"(a), "r"(b)
               :"eax");
}
[llvm]$ clang -S t.c
t.c:8:17: error: asm-specifier for input or output variable conflicts with asm clobber list
               :"eax");
                ^
1 error generated.
[llvm]$ gcc -S t.c
t.c: In function ‘foo’:
t.c:4:3: error: ‘asm’ operand has impossible constraints
    4 |   asm volatile("addl %1, %2 \n"
      |   ^~~
[llvm]$

This is from https://llvm.org/docs/LangRef.html#output-constraints.

If this is not safe (e.g. if the assembly contains two instructions, where the first writes to one output, and the second reads an input and writes to a second output), then the “&” modifier must be used (e.g. “=&r”) to specify that the output is an “early-clobber” output. Marking an output as “early-clobber” ensures that LLVM will not use the same register for any inputs (other than an input tied to this output).

The eax is an output, so front-end need to mark the early-clobber for eax with "=&{eax}", so it seems to be a front-end issue?

FreddyYe added inline comments.Jan 3 2021, 6:48 PM
llvm/test/CodeGen/X86/phys-msInline-fastregalloc.ll
11 ↗(On Diff #314271)

Thanks for review! Sorry for failing to know this output constraint in IR before. Yes, both GNU inline asm and llvm IR don't allow setting one register both as output and clobber. So the front-end seems to deal wrong with MS inline assembler. Seems like clang front-end always generate ={eax} and set each output register in asm as clobber when dealing with MS inline assembler. So this conflict may happen in clang front-end. I will abandon this review.

FreddyYe added inline comments.Jan 3 2021, 7:24 PM
llvm/test/CodeGen/X86/phys-msInline-fastregalloc.ll
11 ↗(On Diff #314271)

I'll double check on conclusions above.

I'd like an additional MIR test for this

FreddyYe updated this revision to Diff 314589.Jan 5 2021, 6:30 AM

Add MIR test

arsenm added inline comments.Jan 5 2021, 11:10 AM
llvm/lib/CodeGen/RegAllocFast.cpp
1241

I don't understand what's special about MSVC inline asm. Aren't the constraints semantically the same for any asm type?

llvm/test/CodeGen/X86/corner-msInline-fastregalloc.mir
6–25

You should be able to drop the whole IR section

27–63

Can delete most of this (-simplify-mir helps)

FreddyYe added inline comments.Jan 5 2021, 7:07 PM
llvm/lib/CodeGen/RegAllocFast.cpp
1241

MSVC inline asm is wrote in Intel assembly format. So the method isMsInlineAsm() here uses flag inteldialect to distinguish from GNU inline asm. (I don't know why this flag didn't show in this MIR test, but it indeed works.)

From the investigation I've done so far, I have several conclusions:

  1. MSVC inline asm doesn't specify semantically input, output, clobber in source. But LLVM IR has such semantics. So does GNU inline asm.
  2. When clang front-end deals with MS inline asm, it has such rules to set input, output, clobber:
    1. always set eax as output
    2. set the dest register in each asm as a clobber

The rules above will then generate corner case like this patch's IR and MIR test.

Yuanke mentioned about set early clobber in LLVM IR (i.e. =&r). I did a experiment. It indeed can pass this bug without this patch. But seems like this rule is for user side. Becasue I found that clang also cannot set =&r for GNU inline asm when output register is written to before the last asm. But clang can deal with similar semantics GNU inline case, because eax is not set as clobber by front-end. (https://gcc.godbolt.org/z/We3b9c).

FreddyYe updated this revision to Diff 316089.Jan 12 2021, 7:35 AM

simplify the MIR test

FreddyYe marked an inline comment as done.Jan 12 2021, 7:39 AM

THX for review!

llvm/test/CodeGen/X86/corner-msInline-fastregalloc.mir
6–25

I failed to drop the IR. Because if dropped, %stack.0.retval and %stack.1.c will parsed by llc as not defined. Maybe I'm doing the wrong way. I'm not so familiar with MIR. Can you show me an example also drops IR but with stack variable?

I uploaded another patch in front-end D94466. It also can fix this regression bug. Both solutions are reasonable to me, since for now clang is weak to deal with MS inline asm.

arsenm added inline comments.Jan 12 2021, 9:50 AM
llvm/lib/CodeGen/RegAllocFast.cpp
1241

There shouldn't be a semantic difference. Ultimately the MIR should have the same flags/semantics regardless of the syntax used. This is either correct for all asm dialects, or it's generally incorrect

FreddyYe added inline comments.Jan 12 2021, 11:13 PM
llvm/lib/CodeGen/RegAllocFast.cpp
1241

I agree with you that there shouldn't be semantic difference. Maybe the old implementation just happen to cover this issue luckily. I'd like to listen to more front-end's voice for this issue. Welcome to comment there! D94466

FreddyYe abandoned this revision.Wed, Feb 3, 10:47 PM

Patch has solved this regression bug in Front End.

Patch has solved this regression bug in Front End.