Page MenuHomePhabricator

Work around somes register/spill/liveness issues relating to returnTwice aka setjmp
Needs ReviewPublic

Authored by bbrehm on Mar 10 2020, 4:19 PM.
This revision needs review, but there are no reviewers specified.

Details

Reviewers
None
Summary

Related issues: https://bugs.llvm.org/show_bug.cgi?id=28431 https://bugs.llvm.org/show_bug.cgi?id=27190 https://bugs.llvm.org/show_bug.cgi?id=21183 and the original issue that brought me here https://github.com/JuliaLang/julia/issues/17288

Presumably there is a related apple issue at rdar://problem/8007500, since a related work-around refers to it, but I don't work at apple, and the linked issue is not open-access.

Quentin Colombet posted the workaround/hint in 2014, in https://bugs.llvm.org/show_bug.cgi?id=21183: Disable join-liveintervals.

Here, I just formalized his tip (only apply in ExposesReturnsTwice functions). That alone fixes the bug in my reproducers.

Next I went over the other passes that handle this, and noted the old workaround in StackSlotColoring.cpp. Extrapolating from that, I suspect that StackSlotColoring.cpp is no better at handling the problem, so I disabled that out of an overabundance of caution.

The final change is to the inlining heuristic: Inlining perfectly fine code into an ExposesReturnsTwice context is a very bad idea until we get much better at optimizing and not miscompiling code surrounding setjmp. Hence that change. A further advantage is that we should trigger the bug less often: It is related to register spilling, and less inlined code often means less register spills.

The changes are pretty unprincipled. But we have been miscompiling setjmp code for 6 years, and nobody qualified seems motivated to fix the underlying issues in the machine-optimizer passes. So I think a temporary band-aid is warranted. I am not qualified ;)

I compared below example to gcc and msvc: MSVC emits very ugly risk-free correct code, by creating tons of volatile loads/stores. GCC generates super nice correct code; but I am not well-acquainted enough with the gcc code-base to understand what they are doing, and whether we could do the same.

The following reproducer is adapted from https://bugs.llvm.org/show_bug.cgi?id=28431

#include <setjmp.h>
#include <stdio.h>
#include <stdlib.h>

extern jmp_buf env;
extern int ff(int v);
extern int gg();
extern void dojump();



__attribute__((noinline)) int f(int a)
{
    printf("pre longjump: a = %d\n", a);
    int b = gg();
    int c = gg();
    int d = gg();
    int e = gg();
    int f = gg();
    int g = gg();
    int h = gg();
    int i = gg();
    double k = ff(b) + ff(c + ff(d + ff(e + ff(f + ff(g + ff(h + i))))));
    k *= b;
    k -= c;
    k += i;
    if (setjmp(env) == 0) {
        printf("Longjump set: a+4 = %d\n", a + 4);
        dojump();
        b = gg();
        c = gg();
        d = gg();
        e = gg();
        f = gg();
        g = gg();
        h = gg();
        i = gg();
        k = ff(b) + ff(c + ff(d + ff(e + ff(f + ff(g + ff(h + i))))));
        k *= b;
        k -= c;
        k += i;
        printf("%d\n", a + 4);
        b = gg();
        c = gg();
        d = gg();
        e = gg();
        f = gg();
        g = gg();
        h = gg();
        i = gg();
        k = ff(b) + ff(c + ff(d + ff(e + ff(f + ff(g + ff(h + i))))));
        k *= b;
        k -= c;
        k += i;
        printf("%d\n", a + 4);
        b = gg();
        c = gg();
        d = gg();
        e = gg();
        f = gg();
        g = gg();
        h = gg();
        i = gg();
        k = ff(b) + ff(c + ff(d + ff(e + ff(f + ff(g + ff(h + i))))));
        k *= b;
        k -= c;
        k += i;
        printf("%d\n", a + 4);
        printf("%f\n", k);
        
    }
    else {
        printf("Returned from Longjump: a = %d\n", a);
    }
    return -1;
}

int main()
{
    f(0);
    return 0;
}

miscompiles with clang -O3, which becomes evident when linked against

#include <setjmp.h>

jmp_buf env;

int ff(int v){return 0;};
int gg(){return 0;};
void dojump(){longjmp(env, 1);}

I get output:

pre longjump: a = 0
Longjump set: a+4 = 4
Returned from Longjump: a = 4

I don't have proper regression tests yet. Could anyone comment on how to proceed on that (first PR in llvm)? Since this is a machine-IR issue one could just make a .ll fixture, but I'm not sure how to make that reliable.

Diff Detail

Unit TestsFailed

TimeTest
70 msLLVM.CodeGen/X86::Unknown Unit Message ("")
Script: -- : 'RUN: at line 2'; /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/bin/llc < /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/llvm/test/CodeGen/X86/speculative-load-hardening-call-and-ret.ll -mtriple=x86_64-unknown-linux-gnu -x86-speculative-load-hardening | /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/build/bin/FileCheck /mnt/disks/ssd0/agent/workspace/amd64_debian_testing_clang8/llvm/test/CodeGen/X86/speculative-load-hardening-call-and-ret.ll --check-prefix=X64-NOPIC

Event Timeline

bbrehm created this revision.Mar 10 2020, 4:19 PM

Was digging around for something else and happened to spot this.

Since this was posted, https://reviews.llvm.org/D77767 made a similar change to RegisterCoalescer.

InlineCost.cpp already has a check for ReturnsTwice.

Maybe worth taking another look at the change to StackColoring.

It appears another subset of this might have been merged in https://reviews.llvm.org/D77767 also?