Page MenuHomePhabricator

Prevent register coalescing in functions whith setjmp
ClosedPublic

Authored by dnsampaio on Apr 8 2020, 6:00 PM.

Details

Summary

In the the given example, a stack slot pointer is merged
between a setjmp and longjmp. This pointer is spilled,
so it does not get correctly restored, addinga undefined
behaviour where it shouldn't.

Change-Id: I60ec010844f2a24ce01ceccf12eb5eba5ab94abb

Diff Detail

Event Timeline

dnsampaio created this revision.Apr 8 2020, 6:00 PM
Herald added a project: Restricted Project. · View Herald TranscriptApr 8 2020, 6:00 PM

I'm not sure I understand the problem; can you give a C testcase?

In general, stack coloring should be okay, as long as we aren't truncating the lifetime of a variable. If a variable which is live at the point setjmp is called is still live at the point of the longjmp, it shouldn't overlap with anything relevant. If the variable isn't live at the point of the longjmp, we don't have any obligation to preserve the value according to the standard (from C99: "All accessible objects have values [...]").

If we truncate the lifetime of a variable, that could be a problem: a variable which the standard requires to be live would be dead, and potentially overlap with something relevant. But I'm not sure how that would happen in practice if we still have an alloca going into isel.

Hi @efriedma,
thanks for looking into this.

I created https://bugs.llvm.org/show_bug.cgi?id=45489 to tackle this bug, with further details.

Bare with me, but the C++ code for this is like this:

file043220.h:#include <setjmp.h>
file043220.h:struct S38 {
file043220.h:  long long M0;
file043220.h:  bool M1;
file043220.h:};
file043220.h:struct S37 {
file043220.h:  char M1;
file043220.h:  S38 M2;
file043220.h:  long long M3[4];
file043220.h:  int M0a;
file043220.h:  unsigned long long M1a;
file043220.h:  int M6;
file043220.h:  S37() : M2(), M3(), M6()  {}
file043220.h:};
file043220.h:struct S23 {
file043220.h:  long long M0[2];
file043220.h:  __fp16 M1;
file043220.h:};
file043220.h:struct S18  {
file043220.h:  int Bx;
file043220.h:  long long BM0;
file043220.h:  long long BM1;
file043220.h:  long long BM2;
file043220.h:  long M0;
file043220.h:  S23 M3;
file043220.h:  int M4;
file043220.h:  S18() : BM0(0), M0(42), M3(), M4() {}
file043220.h:};
file043220.h:void foo(jmp_buf incoming_jb, S18 P0) __attribute__((noreturn));
file043220.h:void bar(jmp_buf incoming_jb, S37 P0) __attribute__((noreturn));
nolto.cpp:#include <assert.h>
nolto.cpp:#include <stdlib.h>
nolto.cpp:#include "file043220.h"
nolto.cpp:void foo(jmp_buf incoming_jb, S18 P0)  {
nolto.cpp:  assert(P0.M0 == 42);
nolto.cpp:  exit(0);
nolto.cpp:}
nolto.cpp:void bar(jmp_buf incoming_jb, S37 P0)  {
nolto.cpp:  longjmp(incoming_jb, 1);
nolto.cpp:}
lto.cpp:#include <assert.h>
lto.cpp:#include <alloca.h>
lto.cpp:#include "file043220.h"
lto.cpp:signed int main(void) {
lto.cpp:  {
lto.cpp:    S37 P0;
lto.cpp:    jmp_buf jb1;
lto.cpp:    if (!setjmp(jb1)) {
lto.cpp:      bar(jb1, P0);
lto.cpp:    }
lto.cpp:  }
lto.cpp:  {
lto.cpp:    char *V6 = (char *)alloca(128);
lto.cpp:    __asm volatile("" : : "r"(V6));
lto.cpp:  }
lto.cpp:  {
lto.cpp:    __attribute((aligned(16))) char V7[16];
lto.cpp:    __asm volatile("" : : "r"(&V7[0]));
lto.cpp:  }
lto.cpp:  __asm volatile("" : : "r"(__builtin_return_address(0)));
lto.cpp:  S18 P1;
lto.cpp:  jmp_buf jb2;
lto.cpp:  if (!setjmp(jb2)) {
lto.cpp:    foo(jb2, P1);
lto.cpp:  }
lto.cpp:  return 0;
lto.cpp:}

From my understanding, when we join variables with disjoint aliveness, one from before and the other after the long, changing the "pointer" to this new joined variable is dangerous. If it is spilled, it won't be restored by the longjmp.
If I understand correctly, in the example, S37 P0 and S18 P1 are selected to share the same stack slot. As the function bar is marked as noreturn, when performing bar(jb1, P0);, the copy of P0 increments the pointer by 72.
bar then calls longjmp, that does not restore the pointer value of P0 as it is spilled. When doing the copy of P1 for foo(jb2, P1);, the read access of P1 is misaligned by 72.

However, the only reason why we see the bug is because the initialization of P1 writes to the correct address, just because it does not write to the start of the container, and after Local Stack Slot Allocation it uses a different "pointer". Thanks to that we hit the assert failure for P1.M0 == 42 inside foo.

efriedma added a subscriber: rnk.Apr 9 2020, 4:30 PM

From my understanding, when we join variables with disjoint aliveness, one from before and the other after the long, changing the "pointer" to this new joined variable is dangerous. If it is spilled, it won't be restored by the longjmp.

You're saying the problem is the address of the local variable, not its contents? In that case, this patch is almost certainly covering up the real problem; an equivalent "merging" transform can be done at the source level. And it's probably feasible to hit related issues in other ways.

My best guess based on your description is that greedy regalloc isn't being conservative enough about its use of spill slots in functions that call setjmp.

rnk added a comment.Apr 10 2020, 1:20 PM

I suspect Eli is right, if this issue has to do with spill slots, it has more to do with the register allocator that does the spilling. I think we need to know more about that to take action. We can't just sprinkle "don't optimize because setjmp" throughout the compiler without a more principled reason.

Hi @rnk and @efriedma, indeed my initial thoughts were as well the register allocator, specially because using the pbqp allocator there are no spills for this example so it does not fail.
From what I see in the code it seems that none of the register allocators know anything about setjmp. There's an ancient llvm-dev post confirming it: https://lists.llvm.org/pipermail/llvm-dev/2011-October/043731.html

So the doubt I have is, what is the correct solution for it? I guess we can't block a variable from being spilled if it's life-range crosses the setjmp function. From my understanding of setjmp/longjmp, for any of such variables, when evicted, needs two spill slots. Spill slot 1 (ss1) holds the current value of the variable. Spill slot 2 (ss2) holds the variable value right before the setjmp, copied from ss1. Right after setfmp we always copy the ss2 to ss1 restoring it. The issue I see with doing it is that it will increase the register pressure at this point of the program by one, as we need it for doing the copying, unless the variable is also live in register at this point, which would avoid the second spill slot all together.

A possible optimization would be finding a lower register pressure point that post-dominates the last may-write to the variable before the setjmp function call and that dominates the function call, if doing so would avoid a second spill.
Does that sound correct to you?

From what I see in the code it seems that none of the register allocators know anything about setjmp

StackSlotColoring checks MF.exposesReturnsTwice(); that's related to register allocation. But yes, the allocators themselves don't have any checks.

Going into the register allocator, distinct SSA values should be in distinct registers. We can take advantage of this to preserve our invariants. If each of those distinct SSA values gets its own spill slot, that should be enough to avoid this class of issue: after the setjmp, the value would always be the same value we computed before the setjmp. This is basically the point of the check in StackSlotColoring.

But I guess the check there isn't enough in some cases. Maybe the problem is register coalescing?

(This is basically the same conclusion as https://lists.llvm.org/pipermail/llvm-dev/2011-October/043734.html .)

Hi @efriedma and @rnk ,

indeed following the log it is register coalescing that finally merges the two distinct stack pointers values into a single virtual register. For fixing it I see 3 options, but perhaps you can come with a better idea.
1 ) A quick fix for that would be simply disable register coalescing if the function exposesReturnsTwice. It is over-conservative, although I wouldn't consider that performance is so critical in such functions anyway due backing-up all register.

  1. The most generic solution I can think of is, after performing spills, to verify every spilled value if they have their value altered between a call to setjmp and a possible call to longjmp (we need to trace the buffer argument).

If it is, then we need to emit extra spill slot and code for performing saving the input-value at the the call to setjmp and restore the value just after the call.

  1. Do not allow to merge instructions if one is alive before the setjmp call and the other is alive after. As well, if one variable is alive both before and after a setjmp call and altered before any potential call to longjmp, that variable also

can't be merged. Although I think this one could be tricky, as we need to follow where the buffer argument can escape of the function.

Do you have any opinion about this, or perhaps a better idea?
Regards

The simplest thing is just disabling coalescing.

The next simplest thing to do would be to specifically disable coalescing registers live across setjmp calls (not worrying about any corresponding longjmp calls). This is probably straightforward to query: register allocation already computes the necessary liveness information, I think. But really, it's probably not worth the effort; disabling coalescing altogether isn't that big of a performance hit.

dnsampaio updated this revision to Diff 260300.Apr 27 2020, 6:13 AM

Do not perform register coalescing

dnsampaio retitled this revision from Prevent stack coloring functions whith setjmp / longjmp to Prevent register coalescing in functions whith setjmp.Apr 27 2020, 7:37 AM
dnsampaio edited the summary of this revision. (Show Details)
dnsampaio updated this revision to Diff 260332.Apr 27 2020, 8:13 AM

Fixed x86 test

Can you fix the MIR testcase to check the register coalescing pass itself, rather than the whole pipeline after stack coloring?

dnsampaio updated this revision to Diff 260527.Apr 27 2020, 6:59 PM

Now testing only simple-register-coalescing

rnk added inline comments.May 5 2020, 4:17 PM
llvm/test/CodeGen/ARM/no-register-coalescing-in-returnsTwice.mir
18

I can't read ARM assembly well enough to tell if this is a good test or to suggest how to make it better, so I'll have to ask Eli to review.

llvm/test/CodeGen/X86/speculative-load-hardening-call-and-ret.ll
376

Seems like we don't want to lose these labels. See rGe3ea164659ff37cb4db623c33de880e91aa29ebb. Please regenerate with --no_x86_scrub_rip.

For a while, I have wanted update_llc_test checks to store this option in the comment at the beginning of the file and then read the options back out when regenerating test cases, but it has not come to pass.

efriedma added inline comments.May 5 2020, 5:32 PM
llvm/test/CodeGen/ARM/no-register-coalescing-in-returnsTwice.mir
18

It should be possible to construct a shorter testcase by artificially forcing up the register pressure using inline asm clobbers.

See also http://llvm.org/docs/MIRLangRef.html?highlight=filecheck#simplifying-mir-files .

dnsampaio updated this revision to Diff 264364.May 15 2020, 2:59 PM
dnsampaio marked 4 inline comments as done.

Addressed requests

llvm/test/CodeGen/ARM/no-register-coalescing-in-returnsTwice.mir
18

Indeed replaced most of the code in my original reproducer by the inline asm clobbering r0 - r14 between the two function calls still gives me the same error.
And I simplified the mir as much as possible.

This revision is now accepted and ready to land.May 15 2020, 3:17 PM
This revision was automatically updated to reflect the committed changes.