Perform the probing in the *cough* correct direction.
I tried the C reproducer from that Rust bug, and it turns out that alloca(0) was already fixed by D88548. But if I change that to a larger size, then it was skipping right over the probe loop. I confirmed that your change here does make it go through the loop.
It would be nice to have tests that actually execute this stuff, rather than just relying on manual reviews of the expected assembly, but I don't know if that's feasible in the current infrastructure.
Shouldn't this sub before mov 0? I think right now, the first iteration is going to clobber the most recent thing on the stack, in this case the saved value from pushq %rbp.