Stack slot colouring adds "weight" to slots if a non-dbg-value instruction refers to it. This, unfortunately, means that DBG_PHI instructions can have an effect on codegen. The fix is very simple, replace isDebugValue with isDebugInstr.
The test is not so simple; because the failure mode involves both weighting of locations and register allocation, it's really difficult to pin down a simple input that will replicate it. Thus, the test for this is the next best thing: an input that replicates the problem, but that is complicated. In instr-ref mode, the PHI in block 13 will get a DBG_PHI, that will then change the order in which stack slots are merged together. It doesn't actually eliminate any slots, instead it changes how they're used.
To avoid having an extremely fragile test, I've used a little bit of sed and test both modes across as small a period as possible (regalloc -> stack slot colouring). Without this patch applied, the DBG_PHI ends up referring to %stack.2, and other stores change their location.