Requiring the input of the first shift to be killed is more conservative than necessary; we can just insert an explicit copy instead.
Thinking about it a bit more, I'm not sure it's really worth complicating this code to catch a few more cases; it's not uncommon (I found 7 instances in a relatively small codebase), but it's not really common either. That said, I have another patch that breaks a couple regression tests without this fix.
Nit: Can you clean up the testcase a little.