While i'm quite sure this is correct and optimal,
i'm having trouble coming up with a test case to demonstrate this.
llvm-stress provided a few snippets that showcase codegen differences
(https://godbolt.org/z/7G3T45cxq), but i can't quite turn them into something reasonable that i feel ok using as a codegen test :)