Anton tried this 5 years ago but it was reverted due to extra VMOVs being emitted. This can be easily fixed with a liberal application of patterns - matching loads/stores and extractelts.
There appears now to be reasonably good LIT test coverage ensuring no extra VMOVs are inserted. I fell foul of a lot of them when developing this patch, so I'm reasonably confident this won't regress performance.