By reordering the objects on the stack frame after looking at the users, a better utilization of displacement operands will result. This means less needed Load Address instructions for the loading/storing of those objects.
This is still a work in progress, but the new test case works where an LAY is eliminated.
(Inspired by X86FrameLowering::orderFrameObjects().)
Why two calls to stable_sort here? It seems this could be done in a single call with an appropriate comparison routine.