This sets the latency of stores to 1 in the Cortex-A55 scheduling model, to better match the values given in the software optimization guide.
The latency of a store in normal llvm scheduling does not appear to have a lot of uses. If the store has no output's then the latency is somewhat meaningless (and pre/post increment update operands use the WriteAdr write for those operands instead). The one place it does alter things is the latency between a store and the end of the scheduling region, which can in turn have an effect on the critical path length. As a result a latency of 1 is more correct and offers ever-so-slightly better scheduling of instructions near the end of the block.
They are marked as RetireOOO to keep the llvm-mca from introducing stalls where non would exist.