As discussed before, we aren't optimizing the LLA+Load/Stores (i.e. AUIPC+ADDI+Load/Stores) instruction sequences that we get with medany because:
- the pseudo instruction expansion pass runs late (in premit2, i.e. post-ra);
- the pass to merge the offset of address calculations into the offset of load/stores runs pre-ra, in SSA form.
Ideally, we want to expand the LLA pseudo instruction earlier and extend the offset folding pass to handle the AUIPC case. This patch implements that earlier expansion.
Just doing the same expansion we were doing but earlier, with virtual registers, runs into problems. It's easy for optimizations to separate the AUIPC instruction from the label of the BB that should point to the AUIPC.
Earlier passes don't know about LabelMustBeEmitted. Originally I solved that by making that flag also imply AddressTaken. But creating BBs earlier was messing with various optimizations, so I ended up going for an implementation based on createNamedTempSymbol + setPreInstrSymbol.
This is overly strict in the specific case being addressed by this patch:
can legally be optimised to