The generated code for the split fp128 load/stores was missing a small yet important adjustment to the pointer metadata being fed into getStore and getLoad, making it out of sync with the effective memory address.
This problem often resulted in instructions being scheduled in the wrong order.
I also took this chance to clean up some "wrong" uses of getAlignment as done in D77687.
Thanks @jrtc27 for finding the problem and providing a patch.
This dyn_cast is only checked by the assert below. The rest of the code just blindly dereferences it.