Currently global SRA uses the GEP structure to determine how to split the global. This patch instead analyses the loads and stores that are performed on the global, and collects which types are used at which offset, and then splits the global according to those.
This is both more general, and works fine with opaque pointers. This is also closer to how ordinary SROA is performed.
I think this limit is more aggressive for struct types than the original code AFAICT, which seems to be causing some code-size regressions. I put up D129525 to make the limit behave more like the original code.