Initialize m0 to the default value for LDS in the entry block,
and remove the initialization around DS instruction uses.
Treat the LDS value as the default, and insert writes of the default around other uses.
Spills need to still do save restore, since we don't know the point where it is being spilled (and could be spilled in a sequence involving inlineasm).
This isn't an ideal solution. Unfortunately this needs to add m0 as a physreg live in to every block for now right after instruction selection which is discouraged. Inserting a copy from the initial value to m0 in each block works, but misses many of the cases where we want to eliminate m0 usage. The live ins are added too aggressively, making more defs appear alive than they really are. Better would be to always use save/restore, but there are missing optimizations to eliminate redundant ones. Also missing are optimizations to generally hoist the same m0 def into predecessor blocks. MachineLICM handles some, but it doesn't handle all loops, or diamonds and other simple control flow. The worst code quality regressions are around SGPR spills at -O0 when using scalar stores, but I'm not sure how much of a concern that is.
I would say neither. We want to totally disable the bounds checking for normal code, hence the MAXINT value seems the best choice.
An address sanitizer may want to insert range check code and pay the performance cost. But this would be on a per variable basis, not on the entire LDS. The hardware ensures that one wave cannot corrupt the LDS of another work-group regardless of M0 value.