Restrict the bitwidth of the largest vector type used in ls vectorizer
to 128 for buffer and constant addr spaces.
This avoids a potential sgpr pressure increase in shaders where multiple
resources are used. There is no enough context in LSV to determine if forming
large vectors is beneficial for perf, and currently there is no late phase in
the compiler that would split vectors if register pressure were too high (it
could be argued that one should be added).
The extra large loads/store could still be formed late in the backend in
si-load-store-optimizer which has also some logic to avoid unbounded register
pressure increases, with the exception of s_load_dwordx16 which is not formed
there. s_load_dwordx16 is a tricky instruction to get right anyway, because
it can cause massive register pressure and fragmentation.