We were assuming a much larger possible per-wave visible stack
allocation than is possible:
https://github.com/RadeonOpenCompute/ROCR-Runtime/blob/master/src/core/runtime/amd_gpu_agent.cpp#L70
Based on this, we can assume the high 15 bits of a frame index or sret
are 0. The frame index value is the per-lane offset, so the maximum
frame index value is MAX_WAVE_SCRATCH / wavesize.
Remove the corresponding subtarget feature and option that made
this configurable.