ds_read_b128 and ds_write_b128 have been recently enabled under the amdgpu-ds128 option because the performance benefit is unclear. Though, using 128-bit loads/stores for the local address space appears to introduce regressions in tessellation shaders. Not sure what is broken, but as ds_read_b128/ds_write_b128 are not enabled by default, just introduce a global option and enable 128-bit only if requested (until it's fixed/used correctly). Bugzilla: https://bugs.freedesktop.org/show_bug.cgi?id=105464
Details
Details
- Reviewers
arsenm
Diff Detail
Diff Detail
Event Timeline
test/CodeGen/AMDGPU/load-local-f32.ll | ||
---|---|---|
6–8 | Where? Do you want new tests where the option is disabled? |
test/CodeGen/AMDGPU/load-local-f32.ll | ||
---|---|---|
6–8 | You can either just add another run line and checks, or another small test where it gets used or not. The separate test is probably less effort |
Comment Actions
v2:
- use a return ternary operator
- fix feature format
- add small test load-local-f32-no-ds128.ll
lib/Target/AMDGPU/AMDGPUSubtarget.h | ||
---|---|---|
136 | Shouldn't this be initialized in the constructor? |
lib/Target/AMDGPU/AMDGPUSubtarget.h | ||
---|---|---|
136 | You are right. |
Weird formatting/capitalization. "ds_{read|write}_b128 maybe?