- User Since
- Sep 4 2018, 4:49 AM (123 w, 3 d)
Wed, Jan 13
Tue, Jan 12
Mon, Jan 11
Superseded by D94311.
Superseded by D94311.
- Address reviewer comments
- Change tests to use more indicative S_AND_B32
Fri, Jan 8
Thu, Jan 7
Fri, Dec 18
Dec 14 2020
Dec 13 2020
Dec 3 2020
Nov 22 2020
Nov 11 2020
Nov 10 2020
- Use/reuse single virtual register for live mask. This removes need for PHIs and live mask register tracking. Assumes WQM is running non-SSA. (Supporting non-SSA and SSA operation would bloat code.)
- Live mask tracks all kills and demotes.
- Live mask manipulations terminate shader if all lanes are killed, even in non-uniform control flow.
- Move all kill lowering to WQM pass, this simplifies later passes and avoids duplication when updating live mask. Removes the need for "clean up" operations.
- WQM pass always modifies shader if it has any kills or demotes, even if there is no WQM.
Nov 9 2020
- Add tests, but these depend on D91066
Nov 8 2020
- Address reviewer comments.
- Invert condition.
- Add tests against a symbol.
Nov 7 2020
Nov 6 2020
Nov 5 2020
Oct 30 2020
Oct 28 2020
- Remove restrictions on types of shader where early terminate can occur.
Superseded by 419168d9381959ec6850e9e87aff9d062b68ef4b
Oct 27 2020
Oct 26 2020
Oct 21 2020
Oct 20 2020
- Fix markDefs to iterate all operands of MI
- Remove fix up for SI_ELSE as this is no longer required
- Remove elimination of trivial SGPR to SGPR WWM copies (this adds cruft in atomic optimizer tests)
Thanks, I was going to make the same change, but you beat me to it.
Slight nit. since we use Width - 1 three times, for readability I think we should just declare a new variable for it (TableIdx?).
Oct 19 2020
- Address review comments
Oct 18 2020
- Consistently use Register type.
Oct 17 2020
Oct 16 2020
Oct 15 2020
Oct 14 2020
- Remove peephole
- Pre-commit test
- Use std::array for map array.
Oct 13 2020
- Use std::array and tidy up initialisation.
- Fix number of rows in table.
Oct 12 2020
Having addressed the comments could I get a second quick read before I submit?
Address reviews comments:
- Fix initialiser to use AMDGPU::NoSubRegister and not memset.
- Add comment on mapping array.
Address reviewer comments.
Oct 11 2020
This has been superseded by front-end work in graphics compiler.
To motivate the peephole.
This pattern effects 2% of graphics shaders on GFX9, and nearly 7% on GFX10.
On average we save ~1.5 instructions per effected shader.
On some VulkanCTS tests the savings are much higher.
Given the relatively low gain, I assume it was not worth introducing a new peephole pass, and took this approach to address the duplicate s_mov instructions at the point of generation (when they cheapest to spot).
Oct 10 2020
- Merge code generation loops to avoid needing to generate work list
- Fix potential issue when all elements of copy are overwritten
- Fix test file location error
Oct 8 2020
Oct 7 2020
- Fix assumptions about SCC live intervals which are not valid late in compilation.
Oct 6 2020
- Address comments about pass insertion.
- Fix bug in removal of trivial SGPR copies from WWM.