Diff Detail
Diff Detail
Event Timeline
lib/Target/AMDGPU/AMDGPUTargetTransformInfo.cpp | ||
---|---|---|
261 | The interleaving was disabled based on SHOC DeviceMemory readLocalMemory test. We request CQE to do a complete performance I remember that I re-measure DeviceMemory performance later when new waitcnt insertion was introduced, and it turned out that it does not matter for DeviceMemory readLocalMemory if we enable it! Note sure the other tests that CQE found beneficial when it is disabled. |
The interleaving was disabled based on SHOC DeviceMemory readLocalMemory test. We request CQE to do a complete performance
measurement around this, and the results were very positive. The major reason to disable it is based on register usage concern.
I remember that I re-measure DeviceMemory performance later when new waitcnt insertion was introduced, and it turned out that it does not matter for DeviceMemory readLocalMemory if we enable it!
Note sure the other tests that CQE found beneficial when it is disabled.