There is no way any compiler can simplify this division, while
the check is done rather often.
Details
- Reviewers
foad Joe_Nash - Commits
- rGe2903abc154a: [AMDGPU] Remove integer division in VOPD checks
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
NB: To have a chance to simplify division here a compiler must fully unroll the loop. With the loop growing it becomes impossible.
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h | ||
---|---|---|
564 | On a separate note, I do not understand 2 banks for the src2. This should have been checked yet while checking vdst parity. If there is a real src2 it must follow a regular rules for the source banks. At the moment this is simply misleading, the loop will actually exit at the first iteration if the parity is the same. It is a time bomb, but this is a separate patch. |
I am growing to think that update_*_test_checks scripts are a simulation of testing. Nobody seems to really think what tests are doing now.
llvm/lib/Target/AMDGPU/Utils/AMDGPUBaseInfo.h | ||
---|---|---|
564 | Please ignore the src2 part, just found an additional restriction. |
On a separate note, I do not understand 2 banks for the src2. This should have been checked yet while checking vdst parity. If there is a real src2 it must follow a regular rules for the source banks. At the moment this is simply misleading, the loop will actually exit at the first iteration if the parity is the same. It is a time bomb, but this is a separate patch.