V_READFIRSTLANE and V_READLANE produce scalar result from vector operand. Scalar instructions consuming the result should not be executed if exec mask is zero.
Tests passed : make llvm-check
Differential D38293
Avoid predicated execution of the basic blocks containing scalar instructions alex-t on Sep 26 2017, 2:03 PM. Authored by
Details V_READFIRSTLANE and V_READLANE produce scalar result from vector operand. Scalar instructions consuming the result should not be executed if exec mask is zero. Tests passed : make llvm-check
Diff Detail Event TimelineComment Actions Test is needed.
Comment Actions
Is really used by some SALU instruction. This is necessary to avoid de-optimization of those cases when the scalar register is distinctly the scalar operand of vector instruction.
Comment Actions It is really make sense to take care of the V_READFIRSTLANE/V_READLANE destination register under exec == 0 condition in case their source VGPR is re-defined in SI_MASK_BRANCH target block. Otherwise we assume that source VGPR is defined in one of the dominating blocks and contain correct value.
Comment Actions I would just bail on any of these instructions not trying to optimize the case, just like you did in the beginning. Comment Actions I want to replace this pass by always inserting the branches on execz, and have a new pass which optimizes out short jumps. Would that be easier than trying to analyze this? This comment was removed by alex-t. |
The comment is misleading. Scalar instructions executed even if exec = 0 (contrarily to the comment). That is unclear if there must be a scalar instruction consuming result of readlane too, since SGPR can be an operand of a vector instruction.