When requesting a scalar value for a uniform VPDef, we can always return
lane 0. This can avoid unnecessary inserting some unncessary instructions
to duplicate the uniform value across lanes.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
LGTM - though, please keep in mind I'm not fully familiar with this code. You may want to wait for another reviewer.
Actually, LGTM revoked. This apparently depends on D91500 (which isn't marked in the metadata), and as commented there I don't have context.
Not blocking this review, but I think it's bug-prone to mix lane 0 of scalarized divergent values and truly uniform values that can be kept on a single scalar. Possible examples:
top-loop: if %iv % VF != 0: inner-loop: %iv = [ 0, inner.ph ], [ %iv.next, inner.latch ] ; Uniform, but lane0 doesn't make much sense since it masked out. ... divergent exit condition
bb: %sel = select i1 %divergent, 42, %divergent.def ; divergent in general use %sel br i1 %divergent, label %uni.use.bb, label %bb2 uni.use.bb: %uni.phi = phi [ %sel, %bb ] ; "Conditionally" uniform - all active lanes have the same uniform value ; Long compute chain based on %uni.phi that we'd like to keep on a single scalar
In the latter case the correct extract for the uniform value would be from the first *active* lane, not from the lane 0. And I believe it's very easy to make a mistake if the same data storage is used for both scalarized parts of divergent values and for really uniform values that should be kept on a single scalar def/register.
To summarize - I think it's possible to implement everything correctly by repurposing lane0 storage for keeping uniform values, but it might lead (in future, once we try to implement more complex/complicated optimizations) to unexpected confusions and omissions that might lead to silent miscompiles (e.g. extracting undef values from lane0 instead of extracting required uniform values from the first active lane).
clang-tidy: error: no member named 'isUniform' in 'llvm::VPReplicateRecipe' [clang-diagnostic-error]
not useful