This is an archive of the discontinued LLVM Phabricator instance.

[Constants] Add "stepvector" to represent the sequence 0,1,2,3... [IR support for SVE scalable vectors 4/4]
AbandonedPublic

Authored by paulwalker-arm on Nov 24 2016, 7:25 AM.

Details

Summary
[Constants] Add "stepvector" to represent the sequence 0,1,2,3...

The sequence "<i32 0, i32 1, i32 2, i32 3...>" is represented by
a ConstantVector containing an ArrayRef of all element values.
For scalable vectors, where the element count is unknown, such a
representation is impossible.

Instead the complex constant "stepvector" can be used wherever an
integer vector is expected. This allows the majority of vector
sequences required for vectorisation to be produced for all vector
types. Example Usage:

  getelementptr i8, i8* %ptr, <n x 4 x i32> stepvector
  getelementptr i8, i8* %ptr, <4 x i32> stepvector

NOTE: The non-scalable vector case will utilise ConstantVector.

Event Timeline

paulwalker-arm retitled this revision from to [Constants] Add "stepvector" to represent the sequence 0,1,2,3... [IR support for SVE scalable vectors 4/4].
paulwalker-arm updated this object.
paulwalker-arm added reviewers: hfinkel, mkuper.
paulwalker-arm added a subscriber: llvm-commits.
fhahn added a subscriber: fhahn.Nov 24 2016, 7:27 AM
deadalnix added inline comments.Nov 29 2016, 1:26 PM
include/llvm-c/Core.h
222

Same as D27103 . It is ok to add new enum values in there, but existing entries needs to keep the same value.

deadalnix requested changes to this revision.Nov 29 2016, 1:26 PM
deadalnix edited edge metadata.
This revision now requires changes to proceed.Nov 29 2016, 1:26 PM
rengolin edited edge metadata.Nov 30 2016, 6:07 AM

Right, this is the one I'm still unsure about. What is the value of having a whole new construct to build <0, 1, 2, ..., n-1>?

I mean, if the new construct was generic enough to represent all possible sequences (reverse, step, multiplicative, etc), then it would be a change that is guaranteed to extend to all future usages. This one is so restricted that we'll have to change again in the near future to add more functionality.

Of course, we could use this construct to build the sequences in IR. Examples:

for (0..N-1) { a[i] = i; }

would increment i (as %index) like:

%a = PHI (getelementptr, %new.a)
%index = i32 0
...
; create a step vector <0, 1, ... N>
%step = constant <n x 4 x i32> stepvector
; create a splat vector with the start of the sequence
%curidx = insertelement <n x 4 x i32> undef, i32 %index, i32 0
%curidx.1 = shufflevector <n x 4 x i32> %a, <n x 4 x i32> undef, <n x 4 x i32> zeroinitializer
; add them together to get the current sequence
%ind = add <n x 4 x i32> %step, <n x 4 x i32> curidx.1
; store the sequence into a[i]
store <n x 4 x i32> ind, i32* %a
; update the index/address
%index = add i32 %index, i32 vscale
%new.a = toptr( add toint(i32* %a), mul(i32 vscale, i32 4) )

If we had an intrinsic for this, we'd be able to have any scalar evolution. Example:

@llvm.scalable_vector.step(i32 %start, i32 %a, i32 %b) ; as %start + (%a * x + %b)

The above loop would be:

%a = PHI (getelementptr, %new.a)
%index = i32 0
...
; create a step vector <0, 1, ... N>
%step = @llvm.scalable_vector.step.4.i32(i32 %index, i32 1, i32 1)
; store the sequence into a[i]
store <n x 4 x i32> ind, i32* %step
; update the index/address
%index = add i32 %index, i32 vscale
%new.a = toptr( add toint(i32* %a), mul(i32 vscale, i32 4) )

And this would work for all possible induction evolutions: positive, negative, multiples, etc.

Maybe going with ax+b is a bit too far, but certainly a constant step (plus/minus) and a starting point would be a great advantage to the notation, and it wouldn't (for now), change the IR to introduce a concept that is far too localised to be worth breaking compatibility.

Hope this makes sense...

cheers,
--renato

After reading the SVE docs, I realised that what I requested here (start + step) is exactly what SVE has for the INDEX instruction. I don't think that having a constant step in this way makes sense, even for SVE.

I mean, once could write:

INDEX z0.s, #1, #2

as

%a = splat <n x 4 x i32>, %i32 2
%b = mul <n x 4 x i32> stepvector <n x 4 x i32>, %a

But what about when start != 1?

I retain the position that an intrinsic here would be much better with start/step and would minimise the changes to IR, at least for now.

cheers,
--renato

After reading the SVE docs, I realised that what I requested here (start + step) is exactly what SVE has for the INDEX instruction. I don't think that having a constant step in this way makes sense, even for SVE.

I mean, once could write:

INDEX z0.s, #1, #2

as

%a = splat <n x 4 x i32>, %i32 2
%b = mul <n x 4 x i32> stepvector <n x 4 x i32>, %a

But what about when start != 1?

I retain the position that an intrinsic here would be much better with start/step and would minimise the changes to IR, at least for now.

cheers,
--renato

An arbitrary start value can be expressed by adding a splat of the start value to the step vector sequence. Composing that with the multiplication of the stepvector to achieve arbitrary steps values covers all use cases of seriesvector.

paulwalker-arm abandoned this revision.May 11 2017, 4:21 AM