Page MenuHomePhabricator

[Constants][SVE] Represent the runtime length of a scalable vector
Needs ReviewPublic

Authored by huntergr on May 2 2017, 2:40 AM.

Details

Summary

The length of a scalable vector is unknown during compilation. When
vectorising loops the runtime length is required to update induction
variables and thus a representation is required within the IR.

This patch introduces the 'vscale' identifier to represent the scaling
factor 'n' of a scalable vector of the form '<n x #elements x ty>'.

In use, induction variable updates for scalable vectorisation become:

; i += number_of_elements(<n x 4 x i32)
%i.next = add i32 %i, mul (i32 vscale, i32 4)

Picking up where Paul left off with https://reviews.llvm.org/D27103

Diff Detail

Event Timeline

huntergr created this revision.May 2 2017, 2:40 AM
rengolin added inline comments.May 2 2017, 11:05 AM
include/llvm-c/Core.h
1188

Not being in the same order as above may confuse people looking for it?

include/llvm/IR/Value.def
92

I don't get this change...

lib/IR/Constants.cpp
812

So, in theory, you can have vscale constans of different integer types, and this would only clear the ones that are the same as this one?

This sounds confusing.

aemerson added inline comments.
lib/IR/Constants.cpp
802

Indentation.

812

Yes, in the same way you can have i32 undef, i64 undef etc.

huntergr added inline comments.May 3 2017, 1:24 AM
include/llvm-c/Core.h
1188

These are already in a different order than the enum above -- Undef appears before ConstantInt there, but after it here. The indentation and the comment above it suggest a hierarchy, so I tried following that.

include/llvm/IR/Value.def
92

I added it at the end, after 'ConstantTokenNone', so needed to adjust the last marker. Since this isn't part of the C interface, I could add it before this and not need to change the markers.

rengolin edited edge metadata.

Hi Graham,

I have no more questions, but again, I'll let this one sit for other people to chime in, as it's a core change to IR.

cheers,
--renato

include/llvm-c/Core.h
1188

Makes sense.

include/llvm/IR/Value.def
92

Right, I see.

lib/IR/Constants.cpp
812

Right, makes sense.

sanjoy added a subscriber: sanjoy.May 7 2017, 3:05 PM

@echristo @chandlerc @lattner @majnemer Ping.

This is a trivial change, discussed in the past, and I'm inclined to approve.

But given that it changes IR behaviour, I want to make sure everyone is on the same page.

thanks,
--renato

chandlerc edited edge metadata.May 20 2017, 2:08 PM

@echristo @chandlerc @lattner @majnemer Ping.

This is a trivial change, discussed in the past, and I'm inclined to approve.

Uh, where was it discussed? It's entirely possible I missed it, but I can't find any consensus on any of the threads that we actually want to support runtime vector width in LLVM's IR.

The most recent thread I find on llvm-dev is from Mar 7: "[llvm-dev][RFC][SVE] Extend vector types to support SVE registers."

That thread exclusively talks about MVT and the code generator. I don't see its relevance to the IR.

Before taht we have the big RFC for SVE. And there, I had suggested in November of last year to get a fresh RFC which I don't see having happened yet.

And in response to that you indicated the patches under review were just examples, not planning to be committed.

If I missed the RFC, totally my bad, but I did search for 'SVE llvm-dev' and was unable to find it, so I suspect I may not be the only one that has continued to wait for an actual follow-up RFC.

For the record, I remain unconvinced that LLVM's IR should support non-constant vector widths. I understand why some CPU vendors are interested in this, but the motivation for LLVM to support it so far is very weak, and the cost in terms of complexity to the IR and every vector-aware optimization is, in my opinion, far too high. However, I'm trying to remain open about this subject and have been awaiting a fresh RFC on llvm-dev to really dig into the motivation.

Uh, where was it discussed? It's entirely possible I missed it, but I can't find any consensus on any of the threads that we actually want to support runtime vector width in LLVM's IR.

Hi Chandler,

This has been discussed in the list and phab, and I invited people from different targets and core devs to discuss. As usual, "consensus" is formed from the people that have actually cared, but there have been enough threads and discussions. We can always have more discussions, sure, but pulling the hand break without concrete proposals at this stage is really not fair.

For the record, I remain unconvinced that LLVM's IR should support non-constant vector widths.

Do you have an alternative implementation to scalable vectors?

I understand why some CPU vendors are interested in this, but the motivation for LLVM to support it so far is very weak, and the cost in terms of complexity to the IR and every vector-aware optimization is, in my opinion, far too high. However, I'm trying to remain open about this subject and have been awaiting a fresh RFC on llvm-dev to really dig into the motivation.

"Some CPU vendors"? Seriously?

This is not a thought experiment or an academic theoretical paper, ARM's SVE is out, many manufacturers were involved and the hardware is being developed right now, and we need to implement it somehow. RISC-V will have support for scalable vectors in the near future, not to mention GCC's ongoing implementation of the spec.

Not implementing SVE, in my opinion, is not an option. It's like not doing vector support because the new types will break scalar optimisations.

Given that this will be more common in the future, having a simple and (hopefully elegant) implementation will make it a lot easier for passes to infer semantics and maintain the existing optimisations working.

The MVT types are out already, other changes going and I'm really surprised at such comments at this stage.

--renato

lattner edited edge metadata.May 20 2017, 3:35 PM

@echristo @chandlerc @lattner @majnemer Ping.

This is a trivial change, discussed in the past, and I'm inclined to approve.

Uh, where was it discussed? It's entirely possible I missed it, but I can't find any consensus on any of the threads that we actually want to support runtime vector width in LLVM's IR.

...

For the record, I remain unconvinced that LLVM's IR should support non-constant vector widths

I agree with Chandler on this. Making something a first class type in LLVM has wide reaching effects, including requiring the ability to load/store/phi the value, pass it by argument, etc. Vikram had a research project many many years ago to do the exact same sort of thing. It failed because of these and other reasons.

What are the semantics of select when the two vectors have different width? Does store do a memory allocation?

-Chris

What are the semantics of select when the two vectors have different width? Does store do a memory allocation?

Maybe I misunderstood, but won't those selects be ill-typed?

lib/IR/Constants.cpp
812

Is there a minimum width, or is (say) an i1 vscale allowed? If there isn't a minimum, I presume the semantics is that the runtime value of vscale will be truncated to the type width?

hfinkel edited edge metadata.May 20 2017, 5:15 PM

What are the semantics of select when the two vectors have different width? Does store do a memory allocation?

Maybe I misunderstood, but won't those selects be ill-typed?

That is my understanding as well; the model is that there is one underlying vector width, we just don't know at compile time what it is. In any case, is there an overall LangRef patch? It might be best to clear up the semantics with an overall patch (even if we commit the changes in pieces along with the implementation).

I've only lightly read the spec, but it looks like the vector length can be controlled by writing to the ZCR_ELn registers (so, e.g. user code could make a syscall to change the vector length)? If that's accurate, I think a constant vscale is not sufficient.

I agree with Chandler on this. Making something a first class type in LLVM has wide reaching effects, including requiring the ability to load/store/phi the value, pass it by argument, etc.

I don't think anyone disagrees with that point.

Vikram had a research project many many years ago to do the exact same sort of thing. It failed because of these and other reasons.

This is not a research project, it's a spec that is being made into hardware by a large number of large corporations and it's in development by a long number of years.

There are already a number of production compilers (including LLVM based) that have SVE implemented in them.

I just want to make sure that we take the right *technical* decision now and evolve as needed. From my point of view, making scalable vectors a native and core type of IR is the only way forward, because the semantics needs to be ingrained in the language to make sense. AFAICS, at the IR level, the differences between vectors and scalable ones is not that big a deal, certainly not bigger than vectors versus scalar.

What are the semantics of select when the two vectors have different width?

As Sanjoy said, it's illegal behaviour. Vector length is a CPU runtime property, not a vector property. Vectors don't have length.

The IR should consider a scalable vector as a "promise of at least one iteration, but potentially more" of the same computation. The mask vectors do the rest of the job of making sure the semantics stays the same.

Does store do a memory allocation?

I'm not sure what you mean by that. Loads and stores do nothing more than what AVX512 (with scatter/gather) already does.

The main difference is that all operations (even non-scatter/gather) are predicated, so that there is absolute control over undefined behaviour. The predicate vector is updated from the scalar evolution of the iteration ranges and they control what the actual operations can and can't do, irrespective of the vector size.

cheers,
--renato

rengolin added inline comments.May 21 2017, 4:38 AM
lib/IR/Constants.cpp
812

The vscale does not define the vector length. That is defined by the CPU (via a status register) at runtime.

The *exact* same code can run in one process with length = 10 and another with length = 1. In theory, the same binary could run one instruction with 10 and the very next with 1 (that'd be crazy, but valid).

However, one instruction being executed by the unit will operate on identical lengths. Ie. you can't have two vectors of different sizes on the same "add". AFAIK this is not just illegal, it's theoretically impossible, from where that information comes from.

What's illegal (and probably traps) is if you set the status register to a value that is larger than the actual physical length, but that will never be generated by the compiler (which has no business setting the length at all), so it's not something the compiler should worry about.

I've only lightly read the spec, but it looks like the vector length can be controlled by writing to the ZCR_ELn registers (so, e.g. user code could make a syscall to change the vector length)?

As I explained to Hal on his comment, that is correct but doesn't have the effect you're expecting.

Vectors don't have length, they have the "idea that they may have length", and it's up to the CPU to control that.

Just to be clear, the example you propose has no effect on the notion of length:

// SVE length defined at boot time to be 4
...
add z0.s, p0/m, z0.s, z1.s // z0+=z1 only where the predicate p0 is valid, which here is "up-to" 4 vector lengths
...
svc ... // Try to change vector length to 8, assuming this works
...
add z0.s, p0/m, z0.s, z1.s // z0+=z1 only where the predicate p0 is valid, which here is "up-to" 8 vector lengths

In the case above, the change in length applies to both z0 and z1, as well as p0 and all the other SVE vectors uniformly.

Using other SVE instructions, the predicate p0 will be built to go "up-to" the end of the array/memory as the semantics allows in IR, either compiler-constant or runtime check, so there's no compiler-generated undefined behaviour.

If that's accurate, I think a constant vscale is not sufficient.

The main problem here is one of representation. In the ARM implementation, SVE vectors alias with SIMD vectors, so you need to be careful on how you write to them.

If you don't have a way to separate SVE from SIMD, you'll have trouble generating either code. If you separate them completely, you'll have trouble worrying about the aliasing.

Having a flag (even as boolean "i1 vscale") is enough. It needs to be a constant because of how scalar evolution will work on the predicate vectors, but I'll let Graham explain that in more depth, as I'm only "familiar" at this point.

cheers,
--renato

This is not a research project, it's a spec that is being made into hardware by a large number of large corporations and it's in development by a long number of years.

I'm familiar with it: I was involved in the work at apple (years ago) which contributed to SVE happening with this design.

I just want to make sure that we take the right *technical* decision now and evolve as needed.

This is obviously correct.

From my point of view, making scalable vectors a native and core type of IR is the only way forward, because the semantics needs to be ingrained in the language to make sense. AFAICS, at the IR level, the differences between vectors and scalable ones is not that big a deal, certainly not bigger than vectors versus scalar.

This is much more up for debate. Details matter. Compilers are a set of engineering tradeoffs. You need to explain and defend your position carefully, not just state it as though it were "obviously true".

Do you have a discussion somewhere of the overall design of the extensions you're proposing? This is presumably only a small piece of it.

Does store do a memory allocation?

I'm not sure what you mean by that. Loads and stores do nothing more than what AVX512 (with scatter/gather) already does.

You haven't proposed a set of IR type system extensions. If you're proposing a new first class type, one which represents a vector of an unknown length, then you'll need to be able to load and store it, e.g. to spill a virtual register. How much stack space is required for that spill?

The IR should consider a scalable vector as a "promise of at least one iteration, but potentially more" of the same computation. The mask vectors do the rest of the job of making sure the semantics stays the same.
The main difference is that all operations (even non-scatter/gather) are predicated, so that there is absolute control over undefined behaviour. The predicate vector is updated from the scalar evolution of the iteration ranges and they control what the actual operations can and can't do, irrespective of the vector size.

Seriously, I happen to be very familiar with the hardware/implementation model of this instruction set extension.

The thing that matters here is the specific set of IR extensions you're proposing. In this patch you're proposing introducing a vscale constant. This doesn't make sense, because it cannot be used in (e.g.) a global variable initializer, and otherwise doesn't fit the model for constants. Why not use an intrinsic to return this value?

-Chris

From my point of view, making scalable vectors a native and core type of IR is the only way forward, because the semantics needs to be ingrained in the language to make sense. AFAICS, at the IR level, the differences between vectors and scalable ones is not that big a deal, certainly not bigger than vectors versus scalar.

This is much more up for debate. Details matter. Compilers are a set of engineering tradeoffs. You need to explain and defend your position carefully, not just state it as though it were "obviously true".

I though that was clear from my previous response to you: "I don't think anyone disagrees with that point."

I'm certainly not asking people to "trust me, I'm an engineer". We had previous discussions in the list, other people chimed in. This is not even my patch and I have nothing to do with their work... "I just work here".

I'm also certainly not asserting that I know all the answers and that this change is worry free, by any means, and that's precisely why I pinged more people to give their opinions, because I was uncomfortable with the low amount of reviews. But it was certainly not "just me".

Do you have a discussion somewhere of the overall design of the extensions you're proposing? This is presumably only a small piece of it.

There were discussions on the list, a "whole-patch approach" in Github and some previous patches. I can't find any of it (my mail client - gmail - is a mess). I'll let Graham cover that part.

You haven't proposed a set of IR type system extensions. If you're proposing a new first class type, one which represents a vector of an unknown length, then you'll need to be able to load and store it, e.g. to spill a virtual register. How much stack space is required for that spill?

Ah, right. I'll let Graham take that one, as they have done this already in LLVM.

Seriously, I happen to be very familiar with the hardware/implementation model of this instruction set extension.

I meant no offence.

The thing that matters here is the specific set of IR extensions you're proposing. In this patch you're proposing introducing a vscale constant.

s/me/Graham/, this is not my patch and I had no part in any of this.

This doesn't make sense, because it cannot be used in (e.g.) a global variable initializer, and otherwise doesn't fit the model for constants. Why not use an intrinsic to return this value?

IIUC, it can be used for global variable initialiser via SVE splats using another new construct (stepvector) they want to introduce (which I had my own concerns).

I think the "stepvector" idea is limiting and could possibly start as an intrinsic, but the vscale is not really the actual scale, just a flag that it is scalable, so it wouldn't have more problems than using some intrinsic.

It'll still need a new register class in ISel and the register allocation, and it will still need to understand the aliasing rules, in the same way as we currently have for VFP and NEON.

If the vscale could make the stack size variable, so would an intrinsic. Or maybe I'm just not understanding the problem.

cheers,
--renato

Just to be clear, I'm not *against* the idea of an intrinsic, nor I'm pushing this patch for any personal/professional agenda. I hope I have made that perfectly clear on my previous reviews on the same patches before.

I just want the best technical overall solution, and this particular one seems fine to me. I may be absolutely wrong, and that's perfectly fine, but we need a solution for this, even if we have to start with intrinsics and move to IR changes.

--renato

I've only lightly read the spec, but it looks like the vector length can be controlled by writing to the ZCR_ELn registers (so, e.g. user code could make a syscall to change the vector length)?

As I explained to Hal on his comment, that is correct but doesn't have the effect you're expecting.

Which comment? FWIW, I didn't see a particular response.

I've only lightly read the spec, but it looks like the vector length can be controlled by writing to the ZCR_ELn registers (so, e.g. user code could make a syscall to change the vector length)? If that's accurate, I think a constant vscale is not sufficient.

I'm also under the impression that this won't work because it would interfere with any ongoing vector calculations, spill code, etc. The point being that it is fixed for a particular process once the process begins (at least in practice).

I've only lightly read the spec, but it looks like the vector length can be controlled by writing to the ZCR_ELn registers (so, e.g. user code could make a syscall to change the vector length)?

As I explained to Hal on his comment, that is correct but doesn't have the effect you're expecting.

Vectors don't have length, they have the "idea that they may have length", and it's up to the CPU to control that.

Just to be clear, the example you propose has no effect on the notion of length:

// SVE length defined at boot time to be 4
...
add z0.s, p0/m, z0.s, z1.s // z0+=z1 only where the predicate p0 is valid, which here is "up-to" 4 vector lengths
...
svc ... // Try to change vector length to 8, assuming this works
...
add z0.s, p0/m, z0.s, z1.s // z0+=z1 only where the predicate p0 is valid, which here is "up-to" 8 vector lengths

Let me try to give an example. Say we have code like (quasi-llvm syntax):

// vector length is 4
%v0 = load <4 x vscale x i8>, <4 x vscale x i8>* %ptr0
svc ... // Try to change vector length to 8, assuming this works
%ptr1 = %ptr0 + vscale
%v1 = load <4 x vscale x i8>, <4 x vscale x i8>* %ptr1
%v2 = add %v1, %v2

I have two questions here:

[edit: I just saw Hal's reply -- if we *disallow* mid-process changes to the vector length, then things are much simpler, but that needs to be documented.]

  • What is the semantics of add %v1, %v2? As far as I can tell, the two vectors have "different" vector lengths, since one was created before the resize and the other was created after. I know the *registers* will have the same size, but as I understand it, one of the 32-byte registers will have 16 elements, while the other one will have 32. The bit that's worrying me here is that if we allow resizing operations then things like shufflevector (say) will have to be ordered with respect to unknown calls.
  • Whether %ptr1 is computed before or after the syscall gives the program different semantics since it will either be 4 or 8 bytes after %ptr0. Does this mean we will have to order %ptr1 (a GEP) with respect to unknown function calls?

If that's accurate, I think a constant vscale is not sufficient.

The main problem here is one of representation. In the ARM implementation, SVE vectors alias with SIMD vectors, so you need to be careful on how you write to them.

If you don't have a way to separate SVE from SIMD, you'll have trouble generating either code. If you separate them completely, you'll have trouble worrying about the aliasing.

I'm not sure how SIMD etc. is related to what I asked.

Having a flag (even as boolean "i1 vscale") is enough. It needs to be a constant because of how scalar evolution will work on the predicate vectors, but I'll let Graham explain that in more depth, as I'm only "familiar" at this point.

Hm? I was under the impression that vscale was supposed to help offset induction variables (and things like that) by the right amount. How would you do that with an i1 vscale?

lib/IR/Constants.cpp
812

What I meant to say is, say I have code like:

for (iN i = 0; i < L; i += (iN vscale)) {
  load scaled vector from &a[i];
  ...
}

Does N have to be greater than some value for the loop above to make sense? For instance, if the vector length in the CPU is set to 32 then N = 2 clearly does not make sense -- i2 32 is just i2 0. If there is such a restriction, then it needs to be documented.

As I explained to Hal on his comment, that is correct but doesn't have the effect you're expecting.

Which comment? FWIW, I didn't see a particular response.

Sorry, not yours, Sanjoy's:

https://reviews.llvm.org/D32737#inline-290542

I'm also under the impression that this won't work because it would interfere with any ongoing vector calculations, spill code, etc. The point being that it is fixed for a particular process once the process begins (at least in practice).

Range calculations won't have to bother with the scale of the vector, as I'll try to explain on Sanjoy's reply later on.

Spill code may be problematic (variable stack), but that's a problem that I'm not sure we can fix with any notation.

There is a lot of discussion here that I really don't think should be on a patch review. It should be an an llvm-dev thread. See below.

From my point of view, making scalable vectors a native and core type of IR is the only way forward, because the semantics needs to be ingrained in the language to make sense. AFAICS, at the IR level, the differences between vectors and scalable ones is not that big a deal, certainly not bigger than vectors versus scalar.

This is much more up for debate. Details matter. Compilers are a set of engineering tradeoffs. You need to explain and defend your position carefully, not just state it as though it were "obviously true".

I though that was clear from my previous response to you: "I don't think anyone disagrees with that point."

I'm certainly not asking people to "trust me, I'm an engineer". We had previous discussions in the list, other people chimed in. This is not even my patch and I have nothing to do with their work... "I just work here".

I'm also certainly not asserting that I know all the answers and that this change is worry free, by any means, and that's precisely why I pinged more people to give their opinions, because I was uncomfortable with the low amount of reviews. But it was certainly not "just me".

Do you have a discussion somewhere of the overall design of the extensions you're proposing? This is presumably only a small piece of it.

There were discussions on the list, a "whole-patch approach" in Github and some previous patches. I can't find any of it (my mail client - gmail - is a mess). I'll let Graham cover that part.

Yes, but in that discussion, I specifically asked for a new, fresh RFC thread that reflected substantial changes made to the design presented in the first RFC email through the discussion. That thread, AFAICT, never happened. If I missed, it I'm happy to get a pointer to it. If not, the author of this patch should start it. Either way, that is where this discussion should take place. I think there remain a lot of important technical issues here. I would like to dig into them, but I *don't* want to do it here where I don't even have the complete design.

I would encourage everyone (Hal, Sanjoy, etc) to hold off on debating "how does X work" here and redirect that to the (either existing or eventual) llvm-dev thread with an updated overall design.

There is a lot of discussion here that I really don't think should be on a patch review. It should be an an llvm-dev thread. See below.

From my point of view, making scalable vectors a native and core type of IR is the only way forward, because the semantics needs to be ingrained in the language to make sense. AFAICS, at the IR level, the differences between vectors and scalable ones is not that big a deal, certainly not bigger than vectors versus scalar.

This is much more up for debate. Details matter. Compilers are a set of engineering tradeoffs. You need to explain and defend your position carefully, not just state it as though it were "obviously true".

I though that was clear from my previous response to you: "I don't think anyone disagrees with that point."

I'm certainly not asking people to "trust me, I'm an engineer". We had previous discussions in the list, other people chimed in. This is not even my patch and I have nothing to do with their work... "I just work here".

I'm also certainly not asserting that I know all the answers and that this change is worry free, by any means, and that's precisely why I pinged more people to give their opinions, because I was uncomfortable with the low amount of reviews. But it was certainly not "just me".

Do you have a discussion somewhere of the overall design of the extensions you're proposing? This is presumably only a small piece of it.

There were discussions on the list, a "whole-patch approach" in Github and some previous patches. I can't find any of it (my mail client - gmail - is a mess). I'll let Graham cover that part.

Yes, but in that discussion, I specifically asked for a new, fresh RFC thread that reflected substantial changes made to the design presented in the first RFC email through the discussion. That thread, AFAICT, never happened. If I missed, it I'm happy to get a pointer to it. If not, the author of this patch should start it. Either way, that is where this discussion should take place. I think there remain a lot of important technical issues here. I would like to dig into them, but I *don't* want to do it here where I don't even have the complete design.

I would encourage everyone (Hal, Sanjoy, etc) to hold off on debating "how does X work" here and redirect that to the (either existing or eventual) llvm-dev thread with an updated overall design.

Put differently, this patch makes sense *once* we clearly have consensus on llvm-dev. So far, the only thread I can find did not reach any meaningful consensus. Notably, none of the code generator people were heavily contributing to that thread, and there remain large unmentioned technical concerns (stack spills, alloca size, etc)

I've only lightly read the spec, but it looks like the vector length can be controlled by writing to the ZCR_ELn registers (so, e.g. user code could make a syscall to change the vector length)? If that's accurate, I think a constant vscale is not sufficient.

@rengolin is correct. The only sensible way to model this feature is with the vector length being a (load time) constant. Changing while a process is executing is not a useful thing to model or worry about. I still think this is better modeled with an intrinsic that returns the value, rather than and llvm::Constant.

Put differently, this patch makes sense *once* we clearly have consensus on llvm-dev. So far, the only thread I can find did not reach any meaningful consensus. Notably, none of the code generator people were heavily contributing to that thread, and there remain large unmentioned technical concerns (stack spills, alloca size, etc)

I agree. I'm glad I pushed this further, and I think we should get a proper discussion in the list.

In my mind, the previous discussion was "good enough", because all my questions were answered and I saw that other people weren't complaining much (so I assumed everyone was happy).

When you reported on the actual status I could see that I was probably inside a bubble. Initially, the idea was to start slow and "cross bridges when we get there", but changing the IR is really serious.

Graham,

I think we'll need an RFC that goes beyond vscale. We need to understand how constants are handles, as well as scalar evolution, spills, stacks etc. Not to work on the patches or even publish them now, just to understand a few cases in IR.

It would be easier to see the proposal, and then have a counter-proposal using intrinsics, to see how things will look like.

I know ARM was initially reluctant to use intrinsics, but as I said before on previous reviews, they allow us to model the behaviour before any hard changes to IR. If we can't reach a consensus now, it'd probably be better to go that way for now and change as it becomes clearer what to do in IR.

In the long run, I still think we should support scalable vectors native in IR, but that can wait until everyone understands the actual semantics.

cheers,
--renato

Hi all,

There are good questions here which we'll enumerate and answer individually once we send out a new RFC to llvm-dev.

The reason that we didn't send out a new RFC to the list and instead sent patches was because the basic idea of our SVE implementation was fundamentally unchanged in terms the modified llvm::VectorType. The other elements of our implementation didn't affect this core concept, e.g. whether we used an instruction like elementcount or our new vscale constant to deal with runtime vector lengths, likewise with the stepvector constant vs seriesvector instruction.

To clarify again, from the compiler's perspective we can assume that the VL is constant but unknown. If you have any other questions ping them to us and we'll try to answer them as part of the new RFC.

Amara

fhahn added a subscriber: fhahn.May 22 2017, 2:28 AM
pekka added a subscriber: pekka.May 23 2017, 7:04 AM