This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
docs/
4/18
LangRef.rst
-
include/llvm/
-
llvm/
-
ADT/
-
DenseMapInfo.h
-
IR/
3/6
DerivedTypes.h
-
Type.h
-
Support/
3/3
ScalableSize.h
-
lib/
-
AsmParser/
-
LLLexer.cpp
-
LLParser.cpp
-
LLToken.h
-
Bitcode/
-
Reader/
-
BitcodeReader.cpp
-
Writer/
-
BitcodeWriter.cpp
-
IR/
-
AsmWriter.cpp
1/4
LLVMContextImpl.h
1/1
Type.cpp
4/4
Verifier.cpp
-
test/
-
Bitcode/
-
compatibility.ll
-
Verifier/
-
scalable-aggregates.ll
-
scalable-global-vars.ll
-
unittests/IR/
-
IR/
-
CMakeLists.txt
2/2
VectorTypesTest.cpp

Differential D32530

[SVE][IR] Scalable Vector IR Type
ClosedPublic

Authored by huntergr on Apr 26 2017, 4:12 AM.

Download Raw Diff

Details

Reviewers

rengolin
lattner
echristo
chandlerc
hfinkel
rkruppe
samparker
SjoerdMeijer
greened
sebpop
efriedma

Commits

rGf4fc01f8dd3a: [SVE][IR] Scalable Vector IR Type
rL361953: [SVE][IR] Scalable Vector IR Type

Summary

Adds a 'scalable' flag to VectorType
Adds an 'ElementCount' class to VectorType to pass (possibly scalable) vector lengths, with overloaded operators.
Modifies existing helper functions to use ElementCount
Adds support for serializing/deserializing to/from both textual and bitcode IR formats
Extends the verifier to reject global variables of scalable types
Adds unit tests
Updates documentation

See the latest version of the RFC here: http://lists.llvm.org/pipermail/llvm-dev/2018-July/124396.html

Diff Detail

Event Timeline

huntergr created this revision.Apr 26 2017, 4:12 AM

Herald added subscribers: tschuett, mgorny, mehdi_amini. · View Herald TranscriptApr 26 2017, 4:12 AM

Hi Graham, thanks for this work.

I think you have addressed all points in the previous review and I'm happy with the patch. It's very concise and simple.

We should let it simmer for a while to make sure no one else has any comment, as this is a core change in IR, and we should get as much eyes on it as possible.

I'm adding a few more people as reviewers and I hope the CC'd people can also have a look.

cheers,
--renato

lib/IR/Type.cpp
603	why not: : SequentialType(VectorTyID, ElType, EC.Min), Scalable(EC.Scalable) { }

Improved constructor based on Renato's suggestion. Thanks.

huntergr marked an inline comment as done.May 2 2017, 2:50 AM

sanjoy added a subscriber: sanjoy.May 7 2017, 3:44 PM

Changed textual IR format to match Chris's suggestion from the mailing list.

I have also changed vscale and stepvector to be intrinsics, but as that makes those patches just an addition to Intrinsics.td I won't post them yet.

Rough plan is to create a patch series to get minimal legalization and codegen working after enough of Sander de Smalen's MC patches have been committed; hopefully that will provide more context.

Alongside that I'll be looking at all the places the size of a value is queried to see if there's any problems using a pair of scaled and unscaled byte sizes.

Updated RFC including a list of patches for simple codegen using this extension has been posted: http://lists.llvm.org/pipermail/llvm-dev/2018-June/123780.html

Herald added subscribers: steven_wu, rkruppe. · View Herald TranscriptJun 5 2018, 6:36 AM

rogfer01 added a subscriber: rogfer01.Jun 8 2018, 2:35 AM

I think the discussion on the mailing list has reached agreement on this type being suitable for both SVE and RVV; any review comments on the code or tests?

Herald added a subscriber: dexonsmith. · View Herald TranscriptJul 12 2018, 3:11 AM

I just realized the updated RFC doesn't touch on the issue at all, but I think it's safe to say we won't support globals of scalable vector type? Those seems impossible to implement in a sensible way for RISC-V, and if my memory and quick skim-reading is correct, it isn't part of the SVE C language extensions either. If that's correct, I'd expect the verifier to reject global variables whose type is a scalable vector.

Indeed, we shouldn't allow scalable vectors to be globals. I've added a check for that in the verifier, plus unit tests and a small update to the langref. Thanks.

fhahn added a subscriber: fhahn.Jul 16 2018, 4:22 AM

Thank you! I took another look and found two nits, sorry for not pointing them out earlier.

Other than those nits I'd say "LGTM", but note that I currently have neither commit access nor even any accepted upstream patches, so a LGTM from me probably isn't enough to commit.

include/llvm/IR/DerivedTypes.h
513	Nit: This restriction on the range of NumElements is very reasonable, but we should make it a proper invariant of the type and enforce it in `VectorType::get` rather than post-hoc in some accessors.
lib/IR/LLVMContextImpl.h
1313	Nit: have you considered `std::pair<bool, unsigned>` instead of manually bit-packing it into 64 bits? DenseMap should support nested pairs, the size should be the same (except if unsigned is 64 bit, which I don't believe we support and which is extremely niche anyway), and it would simplify `VectorType::get` a little bit.

dcaballe added a subscriber: dcaballe.Jul 17 2018, 9:26 AM

greened added a subscriber: greened.Jul 18 2018, 10:33 AM

huntergr added reviewers: samparker, SjoerdMeijer.Jul 19 2018, 7:47 AM

huntergr added inline comments.Jul 19 2018, 8:00 AM

include/llvm/IR/DerivedTypes.h
513	VectorType::get already has this restriction, at least on most platforms -- the argument to it is 'unsigned', which is usually 32 bits. The ElementCount struct also uses 'unsigned'. It may be worth changing it to an explicit uint32_t. I think VectorType originally stored the number of elements as a 32 bit field so was consistent with the interface, but at some point the different SequentialType variants were changed to unify with a single 64 bit size field in the parent class. I will create an updated patch to check that we don't overflow when using methods like getDoubleElementsVectorType; good catch.

samparker added a subscriber: kristof.beyls.Jul 20 2018, 2:41 AM

rkruppe added inline comments.Jul 20 2018, 4:34 AM

include/llvm/IR/DerivedTypes.h
513	Oh, right, I was under the mistaken impression that VectorType::get took uint64_t. So the potential overflow in getDoubeElementsVectorType is pre-existing, but if you want to take the time to harden against it now, that's great. I don't think changing unsigned to uint32_t is worth the churn, at least not in this patch.

huntergr added inline comments.Jul 26 2018, 4:55 AM

lib/IR/LLVMContextImpl.h
1313	So I tried this, and couldn't compile it -- there's no implementation of getHashValue, getEmptyKey, getTombstoneKey, etc. for the nested pair. In our downstream compiler this is actually implemented as a dense map of a 3-element tuple against the Type*, for which we have implemented the appropriate extensions to DenseMapInfo. If that approach is preferred to bitpacking, I'll make a separate patch to implement the DenseMap extensions.

fhahn added a child revision: D47770: [MVT][SVE] Add EVT strings and Type mapping.Jul 28 2018, 8:44 AM

rkruppe added inline comments.Aug 31 2018, 10:29 AM

lib/IR/LLVMContextImpl.h
1313	I finally got a chance to look into the error you're seeing and it turns out the root cause is not nested pairs but a missing implementation of DenseMapInfo for bool. We could add that implementation, but in the future other code may also want to hash an ElementCount, e.g. VPlan may migrate the vectorization factor VF from `unsigned` to ElementCount and there are some DenseMaps with VF as key. With this in mind I'm leaning towards implementing `DenseMapInfo<VectorType::ElementCount>`, using the bit fiddling that's currently open-coded here. What do you think?

huntergr added inline comments.Sep 4 2018, 2:38 AM

lib/IR/LLVMContextImpl.h
1313	Ah, I spotted the same bug you did (missing implementation for bool), but it seems I hadn't submitted the comment I wrote. I tried it with nesting a pair of unsigned ints and that worked, but making it work directly with ElementCount seems a nicer idea, thanks.

huntergr added a child revision: D53137: Scalable vector core instruction support + size queries.Oct 11 2018, 7:03 AM

Added checks in verifier to prevent scalable vectors being included in structs or arrays
Changed lookup to use ElementCount
Moved ElementCount to a new file
More unit tests

huntergr marked an inline comment as done.Nov 2 2018, 5:23 PM

greened added a reviewer: greened.Mar 7 2019, 1:54 PM

Herald added a subscriber: jdoerfert. · View Herald TranscriptMar 7 2019, 1:54 PM

This all LGTM after addressing the various comments.

simoll added a subscriber: simoll.Mar 8 2019, 9:00 AM

hsaito added a subscriber: hsaito.Mar 8 2019, 2:27 PM

willlovett added a subscriber: willlovett.Mar 20 2019, 3:23 AM

Herald added a subscriber: psnobl. · View Herald TranscriptMar 20 2019, 3:23 AM

Seems fine in general, just some nits.

include/llvm/IR/DerivedTypes.h
446	Nit: Punctuation (comments should end with .)
510	Nit: Punctuation.
523	Nit: I'd like to see a similar comment in SequentialType::getNumElements.
include/llvm/Support/ScalableSize.h
27	Nit: Punctuation and capitalization (If [...])
lib/IR/Verifier.cpp
311	Nitpick: I would call this "containsScalableVectorValue", to make it clear that it doesn't just look at the top level type.
665	Could do an early return here instead of aggregating the result.
4928	Nitpick 1: This comment is going to become stale as soon as someone comes up with a non-scalable type they'd like to check. Nitpick 2: Any reason why this is called here and not in Verifier::verify?
unittests/IR/VectorTypesTest.cpp
35	I'd also check the number of elements and element size here, just to cover the ElementCount operators fully.
70	Ditto.

greened added inline comments.Mar 20 2019, 11:10 AM

docs/LangRef.rst
678	Add a similar comment about scalable vector types in aggregates?
include/llvm/Support/ScalableSize.h
7	Needs updated license.
18	Should be `LLVM_SUPPORT_SCALABLESIZE_H`.

Rebased, incorporated fixes from reviews.

Thanks for taking a look, Diana and David.

huntergr marked 12 inline comments as done.Mar 21 2019, 10:24 AM

I am accepting the scalable vector types based on the comments in
http://lists.llvm.org/pipermail/llvm-dev/2019-March/131137.html

Let's move forward with SVE support in LLVM. Thanks!

This revision is now accepted and ready to land.Mar 29 2019, 11:50 AM

In D32530#1448135, @sebpop wrote:

I am accepting the scalable vector types based on the comments in
http://lists.llvm.org/pipermail/llvm-dev/2019-March/131137.html

Let's move forward with SVE support in LLVM. Thanks!

Hi Sebastian,

Thanks!

I'll wait a couple more days for last minute feedback before committing.

In D32530#1451494, @huntergr wrote:

In D32530#1448135, @sebpop wrote:

I am accepting the scalable vector types based on the comments in
http://lists.llvm.org/pipermail/llvm-dev/2019-March/131137.html

Let's move forward with SVE support in LLVM. Thanks!

Hi Sebastian,

Thanks!

I'll wait a couple more days for last minute feedback before committing.

I'd advise caution here, it's really significant/impactful change, and a single sign-off is a bit worrying.
In particular, even regardless of the feature itself, has the implementation itself been reviewed?

In D32530#1451505, @lebedev.ri wrote:

I'd advise caution here, it's really significant/impactful change, and a single sign-off is a bit worrying.

I agree that this is a significant change and I can understand why people are a bit nervous about merging it. Would it help if we had more middle-end patches reviewed before committing, so people could have a better understanding of the impact? Off the top of my head, the rework of D35137 would be interesting, and also constant handling.

Can you also tell us what the plan is regarding all the places in the optimizer that may need updating to handle the new vectors? AFAICT, code that deals only with fixed-width vectors shouldn't be affected, but I would imagine a lot of passes would break IR using scalable vectors (e.g. by building fixed-width vectors instead of scalable ones).

In particular, even regardless of the feature itself, has the implementation itself been reviewed?

I can't speak for Sebastian, but there have been a few pairs of eyes on the implementation itself, as you can see from the comments.

In D32530#1452773, @rovka wrote:

In D32530#1451505, @lebedev.ri wrote:

I'd advise caution here, it's really significant/impactful change, and a single sign-off is a bit worrying.

There were several people who voted for a scalable vector type in the llvm-dev thread that I referred to
http://lists.llvm.org/pipermail/llvm-dev/2019-March/131137.html
Maybe those people can also sign-off to accept this change.

I am somewhat against extending the IR with the SV type
and I recognize that as we stand today adding the SV type is a good way to get ARM-SVE support in LLVM.

I agree that this is a significant change and I can understand why people are a bit nervous about merging it. Would it help if we had more middle-end patches reviewed before committing, so people could have a better understanding of the impact? Off the top of my head, the rework of D35137 would be interesting, and also constant handling.

Can you also tell us what the plan is regarding all the places in the optimizer that may need updating to handle the new vectors?

A patch series has been posted for review, see section 7. in
http://lists.llvm.org/pipermail/llvm-dev/2018-June/123780.html

In particular, even regardless of the feature itself, has the implementation itself been reviewed?

I can't speak for Sebastian, but there have been a few pairs of eyes on the implementation itself, as you can see from the comments.

Yes, I have reviewed the current patch and I have looked at the patch series.

In D32530#1453182, @sebpop wrote:

In D32530#1452773, @rovka wrote:

In D32530#1451505, @lebedev.ri wrote:

I'd advise caution here, it's really significant/impactful change, and a single sign-off is a bit worrying.

There were several people who voted for a scalable vector type in the llvm-dev thread that I referred to
http://lists.llvm.org/pipermail/llvm-dev/2019-March/131137.html
Maybe those people can also sign-off to accept this change.

I am somewhat against extending the IR with the SV type
and I recognize that as we stand today adding the SV type is a good way to get ARM-SVE support in LLVM.

I think this patch is a good start and I don't mind having it merged as is, I was just suggesting a way forward since it seems people are still hesitating. While there seems to be some consensus on moving forward with native types, I'm sure many of the details are still somewhat fuzzy. In my opinion having more patches reviewed would make those details clear, and it would thus make it easier to get more sign-offs on this patch as well.

I agree that this is a significant change and I can understand why people are a bit nervous about merging it. Would it help if we had more middle-end patches reviewed before committing, so people could have a better understanding of the impact? Off the top of my head, the rework of D35137 would be interesting, and also constant handling.

Can you also tell us what the plan is regarding all the places in the optimizer that may need updating to handle the new vectors?

A patch series has been posted for review, see section 7. in
http://lists.llvm.org/pipermail/llvm-dev/2018-June/123780.html

My understanding is that Graham is reworking some of those. Some were only RFC to begin with, some have been abandoned since, and in any case there has been some back-and-forth on the mailing list since then. I think now would be a good time to rebase/update all the patches that are still relevant so people can see the current version of things.

Also, my question regarding updating the optimizer still stands, I don't think I've seen patches related to updating the existing passes to take the 'scalable' property into account. While in some places it's likely to have no impact, I'd be very surprised if you could pass an IR file with scalable types through the optimizer and not get a soup of fixed-width and scalable vectors out the other end (or the corresponding assertions about mismatched types). I'd be happy to help deal with that churn, by the way :) that's rather why I was asking about the plan.

hfinkel added inline comments.Apr 4 2019, 9:12 AM

docs/LangRef.rst
2749	This doesn't seem strong enough. We need the unknown multiple to be the same for any given type (at least within a given function). We also need a relationship between vectors of different underlying types (so that zext/sext/etc. make sense). Otherwise, you can't even sensibly add them together (for example). I realize that it says something about an unknown vector length above, but we need to translate that statement into semantics that make sense for the vectors themselves.
lib/IR/Verifier.cpp
4927	Remove unneeded whitespace change.

rengolin added inline comments.Apr 5 2019, 1:52 AM

docs/LangRef.rst
2749	It's not that simple. Both SVE and RISC-V can have vector multiplier changes in the middle of a function (via system register or similar). Neither of them want that to be the norm, but IIRC, RISC-V doesn't want it to change inside a function and SVE wants it to be the same for the whole program. I totally agree with you that leaving it open is a huge can of worms, and wanting a per-function change would probably need new annotation on functions, which if ever done, should be orthogonal to this change (or would lead us into madness). I second your proposal that we fix the semantics in LLVM, for now, that the "unknown width" is the same throughout the program and that the existing relationship between fixed vectors extends to scalable vectors. If you look at the changes in this patch series, it assumes that behaviour already, by getting new vector types of half-size with double-elements and so on. IFF RISC-V wants to extend the logic to be per-function, then we will need to do a much more extensive analysis on the passes, especially around inlining and function calls. I strongly propose we don't look at it right now and fix the semantics as proposed above. In my analysis, with that semantics, I don't see a big impact on any existing non-scalable optimisations. With vectorisation passes being run at the end of the pipeline, even for scalable code, most of the existing pipeline will still be relevant, too.

I'd advise caution here, it's really significant/impactful change, and a single sign-off is a bit worrying.
In particular, even regardless of the feature itself, has the implementation itself been reviewed?

Noted; I'll hold off for now.

I have scheduled a roundtable at EuroLLVM next week, so if interested people are attending we can perhaps make more progress there.

I think this patch is a good start and I don't mind having it merged as is, I was just suggesting a way forward since it seems people are still hesitating. While there seems to be some consensus on moving forward with native types, I'm sure many of the details are still somewhat fuzzy. In my opinion having more patches reviewed would make those details clear, and it would thus make it easier to get more sign-offs on this patch as well.

Ok. I'll see if I can get some time to work on updating other core parts for review.

I agree that this is a significant change and I can understand why people are a bit nervous about merging it. Would it help if we had more middle-end patches reviewed before committing, so people could have a better understanding of the impact? Off the top of my head, the rework of D35137 would be interesting, and also constant handling.

Can you also tell us what the plan is regarding all the places in the optimizer that may need updating to handle the new vectors?

A patch series has been posted for review, see section 7. in
http://lists.llvm.org/pipermail/llvm-dev/2018-June/123780.html

My understanding is that Graham is reworking some of those. Some were only RFC to begin with, some have been abandoned since, and in any case there has been some back-and-forth on the mailing list since then. I think now would be a good time to rebase/update all the patches that are still relevant so people can see the current version of things.

Indeed. We are considering a github repo with a more complete implementation split into sensible patches (unlike the current repo with a single megapatch that covers far more than just SVE support). It will take some time to prepare that, though.

Also, my question regarding updating the optimizer still stands, I don't think I've seen patches related to updating the existing passes to take the 'scalable' property into account. While in some places it's likely to have no impact, I'd be very surprised if you could pass an IR file with scalable types through the optimizer and not get a soup of fixed-width and scalable vectors out the other end (or the corresponding assertions about mismatched types). I'd be happy to help deal with that churn, by the way :) that's rather why I was asking about the plan.

There aren't that many changes (imo), but I understand that you'd like to see for yourself what they are. I'll see what the opinion is on the repo vs. lots of phabricator patches at EuroLLVM.

rkruppe added inline comments.Apr 5 2019, 2:41 AM

docs/LangRef.rst
2749	So first of all, I agree that this patch does (and should) only implement a single "constant" (at runtime) `vscale` value. The current wording here in LangRef is ambiguous about this, it doesn't make clear at which scope the "unknown integer multiple" is fixed. It should be made clear that this factor is the same for all vector types and does not change while the program executes (and, once the `vscale` intrinsic is added, this section here should also point at it). Second, since you mentioned it I should say: the RISC-V vector extension has now changed to a point where (at least in its standard incarnation, without further extensions on top of it) there is no need for vscale to change during runtime, let alone between individual functions. All the changes to the "vector register size" now happen in such a way that it's easy to express the longer vectors in IR by just using different vector types with higher `ElementCount::Min`, e.g. `<scalable 8 x double>` instead of `<scalable 1 x double>`. So from the RVV side, there's no need any more for `vscale` to vary e.g. function-by-function. I don't know whether the SVE folks want to take a shot at it regardless, but in past discussion it sounded like the vscale-per-function model wasn't a good fit for the programming model they envisioned, so maybe they'd come up with a different solution if and when they tackle that problem.

huntergr added inline comments.Apr 5 2019, 2:43 AM

docs/LangRef.rst
2749	For SVE at least, we can consider changing the vector length during execution to be undefined behaviour. The compiler is not expected to be able to handle it. For RVV, given the new restrictions on how vlmul is handled, I think they won't need to change the multiple at runtime either -- just increase the the minimum number of lanes. I'm hoping to discuss this with Robin at EuroLLVM, assuming time permits. I'll come up with some stricter wording.

Clarified that the runtime multiple is constant across all scalable vector types, even if the constant value isn't known at compile time.
Removed extra whitespace.

huntergr marked 2 inline comments as done.Apr 5 2019, 5:48 AM

In D32530#1456113, @huntergr wrote:

Clarified that the runtime multiple is constant across all scalable vector types, even if the constant value isn't known at compile time.

Removed extra whitespace.

LGTM now, thanks for all the hard work! I won't approve, as this is a larger discussion, but I'm happy with this patch, and the native support as is.

hfinkel added inline comments.Apr 5 2019, 11:18 AM

docs/LangRef.rst
2749	I definitely agree that we should not deal with changing the vscale during program execution. I think that the model is: There is an underlying vector length. vscale = round(vector length in bits / primitive size in bits). Can we specify it like that? We do also need to define what the rounding is. What does <scalable 4 x i3> do? Or is it not allowed?

hsaito added inline comments.Apr 5 2019, 4:01 PM

docs/LangRef.rst
2749	I definitely agree that we should not deal with changing the vscale during program execution. I agree that this will make things a lot simpler than allowing it to change per function or in a middle of a function. However, I don't quite agree that changing vscale per function is an orthogonal issue. What are we going to do when function foo() with vscale=X calls function bar() with vscale=Y using a scalable vector parameter? Having said that, since I don't expect the discussions to converge anytime soon if we talk about vscale changing within a compilation unit, I agree we should move forward with vscale not changing within a compilation unit (we say program execution, but compiler's visibility is always limited to compilation unit). It should be sufficient to say that if multiple compilation unit with different vscale are linked, unspecified behavior will result. @hfinkel, I think the model is "#elements * elementtype" fits in one or more "units of vector" and then apply vscale to it. I don't think scalable vector needs to fit one physical register of HW. Vector type legalization should kick-in. @huntergr, please correct me if my mental model is wrong.

rengolin added inline comments.Apr 6 2019, 3:01 AM

docs/LangRef.rst
2749	However, I don't quite agree that changing vscale per function is an orthogonal issue. I didn't mean the implementation, but the discussion. I think a per-function vscale implementation will be very different from the current one, no matter which course we take now. It won't matter much if we have native or intrinsic implementation, we'll still need function attributes and teach the optimisation passes, etc. Having said that, since I don't expect the discussions to converge anytime soon if we talk about vscale changing within a compilation unit If the scope it the compilation unit, then we'd need it to be fixed on the target string, or we won't be able to link two units together. I think even this discussion is too soon, and we should push the scope to the whole program. Any change in vscale throughout the program should be undefined, or we'd have to encode the necessary logic in the compiler, which is the biggest worry I see from the feedback. So far, the benefits of doing so are on edge cases and the actual costs are unknown (but very likely large). In my view, this is definitely not a subject we should raise right now and restricting the current implementation to whole-program scope is the only way we can go forward for now in any sensible way.

hfinkel added inline comments.Apr 7 2019, 7:36 AM

docs/LangRef.rst
2749	I didn't mean the implementation, but the discussion. As I've said in previous thread, I don't believe that we can sensibly model a changing vscale without some SSA dependence, and that will require significant changes to the overall scheme. restricting the current implementation to whole-program scope is the only way we can go forward for now in any sensible way. +1 I think the model is "#elements * elementtype" fits in one or more "units of vector" and then apply vscale to it. I don't think scalable vector needs to fit one physical register of HW. Vector type legalization should kick-in. Indeed, I believe you're correct. We need to account for this in the definition too.

hfinkel added inline comments.Apr 7 2019, 7:39 AM

docs/LangRef.rst
2749	Indeed, I believe you're correct. We need to account for this in the definition too. Either by having a model that includes legalization, or by restricting the size of the base vector type?

huntergr marked an inline comment as done.Apr 9 2019, 12:46 AM

huntergr added inline comments.

docs/LangRef.rst
2749	We perform legalization for scalable vectors with the same mechanisms fixed-length vectors do (splitting for too large, promoting/extending for too small). Should this be documented in this description (it isn't for fixed vectors), or is there a better place in the docs for that explanation? (A side note; for 'unpacked' float vector types (e.g. <scalable 2 x float>) we do declare them as legal for SVE then generate predicates to mask off the unused lanes in AArch64 specific code. Since there are more predicated architectures being added to the codebase, perhaps this could be generalized as a new legalization mechanism for fp vector types)

hfinkel added inline comments.Apr 9 2019, 1:42 AM

docs/LangRef.rst
2749	We perform legalization for scalable vectors with the same mechanisms fixed-length vectors do (splitting for too large, promoting/extending for too small). What defines too big (what size is used for splitting)? If `<scalable 8 x double>` fits in the vector register depends on the runtime vector size, no? Should this be documented in this description No, unless it's part of the IR-level model. What we need here is a model defined, at the IR level, that explains why: I can add two <scalable 4 x float> vectors together. I cannot add a <scalable 4 x float> to a <scalable 2 x float> I can sext a <scalable 4 x i32> to a <scalable 4 x i64>, and this can be bitcast to a <scalable 8 x i32>. Also we should address happens to vectors with an odd number of lanes or of a non-power-of-two-sized primitive types (both of which are defined at the IR level).

huntergr added inline comments.Apr 9 2019, 5:56 AM

docs/LangRef.rst
2749	What defines too big (what size is used for splitting)? If <scalable 8 x double> fits in the vector register depends on the runtime vector size, no? No. For SVE, the legal types are those where the minimum size is equal to 128 bits, since that's the minimum size for hardware registers (and the granularity of increments of register size). So an operation using <scalable 8 x double> values would need to be split into 4 <scalable 2 x double> operations for SVE during legalization. (I'm ignoring the predicated unpacked float forms for a moment) I wonder if the change in syntax from <n x 8 x double> to <scalable 8 x double> makes that less obvious. The basis of the model can be grounded in '1' being a valid value for vscale, which effectively makes the types equivalent to fixed length vectors. I'll try coming up with a description based on that.

steleman added a subscriber: steleman.Apr 11 2019, 1:34 PM

joelkevinjones added a subscriber: joelkevinjones.Apr 11 2019, 6:14 PM

I think this is a coherent set of changes. Given the current top of trunk, this expands support from just assembly/disassembly of machine instructions to include LLVM IR, right? Such being the case, I think this patch should go in. I have some ideas on how to structure passes so SV IR supporting optimizations can be added incrementally. If anyone thinks such a proposal would help, let me know.

In D32530#1463735, @joelkevinjones wrote:

I think this is a coherent set of changes. Given the current top of trunk, this expands support from just assembly/disassembly of machine instructions to include LLVM IR, right? Such being the case, I think this patch should go in. I have some ideas on how to structure passes so SV IR supporting optimizations can be added incrementally. If anyone thinks such a proposal would help, let me know.

I think there is one more thing we still have to do. Does scalable vector type apply to all Instructions where non-scalable vector is allowed? If the answer is no, we need to identify which ones are not allowed to take scalable vector type operand/result. Some of the Instructions are not plain element-wise operation. Do we have agreed upon semantics for all those that are allowed?

If we are allowing just element-wise Instructions, we should explicitly say that in LangRef, warn at LLVM-DEV mailing list that new scalable vector types are coming, wait a little to let last minute screams to happen, assure them by saying scalable vector on element-wise Instructions won't cause any mess beyond non-scalable vector, and then commit. This would be the quickest route, and it still enables other interesting follow-up patches to be reviewed/committed. I think we are ready enough to do this if we choose to take this route.

If we are going for more than element-wise Instructions, we need to have well defined and agreed semantics for each of those, and that should be part of the LangRef for each such Instruction.
Also, have we thought about Intrinsics? Can all Intrinsics that take/return non-scalable vector handle scalable vector?

We can certainly let element-wise stuff to go in first and then extend to non-element-wise stuff later. Any thoughts in this regard?

In D32530#1464722, @hsaito wrote:

If we are going for more than element-wise Instructions, we need to have well defined and agreed semantics for each of those, and that should be part of the LangRef for each such Instruction.

Agreed.

Also, have we thought about Intrinsics? Can all Intrinsics that take/return non-scalable vector handle scalable vector?

Same for intrinsics. If any of the vector ones apply to scalable vectors (ex. reduce), it needs to be documented.

We can certainly let element-wise stuff to go in first and then extend to non-element-wise stuff later. Any thoughts in this regard?

Precisely, we should go in baby-steps, with each step making sure we don't touch/break *anything* that isn't scalable, updating LangRef as we go.

Doing a single review on everything would be painful and counter-productive.

In D32530#1464722, @hsaito wrote:

In D32530#1463735, @joelkevinjones wrote:

I think this is a coherent set of changes. Given the current top of trunk, this expands support from just assembly/disassembly of machine instructions to include LLVM IR, right? Such being the case, I think this patch should go in. I have some ideas on how to structure passes so SV IR supporting optimizations can be added incrementally. If anyone thinks such a proposal would help, let me know.

I think there is one more thing we still have to do. Does scalable vector type apply to all Instructions where non-scalable vector is allowed? If the answer is no, we need to identify which ones are not allowed to take scalable vector type operand/result. Some of the Instructions are not plain element-wise operation. Do we have agreed upon semantics for all those that are allowed?

I don't recall this being an issue, but I agree, if there are instructions that currently take vector types that don't take scalable vectors, that certainly needs to be documented.

If we are allowing just element-wise Instructions, we should explicitly say that in LangRef, warn at LLVM-DEV mailing list that new scalable vector types are coming, wait a little to let last minute screams to happen, assure them by saying scalable vector on element-wise Instructions won't cause any mess beyond non-scalable vector, and then commit.

I think that there's been plenty of traffic on this subject on llvm-dev. We do send warnings for potentially-downstream-disruptive changes, as a general best practice. I'm not really sure if this falls into this category, but I'm sure a note to llvm-dev will be sent to let everyone know when this lands, if nothing else, as an FYI to review other related patches.

This would be the quickest route, and it still enables other interesting follow-up patches to be reviewed/committed. I think we are ready enough to do this if we choose to take this route.

If we are going for more than element-wise Instructions, we need to have well defined and agreed semantics for each of those, and that should be part of the LangRef for each such Instruction.
Also, have we thought about Intrinsics? Can all Intrinsics that take/return non-scalable vector handle scalable vector?

FWIW, I took a quick look through the intrinsics and nothing jumped out at me as an intrinsic that currently has vector types on its interface that shouldn't take scalable vectors. So, if there things which don't work, we should certainly note that.

We can certainly let element-wise stuff to go in first and then extend to non-element-wise stuff later. Any thoughts in this regard?

We should indeed review patches in small/medium-sized units (so long as the changes are testable).

PkmX added a subscriber: PkmX.Apr 14 2019, 11:18 PM

In D32530#1464722, @hsaito wrote:

In D32530#1463735, @joelkevinjones wrote:

I think this is a coherent set of changes. Given the current top of trunk, this expands support from just assembly/disassembly of machine instructions to include LLVM IR, right? Such being the case, I think this patch should go in. I have some ideas on how to structure passes so SV IR supporting optimizations can be added incrementally. If anyone thinks such a proposal would help, let me know.

I think there is one more thing we still have to do. Does scalable vector type apply to all Instructions where non-scalable vector is allowed? If the answer is no, we need to identify which ones are not allowed to take scalable vector type operand/result. Some of the Instructions are not plain element-wise operation. Do we have agreed upon semantics for all those that are allowed?

The main difference is for 'shufflevector'. For a splat it's simple, since you just use a zeroinitializer mask. For anything else, though, you currently need a constant vector with immediate values; this obviously won't work if you don't know how many elements there are.

Downstream, we solve this by allowing shufflevector masks to be ConstantExprs, and then using 'vscale' and 'stepvector' as symbolic Constant nodes. With those and a few arithmetic and logical ops, we can synthesize the usual set of shuffles (reverse, top/bottom half, odd/even, interleaves, zips, etc). Would also work for fixed-length vectors. There's been some pushback on introducing them as symbolic constants though, and the initial demo patch set has them as intrinsics.

So if we wanted to keep them as intrinsics for now, I think we have one of three options:

Leave discussion on more complicated shuffles until later, and only use scalable autovectorization on loops which don't need anything more than splats.
Introduce additional intrinsics for the other shuffle variants as needed
Allow shufflevector to accept arbitrary masks so that intrinsics can be used (though possibly only if the output vector is scalable).

The discussion at the EuroLLVM roundtable was leaning towards option 1, with an action on me to provide a set of canonical shuffle examples using vscale and stepvector for community consideration.

In D32530#1466746, @huntergr wrote:

So if we wanted to keep them as intrinsics for now, I think we have one of three options:

Leave discussion on more complicated shuffles until later, and only use scalable autovectorization on loops which don't need anything more than splats.

Given the current state, this is the easiest path.

Introduce additional intrinsics for the other shuffle variants as needed

Allow shufflevector to accept arbitrary masks so that intrinsics can be used (though possibly only if the output vector is scalable).

This warrants a larger discussion, which would hinder the current progress.

In D32530#1466762, @rengolin wrote:

In D32530#1466746, @huntergr wrote:

So if we wanted to keep them as intrinsics for now, I think we have one of three options:

Leave discussion on more complicated shuffles until later, and only use scalable autovectorization on loops which don't need anything more than splats.

Given the current state, this is the easiest path.

I agree, although this is an important part of the model, so we should start having this discussion in parallel (sooner rather than later). I had been under the impression that a set of intrinsics were being proposed for this, but extending shufflevector is also an option worth considering. If these are first-class types, then having first-class instruction support is probably the right path. This deserves it's own RFC.

Introduce additional intrinsics for the other shuffle variants as needed

Allow shufflevector to accept arbitrary masks so that intrinsics can be used (though possibly only if the output vector is scalable).

This warrants a larger discussion, which would hinder the current progress.

I agree. We should have a separate RFC on this.

In D32530#1467094, @hfinkel wrote:

In D32530#1466762, @rengolin wrote:

In D32530#1466746, @huntergr wrote:

So if we wanted to keep them as intrinsics for now, I think we have one of three options:

Leave discussion on more complicated shuffles until later, and only use scalable autovectorization on loops which don't need anything more than splats.

Given the current state, this is the easiest path.

I agree, although this is an important part of the model, so we should start having this discussion in parallel (sooner rather than later). I had been under the impression that a set of intrinsics were being proposed for this, but extending shufflevector is also an option worth considering. If these are first-class types, then having first-class instruction support is probably the right path. This deserves it's own RFC.

Introduce additional intrinsics for the other shuffle variants as needed

Allow shufflevector to accept arbitrary masks so that intrinsics can be used (though possibly only if the output vector is scalable).

This warrants a larger discussion, which would hinder the current progress.

I agree. We should have a separate RFC on this.

I agree on both points.

In D32530#1466746, @huntergr wrote:

In D32530#1464722, @hsaito wrote:

In D32530#1463735, @joelkevinjones wrote:

I think this is a coherent set of changes. Given the current top of trunk, this expands support from just assembly/disassembly of machine instructions to include LLVM IR, right? Such being the case, I think this patch should go in. I have some ideas on how to structure passes so SV IR supporting optimizations can be added incrementally. If anyone thinks such a proposal would help, let me know.

I think there is one more thing we still have to do. Does scalable vector type apply to all Instructions where non-scalable vector is allowed? If the answer is no, we need to identify which ones are not allowed to take scalable vector type operand/result. Some of the Instructions are not plain element-wise operation. Do we have agreed upon semantics for all those that are allowed?

The main difference is for 'shufflevector'. For a splat it's simple, since you just use a zeroinitializer mask. For anything else, though, you currently need a constant vector with immediate values; this obviously won't work if you don't know how many elements there are.

We need to clarify on insertelement/extractelement. Maybe already done in some other patches, but that clarification should be part of this patch.
Is the "length of val" under the semantics "scalable * n" in <scalable n x ElemTy>, right? Or is it still n?

Going with length of val being "scalable * n", if there is an optimization that takes advantages of poison value being returned by evaluating "idx >= n" at compile time, we need to disable that for scalable vector. Also, if any verifier is checking whether constant idx value is less than n, we need to disable that for scalable vector.

In D32530#1467094, @hfinkel wrote:

In D32530#1466762, @rengolin wrote:

In D32530#1466746, @huntergr wrote:

So if we wanted to keep them as intrinsics for now, I think we have one of three options:

Leave discussion on more complicated shuffles until later, and only use scalable autovectorization on loops which don't need anything more than splats.

Given the current state, this is the easiest path.

I agree, although this is an important part of the model, so we should start having this discussion in parallel (sooner rather than later). I had been under the impression that a set of intrinsics were being proposed for this, but extending shufflevector is also an option worth considering. If these are first-class types, then having first-class instruction support is probably the right path. This deserves it's own RFC.

That's pretty much what LLVM-VP is about: https://reviews.llvm.org/D57504

We are proposing compress/expand and lane shift as intrinsics.
I suggest you add any shuffle intrinsics to the same namespace to avoid fragmentation.

IMHO, the redesigned reduction intrinsics should go there as well (https://reviews.llvm.org/D60261 and/or https://reviews.llvm.org/D60262).

Introduce additional intrinsics for the other shuffle variants as needed

Allow shufflevector to accept arbitrary masks so that intrinsics can be used (though possibly only if the output vector is scalable).

Even if shuffle masks don't need to be constants anymore, it will be awkward to encode compress/expand without additional intrinsics (you would need a prefix sum vector over the mask bits or similar).

This warrants a larger discussion, which would hinder the current progress.

I agree. We should have a separate RFC on this.

We need to clarify on insertelement/extractelement. Maybe already done in some other patches, but that clarification should be part of this patch.
Is the "length of val" under the semantics "scalable * n" in <scalable n x ElemTy>, right? Or is it still n?

I am not sure how it could be anything but n. If you don't know how long the vector is, you can't correctly generate an index beyond n. I assume for vectors of length > n one would have to use shufflevector or something similar (using vscale and stepvector as Graham mentioned) to get the elements you want to the lower n positions.

Either way, the semantics and any restrictions certainly need to be documented.

Going with length of val being "scalable * n", if there is an optimization that takes advantages of poison value being returned by evaluating "idx >= n" at compile time, we need to disable that for scalable vector. Also, if any verifier is checking whether constant idx value is less than n, we need to disable that for scalable vector.

If the index is still restricted by n then this shouldn't be an issue.

docs/LangRef.rst
2753	Just want to double-check: there is nothing about scalable vectors that assumes all vector types have the same bit width, correct? Can <scalable 1 x float> have a different bit width from <scalable 1 x double>?

In D32530#1470773, @greened wrote:

We need to clarify on insertelement/extractelement. Maybe already done in some other patches, but that clarification should be part of this patch.
Is the "length of val" under the semantics "scalable * n" in <scalable n x ElemTy>, right? Or is it still n?

I am not sure how it could be anything but n. If you don't know how long the vector is, you can't correctly generate an index beyond n.

But you know at runtime... there has to be a way to determine, at runtime, vscale. And the index doesn't need to be a constant. I'm not sure that we need to restrict non-constant n to only values valid for vscale == 1.

troyj added a subscriber: troyj.Apr 17 2019, 6:47 PM

PkmX added inline comments.Apr 17 2019, 8:57 PM

docs/LangRef.rst
2753	I believe the intention is that `<scalable 1 x double>` should have twice many bits as `<scalable 1 x float>`, or the same many bits as `<scalable 2 x float>`.

In D32530#1468145, @simoll wrote:

That's pretty much what LLVM-VP is about: https://reviews.llvm.org/D57504

We are proposing compress/expand and lane shift as intrinsics.
I suggest you add any shuffle intrinsics to the same namespace to avoid fragmentation.

I'm hoping we can use an extended shufflevector for this still, but if we do end up going down the extra intrinsics route then I'll certainly use the same namespace.

IMHO, the redesigned reduction intrinsics should go there as well (https://reviews.llvm.org/D60261 and/or https://reviews.llvm.org/D60262).

Introduce additional intrinsics for the other shuffle variants as needed

Allow shufflevector to accept arbitrary masks so that intrinsics can be used (though possibly only if the output vector is scalable).

Even if shuffle masks don't need to be constants anymore, it will be awkward to encode compress/expand without additional intrinsics (you would need a prefix sum vector over the mask bits or similar).

Agreed, and it also extends to predicated inserts and extracts -- using SVE's 'lasta/lastb' instructions in a vectorized loop tail, for example -- they don't map well to the current instructions, so would need intrinsics.

This warrants a larger discussion, which would hinder the current progress.

I agree. We should have a separate RFC on this.

+1

Agreed. Handling shuffles was part of the original RFC, but that RFC was far too large to discuss all at once. I've started reworking that section based on current feedback, and I'll send it out once ready.

In D32530#1470773, @greened wrote:

We need to clarify on insertelement/extractelement. Maybe already done in some other patches, but that clarification should be part of this patch.
Is the "length of val" under the semantics "scalable * n" in <scalable n x ElemTy>, right? Or is it still n?

I am not sure how it could be anything but n. If you don't know how long the vector is, you can't correctly generate an index beyond n. I assume for vectors of length > n one would have to use shufflevector or something similar (using vscale and stepvector as Graham mentioned) to get the elements you want to the lower n positions.

Either way, the semantics and any restrictions certainly need to be documented.

For now, I suspect keeping the limit as 'n' is sufficient, as we only really need to insert into the first element to be able to generate a splat. If we need more later we can discuss it then.

Going with length of val being "scalable * n", if there is an optimization that takes advantages of poison value being returned by evaluating "idx >= n" at compile time, we need to disable that for scalable vector. Also, if any verifier is checking whether constant idx value is less than n, we need to disable that for scalable vector.

If the index is still restricted by n then this shouldn't be an issue.

Agreed.

huntergr added inline comments.Apr 18 2019, 2:45 AM

docs/LangRef.rst
2753	That's correct. I think that a clearer syntax might be `<vscale x 1 x float>` to indicate that the number of elements is being multiplied by the same `vscale` term for all scalable vector types. The intent is that we should be able to reason about the relative sizes of different scalable vector types based on the element size and minimum number of lanes alone.

In D32530#1470784, @hfinkel wrote:

In D32530#1470773, @greened wrote:

We need to clarify on insertelement/extractelement. Maybe already done in some other patches, but that clarification should be part of this patch.
Is the "length of val" under the semantics "scalable * n" in <scalable n x ElemTy>, right? Or is it still n?

I am not sure how it could be anything but n. If you don't know how long the vector is, you can't correctly generate an index beyond n.

But you know at runtime... there has to be a way to determine, at runtime, vscale. And the index doesn't need to be a constant. I'm not sure that we need to restrict non-constant n to only values valid for vscale == 1.

Yes; in our downstream compiler we allow inserts and extracts at arbitrary offsets in IR (though we admittedly haven't found too much use for it) using vscale (as a constant, though obtaining the Value via an intrinsic will still work) to give us the required element index.

e.g. (vscale * n) - 1 for the last element.

I don't think we need to support this initially, though.

The SVE ISA doesn't allow arbitrary indices for inserts or extracts, but we can generate an appropriate predicate quite easily in the backend.

In D32530#1470784, @hfinkel wrote:

In D32530#1470773, @greened wrote:

I am not sure how it could be anything but n. If you don't know how long the vector is, you can't correctly generate an index beyond n.

But you know at runtime... there has to be a way to determine, at runtime, vscale. And the index doesn't need to be a constant. I'm not sure that we need to restrict non-constant n to only values valid for vscale == 1.

Good point. 100% agree. I was only considering the constant case.

In D32530#1474555, @greened wrote:

In D32530#1470784, @hfinkel wrote:

In D32530#1470773, @greened wrote:

I am not sure how it could be anything but n. If you don't know how long the vector is, you can't correctly generate an index beyond n.

But you know at runtime... there has to be a way to determine, at runtime, vscale. And the index doesn't need to be a constant. I'm not sure that we need to restrict non-constant n to only values valid for vscale == 1.

Good point. 100% agree. I was only considering the constant case.

Ok, so do we have agreement that constant literal indices should be limited to 0..n-1 for now, but non-constant indices can potentially exceed n so that expressions featuring vscale can be used?

In D32530#1475356, @huntergr wrote:

In D32530#1474555, @greened wrote:

In D32530#1470784, @hfinkel wrote:

In D32530#1470773, @greened wrote:

I am not sure how it could be anything but n. If you don't know how long the vector is, you can't correctly generate an index beyond n.

But you know at runtime... there has to be a way to determine, at runtime, vscale. And the index doesn't need to be a constant. I'm not sure that we need to restrict non-constant n to only values valid for vscale == 1.

Good point. 100% agree. I was only considering the constant case.

Ok, so do we have agreement that constant literal indices should be limited to 0..n-1 for now, but non-constant indices can potentially exceed n so that expressions featuring vscale can be used?

I want to say yes here, but that leaves a bit of oddness. It's true that nobody can really police on true variable values. However, some variable values can become constant through optimization.
What's the behavior when >=n constant is exposed?

In D32530#1475854, @hsaito wrote:

In D32530#1475356, @huntergr wrote:

In D32530#1474555, @greened wrote:

In D32530#1470784, @hfinkel wrote:

In D32530#1470773, @greened wrote:

I am not sure how it could be anything but n. If you don't know how long the vector is, you can't correctly generate an index beyond n.

But you know at runtime... there has to be a way to determine, at runtime, vscale. And the index doesn't need to be a constant. I'm not sure that we need to restrict non-constant n to only values valid for vscale == 1.

Good point. 100% agree. I was only considering the constant case.

Ok, so do we have agreement that constant literal indices should be limited to 0..n-1 for now, but non-constant indices can potentially exceed n so that expressions featuring vscale can be used?

I want to say yes here, but that leaves a bit of oddness. It's true that nobody can really police on true variable values. However, some variable values can become constant through optimization.
What's the behavior when >=n constant is exposed?

Exactly. Non-constant values can become constant. Constant values can be guarded by vscale-dependent runtime guards (both hand-written and compiler generated). My preference is to leave this not restricted to vscale == 1 values, but rather allow all values that can be supported at runtime, and have it be UB if, at runtime, the relevant index is not available.

In D32530#1477041, @hfinkel wrote:

Exactly. Non-constant values can become constant. Constant values can be guarded by vscale-dependent runtime guards (both hand-written and compiler generated). My preference is to leave this not restricted to vscale == 1 values, but rather allow all values that can be supported at runtime, and have it be UB if, at runtime, the relevant index is not available.

In D32530#1477164, @rengolin wrote:

In D32530#1477041, @hfinkel wrote:

Exactly. Non-constant values can become constant. Constant values can be guarded by vscale-dependent runtime guards (both hand-written and compiler generated). My preference is to leave this not restricted to vscale == 1 values, but rather allow all values that can be supported at runtime, and have it be UB if, at runtime, the relevant index is not available.

+1

That means a need for a warning to general developers: If there is a check for constant_index >= n to see if the result is a poison value, that code has to be changed so that it applies to non-scalable vector only. Hopefully not too many instances of that. I'm fine with this as long as the communication to the rest of LLVM community is clear about it.

lewahlig added a subscriber: lewahlig.Apr 24 2019, 4:48 PM

lewahlig removed a subscriber: lewahlig.

lewahlig added a subscriber: lewahlig.

In D32530#1477261, @hsaito wrote:

In D32530#1477164, @rengolin wrote:

In D32530#1477041, @hfinkel wrote:

Exactly. Non-constant values can become constant. Constant values can be guarded by vscale-dependent runtime guards (both hand-written and compiler generated). My preference is to leave this not restricted to vscale == 1 values, but rather allow all values that can be supported at runtime, and have it be UB if, at runtime, the relevant index is not available.

+1

That means a need for a warning to general developers: If there is a check for constant_index >= n to see if the result is a poison value, that code has to be changed so that it applies to non-scalable vector only. Hopefully not too many instances of that. I'm fine with this as long as the communication to the rest of LLVM community is clear about it.

Ok; I'll update the patch for that.

I think this works out fine, since this behaviour matches our downstream implementation, but I can summarize the extra semantic decisions in this review in a post to llvm-dev as well.

bryanpkc added a subscriber: bryanpkc.Apr 25 2019, 1:14 PM

Noted the change in semantics for extractelement and insertelement in the langref.

In D32530#1484027, @huntergr wrote:

Noted the change in semantics for extractelement and insertelement in the langref.

Why implementation defined and not UB for the case where the index exceeds the runtime length? How do you intend to define this for SVE?

greened added inline comments.Apr 30 2019, 12:30 PM

docs/LangRef.rst
2753	+1 for `<vscale x 1 x float>`.

In D32530#1484668, @hfinkel wrote:

In D32530#1484027, @huntergr wrote:

Noted the change in semantics for extractelement and insertelement in the langref.

Why implementation defined and not UB for the case where the index exceeds the runtime length? How do you intend to define this for SVE?

SVE uses a predicate for indexed inserts and extracts. We generate that predicate by comparing a splat of the index against a stepvector (0,1,2,3...); if the index is out of range then the predicate will be all false.

For a mov (insert), that results in an unmodified vector.

For a lastb (extract), that extracts the last lane in the vector if no predicate bits are true.

I don't know if RVV or SX-Aurora have similarly defined semantics. If the preference is to make it UB, that's fine.

In D32530#1485818, @huntergr wrote:

In D32530#1484668, @hfinkel wrote:

Why implementation defined and not UB for the case where the index exceeds the runtime length? How do you intend to define this for SVE?

SVE uses a predicate for indexed inserts and extracts. We generate that predicate by comparing a splat of the index against a stepvector (0,1,2,3...); if the index is out of range then the predicate will be all false.

For a mov (insert), that results in an unmodified vector.

For a lastb (extract), that extracts the last lane in the vector if no predicate bits are true.

I don't know if RVV or SX-Aurora have similarly defined semantics. If the preference is to make it UB, that's fine.

I think this should be UB, or at least return poison. However, making it UB has the unfortunate side effect of making it illegal to hoist these operations out of conditionals (which currently isn't the case), so maybe poison is better.

For me the main reason for making is UB (aside from generally being conservative) is that element insert/extract can be usefully legalized by putting the vector into a stack slot and accessing the affected element with scalar loads/stores -- in fact, that's the default legalization strategy in trunk. But in that setting an out-of-bounds lane index would become a wild read/write (= clear-cut UB), unless the legalization code jumps through extra hoops to add a bounds check or clamp the lane index into range. This wouldn't affect SVE or RVV since they would implement custom lowering anyway, but for other targets (especially if we eventually generalize insert/extract to accept variable lane positions and do so for all vector types for consistency) it could become an unnecessary headache. Though if we make insertelement with out-of-range lane index return poison instead of being UB, then at minimum clamping in insertelement is necessary to avoid the wild write.

Aside: your proposed semantics for SVE would break the property extractelt(insertelt(vec, i, elt), i) = elt which instcombine (and others) most likely assumes. If we make the insertelt return poison (or even just undef, FWIW), that issue goes away.

In D32530#1485862, @rkruppe wrote:

In D32530#1485818, @huntergr wrote:

In D32530#1484668, @hfinkel wrote:

Why implementation defined and not UB for the case where the index exceeds the runtime length? How do you intend to define this for SVE?

SVE uses a predicate for indexed inserts and extracts. We generate that predicate by comparing a splat of the index against a stepvector (0,1,2,3...); if the index is out of range then the predicate will be all false.

For a mov (insert), that results in an unmodified vector.

For a lastb (extract), that extracts the last lane in the vector if no predicate bits are true.

I don't know if RVV or SX-Aurora have similarly defined semantics. If the preference is to make it UB, that's fine.

I think this should be UB, or at least return poison. However, making it UB has the unfortunate side effect of making it illegal to hoist these operations out of conditionals (which currently isn't the case), so maybe poison is better.

For me the main reason for making is UB (aside from generally being conservative) is that element insert/extract can be usefully legalized by putting the vector into a stack slot and accessing the affected element with scalar loads/stores -- in fact, that's the default legalization strategy in trunk. But in that setting an out-of-bounds lane index would become a wild read/write (= clear-cut UB), unless the legalization code jumps through extra hoops to add a bounds check or clamp the lane index into range. This wouldn't affect SVE or RVV since they would implement custom lowering anyway, but for other targets (especially if we eventually generalize insert/extract to accept variable lane positions and do so for all vector types for consistency) it could become an unnecessary headache. Though if we make insertelement with out-of-range lane index return poison instead of being UB, then at minimum clamping in insertelement is necessary to avoid the wild write.

Aside: your proposed semantics for SVE would break the property extractelt(insertelt(vec, i, elt), i) = elt which instcombine (and others) most likely assumes. If we make the insertelt return poison (or even just undef, FWIW), that issue goes away.

I think that making the return be a poison value is best.

In D32530#1486007, @hfinkel wrote:

In D32530#1485862, @rkruppe wrote:

In D32530#1485818, @huntergr wrote:

In D32530#1484668, @hfinkel wrote:

Why implementation defined and not UB for the case where the index exceeds the runtime length? How do you intend to define this for SVE?

SVE uses a predicate for indexed inserts and extracts. We generate that predicate by comparing a splat of the index against a stepvector (0,1,2,3...); if the index is out of range then the predicate will be all false.

For a mov (insert), that results in an unmodified vector.

For a lastb (extract), that extracts the last lane in the vector if no predicate bits are true.

I don't know if RVV or SX-Aurora have similarly defined semantics. If the preference is to make it UB, that's fine.

I think this should be UB, or at least return poison. However, making it UB has the unfortunate side effect of making it illegal to hoist these operations out of conditionals (which currently isn't the case), so maybe poison is better.

For me the main reason for making is UB (aside from generally being conservative) is that element insert/extract can be usefully legalized by putting the vector into a stack slot and accessing the affected element with scalar loads/stores -- in fact, that's the default legalization strategy in trunk. But in that setting an out-of-bounds lane index would become a wild read/write (= clear-cut UB), unless the legalization code jumps through extra hoops to add a bounds check or clamp the lane index into range. This wouldn't affect SVE or RVV since they would implement custom lowering anyway, but for other targets (especially if we eventually generalize insert/extract to accept variable lane positions and do so for all vector types for consistency) it could become an unnecessary headache. Though if we make insertelement with out-of-range lane index return poison instead of being UB, then at minimum clamping in insertelement is necessary to avoid the wild write.

Aside: your proposed semantics for SVE would break the property extractelt(insertelt(vec, i, elt), i) = elt which instcombine (and others) most likely assumes. If we make the insertelt return poison (or even just undef, FWIW), that issue goes away.

I think that making the return be a poison value is best.

Ok, will update again. Thanks.

sdesmalen mentioned this in D61435: [AArch64] NFC: Add generic StackOffset to describe scalable offsets..May 2 2019, 5:38 AM

Sorry to be late to the party, but I have a quick question:

Ok, so do we have agreement that constant literal indices should be limited to 0..n-1 for now, but non-constant indices can potentially exceed n so that expressions featuring vscale can be used?

What if we know the width is fixed on our target machine? Let's say it's fixed at 512b. So a full width scalable double vector would be:

<vscale x 2 x double>

Since our width is fixed, we know that vscale=4 here and there are 8 elements in this vector.

I'd like to be able to do a really fast insert at say index 3. I.e. insert_element(<vscale x 2 x double> res, double elt, int 3). Would that insert be as fast with the vscale method as it would be for an index < n-1?

In D32530#1487868, @cameron.mcinally wrote:

Sorry to be late to the party, but I have a quick question:

Ok, so do we have agreement that constant literal indices should be limited to 0..n-1 for now, but non-constant indices can potentially exceed n so that expressions featuring vscale can be used?

What if we know the width is fixed on our target machine? Let's say it's fixed at 512b. So a full width scalable double vector would be:

<vscale x 2 x double>

Since our width is fixed, we know that vscale=4 here and there are 8 elements in this vector.

I'd like to be able to do a really fast insert at say index 3. I.e. insert_element(<vscale x 2 x double> res, double elt, int 3). Would that insert be as fast with the vscale method as it would be for an index < n-1?

If you know the exact size of your vectors at compile time, then I believe fixed length vector types should be used, at least at the IR level.

The performance of insert/extract is target dependent. For SVE you almost always need a predicate for a single element insert or extract, so you're not going to gain anything by knowing the size ahead of time.

If you know the exact size of your vectors at compile time, then I believe fixed length vector types should be used, at least at the IR level.

Ah, ok. So let's say we're targeting SVE and I know my vector length is 512b. I was imagining building a vector like this:

%r1 = insertelement <scalable 2 x double> undef, double %x, i32 0  
%r2 = insertelement <scalable 2 x double > %r1, double %x1, i32 1
%r3 = insertelement <scalable 2 x double > %r2, double %x2, i32 2
%r4 = insertelement <scalable 2 x double > %r3, double %x3, i32 3
%r5 = insertelement <scalable 2 x double > %r4, double %x4, i32 4
%r6 = insertelement <scalable 2 x double > %r5, double %x5, i32 5
%r7 = insertelement <scalable 2 x double > %r6, double %x6, i32 6
%r8 = insertelement <scalable 2 x double > %r7, double %x7, i32 7

But it sounds like I should rather just build a <8 x double> vector instead. Would that <8 x double> vector legalize to SVE vector instructions? Or would it be split into four 128b NEON vectors?

The performance of insert/extract is target dependent. For SVE you almost always need a predicate for a single element insert or extract, so you're not going to gain anything by knowing the size ahead of time.

So back to my scalable vector example above, it sounds like even if the above IR was valid, we'd still have to produce predicates at the hardware instruction level. So maybe my concern is moot...

In D32530#1488001, @cameron.mcinally wrote:
If you know the exact size of your vectors at compile time, then I believe fixed length vector types should be used, at least at the IR level.

Ah, ok. So let's say we're targeting SVE and I know my vector length is 512b. I was imagining building a vector like this:
%r1 = insertelement <scalable 2 x double> undef, double %x, i32 0  
%r2 = insertelement <scalable 2 x double > %r1, double %x1, i32 1
%r3 = insertelement <scalable 2 x double > %r2, double %x2, i32 2
%r4 = insertelement <scalable 2 x double > %r3, double %x3, i32 3
%r5 = insertelement <scalable 2 x double > %r4, double %x4, i32 4
%r6 = insertelement <scalable 2 x double > %r5, double %x5, i32 5
%r7 = insertelement <scalable 2 x double > %r6, double %x6, i32 6
%r8 = insertelement <scalable 2 x double > %r7, double %x7, i32 7

So you could do that (judging by the rest of the discussion), but you'd have poison values (effectively) if you exceeded the runtime length. I guess if you're treating scalable vectors as pseudo-fixed-length then you don't care about using the same binary on different hardware.

But it sounds like I should rather just build a <8 x double> vector instead. Would that <8 x double> vector legalize to SVE vector instructions? Or would it be split into four 128b NEON vectors?

In the current code, you'd get 4 NEON vectors. In future we'd implement fixed length SVE support as well (for SLP autovec without introducing extra predicate generation/branching), but the recommended method of loop autovec for SVE would be VLA. For your own work you'd be able to use fixed-length types then, but we're still figuring out the design (to minimize the number of ISel patterns).

The performance of insert/extract is target dependent. For SVE you almost always need a predicate for a single element insert or extract, so you're not going to gain anything by knowing the size ahead of time.

So back to my scalable vector example above, it sounds like even if the above IR was valid, we'd still have to produce predicates at the hardware instruction level. So maybe my concern is moot...

Yes, though if the index is known to be within a certain range (5bit signed immediate), then you can skip generating a splat and just compare against the stepvector directly; so for your 512b example cpu, you'd be able to generate one fewer instruction when indexing 32b or 64b elements. If you're building up an entire vector (as in your IR), the insr instruction will shift all elements along by one and insert into the first lane, so no predicates would be needed -- but we would need to pattern match and optimize for this case.

In D32530#1488083, @huntergr wrote:
In D32530#1488001, @cameron.mcinally wrote:
If you know the exact size of your vectors at compile time, then I believe fixed length vector types should be used, at least at the IR level.

Ah, ok. So let's say we're targeting SVE and I know my vector length is 512b. I was imagining building a vector like this:
%r1 = insertelement <scalable 2 x double> undef, double %x, i32 0  
%r2 = insertelement <scalable 2 x double > %r1, double %x1, i32 1
%r3 = insertelement <scalable 2 x double > %r2, double %x2, i32 2
%r4 = insertelement <scalable 2 x double > %r3, double %x3, i32 3
%r5 = insertelement <scalable 2 x double > %r4, double %x4, i32 4
%r6 = insertelement <scalable 2 x double > %r5, double %x5, i32 5
%r7 = insertelement <scalable 2 x double > %r6, double %x6, i32 6
%r8 = insertelement <scalable 2 x double > %r7, double %x7, i32 7
So you could do that (judging by the rest of the discussion), but you'd have poison values (effectively) if you exceeded the runtime length. I guess if you're treating scalable vectors as pseudo-fixed-length then you don't care about using the same binary on different hardware.

But it sounds like I should rather just build a <8 x double> vector instead. Would that <8 x double> vector legalize to SVE vector instructions? Or would it be split into four 128b NEON vectors?

In the current code, you'd get 4 NEON vectors. In future we'd implement fixed length SVE support as well (for SLP autovec without introducing extra predicate generation/branching), but the recommended method of loop autovec for SVE would be VLA. For your own work you'd be able to use fixed-length types then, but we're still figuring out the design (to minimize the number of ISel patterns).

That's great.

The performance of insert/extract is target dependent. For SVE you almost always need a predicate for a single element insert or extract, so you're not going to gain anything by knowing the size ahead of time.

So back to my scalable vector example above, it sounds like even if the above IR was valid, we'd still have to produce predicates at the hardware instruction level. So maybe my concern is moot...

Yes, though if the index is known to be within a certain range (5bit signed immediate), then you can skip generating a splat and just compare against the stepvector directly; so for your 512b example cpu, you'd be able to generate one fewer instruction when indexing 32b or 64b elements. If you're building up an entire vector (as in your IR), the insr instruction will shift all elements along by one and insert into the first lane, so no predicates would be needed -- but we would need to pattern match and optimize for this case.

Ok. Thanks for the clarification.

What's the status of this? It seems like discussion has died down a bit. I think Graham's idea to change from <scalable 2 x float> to <vscale x 2 x float> will make the IR more readable/understandable but it's not a show-stopper for me.

Are there any other outstanding issues to address before this lands?

In D32530#1496945, @greened wrote:

What's the status of this? It seems like discussion has died down a bit. I think Graham's idea to change from <scalable 2 x float> to <vscale x 2 x float> will make the IR more readable/understandable but it's not a show-stopper for me.

Are there any other outstanding issues to address before this lands?

I think @huntergr still needs to update the insertelement/extractelement doc. For me, the last remaining thing is the definition of shufflevector behavior that his last revision did not address. Do we allow scalable vector? Do we allow anything other than zeroinitializer mask? I'm fine not allowing for the time being or restricting to zeroinitializer mask for the time being. We still need to write it down since generally supporting shufflevector of scalable vectors is outside the scope of this patch.

In D32530#1496960, @hsaito wrote:

In D32530#1496945, @greened wrote:

What's the status of this? It seems like discussion has died down a bit. I think Graham's idea to change from <scalable 2 x float> to <vscale x 2 x float> will make the IR more readable/understandable but it's not a show-stopper for me.

Are there any other outstanding issues to address before this lands?

I think @huntergr still needs to update the insertelement/extractelement doc. For me, the last remaining thing is the definition of shufflevector behavior that his last revision did not address. Do we allow scalable vector? Do we allow anything other than zeroinitializer mask? I'm fine not allowing for the time being or restricting to zeroinitializer mask for the time being. We still need to write it down since generally supporting shufflevector of scalable vectors is outside the scope of this patch.

+1 - Once we have these documentation updates, I think that we'll be good to go.

Changed (extract|insert)element semantics to return poison values at runtime if the index exceeds hardware vector length.
Changed shufflevector semantics to note that only zeroinitializer and undef can be used as mask values for scalable vectors for now.

(Sorry about the delay in updating, I've been out at a conference this week)

In D32530#1496945, @greened wrote:

What's the status of this? It seems like discussion has died down a bit. I think Graham's idea to change from <scalable 2 x float> to <vscale x 2 x float> will make the IR more readable/understandable but it's not a show-stopper for me.

While I don't want to hold up this patch, a name change to the type will be difficult to get through once the type is in and the terminology has settled, so it may be worth getting it right from the start. Personally I think <vscale x 16 x i8> is more explicit (and therefore easier to understand for those new to the type) than <scalable 16 x i8>. And now that we all have a better understanding and clear definition of scalable types, I wonder if <n x 16 x i8> instead of <vscale x 16 x i8> is worth considering again, since it is much easier to read in tests.

Other than that, great to see this discussion on scalable types converging and so close to agreement!

In D32530#1498190, @sdesmalen wrote:

I wonder if <n x 16 x i8> instead of <vscale x 16 x i8> is worth considering again, since it is much easier to read in tests.

I kind of like <n x 16 x i8>. It's concise. I don't think vscale really adds a lot of value, but it does eat up characters on a line.

I know very well how annoying it can be to read and write (and say) the scalable prefix all the time and wish for something shorter sometimes, but I also prefer <vscale x ...> for the reasons Sander gave. I'll add that <vscale x 4 x i32> feels a bit lighter than <scalable 4 x i32> even though it's the same number of characters (maybe because there's more whitespace?).

The <n x ...> syntax is shorter but doesn't have the mnemonic aspect and also clashes with the pre-existing use of "n" as a metavariable standing for some fixed vector length (as in, <N x i1> for example), so I'd rather have <scalable ...> if those are the two options.

In D32530#1498469, @rkruppe wrote:

I know very well how annoying it can be to read and write (and say) the scalable prefix all the time and wish for something shorter sometimes, but I also prefer <vscale x ...> for the reasons Sander gave. I'll add that <vscale x 4 x i32> feels a bit lighter than <scalable 4 x i32> even though it's the same number of characters (maybe because there's more whitespace?).

The <n x ...> syntax is shorter but doesn't have the mnemonic aspect and also clashes with the pre-existing use of "n" as a metavariable standing for some fixed vector length (as in, <N x i1> for example), so I'd rather have <scalable ...> if those are the two options.

+1. I like vscale over n because it directly references a term documented in the IR with this patch.

sdesmalen mentioned this in D61437: [AArch64] Static (de)allocation of SVE stack objects..May 16 2019, 8:05 AM

c-rhodes added a subscriber: c-rhodes.May 17 2019, 2:55 AM

It seems like there's enough support for changing to <vscale x as the prefix, so I'll revise the patch.

tstellar added a subscriber: tstellar.May 20 2019, 9:11 AM

Changed textual IR format to <vscale x n x <ty>>

hfinkel added inline comments.May 22 2019, 4:25 PM

docs/LangRef.rst
8190	I believe that all you need to say is: For a scalable vector, if the value of ``idx`` exceeds the runtime length of the vector, the result is a poison value. This part about the "result in the IR" is confusing, and is just stating a basic consistency fact about LLVM's type system. I recommend removing that.

Simplified wording for extract/insertelement semantics

huntergr marked an inline comment as done.May 23 2019, 5:10 AM

In D32530#1513529, @huntergr wrote:

Simplified wording for extract/insertelement semantics

Thanks. LGTM.

Closed by commit rL361953: [SVE][IR] Scalable Vector IR Type (authored by huntergr). · Explain WhyMay 29 2019, 5:20 AM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptMay 29 2019, 5:20 AM

Herald added a subscriber: kristina. · View Herald Transcript

Hi, this slowed down thinlto links 3-4x, and other things probably too. PR42210 has details. I've reverted this for now in r362913.

In D32530#1535665, @thakis wrote:

Hi, this slowed down thinlto links 3-4x, and other things probably too. PR42210 has details. I've reverted this for now in r362913.

Ok, thanks. I'll investigate -- I guess having a persistent map for the duration of verification might resolve it.

@huntergr do you have an account on bugzilla? I couldn't CC you on that bug.

In D32530#1535948, @rengolin wrote:

@huntergr do you have an account on bugzilla? I couldn't CC you on that bug.

No. I'll sign up for one.

Ka-Ka added a subscriber: Ka-Ka.Jun 11 2019, 6:46 AM

huntergr mentioned this in D63321: [SVE][IR] Scalable Vector IR Type with pr42210 fix.Jun 14 2019, 12:56 AM

huntergr mentioned this in rL363658: [SVE][IR] Scalable Vector IR Type with pr42210 fix.Jun 18 2019, 3:10 AM

huntergr mentioned this in rG43854e3ccc7f: [SVE][IR] Scalable Vector IR Type with pr42210 fix.

hans mentioned this in rL364543: Revert r363658 "[SVE][IR] Scalable Vector IR Type with pr42210 fix".Jun 27 2019, 6:55 AM

hansw mentioned this in rG408fc0849ea1: Revert r363658 "[SVE][IR] Scalable Vector IR Type with pr42210 fix".Jun 27 2019, 6:56 AM

huntergr mentioned this in D64079: Scalable Vector IR Type (Try 3).Jul 2 2019, 7:09 AM

huntergr mentioned this in rL365203: Scalable Vector IR Type with further LTO fixes.Jul 5 2019, 5:48 AM

huntergr mentioned this in rG957c40db6aeb: Scalable Vector IR Type with further LTO fixes.

Diffusion mentioned this in rL368024: [AArch64] NFC: Add generic StackOffset to describe scalable offsets..Aug 6 2019, 6:06 AM

sdesmalen mentioned this in rG612b03896610: [AArch64] NFC: Add generic StackOffset to describe scalable offsets..Aug 6 2019, 6:07 AM

Link to llvm presentation for reference: https://llvm.org/devmtg/2019-04/slides/TechTalk-Kruppe-Espasa-RISC-V_Vectors_and_LLVM.pdf

Herald added a reviewer: efriedma. · View Herald TranscriptJan 29 2023, 10:18 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added subscribers: alextsao1999, ormris. · View Herald Transcript

dexonsmith removed a subscriber: dexonsmith.Jan 29 2023, 10:20 AM

Revision Contents

Path

Size

docs/

LangRef.rst

35 lines

include/

llvm/

ADT/

DenseMapInfo.h

16 lines

IR/

DerivedTypes.h

68 lines

Type.h

1 line

Support/

ScalableSize.h

43 lines

lib/

AsmParser/

LLLexer.cpp

1 line

LLParser.cpp

11 lines

LLToken.h

1 line

Bitcode/

Reader/

BitcodeReader.cpp

6 lines

Writer/

BitcodeWriter.cpp

5 lines

IR/

5 lines

2 lines

12 lines

67 lines

test/

Bitcode/

compatibility.ll

4 lines

Verifier/

scalable-aggregates.ll

31 lines

scalable-global-vars.ll

24 lines

unittests/

IR/

CMakeLists.txt

1 line

VectorTypesTest.cpp

164 lines

Diff 191733

docs/LangRef.rst

This file is larger than 256 KB, so syntax highlighting is disabled by default.

	Show First 20 Lines • Show All 669 Lines • ▼ Show 20 Lines
	Globals can also have a :ref:`DLL storage class <dllstorageclass>`,			Globals can also have a :ref:`DLL storage class <dllstorageclass>`,
	an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,			an optional :ref:`runtime preemption specifier <runtime_preemption_model>`,
	an optional :ref:`global attributes <glattrs>` and			an optional :ref:`global attributes <glattrs>` and
	an optional list of attached :ref:`metadata <metadata>`.			an optional list of attached :ref:`metadata <metadata>`.

	Variables and aliases can have a			Variables and aliases can have a
	:ref:`Thread Local Storage Model <tls_model>`.			:ref:`Thread Local Storage Model <tls_model>`.

				:ref:`Scalable vectors <t_vector>` cannot be global variables or members of
				greenedUnsubmitted Done Reply Inline Actions Add a similar comment about scalable vector types in aggregates? greened: Add a similar comment about scalable vector types in aggregates?
				structs or arrays because their size is unknown at compile time.

	Syntax::			Syntax::

	@<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]			@<GlobalVarName> = [Linkage] [PreemptionSpecifier] [Visibility]
	[DLLStorageClass] [ThreadLocal]			[DLLStorageClass] [ThreadLocal]
	[(unnamed_addr\|local_unnamed_addr)] [AddrSpace]			[(unnamed_addr\|local_unnamed_addr)] [AddrSpace]
	[ExternallyInitialized]			[ExternallyInitialized]
	<global \| constant> <Type> [<InitializerConstant>]			<global \| constant> <Type> [<InitializerConstant>]
	[, section "name"] [, comdat [($name)]]			[, section "name"] [, comdat [($name)]]
	▲ Show 20 Lines • Show All 2,037 Lines • ▼ Show 20 Lines
	Vector Type			Vector Type
	"""""""""""			"""""""""""

	:Overview:			:Overview:

	A vector type is a simple derived type that represents a vector of			A vector type is a simple derived type that represents a vector of
	elements. Vector types are used when multiple primitive data are			elements. Vector types are used when multiple primitive data are
	operated in parallel using a single instruction (SIMD). A vector type			operated in parallel using a single instruction (SIMD). A vector type
	requires a size (number of elements) and an underlying primitive data			requires a size (number of elements), an underlying primitive data type,
	type. Vector types are considered :ref:`first class <t_firstclass>`.			and a scalable property to represent vectors where the exact hardware
				vector length is unknown at compile time. Vector types are considered
				:ref:`first class <t_firstclass>`.

	:Syntax:			:Syntax:

	::			::

	< <# elements> x <elementtype> >			< <# elements> x <elementtype> > ; Fixed-length vector
				< scalable <# elements> x <elementtype> > ; Scalable vector

	The number of elements is a constant integer value larger than 0;			The number of elements is a constant integer value larger than 0;
	elementtype may be any integer, floating-point or pointer type. Vectors			elementtype may be any integer, floating-point or pointer type. Vectors
	of size zero are not allowed.			of size zero are not allowed. For scalable vectors, the number of
				elements is an unknown integer multiple of the number of elements.
				hfinkelUnsubmitted Done Reply Inline Actions This doesn't seem strong enough. We need the unknown multiple to be the same for any given type (at least within a given function). We also need a relationship between vectors of different underlying types (so that zext/sext/etc. make sense). Otherwise, you can't even sensibly add them together (for example). I realize that it says something about an unknown vector length above, but we need to translate that statement into semantics that make sense for the vectors themselves. hfinkel: This doesn't seem strong enough. We need the unknown multiple to be the same for any given…
				rengolinUnsubmitted Not Done Reply Inline Actions It's not that simple. Both SVE and RISC-V can have vector multiplier changes in the middle of a function (via system register or similar). Neither of them want that to be the norm, but IIRC, RISC-V doesn't want it to change inside a function and SVE wants it to be the same for the whole program. I totally agree with you that leaving it open is a huge can of worms, and wanting a per-function change would probably need new annotation on functions, which if ever done, should be orthogonal to this change (or would lead us into madness). I second your proposal that we fix the semantics in LLVM, for now, that the "unknown width" is the same throughout the program and that the existing relationship between fixed vectors extends to scalable vectors. If you look at the changes in this patch series, it assumes that behaviour already, by getting new vector types of half-size with double-elements and so on. IFF RISC-V wants to extend the logic to be per-function, then we will need to do a much more extensive analysis on the passes, especially around inlining and function calls. I strongly propose we don't look at it right now and fix the semantics as proposed above. In my analysis, with that semantics, I don't see a big impact on any existing non-scalable optimisations. With vectorisation passes being run at the end of the pipeline, even for scalable code, most of the existing pipeline will still be relevant, too. rengolin: It's not that simple. Both SVE and RISC-V can have vector multiplier changes in the middle of a…
				rkruppeUnsubmitted Not Done Reply Inline Actions So first of all, I agree that this patch does (and should) only implement a single "constant" (at runtime) `vscale` value. The current wording here in LangRef is ambiguous about this, it doesn't make clear at which scope the "unknown integer multiple" is fixed. It should be made clear that this factor is the same for all vector types and does not change while the program executes (and, once the `vscale` intrinsic is added, this section here should also point at it). Second, since you mentioned it I should say: the RISC-V vector extension has now changed to a point where (at least in its standard incarnation, without further extensions on top of it) there is no need for vscale to change during runtime, let alone between individual functions. All the changes to the "vector register size" now happen in such a way that it's easy to express the longer vectors in IR by just using different vector types with higher `ElementCount::Min`, e.g. `<scalable 8 x double>` instead of `<scalable 1 x double>`. So from the RVV side, there's no need any more for `vscale` to vary e.g. function-by-function. I don't know whether the SVE folks want to take a shot at it regardless, but in past discussion it sounded like the vscale-per-function model wasn't a good fit for the programming model they envisioned, so maybe they'd come up with a different solution if and when they tackle that problem. rkruppe: So first of all, I agree that this patch does (and should) only implement a single "constant"…
				huntergrAuthorUnsubmitted Not Done Reply Inline Actions For SVE at least, we can consider changing the vector length during execution to be undefined behaviour. The compiler is not expected to be able to handle it. For RVV, given the new restrictions on how vlmul is handled, I think they won't need to change the multiple at runtime either -- just increase the the minimum number of lanes. I'm hoping to discuss this with Robin at EuroLLVM, assuming time permits. I'll come up with some stricter wording. huntergr: For SVE at least, we can consider changing the vector length during execution to be undefined…
				hfinkelUnsubmitted Not Done Reply Inline Actions I definitely agree that we should not deal with changing the vscale during program execution. I think that the model is: There is an underlying vector length. vscale = round(vector length in bits / primitive size in bits). Can we specify it like that? We do also need to define what the rounding is. What does <scalable 4 x i3> do? Or is it not allowed? hfinkel: I definitely agree that we should not deal with changing the vscale during program execution.
				hsaitoUnsubmitted Not Done Reply Inline Actions I definitely agree that we should not deal with changing the vscale during program execution. I agree that this will make things a lot simpler than allowing it to change per function or in a middle of a function. However, I don't quite agree that changing vscale per function is an orthogonal issue. What are we going to do when function foo() with vscale=X calls function bar() with vscale=Y using a scalable vector parameter? Having said that, since I don't expect the discussions to converge anytime soon if we talk about vscale changing within a compilation unit, I agree we should move forward with vscale not changing within a compilation unit (we say program execution, but compiler's visibility is always limited to compilation unit). It should be sufficient to say that if multiple compilation unit with different vscale are linked, unspecified behavior will result. @hfinkel, I think the model is "#elements * elementtype" fits in one or more "units of vector" and then apply vscale to it. I don't think scalable vector needs to fit one physical register of HW. Vector type legalization should kick-in. @huntergr, please correct me if my mental model is wrong. hsaito: >I definitely agree that we should not deal with changing the vscale during program execution.
				rengolinUnsubmitted Not Done Reply Inline Actions However, I don't quite agree that changing vscale per function is an orthogonal issue. I didn't mean the implementation, but the discussion. I think a per-function vscale implementation will be very different from the current one, no matter which course we take now. It won't matter much if we have native or intrinsic implementation, we'll still need function attributes and teach the optimisation passes, etc. Having said that, since I don't expect the discussions to converge anytime soon if we talk about vscale changing within a compilation unit If the scope it the compilation unit, then we'd need it to be fixed on the target string, or we won't be able to link two units together. I think even this discussion is too soon, and we should push the scope to the whole program. Any change in vscale throughout the program should be undefined, or we'd have to encode the necessary logic in the compiler, which is the biggest worry I see from the feedback. So far, the benefits of doing so are on edge cases and the actual costs are unknown (but very likely large). In my view, this is definitely not a subject we should raise right now and restricting the current implementation to whole-program scope is the only way we can go forward for now in any sensible way. rengolin: > However, I don't quite agree that changing vscale per function is an orthogonal issue. I…
				hfinkelUnsubmitted Not Done Reply Inline Actions I didn't mean the implementation, but the discussion. As I've said in previous thread, I don't believe that we can sensibly model a changing vscale without some SSA dependence, and that will require significant changes to the overall scheme. restricting the current implementation to whole-program scope is the only way we can go forward for now in any sensible way. +1 I think the model is "#elements * elementtype" fits in one or more "units of vector" and then apply vscale to it. I don't think scalable vector needs to fit one physical register of HW. Vector type legalization should kick-in. Indeed, I believe you're correct. We need to account for this in the definition too. hfinkel: > I didn't mean the implementation, but the discussion. As I've said in previous thread, I…
				hfinkelUnsubmitted Not Done Reply Inline Actions Indeed, I believe you're correct. We need to account for this in the definition too. Either by having a model that includes legalization, or by restricting the size of the base vector type? hfinkel: > Indeed, I believe you're correct. We need to account for this in the definition too. Either…
				huntergrAuthorUnsubmitted Done Reply Inline Actions We perform legalization for scalable vectors with the same mechanisms fixed-length vectors do (splitting for too large, promoting/extending for too small). Should this be documented in this description (it isn't for fixed vectors), or is there a better place in the docs for that explanation? (A side note; for 'unpacked' float vector types (e.g. <scalable 2 x float>) we do declare them as legal for SVE then generate predicates to mask off the unused lanes in AArch64 specific code. Since there are more predicated architectures being added to the codebase, perhaps this could be generalized as a new legalization mechanism for fp vector types) huntergr: We perform legalization for scalable vectors with the same mechanisms fixed-length vectors do…
				hfinkelUnsubmitted Not Done Reply Inline Actions We perform legalization for scalable vectors with the same mechanisms fixed-length vectors do (splitting for too large, promoting/extending for too small). What defines too big (what size is used for splitting)? If `<scalable 8 x double>` fits in the vector register depends on the runtime vector size, no? Should this be documented in this description No, unless it's part of the IR-level model. What we need here is a model defined, at the IR level, that explains why: I can add two <scalable 4 x float> vectors together. I cannot add a <scalable 4 x float> to a <scalable 2 x float> I can sext a <scalable 4 x i32> to a <scalable 4 x i64>, and this can be bitcast to a <scalable 8 x i32>. Also we should address happens to vectors with an odd number of lanes or of a non-power-of-two-sized primitive types (both of which are defined at the IR level). hfinkel: > We perform legalization for scalable vectors with the same mechanisms fixed-length vectors do…
				huntergrAuthorUnsubmitted Not Done Reply Inline Actions What defines too big (what size is used for splitting)? If <scalable 8 x double> fits in the vector register depends on the runtime vector size, no? No. For SVE, the legal types are those where the minimum size is equal to 128 bits, since that's the minimum size for hardware registers (and the granularity of increments of register size). So an operation using <scalable 8 x double> values would need to be split into 4 <scalable 2 x double> operations for SVE during legalization. (I'm ignoring the predicated unpacked float forms for a moment) I wonder if the change in syntax from <n x 8 x double> to <scalable 8 x double> makes that less obvious. The basis of the model can be grounded in '1' being a valid value for vscale, which effectively makes the types equivalent to fixed length vectors. I'll try coming up with a description based on that. huntergr: > What defines too big (what size is used for splitting)? If <scalable 8 x double> fits in the…

	:Examples:			:Examples:

	+-------------------+--------------------------------------------------+			+------------------------+----------------------------------------------------+
				greenedUnsubmitted Not Done Reply Inline Actions Just want to double-check: there is nothing about scalable vectors that assumes all vector types have the same bit width, correct? Can <scalable 1 x float> have a different bit width from <scalable 1 x double>? greened: Just want to double-check: there is nothing about scalable vectors that assumes all vector…
				PkmXUnsubmitted Not Done Reply Inline Actions I believe the intention is that `<scalable 1 x double>` should have twice many bits as `<scalable 1 x float>`, or the same many bits as `<scalable 2 x float>`. PkmX: I believe the intention is that `<scalable 1 x double>` should have twice many bits as…
				huntergrAuthorUnsubmitted Not Done Reply Inline Actions That's correct. I think that a clearer syntax might be `<vscale x 1 x float>` to indicate that the number of elements is being multiplied by the same `vscale` term for all scalable vector types. The intent is that we should be able to reason about the relative sizes of different scalable vector types based on the element size and minimum number of lanes alone. huntergr: That's correct. I think that a clearer syntax might be ##<vscale x 1 x float>## to indicate…
				greenedUnsubmitted Not Done Reply Inline Actions +1 for `<vscale x 1 x float>`. greened: +1 for `<vscale x 1 x float>`.
	\| ``<4 x i32>`` \| Vector of 4 32-bit integer values. \|			\| ``<4 x i32>`` \| Vector of 4 32-bit integer values. \|
	+-------------------+--------------------------------------------------+			+------------------------+----------------------------------------------------+
	\| ``<8 x float>`` \| Vector of 8 32-bit floating-point values. \|			\| ``<8 x float>`` \| Vector of 8 32-bit floating-point values. \|
	+-------------------+--------------------------------------------------+			+------------------------+----------------------------------------------------+
	\| ``<2 x i64>`` \| Vector of 2 64-bit integer values. \|			\| ``<2 x i64>`` \| Vector of 2 64-bit integer values. \|
	+-------------------+--------------------------------------------------+			+------------------------+----------------------------------------------------+
	\| ``<4 x i64*>`` \| Vector of 4 pointers to 64-bit integer values. \|			\| ``<4 x i64*>`` \| Vector of 4 pointers to 64-bit integer values. \|
	+-------------------+--------------------------------------------------+			+------------------------+----------------------------------------------------+
				\| ``<scalable 4 x i32>`` \| Vector with a multiple of 4 32-bit integer values. \|
				+------------------------+----------------------------------------------------+

	.. _t_label:			.. _t_label:

	Label Type			Label Type
	^^^^^^^^^^			^^^^^^^^^^

	:Overview:			:Overview:

	▲ Show 20 Lines • Show All 5,410 Lines • ▼ Show 20 Lines

	The result is a vector of the same type as ``val``. Its element values			The result is a vector of the same type as ``val``. Its element values
	are those of ``val`` except at position ``idx``, where it gets the value			are those of ``val`` except at position ``idx``, where it gets the value
	``elt``. If ``idx`` exceeds the length of ``val``, the result			``elt``. If ``idx`` exceeds the length of ``val``, the result
	is a :ref:`poison value <poisonvalues>`.			is a :ref:`poison value <poisonvalues>`.

	Example:			Example:
	""""""""			""""""""

				hfinkelUnsubmitted Done Reply Inline Actions I believe that all you need to say is: For a scalable vector, if the value of ``idx`` exceeds the runtime length of the vector, the result is a poison value. This part about the "result in the IR" is confusing, and is just stating a basic consistency fact about LLVM's type system. I recommend removing that. hfinkel: I believe that all you need to say is: For a scalable vector, if the value of ``idx``…
	.. code-block:: text			.. code-block:: text

	<result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>			<result> = insertelement <4 x i32> %vec, i32 1, i32 0 ; yields <4 x i32>

	.. _i_shufflevector:			.. _i_shufflevector:

	'``shufflevector``' Instruction			'``shufflevector``' Instruction
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^			^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	▲ Show 20 Lines • Show All 8,678 Lines • Show Last 20 Lines

include/llvm/ADT/DenseMapInfo.h

	Show All 11 Lines

	#ifndef LLVM_ADT_DENSEMAPINFO_H			#ifndef LLVM_ADT_DENSEMAPINFO_H
	#define LLVM_ADT_DENSEMAPINFO_H			#define LLVM_ADT_DENSEMAPINFO_H

	#include "llvm/ADT/ArrayRef.h"			#include "llvm/ADT/ArrayRef.h"
	#include "llvm/ADT/Hashing.h"			#include "llvm/ADT/Hashing.h"
	#include "llvm/ADT/StringRef.h"			#include "llvm/ADT/StringRef.h"
	#include "llvm/Support/PointerLikeTypeTraits.h"			#include "llvm/Support/PointerLikeTypeTraits.h"
				#include "llvm/Support/ScalableSize.h"
	#include <cassert>			#include <cassert>
	#include <cstddef>			#include <cstddef>
	#include <cstdint>			#include <cstdint>
	#include <utility>			#include <utility>

	namespace llvm {			namespace llvm {

	template<typename T>			template<typename T>
	▲ Show 20 Lines • Show All 235 Lines • ▼ Show 20 Lines

	template <> struct DenseMapInfo<hash_code> {			template <> struct DenseMapInfo<hash_code> {
	static inline hash_code getEmptyKey() { return hash_code(-1); }			static inline hash_code getEmptyKey() { return hash_code(-1); }
	static inline hash_code getTombstoneKey() { return hash_code(-2); }			static inline hash_code getTombstoneKey() { return hash_code(-2); }
	static unsigned getHashValue(hash_code val) { return val; }			static unsigned getHashValue(hash_code val) { return val; }
	static bool isEqual(hash_code LHS, hash_code RHS) { return LHS == RHS; }			static bool isEqual(hash_code LHS, hash_code RHS) { return LHS == RHS; }
	};			};

				template <> struct DenseMapInfo<ElementCount> {
				static inline ElementCount getEmptyKey() { return {~0U, true}; }
				static inline ElementCount getTombstoneKey() { return {~0U - 1, false}; }
				static unsigned getHashValue(const ElementCount& EltCnt) {
				if (EltCnt.Scalable)
				return (EltCnt.Min * 37U) - 1U;

				return EltCnt.Min * 37U;
				}

				static bool isEqual(const ElementCount& LHS, const ElementCount& RHS) {
				return LHS == RHS;
				}
				};

	} // end namespace llvm			} // end namespace llvm

	#endif // LLVM_ADT_DENSEMAPINFO_H			#endif // LLVM_ADT_DENSEMAPINFO_H

include/llvm/IR/DerivedTypes.h

Show All 17 Lines
#define LLVM_IR_DERIVEDTYPES_H		#define LLVM_IR_DERIVEDTYPES_H

#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
#include "llvm/Support/Compiler.h"		#include "llvm/Support/Compiler.h"
		#include "llvm/Support/ScalableSize.h"
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>

namespace llvm {		namespace llvm {

class Value;		class Value;
class APInt;		class APInt;
class LLVMContext;		class LLVMContext;
▲ Show 20 Lines • Show All 348 Lines • ▼ Show 20 Lines	SequentialType(TypeID TID, Type *ElType, uint64_t NumElements)
ContainedTys = &ContainedType;		ContainedTys = &ContainedType;
NumContainedTys = 1;		NumContainedTys = 1;
}		}

public:		public:
SequentialType(const SequentialType &) = delete;		SequentialType(const SequentialType &) = delete;
SequentialType &operator=(const SequentialType &) = delete;		SequentialType &operator=(const SequentialType &) = delete;

		/// For scalable vectors, this will return the minimum number of elements
		/// in the vector.
uint64_t getNumElements() const { return NumElements; }		uint64_t getNumElements() const { return NumElements; }
Type *getElementType() const { return ContainedType; }		Type *getElementType() const { return ContainedType; }

/// Methods for support type inquiry through isa, cast, and dyn_cast.		/// Methods for support type inquiry through isa, cast, and dyn_cast.
static bool classof(const Type *T) {		static bool classof(const Type *T) {
return T->getTypeID() == ArrayTyID \|\| T->getTypeID() == VectorTyID;		return T->getTypeID() == ArrayTyID \|\| T->getTypeID() == VectorTyID;
}		}
};		};
Show All 19 Lines
};		};

uint64_t Type::getArrayNumElements() const {		uint64_t Type::getArrayNumElements() const {
return cast<ArrayType>(this)->getNumElements();		return cast<ArrayType>(this)->getNumElements();
}		}

/// Class to represent vector types.		/// Class to represent vector types.
class VectorType : public SequentialType {		class VectorType : public SequentialType {
VectorType(Type *ElType, unsigned NumEl);		/// A fully specified VectorType is of the form <scalable n x Ty>. 'n' is the
		/// minimum number of elements of type Ty contained within the vector, and
		/// 'scalable' indicates that the total element count is an integer multiple
		/// of 'n', where the multiple is either guaranteed to be one, or is
		/// statically unknown at compile time.
		///
		/// If the multiple is known to be 1, then the extra term is discarded in
		/// textual IR:
		///
		/// <4 x i32> - a vector containing 4 i32s
		/// <scalable 4 x i32> - a vector containing an unknown integer multiple
		/// of 4 i32s

		VectorType(Type *ElType, unsigned NumEl, bool Scalable = false);
		VectorType(Type *ElType, ElementCount EC);

		// If true, the total number of elements is an unknown multiple of the
		// minimum 'NumElements' from SequentialType. Otherwise the total number
		// of elements is exactly equal to 'NumElements'.
		rovkaUnsubmitted Done Reply Inline Actions Nit: Punctuation (comments should end with .) rovka: Nit: Punctuation (comments should end with .)
		bool Scalable;

public:		public:
VectorType(const VectorType &) = delete;		VectorType(const VectorType &) = delete;
VectorType &operator=(const VectorType &) = delete;		VectorType &operator=(const VectorType &) = delete;

/// This static method is the primary way to construct an VectorType.		/// This static method is the primary way to construct an VectorType.
static VectorType get(Type ElementType, unsigned NumElements);		static VectorType get(Type ElementType, ElementCount EC);
		static VectorType get(Type ElementType, unsigned NumElements,
		bool Scalable = false) {
		return VectorType::get(ElementType, {NumElements, Scalable});
		}

/// This static method gets a VectorType with the same number of elements as		/// This static method gets a VectorType with the same number of elements as
/// the input type, and the element type is an integer type of the same width		/// the input type, and the element type is an integer type of the same width
/// as the input element type.		/// as the input element type.
static VectorType getInteger(VectorType VTy) {		static VectorType getInteger(VectorType VTy) {
unsigned EltBits = VTy->getElementType()->getPrimitiveSizeInBits();		unsigned EltBits = VTy->getElementType()->getPrimitiveSizeInBits();
assert(EltBits && "Element size must be of a non-zero size");		assert(EltBits && "Element size must be of a non-zero size");
Type *EltTy = IntegerType::get(VTy->getContext(), EltBits);		Type *EltTy = IntegerType::get(VTy->getContext(), EltBits);
return VectorType::get(EltTy, VTy->getNumElements());		return VectorType::get(EltTy, VTy->getElementCount());
}		}

/// This static method is like getInteger except that the element types are		/// This static method is like getInteger except that the element types are
/// twice as wide as the elements in the input type.		/// twice as wide as the elements in the input type.
static VectorType getExtendedElementVectorType(VectorType VTy) {		static VectorType getExtendedElementVectorType(VectorType VTy) {
unsigned EltBits = VTy->getElementType()->getPrimitiveSizeInBits();		unsigned EltBits = VTy->getElementType()->getPrimitiveSizeInBits();
Type EltTy = IntegerType::get(VTy->getContext(), EltBits 2);		Type EltTy = IntegerType::get(VTy->getContext(), EltBits 2);
return VectorType::get(EltTy, VTy->getNumElements());		return VectorType::get(EltTy, VTy->getElementCount());
}		}

/// This static method is like getInteger except that the element types are		/// This static method is like getInteger except that the element types are
/// half as wide as the elements in the input type.		/// half as wide as the elements in the input type.
static VectorType getTruncatedElementVectorType(VectorType VTy) {		static VectorType getTruncatedElementVectorType(VectorType VTy) {
unsigned EltBits = VTy->getElementType()->getPrimitiveSizeInBits();		unsigned EltBits = VTy->getElementType()->getPrimitiveSizeInBits();
assert((EltBits & 1) == 0 &&		assert((EltBits & 1) == 0 &&
"Cannot truncate vector element with odd bit-width");		"Cannot truncate vector element with odd bit-width");
Type *EltTy = IntegerType::get(VTy->getContext(), EltBits / 2);		Type *EltTy = IntegerType::get(VTy->getContext(), EltBits / 2);
return VectorType::get(EltTy, VTy->getNumElements());		return VectorType::get(EltTy, VTy->getElementCount());
}		}

/// This static method returns a VectorType with half as many elements as the		/// This static method returns a VectorType with half as many elements as the
/// input type and the same element type.		/// input type and the same element type.
static VectorType getHalfElementsVectorType(VectorType VTy) {		static VectorType getHalfElementsVectorType(VectorType VTy) {
unsigned NumElts = VTy->getNumElements();		auto EltCnt = VTy->getElementCount();
assert ((NumElts & 1) == 0 &&		assert ((EltCnt.Min & 1) == 0 &&
"Cannot halve vector with odd number of elements.");		"Cannot halve vector with odd number of elements.");
return VectorType::get(VTy->getElementType(), NumElts/2);		return VectorType::get(VTy->getElementType(), EltCnt/2);
}		}

/// This static method returns a VectorType with twice as many elements as the		/// This static method returns a VectorType with twice as many elements as the
/// input type and the same element type.		/// input type and the same element type.
static VectorType getDoubleElementsVectorType(VectorType VTy) {		static VectorType getDoubleElementsVectorType(VectorType VTy) {
unsigned NumElts = VTy->getNumElements();		auto EltCnt = VTy->getElementCount();
return VectorType::get(VTy->getElementType(), NumElts*2);		assert((VTy->getNumElements() * 2ull) <= UINT_MAX &&
		"Too many elements in vector");
		return VectorType::get(VTy->getElementType(), EltCnt*2);
}		}

/// Return true if the specified type is valid as a element type.		/// Return true if the specified type is valid as a element type.
static bool isValidElementType(Type *ElemTy);		static bool isValidElementType(Type *ElemTy);

/// Return the number of bits in the Vector type.		/// Return an ElementCount instance to represent the (possibly scalable)
		/// number of elements in the vector.
		rovkaUnsubmitted Done Reply Inline Actions Nit: Punctuation. rovka: Nit: Punctuation.
		ElementCount getElementCount() const {
		uint64_t MinimumEltCnt = getNumElements();
		assert(MinimumEltCnt <= UINT_MAX && "Too many elements in vector");
		rkruppeUnsubmitted Not Done Reply Inline Actions Nit: This restriction on the range of NumElements is very reasonable, but we should make it a proper invariant of the type and enforce it in `VectorType::get` rather than post-hoc in some accessors. rkruppe: Nit: This restriction on the range of NumElements is very reasonable, but we should make it a…
		huntergrAuthorUnsubmitted Not Done Reply Inline Actions VectorType::get already has this restriction, at least on most platforms -- the argument to it is 'unsigned', which is usually 32 bits. The ElementCount struct also uses 'unsigned'. It may be worth changing it to an explicit uint32_t. I think VectorType originally stored the number of elements as a 32 bit field so was consistent with the interface, but at some point the different SequentialType variants were changed to unify with a single 64 bit size field in the parent class. I will create an updated patch to check that we don't overflow when using methods like getDoubleElementsVectorType; good catch. huntergr: VectorType::get already has this restriction, at least on most platforms -- the argument to it…
		rkruppeUnsubmitted Not Done Reply Inline Actions Oh, right, I was under the mistaken impression that VectorType::get took uint64_t. So the potential overflow in getDoubeElementsVectorType is pre-existing, but if you want to take the time to harden against it now, that's great. I don't think changing unsigned to uint32_t is worth the churn, at least not in this patch. rkruppe: Oh, right, I was under the mistaken impression that VectorType::get took uint64_t. So the…
		return { (unsigned)MinimumEltCnt, Scalable };
		}

		/// Returns whether or not this is a scalable vector (meaning the total
		/// element count is a multiple of the minimum).
		bool isScalable() const {
		return Scalable;
		}

		/// Return the minimum number of bits in the Vector type.
		rovkaUnsubmitted Done Reply Inline Actions Nit: I'd like to see a similar comment in SequentialType::getNumElements. rovka: Nit: I'd like to see a similar comment in SequentialType::getNumElements.
/// Returns zero when the vector is a vector of pointers.		/// Returns zero when the vector is a vector of pointers.
unsigned getBitWidth() const {		unsigned getBitWidth() const {
return getNumElements() * getElementType()->getPrimitiveSizeInBits();		return getNumElements() * getElementType()->getPrimitiveSizeInBits();
}		}

/// Methods for support type inquiry through isa, cast, and dyn_cast.		/// Methods for support type inquiry through isa, cast, and dyn_cast.
static bool classof(const Type *T) {		static bool classof(const Type *T) {
return T->getTypeID() == VectorTyID;		return T->getTypeID() == VectorTyID;
}		}
};		};

unsigned Type::getVectorNumElements() const {		unsigned Type::getVectorNumElements() const {
return cast<VectorType>(this)->getNumElements();		return cast<VectorType>(this)->getNumElements();
}		}

		bool Type::getVectorIsScalable() const {
		return cast<VectorType>(this)->isScalable();
		}

/// Class to represent pointers.		/// Class to represent pointers.
class PointerType : public Type {		class PointerType : public Type {
explicit PointerType(Type *ElType, unsigned AddrSpace);		explicit PointerType(Type *ElType, unsigned AddrSpace);

Type *PointeeTy;		Type *PointeeTy;

public:		public:
PointerType(const PointerType &) = delete;		PointerType(const PointerType &) = delete;
Show All 36 Lines

include/llvm/IR/Type.h

Show First 20 Lines • Show All 360 Lines • ▼ Show 20 Lines	public:

inline uint64_t getArrayNumElements() const;		inline uint64_t getArrayNumElements() const;

Type *getArrayElementType() const {		Type *getArrayElementType() const {
assert(getTypeID() == ArrayTyID);		assert(getTypeID() == ArrayTyID);
return ContainedTys[0];		return ContainedTys[0];
}		}

		inline bool getVectorIsScalable() const;
inline unsigned getVectorNumElements() const;		inline unsigned getVectorNumElements() const;
Type *getVectorElementType() const {		Type *getVectorElementType() const {
assert(getTypeID() == VectorTyID);		assert(getTypeID() == VectorTyID);
return ContainedTys[0];		return ContainedTys[0];
}		}

Type *getPointerElementType() const {		Type *getPointerElementType() const {
assert(getTypeID() == PointerTyID);		assert(getTypeID() == PointerTyID);
▲ Show 20 Lines • Show All 108 Lines • Show Last 20 Lines

include/llvm/Support/ScalableSize.h

This file was added.

				//===- ScalableSize.h - Scalable vector size info ---------------- C++ --===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				greenedUnsubmitted Done Reply Inline Actions Needs updated license. greened: Needs updated license.
				//
				// This file provides a struct that can be used to query the size of IR types
				// which may be scalable vectors. It provides convenience operators so that
				// it can be used in much the same way as a single scalar value.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_SUPPORT_SCALABLESIZE_H
				#define LLVM_SUPPORT_SCALABLESIZE_H

				namespace llvm {
				greenedUnsubmitted Done Reply Inline Actions Should be `LLVM_SUPPORT_SCALABLESIZE_H`. greened: Should be `LLVM_SUPPORT_SCALABLESIZE_H`.

				class ElementCount {
				public:
				unsigned Min; // Minimum number of vector elements.
				bool Scalable; // If true, NumElements is a multiple of 'Min' determined
				// at runtime rather than compile time.

				ElementCount(unsigned Min, bool Scalable)
				: Min(Min), Scalable(Scalable) {}
				rovkaUnsubmitted Done Reply Inline Actions Nit: Punctuation and capitalization (If [...]) rovka: Nit: Punctuation and capitalization (If [...])

				ElementCount operator*(unsigned RHS) {
				return { Min * RHS, Scalable };
				}
				ElementCount operator/(unsigned RHS) {
				return { Min / RHS, Scalable };
				}

				bool operator==(const ElementCount& RHS) const {
				return Min == RHS.Min && Scalable == RHS.Scalable;
				}
				};

				} // end namespace llvm

				#endif // LLVM_SUPPORT_SCALABLESIZE_H

lib/AsmParser/LLLexer.cpp

Show First 20 Lines • Show All 700 Lines • ▼ Show 20 Lines	#define KEYWORD(STR) \
KEYWORD(eq); KEYWORD(ne); KEYWORD(slt); KEYWORD(sgt); KEYWORD(sle);		KEYWORD(eq); KEYWORD(ne); KEYWORD(slt); KEYWORD(sgt); KEYWORD(sle);
KEYWORD(sge); KEYWORD(ult); KEYWORD(ugt); KEYWORD(ule); KEYWORD(uge);		KEYWORD(sge); KEYWORD(ult); KEYWORD(ugt); KEYWORD(ule); KEYWORD(uge);
KEYWORD(oeq); KEYWORD(one); KEYWORD(olt); KEYWORD(ogt); KEYWORD(ole);		KEYWORD(oeq); KEYWORD(one); KEYWORD(olt); KEYWORD(ogt); KEYWORD(ole);
KEYWORD(oge); KEYWORD(ord); KEYWORD(uno); KEYWORD(ueq); KEYWORD(une);		KEYWORD(oge); KEYWORD(ord); KEYWORD(uno); KEYWORD(ueq); KEYWORD(une);

KEYWORD(xchg); KEYWORD(nand); KEYWORD(max); KEYWORD(min); KEYWORD(umax);		KEYWORD(xchg); KEYWORD(nand); KEYWORD(max); KEYWORD(min); KEYWORD(umax);
KEYWORD(umin);		KEYWORD(umin);

		KEYWORD(scalable);
KEYWORD(x);		KEYWORD(x);
KEYWORD(blockaddress);		KEYWORD(blockaddress);

// Metadata types.		// Metadata types.
KEYWORD(distinct);		KEYWORD(distinct);

// Use-list order directives.		// Use-list order directives.
KEYWORD(uselistorder);		KEYWORD(uselistorder);
▲ Show 20 Lines • Show All 407 Lines • Show Last 20 Lines

lib/AsmParser/LLParser.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,688 Lines • ▼ Show 20 Lines	bool LLParser::ParseStructBody(SmallVectorImpl<Type*> &Body) {
return ParseToken(lltok::rbrace, "expected '}' at end of struct");		return ParseToken(lltok::rbrace, "expected '}' at end of struct");
}		}

/// ParseArrayVectorType - Parse an array or vector type, assuming the first		/// ParseArrayVectorType - Parse an array or vector type, assuming the first
/// token has already been consumed.		/// token has already been consumed.
/// Type		/// Type
/// ::= '[' APSINTVAL 'x' Types ']'		/// ::= '[' APSINTVAL 'x' Types ']'
/// ::= '<' APSINTVAL 'x' Types '>'		/// ::= '<' APSINTVAL 'x' Types '>'
		/// ::= '<' 'scalable' APSINTVAL 'x' Types '>'
bool LLParser::ParseArrayVectorType(Type *&Result, bool isVector) {		bool LLParser::ParseArrayVectorType(Type *&Result, bool isVector) {
		bool Scalable = false;

		if (isVector && Lex.getKind() == lltok::kw_scalable) {
		Lex.Lex(); // consume the 'scalable'

		Scalable = true;
		}

if (Lex.getKind() != lltok::APSInt \|\| Lex.getAPSIntVal().isSigned() \|\|		if (Lex.getKind() != lltok::APSInt \|\| Lex.getAPSIntVal().isSigned() \|\|
Lex.getAPSIntVal().getBitWidth() > 64)		Lex.getAPSIntVal().getBitWidth() > 64)
return TokError("expected number in address space");		return TokError("expected number in address space");

LocTy SizeLoc = Lex.getLoc();		LocTy SizeLoc = Lex.getLoc();
uint64_t Size = Lex.getAPSIntVal().getZExtValue();		uint64_t Size = Lex.getAPSIntVal().getZExtValue();
Lex.Lex();		Lex.Lex();

Show All 10 Lines	bool LLParser::ParseArrayVectorType(Type *&Result, bool isVector) {

if (isVector) {		if (isVector) {
if (Size == 0)		if (Size == 0)
return Error(SizeLoc, "zero element vector is illegal");		return Error(SizeLoc, "zero element vector is illegal");
if ((unsigned)Size != Size)		if ((unsigned)Size != Size)
return Error(SizeLoc, "size too large for vector");		return Error(SizeLoc, "size too large for vector");
if (!VectorType::isValidElementType(EltTy))		if (!VectorType::isValidElementType(EltTy))
return Error(TypeLoc, "invalid vector element type");		return Error(TypeLoc, "invalid vector element type");
Result = VectorType::get(EltTy, unsigned(Size));		Result = VectorType::get(EltTy, unsigned(Size), Scalable);
} else {		} else {
if (!ArrayType::isValidElementType(EltTy))		if (!ArrayType::isValidElementType(EltTy))
return Error(TypeLoc, "invalid array element type");		return Error(TypeLoc, "invalid array element type");
Result = ArrayType::get(EltTy, Size);		Result = ArrayType::get(EltTy, Size);
}		}
return false;		return false;
}		}

▲ Show 20 Lines • Show All 5,819 Lines • Show Last 20 Lines

lib/AsmParser/LLToken.h

Show All 31 Lines	enum Kind {
less,		less,
greater, // < >		greater, // < >
lparen,		lparen,
rparen, // ( )		rparen, // ( )
exclaim, // !		exclaim, // !
bar, // \|		bar, // \|
colon, // :		colon, // :

		kw_scalable,
kw_x,		kw_x,
kw_true,		kw_true,
kw_false,		kw_false,
kw_declare,		kw_declare,
kw_define,		kw_define,
kw_global,		kw_global,
kw_constant,		kw_constant,

▲ Show 20 Lines • Show All 412 Lines • Show Last 20 Lines

lib/Bitcode/Reader/BitcodeReader.cpp

Show First 20 Lines • Show All 1,751 Lines • ▼ Show 20 Lines	while (true) {
case bitc::TYPE_CODE_ARRAY: // ARRAY: [numelts, eltty]		case bitc::TYPE_CODE_ARRAY: // ARRAY: [numelts, eltty]
if (Record.size() < 2)		if (Record.size() < 2)
return error("Invalid record");		return error("Invalid record");
ResultTy = getTypeByID(Record[1]);		ResultTy = getTypeByID(Record[1]);
if (!ResultTy \|\| !ArrayType::isValidElementType(ResultTy))		if (!ResultTy \|\| !ArrayType::isValidElementType(ResultTy))
return error("Invalid type");		return error("Invalid type");
ResultTy = ArrayType::get(ResultTy, Record[0]);		ResultTy = ArrayType::get(ResultTy, Record[0]);
break;		break;
case bitc::TYPE_CODE_VECTOR: // VECTOR: [numelts, eltty]		case bitc::TYPE_CODE_VECTOR: // VECTOR: [numelts, eltty] or
		// [numelts, eltty, scalable]
if (Record.size() < 2)		if (Record.size() < 2)
return error("Invalid record");		return error("Invalid record");
if (Record[0] == 0)		if (Record[0] == 0)
return error("Invalid vector length");		return error("Invalid vector length");
ResultTy = getTypeByID(Record[1]);		ResultTy = getTypeByID(Record[1]);
if (!ResultTy \|\| !StructType::isValidElementType(ResultTy))		if (!ResultTy \|\| !StructType::isValidElementType(ResultTy))
return error("Invalid type");		return error("Invalid type");
ResultTy = VectorType::get(ResultTy, Record[0]);		bool Scalable = Record.size() > 2 ? Record[2] : false;
		ResultTy = VectorType::get(ResultTy, Record[0], Scalable);
break;		break;
}		}

if (NumRecords >= TypeList.size())		if (NumRecords >= TypeList.size())
return error("Invalid TYPE table");		return error("Invalid TYPE table");
if (TypeList[NumRecords])		if (TypeList[NumRecords])
return error(		return error(
"Invalid TYPE table: Only named structs can be forward referenced");		"Invalid TYPE table: Only named structs can be forward referenced");
▲ Show 20 Lines • Show All 4,423 Lines • Show Last 20 Lines

lib/Bitcode/Writer/BitcodeWriter.cpp

Show First 20 Lines • Show All 923 Lines • ▼ Show 20 Lines	case Type::ArrayTyID: {
Code = bitc::TYPE_CODE_ARRAY;		Code = bitc::TYPE_CODE_ARRAY;
TypeVals.push_back(AT->getNumElements());		TypeVals.push_back(AT->getNumElements());
TypeVals.push_back(VE.getTypeID(AT->getElementType()));		TypeVals.push_back(VE.getTypeID(AT->getElementType()));
AbbrevToUse = ArrayAbbrev;		AbbrevToUse = ArrayAbbrev;
break;		break;
}		}
case Type::VectorTyID: {		case Type::VectorTyID: {
VectorType *VT = cast<VectorType>(T);		VectorType *VT = cast<VectorType>(T);
// VECTOR [numelts, eltty]		// VECTOR [numelts, eltty] or
		// [numelts, eltty, scalable]
Code = bitc::TYPE_CODE_VECTOR;		Code = bitc::TYPE_CODE_VECTOR;
TypeVals.push_back(VT->getNumElements());		TypeVals.push_back(VT->getNumElements());
TypeVals.push_back(VE.getTypeID(VT->getElementType()));		TypeVals.push_back(VE.getTypeID(VT->getElementType()));
		if (VT->isScalable())
		TypeVals.push_back(VT->isScalable());
break;		break;
}		}
}		}

// Emit the finished record.		// Emit the finished record.
Stream.EmitRecord(Code, TypeVals, AbbrevToUse);		Stream.EmitRecord(Code, TypeVals, AbbrevToUse);
TypeVals.clear();		TypeVals.clear();
}		}
▲ Show 20 Lines • Show All 3,593 Lines • Show Last 20 Lines

lib/IR/AsmWriter.cpp

Show First 20 Lines • Show All 614 Lines • ▼ Show 20 Lines	case Type::ArrayTyID: {
ArrayType *ATy = cast<ArrayType>(Ty);		ArrayType *ATy = cast<ArrayType>(Ty);
OS << '[' << ATy->getNumElements() << " x ";		OS << '[' << ATy->getNumElements() << " x ";
print(ATy->getElementType(), OS);		print(ATy->getElementType(), OS);
OS << ']';		OS << ']';
return;		return;
}		}
case Type::VectorTyID: {		case Type::VectorTyID: {
VectorType *PTy = cast<VectorType>(Ty);		VectorType *PTy = cast<VectorType>(Ty);
OS << "<" << PTy->getNumElements() << " x ";		OS << "<";
		if (PTy->isScalable())
		OS << "scalable ";
		OS << PTy->getNumElements() << " x ";
print(PTy->getElementType(), OS);		print(PTy->getElementType(), OS);
OS << '>';		OS << '>';
return;		return;
}		}
}		}
llvm_unreachable("Invalid TypeID");		llvm_unreachable("Invalid TypeID");
}		}

▲ Show 20 Lines • Show All 3,749 Lines • Show Last 20 Lines

lib/IR/LLVMContextImpl.h

Show First 20 Lines • Show All 1,304 Lines • ▼ Show 20 Lines	#include "llvm/IR/Metadata.def"
using FunctionTypeSet = DenseSet<FunctionType *, FunctionTypeKeyInfo>;		using FunctionTypeSet = DenseSet<FunctionType *, FunctionTypeKeyInfo>;
FunctionTypeSet FunctionTypes;		FunctionTypeSet FunctionTypes;
using StructTypeSet = DenseSet<StructType *, AnonStructTypeKeyInfo>;		using StructTypeSet = DenseSet<StructType *, AnonStructTypeKeyInfo>;
StructTypeSet AnonStructTypes;		StructTypeSet AnonStructTypes;
StringMap<StructType*> NamedStructTypes;		StringMap<StructType*> NamedStructTypes;
unsigned NamedStructTypesUniqueID = 0;		unsigned NamedStructTypesUniqueID = 0;

DenseMap<std::pair<Type , uint64_t>, ArrayType> ArrayTypes;		DenseMap<std::pair<Type , uint64_t>, ArrayType> ArrayTypes;
DenseMap<std::pair<Type , unsigned>, VectorType> VectorTypes;		DenseMap<std::pair<Type , ElementCount>, VectorType> VectorTypes;
		rkruppeUnsubmitted Not Done Reply Inline Actions Nit: have you considered `std::pair<bool, unsigned>` instead of manually bit-packing it into 64 bits? DenseMap should support nested pairs, the size should be the same (except if unsigned is 64 bit, which I don't believe we support and which is extremely niche anyway), and it would simplify `VectorType::get` a little bit. rkruppe: Nit: have you considered `std::pair<bool, unsigned>` instead of manually bit-packing it into 64…
		huntergrAuthorUnsubmitted Not Done Reply Inline Actions So I tried this, and couldn't compile it -- there's no implementation of getHashValue, getEmptyKey, getTombstoneKey, etc. for the nested pair. In our downstream compiler this is actually implemented as a dense map of a 3-element tuple against the Type, for which we have implemented the appropriate extensions to DenseMapInfo. If that approach is preferred to bitpacking, I'll make a separate patch to implement the DenseMap extensions. huntergr:* So I tried this, and couldn't compile it -- there's no implementation of getHashValue…
		rkruppeUnsubmitted Done Reply Inline Actions I finally got a chance to look into the error you're seeing and it turns out the root cause is not nested pairs but a missing implementation of DenseMapInfo for bool. We could add that implementation, but in the future other code may also want to hash an ElementCount, e.g. VPlan may migrate the vectorization factor VF from `unsigned` to ElementCount and there are some DenseMaps with VF as key. With this in mind I'm leaning towards implementing `DenseMapInfo<VectorType::ElementCount>`, using the bit fiddling that's currently open-coded here. What do you think? rkruppe: I finally got a chance to look into the error you're seeing and it turns out the root cause is…
		huntergrAuthorUnsubmitted Not Done Reply Inline Actions Ah, I spotted the same bug you did (missing implementation for bool), but it seems I hadn't submitted the comment I wrote. I tried it with nesting a pair of unsigned ints and that worked, but making it work directly with ElementCount seems a nicer idea, thanks. huntergr: Ah, I spotted the same bug you did (missing implementation for bool), but it seems I hadn't…
DenseMap<Type, PointerType> PointerTypes; // Pointers in AddrSpace = 0		DenseMap<Type, PointerType> PointerTypes; // Pointers in AddrSpace = 0
DenseMap<std::pair<Type, unsigned>, PointerType> ASPointerTypes;		DenseMap<std::pair<Type, unsigned>, PointerType> ASPointerTypes;

/// ValueHandles - This map keeps track of all of the value handles that are		/// ValueHandles - This map keeps track of all of the value handles that are
/// watching a Value*. The Value::HasValueHandle bit is used to know		/// watching a Value*. The Value::HasValueHandle bit is used to know
/// whether or not a value has an entry in this map.		/// whether or not a value has an entry in this map.
using ValueHandlesTy = DenseMap<Value , ValueHandleBase >;		using ValueHandlesTy = DenseMap<Value , ValueHandleBase >;
ValueHandlesTy ValueHandles;		ValueHandlesTy ValueHandles;
▲ Show 20 Lines • Show All 82 Lines • Show Last 20 Lines

lib/IR/Type.cpp

Show First 20 Lines • Show All 593 Lines • ▼ Show 20 Lines	return !ElemTy->isVoidTy() && !ElemTy->isLabelTy() &&
!ElemTy->isMetadataTy() && !ElemTy->isFunctionTy() &&		!ElemTy->isMetadataTy() && !ElemTy->isFunctionTy() &&
!ElemTy->isTokenTy();		!ElemTy->isTokenTy();
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// VectorType Implementation		// VectorType Implementation
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

VectorType::VectorType(Type *ElType, unsigned NumEl)		VectorType::VectorType(Type *ElType, ElementCount EC)
: SequentialType(VectorTyID, ElType, NumEl) {}		: SequentialType(VectorTyID, ElType, EC.Min), Scalable(EC.Scalable) {}
		rengolinUnsubmitted Done Reply Inline Actions why not: : SequentialType(VectorTyID, ElType, EC.Min), Scalable(EC.Scalable) { } rengolin: why not: : SequentialType(VectorTyID, ElType, EC.Min), Scalable(EC.Scalable) { }

VectorType VectorType::get(Type ElementType, unsigned NumElements) {		VectorType VectorType::get(Type ElementType, ElementCount EC ) {
assert(NumElements > 0 && "#Elements of a VectorType must be greater than 0");		assert(EC.Min > 0 && "#Elements of a VectorType must be greater than 0");
assert(isValidElementType(ElementType) && "Element type of a VectorType must "		assert(isValidElementType(ElementType) && "Element type of a VectorType must "
"be an integer, floating point, or "		"be an integer, floating point, or "
"pointer type.");		"pointer type.");

LLVMContextImpl *pImpl = ElementType->getContext().pImpl;		LLVMContextImpl *pImpl = ElementType->getContext().pImpl;
VectorType *&Entry = ElementType->getContext().pImpl		VectorType *&Entry = ElementType->getContext().pImpl
->VectorTypes[std::make_pair(ElementType, NumElements)];		->VectorTypes[std::make_pair(ElementType, EC)];

if (!Entry)		if (!Entry)
Entry = new (pImpl->TypeAllocator) VectorType(ElementType, NumElements);		Entry = new (pImpl->TypeAllocator) VectorType(ElementType, EC);
return Entry;		return Entry;
}		}

bool VectorType::isValidElementType(Type *ElemTy) {		bool VectorType::isValidElementType(Type *ElemTy) {
return ElemTy->isIntegerTy() \|\| ElemTy->isFloatingPointTy() \|\|		return ElemTy->isIntegerTy() \|\| ElemTy->isFloatingPointTy() \|\|
ElemTy->isPointerTy();		ElemTy->isPointerTy();
}		}

Show All 38 Lines

lib/IR/Verifier.cpp

Show All 37 Lines
// only by the unwind edge of an invoke instruction.		// only by the unwind edge of an invoke instruction.
// * A landingpad instruction must be the first non-PHI instruction in the		// * A landingpad instruction must be the first non-PHI instruction in the
// block.		// block.
// * Landingpad instructions must be in a function with a personality function.		// * Landingpad instructions must be in a function with a personality function.
// * All other things that are tested by asserts spread about the code...		// * All other things that are tested by asserts spread about the code...
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		#include "LLVMContextImpl.h"
#include "llvm/IR/Verifier.h"		#include "llvm/IR/Verifier.h"
#include "llvm/ADT/APFloat.h"		#include "llvm/ADT/APFloat.h"
#include "llvm/ADT/APInt.h"		#include "llvm/ADT/APInt.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/DenseMap.h"		#include "llvm/ADT/DenseMap.h"
#include "llvm/ADT/MapVector.h"		#include "llvm/ADT/MapVector.h"
#include "llvm/ADT/Optional.h"		#include "llvm/ADT/Optional.h"
#include "llvm/ADT/STLExtras.h"		#include "llvm/ADT/STLExtras.h"
▲ Show 20 Lines • Show All 248 Lines • ▼ Show 20 Lines	class Verifier : public InstVisitor<Verifier>, VerifierSupport {
SmallPtrSet<const Value *, 32> GlobalValueVisited;		SmallPtrSet<const Value *, 32> GlobalValueVisited;

// Keeps track of duplicate function argument debug info.		// Keeps track of duplicate function argument debug info.
SmallVector<const DILocalVariable *, 16> DebugFnArgs;		SmallVector<const DILocalVariable *, 16> DebugFnArgs;

TBAAVerifier TBAAVerifyHelper;		TBAAVerifier TBAAVerifyHelper;

void checkAtomicMemAccessSize(Type Ty, const Instruction I);		void checkAtomicMemAccessSize(Type Ty, const Instruction I);
		static bool containsScalableVectorValue(const Type *Ty);
		rovkaUnsubmitted Done Reply Inline Actions Nitpick: I would call this "containsScalableVectorValue", to make it clear that it doesn't just look at the top level type. rovka: Nitpick: I would call this "containsScalableVectorValue", to make it clear that it doesn't just…

public:		public:
explicit Verifier(raw_ostream *OS, bool ShouldTreatBrokenDebugInfoAsError,		explicit Verifier(raw_ostream *OS, bool ShouldTreatBrokenDebugInfoAsError,
const Module &M)		const Module &M)
: VerifierSupport(OS, M), LandingPadResultTy(nullptr),		: VerifierSupport(OS, M), LandingPadResultTy(nullptr),
SawFrameEscape(false), TBAAVerifyHelper(this) {		SawFrameEscape(false), TBAAVerifyHelper(this) {
TreatBrokenDebugInfoAsError = ShouldTreatBrokenDebugInfoAsError;		TreatBrokenDebugInfoAsError = ShouldTreatBrokenDebugInfoAsError;
}		}

bool hasBrokenDebugInfo() const { return BrokenDebugInfo; }		bool hasBrokenDebugInfo() const { return BrokenDebugInfo; }

		bool verifyTypes(const Module &M) {
		LLVMContext &Ctx = M.getContext();
		for (auto &Entry : Ctx.pImpl->ArrayTypes) {
		ArrayType *ATy = Entry.second;
		if (containsScalableVectorValue(ATy)) {
		CheckFailed("Arrays cannot contain scalable vectors", ATy, &M);
		Broken = true;
		}
		}

		for (StructType* STy : Ctx.pImpl->AnonStructTypes)
		if (containsScalableVectorValue(STy)) {
		CheckFailed("Structs cannot contain scalable vectors", STy, &M);
		Broken = true;
		}

		for (auto &Entry : Ctx.pImpl->NamedStructTypes) {
		StructType *STy = Entry.second;
		if (containsScalableVectorValue(STy)) {
		CheckFailed("Structs cannot contain scalable vectors", STy, &M);
		Broken = true;
		}
		}

		return !Broken;
		}

bool verify(const Function &F) {		bool verify(const Function &F) {
assert(F.getParent() == &M &&		assert(F.getParent() == &M &&
"An instance of this class only works with a specific module!");		"An instance of this class only works with a specific module!");

// First ensure the function is well-enough formed to compute dominance		// First ensure the function is well-enough formed to compute dominance
// information, and directly compute a dominance tree. We don't rely on the		// information, and directly compute a dominance tree. We don't rely on the
// pass manager to provide this as it isolates us from a potentially		// pass manager to provide this as it isolates us from a potentially
// out-of-date dominator tree and makes it significantly more complex to run		// out-of-date dominator tree and makes it significantly more complex to run
▲ Show 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	for (const StringMapEntry<Comdat> &SMEC : M.getComdatSymbolTable())
visitComdat(SMEC.getValue());		visitComdat(SMEC.getValue());

visitModuleFlags(M);		visitModuleFlags(M);
visitModuleIdents(M);		visitModuleIdents(M);
visitModuleCommandLines(M);		visitModuleCommandLines(M);

verifyCompileUnits();		verifyCompileUnits();

		verifyTypes(M);

verifyDeoptimizeCallingConvs();		verifyDeoptimizeCallingConvs();
DISubprogramAttachments.clear();		DISubprogramAttachments.clear();
return !Broken;		return !Broken;
}		}

private:		private:
// Verification methods...		// Verification methods...
void visitGlobalValue(const GlobalValue &GV);		void visitGlobalValue(const GlobalValue &GV);
▲ Show 20 Lines • Show All 210 Lines • ▼ Show 20 Lines	if (const Instruction *I = dyn_cast<Instruction>(V)) {
CheckFailed("Global is used by function in a different module", &GV, &M,		CheckFailed("Global is used by function in a different module", &GV, &M,
F, F->getParent());		F, F->getParent());
return false;		return false;
}		}
return true;		return true;
});		});
}		}

		// Check for a scalable vector type, making sure to look through arrays and
		// structs. Pointers to scalable vectors don't count, since we know what the
		// size of a pointer is.
		static bool containsScalableVectorValueRecursive(const Type *Ty,
		SmallVectorImpl<const Type*> &Visited) {
		if (is_contained(Visited, Ty))
		return false;

		Visited.push_back(Ty);

		if (auto *VTy = dyn_cast<VectorType>(Ty))
		return VTy->isScalable();

		if (auto *ATy = dyn_cast<ArrayType>(Ty))
		return containsScalableVectorValueRecursive(ATy->getElementType(), Visited);

		if (auto *STy = dyn_cast<StructType>(Ty))
		for (Type *EltTy : STy->elements())
		if (containsScalableVectorValueRecursive(EltTy, Visited))
		rovkaUnsubmitted Done Reply Inline Actions Could do an early return here instead of aggregating the result. rovka: Could do an early return here instead of aggregating the result.
		return true;

		return false;
		}

		bool Verifier::containsScalableVectorValue(const Type *Ty) {
		SmallVector<const Type*, 16> VisitedList = {};
		return containsScalableVectorValueRecursive(Ty, VisitedList);
		}

void Verifier::visitGlobalVariable(const GlobalVariable &GV) {		void Verifier::visitGlobalVariable(const GlobalVariable &GV) {
if (GV.hasInitializer()) {		if (GV.hasInitializer()) {
Assert(GV.getInitializer()->getType() == GV.getValueType(),		Assert(GV.getInitializer()->getType() == GV.getValueType(),
"Global variable initializer type does not match global "		"Global variable initializer type does not match global "
"variable type!",		"variable type!",
&GV);		&GV);
// If the global has common linkage, it must have a zero initializer and		// If the global has common linkage, it must have a zero initializer and
// cannot be constant.		// cannot be constant.
▲ Show 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	void Verifier::visitGlobalVariable(const GlobalVariable &GV) {
for (auto *MD : MDs) {		for (auto *MD : MDs) {
if (auto *GVE = dyn_cast<DIGlobalVariableExpression>(MD))		if (auto *GVE = dyn_cast<DIGlobalVariableExpression>(MD))
visitDIGlobalVariableExpression(*GVE);		visitDIGlobalVariableExpression(*GVE);
else		else
AssertDI(false, "!dbg attachment of global variable must be a "		AssertDI(false, "!dbg attachment of global variable must be a "
"DIGlobalVariableExpression");		"DIGlobalVariableExpression");
}		}

		// Scalable vectors cannot be global variables, since we don't know
		// the runtime size. Need to look inside structs/arrays to find the
		// underlying element type as well.
		if (containsScalableVectorValue(GV.getValueType()))
		CheckFailed("Globals cannot contain scalable vectors", &GV);

if (!GV.hasInitializer()) {		if (!GV.hasInitializer()) {
visitGlobalValue(GV);		visitGlobalValue(GV);
return;		return;
}		}

// Walk any aggregate initializers looking for bitcasts between address spaces		// Walk any aggregate initializers looking for bitcasts between address spaces
visitConstantExprsRecursively(GV.getInitializer());		visitConstantExprsRecursively(GV.getInitializer());

▲ Show 20 Lines • Show All 4,151 Lines • ▼ Show 20 Lines	bool llvm::verifyModule(const Module &M, raw_ostream *OS,

bool Broken = false;		bool Broken = false;
for (const Function &F : M)		for (const Function &F : M)
Broken \|= !V.verify(F);		Broken \|= !V.verify(F);

Broken \|= !V.verify();		Broken \|= !V.verify();
if (BrokenDebugInfo)		if (BrokenDebugInfo)
*BrokenDebugInfo = V.hasBrokenDebugInfo();		*BrokenDebugInfo = V.hasBrokenDebugInfo();

		hfinkelUnsubmitted Done Reply Inline Actions Remove unneeded whitespace change. hfinkel: Remove unneeded whitespace change.
// Note that this function's return value is inverted from what you would		// Note that this function's return value is inverted from what you would
		rovkaUnsubmitted Done Reply Inline Actions Nitpick 1: This comment is going to become stale as soon as someone comes up with a non-scalable type they'd like to check. Nitpick 2: Any reason why this is called here and not in Verifier::verify? rovka: Nitpick 1: This comment is going to become stale as soon as someone comes up with a non…
// expect of a function called "verify".		// expect of a function called "verify".
return Broken;		return Broken;
}		}

namespace {		namespace {

struct VerifierLegacyPass : public FunctionPass {		struct VerifierLegacyPass : public FunctionPass {
static char ID;		static char ID;
▲ Show 20 Lines • Show All 427 Lines • Show Last 20 Lines

test/Bitcode/compatibility.ll

Show First 20 Lines • Show All 865 Lines • ▼ Show 20 Lines	define void @typesystem() {
%t5 = alloca x86_fp80		%t5 = alloca x86_fp80
; CHECK: %t5 = alloca x86_fp80		; CHECK: %t5 = alloca x86_fp80
%t6 = alloca ppc_fp128		%t6 = alloca ppc_fp128
; CHECK: %t6 = alloca ppc_fp128		; CHECK: %t6 = alloca ppc_fp128
%t7 = alloca x86_mmx		%t7 = alloca x86_mmx
; CHECK: %t7 = alloca x86_mmx		; CHECK: %t7 = alloca x86_mmx
%t8 = alloca %opaquety*		%t8 = alloca %opaquety*
; CHECK: %t8 = alloca %opaquety*		; CHECK: %t8 = alloca %opaquety*
		%t9 = alloca <4 x i32>
		; CHECK: %t9 = alloca <4 x i32>
		%t10 = alloca <scalable 4 x i32>
		; CHECK: %t10 = alloca <scalable 4 x i32>

ret void		ret void
}		}

declare void @llvm.token(token)		declare void @llvm.token(token)
; CHECK: declare void @llvm.token(token)		; CHECK: declare void @llvm.token(token)

;; Inline Assembler Expressions		;; Inline Assembler Expressions
▲ Show 20 Lines • Show All 881 Lines • Show Last 20 Lines

test/Verifier/scalable-aggregates.ll

This file was added.

				; RUN: not opt -S -verify < %s 2>&1 \| FileCheck %s

				;; Arrays and Structs cannot contain scalable vectors, since we don't
				;; know the size at compile time and the container types need to have
				;; a known size.

				; CHECK-DAG: Arrays cannot contain scalable vectors
				; CHECK-DAG: [2 x { i32, <scalable 1 x i32> }]; ModuleID = '<stdin>'
				; CHECK-DAG: Arrays cannot contain scalable vectors
				; CHECK-DAG: [4 x <scalable 256 x i1>]; ModuleID = '<stdin>'
				; CHECK-DAG: Arrays cannot contain scalable vectors
				; CHECK-DAG: [2 x <scalable 4 x i32>]; ModuleID = '<stdin>'
				; CHECK-DAG: Structs cannot contain scalable vectors
				; CHECK-DAG: { i64, [4 x <scalable 256 x i1>] }; ModuleID = '<stdin>'
				; CHECK-DAG: Structs cannot contain scalable vectors
				; CHECK-DAG: { i32, <scalable 1 x i32> }; ModuleID = '<stdin>'
				; CHECK-DAG: Structs cannot contain scalable vectors
				; CHECK-DAG: { <scalable 16 x i8>, <scalable 2 x double> }; ModuleID = '<stdin>'
				; CHECK-DAG: Structs cannot contain scalable vectors
				; CHECK-DAG: %sty = type { i64, <scalable 32 x i16> }; ModuleID = '<stdin>'

				%sty = type { i64, <scalable 32 x i16> }

				define void @scalable_aggregates() {
				%array = alloca [2 x <scalable 4 x i32>]
				%struct = alloca { <scalable 16 x i8>, <scalable 2 x double> }
				%named_struct = alloca %sty
				%s_in_a = alloca [2 x { i32, <scalable 1 x i32> } ]
				%a_in_s = alloca { i64, [4 x <scalable 256 x i1> ] }
				ret void
				}
				No newline at end of file

test/Verifier/scalable-global-vars.ll

This file was added.

				; RUN: not opt -S -verify < %s 2>&1 \| FileCheck %s

				;; Global variables cannot be scalable vectors, since we don't
				;; know the size at compile time.

				; CHECK: Globals cannot contain scalable vectors
				; CHECK-NEXT: <scalable 4 x i32>* @ScalableVecGlobal
				@ScalableVecGlobal = global <scalable 4 x i32> zeroinitializer

				; CHECK: Globals cannot contain scalable vectors
				; CHECK-NEXT: [64 x <scalable 2 x double>]* @ScalableVecGlobalArray
				@ScalableVecGlobalArray = global [64 x <scalable 2 x double>] zeroinitializer

				; CHECK: Globals cannot contain scalable vectors
				; CHECK-NEXT: { <scalable 16 x i64>, <scalable 16 x i1> }* @ScalableVecGlobalStruct
				@ScalableVecGlobalStruct = global { <scalable 16 x i64>, <scalable 16 x i1> } zeroinitializer

				; CHECK: Globals cannot contain scalable vectors
				; CHECK-NEXT: { [4 x i32], [2 x { <scalable 4 x i64>, <scalable 32 x i8> }] }* @ScalableVecMixed
				@ScalableVecMixed = global { [4 x i32], [2 x { <scalable 4 x i64>, <scalable 32 x i8> }]} zeroinitializer

				;; Global _pointers_ to scalable vectors are fine
				; CHECK-NOT: Globals cannot contain scalable vectors
				@ScalableVecPtr = global <scalable 8 x i16>* zeroinitializer

unittests/IR/CMakeLists.txt

Show All 31 Lines	add_llvm_unittest(IRTests
PatternMatch.cpp		PatternMatch.cpp
TimePassesTest.cpp		TimePassesTest.cpp
TypesTest.cpp		TypesTest.cpp
UseTest.cpp		UseTest.cpp
UserTest.cpp		UserTest.cpp
ValueHandleTest.cpp		ValueHandleTest.cpp
ValueMapTest.cpp		ValueMapTest.cpp
ValueTest.cpp		ValueTest.cpp
		VectorTypesTest.cpp
VerifierTest.cpp		VerifierTest.cpp
WaymarkTest.cpp		WaymarkTest.cpp
)		)

target_link_libraries(IRTests PRIVATE LLVMTestingSupport)		target_link_libraries(IRTests PRIVATE LLVMTestingSupport)

unittests/IR/VectorTypesTest.cpp

This file was added.

				//===--- llvm/unittest/IR/VectorTypesTest.cpp - vector types unit tests ---===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/IR/DerivedTypes.h"
				#include "llvm/IR/LLVMContext.h"
				#include "llvm/Support/ScalableSize.h"
				#include "gtest/gtest.h"
				using namespace llvm;

				namespace {
				TEST(VectorTypesTest, FixedLength) {
				LLVMContext Ctx;

				Type *Int16Ty = Type::getInt16Ty(Ctx);
				Type *Int32Ty = Type::getInt32Ty(Ctx);
				Type *Int64Ty = Type::getInt64Ty(Ctx);
				Type *Float64Ty = Type::getDoubleTy(Ctx);

				VectorType *V8Int32Ty = VectorType::get(Int32Ty, 8);
				ASSERT_FALSE(V8Int32Ty->isScalable());
				EXPECT_EQ(V8Int32Ty->getNumElements(), 8U);
				EXPECT_EQ(V8Int32Ty->getElementType()->getScalarSizeInBits(), 32U);

				VectorType *V8Int16Ty = VectorType::get(Int16Ty, {8, false});
				ASSERT_FALSE(V8Int16Ty->isScalable());
				EXPECT_EQ(V8Int16Ty->getNumElements(), 8U);
				EXPECT_EQ(V8Int16Ty->getElementType()->getScalarSizeInBits(), 16U);

				ElementCount EltCnt(4, false);
				VectorType *V4Int64Ty = VectorType::get(Int64Ty, EltCnt);
				rovkaUnsubmitted Done Reply Inline Actions I'd also check the number of elements and element size here, just to cover the ElementCount operators fully. rovka: I'd also check the number of elements and element size here, just to cover the ElementCount…
				ASSERT_FALSE(V4Int64Ty->isScalable());
				EXPECT_EQ(V4Int64Ty->getNumElements(), 4U);
				EXPECT_EQ(V4Int64Ty->getElementType()->getScalarSizeInBits(), 64U);

				VectorType *V2Int64Ty = VectorType::get(Int64Ty, EltCnt/2);
				ASSERT_FALSE(V2Int64Ty->isScalable());
				EXPECT_EQ(V2Int64Ty->getNumElements(), 2U);
				EXPECT_EQ(V2Int64Ty->getElementType()->getScalarSizeInBits(), 64U);

				VectorType V8Int64Ty = VectorType::get(Int64Ty, EltCnt2);
				ASSERT_FALSE(V8Int64Ty->isScalable());
				EXPECT_EQ(V8Int64Ty->getNumElements(), 8U);
				EXPECT_EQ(V8Int64Ty->getElementType()->getScalarSizeInBits(), 64U);

				VectorType *V4Float64Ty = VectorType::get(Float64Ty, EltCnt);
				ASSERT_FALSE(V4Float64Ty->isScalable());
				EXPECT_EQ(V4Float64Ty->getNumElements(), 4U);
				EXPECT_EQ(V4Float64Ty->getElementType()->getScalarSizeInBits(), 64U);

				VectorType *ExtTy = VectorType::getExtendedElementVectorType(V8Int16Ty);
				EXPECT_EQ(ExtTy, V8Int32Ty);
				ASSERT_FALSE(ExtTy->isScalable());
				EXPECT_EQ(ExtTy->getNumElements(), 8U);
				EXPECT_EQ(ExtTy->getElementType()->getScalarSizeInBits(), 32U);

				VectorType *TruncTy = VectorType::getTruncatedElementVectorType(V8Int32Ty);
				EXPECT_EQ(TruncTy, V8Int16Ty);
				ASSERT_FALSE(TruncTy->isScalable());
				EXPECT_EQ(TruncTy->getNumElements(), 8U);
				EXPECT_EQ(TruncTy->getElementType()->getScalarSizeInBits(), 16U);

				VectorType *HalvedTy = VectorType::getHalfElementsVectorType(V4Int64Ty);
				EXPECT_EQ(HalvedTy, V2Int64Ty);
				ASSERT_FALSE(HalvedTy->isScalable());
				EXPECT_EQ(HalvedTy->getNumElements(), 2U);
				rovkaUnsubmitted Done Reply Inline Actions Ditto. rovka: Ditto.
				EXPECT_EQ(HalvedTy->getElementType()->getScalarSizeInBits(), 64U);

				VectorType *DoubledTy = VectorType::getDoubleElementsVectorType(V4Int64Ty);
				EXPECT_EQ(DoubledTy, V8Int64Ty);
				ASSERT_FALSE(DoubledTy->isScalable());
				EXPECT_EQ(DoubledTy->getNumElements(), 8U);
				EXPECT_EQ(DoubledTy->getElementType()->getScalarSizeInBits(), 64U);

				VectorType *ConvTy = VectorType::getInteger(V4Float64Ty);
				EXPECT_EQ(ConvTy, V4Int64Ty);
				ASSERT_FALSE(ConvTy->isScalable());
				EXPECT_EQ(ConvTy->getNumElements(), 4U);
				EXPECT_EQ(ConvTy->getElementType()->getScalarSizeInBits(), 64U);

				EltCnt = V8Int64Ty->getElementCount();
				EXPECT_EQ(EltCnt.Min, 8U);
				ASSERT_FALSE(EltCnt.Scalable);
				}

				TEST(VectorTypesTest, Scalable) {
				LLVMContext Ctx;

				Type *Int16Ty = Type::getInt16Ty(Ctx);
				Type *Int32Ty = Type::getInt32Ty(Ctx);
				Type *Int64Ty = Type::getInt64Ty(Ctx);
				Type *Float64Ty = Type::getDoubleTy(Ctx);

				VectorType *ScV8Int32Ty = VectorType::get(Int32Ty, 8, true);
				ASSERT_TRUE(ScV8Int32Ty->isScalable());
				EXPECT_EQ(ScV8Int32Ty->getNumElements(), 8U);
				EXPECT_EQ(ScV8Int32Ty->getElementType()->getScalarSizeInBits(), 32U);

				VectorType *ScV8Int16Ty = VectorType::get(Int16Ty, {8, true});
				ASSERT_TRUE(ScV8Int16Ty->isScalable());
				EXPECT_EQ(ScV8Int16Ty->getNumElements(), 8U);
				EXPECT_EQ(ScV8Int16Ty->getElementType()->getScalarSizeInBits(), 16U);

				ElementCount EltCnt(4, true);
				VectorType *ScV4Int64Ty = VectorType::get(Int64Ty, EltCnt);
				ASSERT_TRUE(ScV4Int64Ty->isScalable());
				EXPECT_EQ(ScV4Int64Ty->getNumElements(), 4U);
				EXPECT_EQ(ScV4Int64Ty->getElementType()->getScalarSizeInBits(), 64U);

				VectorType *ScV2Int64Ty = VectorType::get(Int64Ty, EltCnt/2);
				ASSERT_TRUE(ScV2Int64Ty->isScalable());
				EXPECT_EQ(ScV2Int64Ty->getNumElements(), 2U);
				EXPECT_EQ(ScV2Int64Ty->getElementType()->getScalarSizeInBits(), 64U);

				VectorType ScV8Int64Ty = VectorType::get(Int64Ty, EltCnt2);
				ASSERT_TRUE(ScV8Int64Ty->isScalable());
				EXPECT_EQ(ScV8Int64Ty->getNumElements(), 8U);
				EXPECT_EQ(ScV8Int64Ty->getElementType()->getScalarSizeInBits(), 64U);

				VectorType *ScV4Float64Ty = VectorType::get(Float64Ty, EltCnt);
				ASSERT_TRUE(ScV4Float64Ty->isScalable());
				EXPECT_EQ(ScV4Float64Ty->getNumElements(), 4U);
				EXPECT_EQ(ScV4Float64Ty->getElementType()->getScalarSizeInBits(), 64U);

				VectorType *ExtTy = VectorType::getExtendedElementVectorType(ScV8Int16Ty);
				EXPECT_EQ(ExtTy, ScV8Int32Ty);
				ASSERT_TRUE(ExtTy->isScalable());
				EXPECT_EQ(ExtTy->getNumElements(), 8U);
				EXPECT_EQ(ExtTy->getElementType()->getScalarSizeInBits(), 32U);

				VectorType *TruncTy = VectorType::getTruncatedElementVectorType(ScV8Int32Ty);
				EXPECT_EQ(TruncTy, ScV8Int16Ty);
				ASSERT_TRUE(TruncTy->isScalable());
				EXPECT_EQ(TruncTy->getNumElements(), 8U);
				EXPECT_EQ(TruncTy->getElementType()->getScalarSizeInBits(), 16U);

				VectorType *HalvedTy = VectorType::getHalfElementsVectorType(ScV4Int64Ty);
				EXPECT_EQ(HalvedTy, ScV2Int64Ty);
				ASSERT_TRUE(HalvedTy->isScalable());
				EXPECT_EQ(HalvedTy->getNumElements(), 2U);
				EXPECT_EQ(HalvedTy->getElementType()->getScalarSizeInBits(), 64U);

				VectorType *DoubledTy = VectorType::getDoubleElementsVectorType(ScV4Int64Ty);
				EXPECT_EQ(DoubledTy, ScV8Int64Ty);
				ASSERT_TRUE(DoubledTy->isScalable());
				EXPECT_EQ(DoubledTy->getNumElements(), 8U);
				EXPECT_EQ(DoubledTy->getElementType()->getScalarSizeInBits(), 64U);

				VectorType *ConvTy = VectorType::getInteger(ScV4Float64Ty);
				EXPECT_EQ(ConvTy, ScV4Int64Ty);
				ASSERT_TRUE(ConvTy->isScalable());
				EXPECT_EQ(ConvTy->getNumElements(), 4U);
				EXPECT_EQ(ConvTy->getElementType()->getScalarSizeInBits(), 64U);

				EltCnt = ScV8Int64Ty->getElementCount();
				EXPECT_EQ(EltCnt.Min, 8U);
				ASSERT_TRUE(EltCnt.Scalable);
				}

				} // end anonymous namespace

This is an archive of the discontinued LLVM Phabricator instance.

[SVE][IR] Scalable Vector IR TypeClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 191733

docs/LangRef.rst

include/llvm/ADT/DenseMapInfo.h

include/llvm/IR/DerivedTypes.h

include/llvm/IR/Type.h

include/llvm/Support/ScalableSize.h

lib/AsmParser/LLLexer.cpp

lib/AsmParser/LLParser.cpp

lib/AsmParser/LLToken.h

lib/Bitcode/Reader/BitcodeReader.cpp

lib/Bitcode/Writer/BitcodeWriter.cpp

lib/IR/AsmWriter.cpp

lib/IR/LLVMContextImpl.h

lib/IR/Type.cpp

lib/IR/Verifier.cpp

test/Bitcode/compatibility.ll

test/Verifier/scalable-aggregates.ll

test/Verifier/scalable-global-vars.ll

unittests/IR/CMakeLists.txt

unittests/IR/VectorTypesTest.cpp

[SVE][IR] Scalable Vector IR Type
ClosedPublic