This is an archive of the discontinued LLVM Phabricator instance.

[AArch64][SVE][WIP] Add support for vscale constants (?)
AbandonedPublic

Authored by efriedma on Dec 17 2019, 3:56 PM.

Details

Summary

This patch allows the following IR to compile:

define i32 @vscale() {
  ret i32 ptrtoint (<vscale x 1 x i8>* getelementptr (<vscale x 1 x
i8>, <vscale x 1 x i8>* null, i64 1) to i32)
}

to the following, which produces a value equal to the runtime value of vscale:

cntd    x8
lsr     x0, x8, #1
                                // kill: def $w0 killed $w0 killed $x0
ret

Posting this patch mainly to pose the question of whether the input IR is supposed to be valid. Some people have expressed the opinion that vscale should not be an llvm::Constant.

Diff Detail

Event Timeline

efriedma created this revision.Dec 17 2019, 3:56 PM

This is a very clever approach, I agree it should work - nice job! That said, I don't think it follows that we should accept these constantexprs in a ShuffleVector mask.

My current thinking for shuffles is that we shouldn't represent the shuffle mask of shufflevector as a Constant at all. Currently, we basically treat it as an ArrayRef<int> anyway (and in fact, that's the only representation at the SelectionDAG level). And once we're not using Constants, we can pick an appropriate representation for vscale'ed shuffles without worrying about shoving the result into a Constant. I think Sander is planning to write up a proposal along these lines.

My current thinking for shuffles is that we shouldn't represent the shuffle mask of shufflevector as a Constant at all. Currently, we basically treat it as an ArrayRef<int> anyway

I completely agree, this was a mistake in the representation and it would be really great to fix it! I'd recommend moving the existing functionality before proposing extending shufflevector to vscale though. One step at a time. Thanks :-)

Given this establishes that adding the scalable vector type has inadvertently created a way to represent the runtime constant of vscale as a constant, what's the downside to having a clearer representation in the form of an explicit constant?

What's the advantage of a redundant way to represent the same thing? LLVM could have a 'sizeof' ConstantExpr for example, but never did (people use the same gep trick). It is generally good to have fewer more canonical ways to represent a thing if possible.

Can we guarantee "sizeof(<vscale x 1 x i8>) == vscale" for all future targets? much like how "sizeof(<vscale x 1 x i1>) == vscale" doesn't hold true for SVE. If not then this isn't a canonical form as the exact pattern becomes target specific. Or to put another, why should vscale be linked to data layout?

Can we guarantee "sizeof(<vscale x 1 x i8>) == vscale" for all future targets?

I don't think the current datalayout code actually prevents a target from specifying some sort of excessive alignment for vectors, like "v8:128:128". Granted, I can't think of any reason you'd want to do that.

simoll added a subscriber: arsenm.Jan 7 2020, 1:48 AM
define i32 @vscale() {
  ret i32 ptrtoint (<vscale x 1 x i8>* getelementptr (<vscale x 1 x
i8>, <vscale x 1 x i8>* null, i64 1) to i32)
}

Whether this returns vscale or something else depends on the ptrtoint mapping of the default address space.. and <vscale x 1 x i8>* null might not be i32 0 for some targets. You might get away with subtracting the integer value of null ;-) (.. i guess @arsenm is more familiar with the topic of address spaces and what may be assumed about them given his recent DevMtg talk).

Personally, I'd prefer to have an explicit and well-defined vscale constant. Having scalable vector types without vscale, to me, seems like stopping half way.

My current thinking for shuffles is that we shouldn't represent the shuffle mask of shufflevector as a Constant at all.

If shuffle masks are no longer IR entities, this means ruling out computed shuffle masks in the future, which are available on some targets (X86). Whether we want computed shuffle masks at all is a different question.

define i32 @vscale() {
  ret i32 ptrtoint (<vscale x 1 x i8>* getelementptr (<vscale x 1 x
i8>, <vscale x 1 x i8>* null, i64 1) to i32)
}

Whether this returns vscale or something else depends on the ptrtoint mapping of the default address space.. and <vscale x 1 x i8>* null might not be i32 0 for some targets. You might get away with subtracting the integer value of null ;-) (.. i guess @arsenm is more familiar with the topic of address spaces and what may be assumed about them given his recent DevMtg talk).

My understanding was that for the default address space, nullptr always has the bit value 0. I don't really understand this topic enough to know the intricacies around the ptrtoint mapping of the default address space, but if this is used for sizeof, I don't see why it couldn't be used for vscale (taking into account the check for larger alignment).

The main argument for wanting vscale as a constant was to avoid copy propagation (because it is inherently a constant value) to benefit ISel and to use VScale in shuffle masks to e.g. represent zips or concats. We can drop the latter requirement if we come up with alternative ways to address shuffles for scalable vectors. With the right safe-guards with respect to datalayout, we can still use Eli's suggestion to implement the copy propagation and still benefit from the constant expression with some help from PatternMatch. I've updated my original vscale patch to reflect that: D68203.

My current thinking for shuffles is that we shouldn't represent the shuffle mask of shufflevector as a Constant at all.

If shuffle masks are no longer IR entities, this means ruling out computed shuffle masks in the future, which are available on some targets (X86). Whether we want computed shuffle masks at all is a different question.

Yes, this is kind of a decision point: either we allow computed shufflemasks, or we get rid of the operand altogether. Or we can introduce some intrinsics and put off the decision, I guess.

I don't think there's much incentive to support computed shuffle masks. Yes, x86 has pshufb, but that doesn't generalize to other element widths/two source vectors/etc. easily. And there aren't very many practical use cases for computed shuffles in automatic vectorization.

If we allow computed shuffle masks, there's also the minor complication that we'd have to change shufflevector to produce poison, not undef, for undef indexes. But we can likely change that with some work...

If shuffle masks are no longer IR entities, this means ruling out computed shuffle masks in the future, which are available on some targets (X86). Whether we want computed shuffle masks at all is a different question.

Yes, this is kind of a decision point: either we allow computed shufflemasks, or we get rid of the operand altogether. Or we can introduce some intrinsics and put off the decision, I guess.

I don't think there's much incentive to support computed shuffle masks. Yes, x86 has pshufb, but that doesn't generalize to other element widths/two source vectors/etc. easily. And there aren't very many practical use cases for computed shuffles in automatic vectorization.

If we allow computed shuffle masks, there's also the minor complication that we'd have to change shufflevector to produce poison, not undef, for undef indexes. But we can likely change that with some work...

Altivec also supports computed shuffles with vperm IIRC. The problem with this in a shufflevector context is that each of these are very different in how they model a computed shuffle. I would personally rather (keep!) shufflevector to static shuffle masks as a target independent concept. If there becomes a good abstraction for computed shuffles, we can introduce a new intrinsic or instruction at that time, which is separate from it.

The main thrust of this patch was committed in D68203. But we still need the changes to SelectionDAGBuilder for general GEPs over vscale'ed types.

efriedma abandoned this revision.Jan 28 2020, 7:37 PM

Remaining changes are now in https://reviews.llvm.org/D73602 .