This is an archive of the discontinued LLVM Phabricator instance.

[SVE] Make ElementCount members private
ClosedPublic

Authored by david-arm on Aug 17 2020, 5:41 AM.

Details

Summary

This patch changes ElementCount so that the Min and Scalable
members are now private and can only be accessed via the get
functions getKnownMinValue() and isScalable(). This is now inline
with the TypeSize class.

In addition I've added some other member functions for more
commonly used operations. Hopefully this makes the class more
useful and will reduce the need for calling getKnownMinValue().

Diff Detail

Event Timeline

david-arm created this revision.Aug 17 2020, 5:41 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptAug 17 2020, 5:41 AM
david-arm requested review of this revision.Aug 17 2020, 5:41 AM
fpetrogalli added inline comments.Aug 17 2020, 8:53 AM
llvm/include/llvm/Support/TypeSize.h
61

I think that @ctetreau is right on https://reviews.llvm.org/D85794#inline-793909. We should not overload a comparison operator on this class because the set it represent it cannot be ordered.

Chris suggests an approach of writing a static function that can be used as a comparison operator, so that we can make it explicit of what kind of comparison we are doing.

Perhaps now would be a good time to combine TypeSize and ElementCount into a single Polynomial type? We don't have to implement the whole abstraction of c*x^n (since we currently don't use the exponent, and don't distinguish between X's) but if it's ever needed in the future it will be obvious where to add it, and it will Just Work.

llvm/include/llvm/Support/TypeSize.h
61

In C++, it's common to overload the comparison operators for the purposes of being able to std::sort and use ordered sets. Normally, I would be OK with such usages. However, since ElementCount is basically a numeric type, and they only have a partial ordering, I think this is dangerous. I'm concerned that this will result in more bugs whereby somebody didn't remember that vectors can be scalable.

I don't have a strong opinion what the comparator function should be called, but I strongly prefer that it not be a comparison operator.

323–325

NIT: this can be rewritten without duplicating EltCnt.getKnownMinValue() * 37U

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3524

What is this >> thing? Some indicator of whitespace change, or is this a hard tab?

vkmr added a subscriber: vkmr.Aug 18 2020, 4:21 AM

Perhaps now would be a good time to combine TypeSize and ElementCount into a single Polynomial type? We don't have to implement the whole abstraction of c*x^n (since we currently don't use the exponent, and don't distinguish between X's) but if it's ever needed in the future it will be obvious where to add it, and it will Just Work.

Even if the types are structurally similar, I'd prefer to keep them separate. The uses are distinct, the usage in a function signature indicates what kind of value we actually expect. (It's particularly easy to perform incorrect conversions with types representing size and alignment.)

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp
3524

It's an indicator of a whitespace change; I think they started showing up with the recent Phabricator upgrade

Hi @ctetreau, I agree with @efriedma that keeping the two classes distinct for now seems best. The reason is I spent quite a lot of time trying to unify these classes already and I hit a stumbling block - TypeSize has the ugly uint64_t() cast operator, which makes unifying difficult. I didn't want to introduce a templated cast operator that ElementCount would then have too. I also tried making TypeSize derive from a templated parent, but that was pretty ugly too. Perhaps once we've removed the TypeSize -> uint64_t we might be better able to consider it?

llvm/include/llvm/Support/TypeSize.h
61

Hi @ctetreau, yeah I understand. The reason I chose to use operators was simply to be consistent with what we have already in TypeSize. Also, we have existing "==" and "!=" operators in ElementCount too, although these are essentially testing that two ElementCounts are identically the same or not, i.e. for 2 given polynomials (a + bx) and (c + dx) we're essentially asking if both a==c and b==d.

If I introduce a new comparison function, I'll probably keep the asserts in for now, but in general we can do better than simply asserting if something is scalable or not. For example, we know that (vscale * 4) is definitely >= 4 because vscale is at least 1. I'm just not sure if we have that need yet.

david-arm updated this revision to Diff 287003.Aug 21 2020, 4:45 AM
paulwalker-arm added inline comments.Aug 21 2020, 5:36 AM
llvm/include/llvm/Support/TypeSize.h
61

I think we should treat the non-equality comparison functions more like floating point. What we don't want is somebody writing !GreaterThan when they actually mean LessThan.

Perhaps we should name the functions accordingly (i.e. ogt for OrderedAndGreaterThan). We will also need matching less than functions since I can see those being useful when analysing constant insert/extract element indices which stand a good chance to be a known comparison (with 0 being the most common index).

fpetrogalli added inline comments.Aug 24 2020, 10:09 AM
llvm/include/llvm/Support/TypeSize.h
61

May I suggest the following name scheme? (my 2 c, will not hold the patch for not addressing this comment)

static bool [Non]Total<cmp>(...)

with <cmp> being

  • LT -> less than, aka <
  • LToE -> less than or equal, aka "<="
  • GT -> greater than, aka ">"
  • GToE -> greater than or equal, aka ">="

and Total , NonTotal being the prefix that gives information about the behavior on the value of scalable:

Total -> for example, all scalable ECs are bigger than fixed ECs.
NonTotal -> asserting on (LHS.Scalable == RHS.Scalable) before returning LHS.Min <cmp> RHS.Min.

Taking it further: it could also be a template on an enumeration that list the type of comparisons?

enum CMPType {
  TotalGT,
  NonTotalLT,
  fancy_one
};

...

template <unsigned T>
static bool cmp(ElementCount &LHS, ElementCount &RHS );

...

static bool ElementCount::cmp<ElementCount::CMPType::TotalGT>(ElementCount &LHS, ElementCount &RHS ) {
 /// implementation
}
static bool ElementCount::cmp<ElementCount::CMPType::fancy_one>(ElementCount &LHS, ElementCount &RHS ) {
 /// implementation
}

Hi @fpetrogalli, if you don't mind I think I'll stick with Paul's idea for ogt because this matches the IR neatly, i.e. "fcmp ogt". Also, for me personally it's much simpler and more intuitive.

Hi @fpetrogalli, if you don't mind I think I'll stick with Paul's idea for ogt because this matches the IR neatly, i.e. "fcmp ogt". Also, for me personally it's much simpler and more intuitive.

I don't mind at all, please follow Paul's suggestion, mine was just an alternative. Thanks!

david-arm updated this revision to Diff 287672.Aug 25 2020, 8:14 AM
  • Changed comparison function from gt to ogt and added a olt (less than) comparison function too.
  • Instead of adding the ">>=" operator I've added "/=" instead as I think this is more common. In places where ">>= 1" was used we now do "/= 2".
  • After rebasing it was necessary to add a "*=" operator too for the Loop Vectorizer.
david-arm marked 2 inline comments as done.Aug 25 2020, 8:15 AM
ctetreau added inline comments.Aug 25 2020, 12:22 PM
llvm/include/llvm/Support/TypeSize.h
61

Honestly, I think this is actually worse. My issue is the fact that, from a mathematical perspective, vscale_1 * min_1 < vscale_2 * min_2 is a function of vscale_1 and vscale_2. In principle, we can know some ordering relationships between certain element counts (such as vscale * min_1 >= min_2 = true), but in general, this function does not make sense. However, an operator< is useful because it allows you to put an ElementCount into an ordered set, and it will just work. Renaming the function to olt just makes it so that you can't put ElementCounts into an ordered set, but still implies that ElementCounts are comparable in general. This will also blow up if you actually try to mix fixed width and scalable ElementCounts in an ordered set, which should work IMO.

Here is what I propose:

  1. Add a predicate for establishing an arbitrary ordering. The predicate would be completely arbitrary, because it's only useful for establishing an ordering for an ordered set or in a sorting algorithm. It could look something like this:
static bool orderedBefore(const ElementCount &LHS, const ElementCount &RHS) {
  auto l = std::tie(LHS.Scalable, LHS.Min);
  auto r = std::tie(RHS.Scalable, RHS.Min);
  return l < r;
}
  1. don't add any sort of mathematical comparison functions. Code working with ElementCount almost certainly either inspects the Scalable field and does something with the Min:
if (EC.isScalable()) {
    unsigned min = EC.getKnownMinValue();
    ... // do stuff with min

... or just uses it as a unit:

auto * VecTy = VectorType::get(SomeTy, EC);

I do not think that having the relation operators on ElementCount would simplify very much code. However, it is very easy to use incorrectly, and if it is ever extended in the future (one machine with two different vscale values? vscale == 0 becoming valid?), it would become even worse. Best just not open that door.

Hi @ctetreau, ok for now I'm going to completely remove the operators and revert the code using those operators to how it was before. I'm not sure what you mean about the predicate functions so I've left those for now, since they aren't needed for this patch. The purpose of this patch was originally supposed to be mechanical anyway - just making members private. I only added the operators as an after-thought really, just to be consistent with how TypeSize dealt with the identical problem. For what it's worth, I believe that GCC solved this exact same problem by adding two types of comparison functions - one set that absolutely wanted an answer to ">,<,>=,<=" and asserted if it wasn't known at compile time, and another set of comparison functions that returned an additional boolean value indicating whether the answer was known or not. Perhaps my knowledge is out of date, but I believe this was the accepted solution and seemed to work well.

david-arm updated this revision to Diff 288260.Aug 27 2020, 3:15 AM
david-arm edited the summary of this revision. (Show Details)
paulwalker-arm added inline comments.Aug 27 2020, 4:19 AM
llvm/include/llvm/Support/TypeSize.h
90

I don't believe this is safe. For example we know SVE supported vector lengths only have to be a multiple of 128bits. So for scalable vectors we cannot know the element count is a power of 2 unless we perform a runtime check.

david-arm added inline comments.Aug 27 2020, 4:33 AM
llvm/include/llvm/Support/TypeSize.h
90

Ok, but if that's true how is code in llvm/lib/CodeGen/TargetLoweringBase.cpp ever safe for scalable vectors? I thought that the question being asked wasn't that the total size was a power of 2, but whether or not it was safe to split the vector. The answer should be the same even if vscale is 3, for example. I thought the problem here is that the legaliser simply needs to know in what way it should break down different types, and that whatever approach it took would work when scaled up. The vector breakdown algorithm relies upon having an answer here - perhaps this is just a case of changing the question and name of function?

I cannot say whether such questions make sense without a deeper investigation, but I can say for certain that EC.isPowerOf2 is a question we cannot answer at compile time. Given this is a mechanical change I would just remove the member function and leave the code as is (well change EC.Min to EC.getKnownMinValue()). We already know that we'll need to visit the places where getKnownMinValue() is used to ensure the question makes sense in the face of scalable vectors.

david-arm updated this revision to Diff 288370.Aug 27 2020, 9:42 AM
  • Removed isPowerOf2() function since this is potentially misleading - it's only the known minimum value that we're checking.
  • Renamed isEven to isKnownEven to try and make it clear that returning true indicates we know definitely that the total number of elements is even, whereas returning false could mean either the element count is odd or that we don't know.
paulwalker-arm accepted this revision.Aug 27 2020, 10:34 AM

There's probably a few .Min to .getKnownMinValue() conversions where the .Min could be dropped (calls to Builder.CreateVectorSplat for example) but they can be tidied up as part of a proper activity to reduce the places where getKnownMinValue is called. So other than my suggested updated to EC::operator/ the patch looks good to my eye. Please give other reviewers a little more time to provide other insights.

llvm/include/llvm/Support/TypeSize.h
66

If you add an assert that the divide is lossless (i.e. MIN % RHS == 0) then asserts like:

assert(EltCnt.isKnownEven() && "Splitting vector, but not in half!");

are no longer required. Plus those places which are not checking for lossless division will be automatically protected. This feels like a sensible default to me. If somebody wants a truncated result, they can do the maths using getKnownMinValue().

This revision is now accepted and ready to land.Aug 27 2020, 10:34 AM

Hi @ctetreau, ok for now I'm going to completely remove the operators and revert the code using those operators to how it was before. ...

This is probably for the best.

... I'm not sure what you mean about the predicate functions ...

I'm referring to providing some built in way to std::sort a collection of ElementCount or have a std::set<ElementCount>. By default, C++ wants to use operator< for this, which I believe was the original motivation for the operator being here in the first place. I think it's reasonable for ElementCount to provide a built-in function to establish an ordering for these purposes, but the function should be named such that nobody thinks the function is intended to be the mathematical relation.

ctetreau accepted this revision.Aug 27 2020, 2:38 PM

I think this is good to go as is. Assuming @paulwalker-arm is satisfied with leaving operator/ as is, then LGTM.

llvm/include/llvm/Support/TypeSize.h
66

I would prefer that this not be done. This would make this function non-total in an unrecoverable way, and would force everybody to write a bunch of tedious error handling code, even if the normal integer division behavior would have been fine:

ElementCount res = LHS.getKnownMinValue() % RHS.getKnownMinValue() == 0 ? LHS / RHS : SomeOtherThing;

Everybody knows how integer division works, so I think the lossy behavior will not surprise anybody. An assert might.

Can't say I agree since people are already writing the ugly code, because the result typically demands different handling or they're asserting the divide doesn't truncate in the first place. That said I'm happy for there to be no assert as long as operator% is implemented so users can calculate the remainder in the expected way.

I'm retracting my operator% request. After thinking about it and speaking with Dave I just cannot see how allowing a total divide is safe for scalable vectors. If you are relying on a truncating divide then special handling is require anyway, which is likely to be different between fixed-length and scalable vectors.

To be more clear, I'm happy to defer the divide conversation for if/when we run into issues so my previous acceptance still stands. It'll be good to get the intent of the patch in (i.e. stoping access to internal class members) asap, plus any follow up work will be a smaller more manageable patch. It's worth talking this through during the next sync call to see it we can get some consensus regarding what maths is and isn't allowed.

This revision was automatically updated to reflect the committed changes.

Hi @ctetreau, ok for now I'm going to completely remove the operators and revert the code using those operators to how it was before. I'm not sure what you mean about the predicate functions so I've left those for now, since they aren't needed for this patch. The purpose of this patch was originally supposed to be mechanical anyway - just making members private. I only added the operators as an after-thought really, just to be consistent with how TypeSize dealt with the identical problem. For what it's worth, I believe that GCC solved this exact same problem by adding two types of comparison functions - one set that absolutely wanted an answer to ">,<,>=,<=" and asserted if it wasn't known at compile time, and another set of comparison functions that returned an additional boolean value indicating whether the answer was known or not. Perhaps my knowledge is out of date, but I believe this was the accepted solution and seemed to work well.

FWIW, the GCC scheme is to have one set of functions maybe_<cond> that are true if a condition *might* hold (i.e. would hold for one possible value of the runtime indeterminates), and another set of functions known_<cond> (originally must_<cond>) that are true if a condition *always* holds (i.e. would hold for all possible values of the runtime indeterminates). known_le is a partial order but maybe_le is not (because it isn't antisymmetric). Having both is redundant with !, since e.g. known_le is the opposite of maybe_gt, but it seemed more readable to allow every condition to be expressed positively.

Like you say, there is also a test for whether two values are ordered by known_le, and there are also some operations like ordered_min and ordered_max that assert if the values aren't ordered by known_le.

[Sorry for the post-commit comment, but it's related to something that wasn't part of the commit.]