This is an archive of the discontinued LLVM Phabricator instance.

[globalisel][legalizer] Adapt LegalizerInfo to support inter-type dependencies and other things.
ClosedPublic

Authored by dsanders on Jan 18 2018, 10:52 AM.

Details

Summary

As discussed in D42244, we have difficulty describing the legality of some
operations. We're not able to specify relationships between types.
For example, declaring the following

setAction({..., 0, s32}, Legal)
setAction({..., 0, s64}, Legal)
setAction({..., 1, s32}, Legal)
setAction({..., 1, s64}, Legal)

currently declares these type combinations as legal:

{s32, s32}
{s64, s32}
{s32, s64}
{s64, s64}

but we currently have no means to say that, for example, {s64, s32} is
not legal. Some operations such as G_INSERT/G_EXTRACT/G_MERGE_VALUES/
G_UNMERGE_VALUES have relationships between the types that are currently
described incorrectly.

Additionally, G_LOAD/G_STORE currently have no means to legalize non-atomics
differently to atomics. The necessary information is in the MMO but we have no
way to use this in the legalizer. Similarly, there is currently no way for the
register type and the memory type to differ so there is no way to cleanly
represent extending-load/truncating-store in a way that can't be broken by
optimizers (resulting in illegal MIR).

It's also difficult to control the legalization strategy. We've added support
for legalizing non-power of 2 types but there's still some hardcoded assumptions
about the strategy. The main one I've noticed is that type0 is always legalized
before type1 which is not a good strategy for type0 = G_EXTRACT type1, ... if
you need to widen the container. It will converge on the same result eventually
but it will take a much longer route when legalizing type0 than if you legalize
type1 first.

Lastly, the definition of legality and the legalization strategy is kept
separate which is not ideal. It's helpful to be able to look at a one piece of
code and see both what is legal and the method the legalizer will use to make
illegal MIR more legal.

This patch adds a layer onto the LegalizerInfo (to be removed when all targets
have been migrated) which resolves all these issues.

Here are the rules for shift and division:

for (unsigned BinOp : {G_LSHR, G_ASHR, G_SDIV, G_UDIV})
  getActionDefinitions(BinOp)
      .legalFor({s32, s64})     // If type0 is s32/s64 then it's Legal
      .clampScalar(0, s32, s64) // If type0 is <s32 then WidenScalar to s32
                                // If type0 is >s64 then NarrowScalar to s64
      .widenScalarToPow2(0)     // Round type0 scalars up to powers of 2
      .unsupported();           // Otherwise, it's unsupported

This describes everything needed to both define legality and describe how to
make illegal things legal.

Here's an example of a complex rule:

getActionDefinitions(G_INSERT)
    .unsupportedIf([=](const LegalityQuery &Query) {
      // If type0 is smaller than type1 then it's unsupported
      return Query.Types[0].getSizeInBits() <= Query.Types[1].getSizeInBits();
    })
    .legalIf([=](const LegalityQuery &Query) {
      // If type0 is s32/s64/p0 and type1 is a power of 2 other than 2 or 4 then it's legal
      // We don't need to worry about large type1's because unsupportedIf caught that.
      const LLT &Ty0 = Query.Types[0];
      const LLT &Ty1 = Query.Types[1];
      if (Ty0 != s32 && Ty0 != s64 && Ty0 != p0)
        return false;
      return isPowerOf2_32(Ty1.getSizeInBits()) &&
             (Ty1.getSizeInBits() == 1 || Ty1.getSizeInBits() >= 8);
    })
    .clampScalar(0, s32, s64)
    .widenScalarToPow2(0)
    .maxScalarIf(typeInSet(0, {s32}), 1, s16) // If type0 is s32 and type1 is bigger than s16 then NarrowScalar type1 to s16
    .maxScalarIf(typeInSet(0, {s64}), 1, s32) // If type0 is s64 and type1 is bigger than s32 then NarrowScalar type1 to s32
    .widenScalarToPow2(1)                     // Round type1 scalars up to powers of 2
    .unsupported();

This uses a lambda to say that G_INSERT is unsupported when type0 is bigger than
type1 (in practice, this would be a default rule for G_INSERT). It also uses one
to describe the legal cases. This particular predicate is equivalent to:

.legalFor({{s32, s1}, {s32, s8}, {s32, s16}, {s64, s1}, {s64, s8}, {s64, s16}, {s64, s32}})

In terms of performance, I saw a slight (~6%) performance improvement when
AArch64 was around 30% ported but it's pretty much break even right now.
I'm going to take a look at constexpr as a means to reduce the initialization
cost.

Future work:

  • Make it possible for opcodes to share rulesets. There's no need for G_LSHR/G_ASHR/G_SDIV/G_UDIV to have separate rule and ruleset objects. There's no technical barrier to this, it just hasn't been done yet.
  • Replace the type-index numbers with an enum to get .clampScalar(Type0, s32, s64)
  • Better names for things like .maxScalarIf() (clampMaxScalar?) and the vector rules.
  • Improve initialization cost using constexpr

Possible future work:

  • It's possible to make these rulesets change the MIR directly instead of returning a description of how to change the MIR. This should remove a little overhead caused by parsing the description and routing to the right code, but the real motivation is that it removes the need for LegalizeAction::Custom. With Custom removed, there's no longer a requirement that Custom legalization change the opcode to something that's considered legal.

Diff Detail

Repository
rL LLVM

Event Timeline

dsanders created this revision.Jan 18 2018, 10:52 AM

I forgot to mention: I haven't tried to move the implementations in the headers somewhere more reasonable yet. They're currently included in a lot of places they don't need to be.

bogner added a subscriber: bogner.Jan 18 2018, 1:21 PM

As far as the diff goes it does look like we end up being much more verbose for simple cases (like, IMPLICIT_DEF), but OTOH the definitions we're adding are doing quite a bit more, so that's probably fine. I think the approach is reasonable.

include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
173–174 ↗(On Diff #130454)

inline is redundant here - since we're defining this within the class definition it's implicitly inline.

194–195 ↗(On Diff #130454)

If we've already spelled out all of "cartesian" are we really saving anything by abbreviating product to "prod"?

196–198 ↗(On Diff #130454)

Probably worth simplifying with a using namespace:

using namespace LegalityPredicates;
return actionIf(Action, all(typeInSet(0, Types), typeInSet(1, Types)));
246–254 ↗(On Diff #130454)

Remove or finish implementing?

371–374 ↗(On Diff #130454)

I'm not sure we want this here - this would allow someone to write some new style rules for an op but then fallback as well, which is just confusing. I think we should enforce either new style rules or fallback, and not allow weird combinations of both.

lib/Target/AArch64/AArch64LegalizerInfo.cpp
53 ↗(On Diff #130454)

Would it make more sense to have the end of the list be implicitly unsupported? I think currently it would go to fallback instead, but as I pointed out earlier I don't think that's the right thing to do.

As far as the diff goes it does look like we end up being much more verbose for simple cases (like, IMPLICIT_DEF), but OTOH the definitions we're adding are doing quite a bit more, so that's probably fine. I think the approach is reasonable.

We can probably compress it a bit more. For example:

getActionDefinitions(G_IMPLICIT_DEF)
    .legalFor({p0, s1, s8, s16, s32, s64})
    .clampScalar(0, s1, s64)
    .widenScalarToPow2(0, 8)
    .unsupported();

down to:

getActionDefinitions(G_IMPLICIT_DEF)
    .legalFor({p0, s1, s8, s16, s32, s64})
    .changeScalarToNextBiggest(0, {s1, s8, s16, s32, s64});

this particular case is likely to be handled the same way on all targets so we could potentially have:

getActionDefinitions(G_IMPLICIT_DEF)
    .legalFor({p0, s1})
    .default(/*BiggestType*/ s64);

where .default() adds rules for {s8, s16, s32, s64}.

include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
246–254 ↗(On Diff #130454)

Finish implementing. Lower needs a bit more information than I had expected and I need to figure out what it's doing with that information.

371–374 ↗(On Diff #130454)

This was intended to be a migration tool but it's not actually in use anymore. I'll remove it.

At the moment, if you fall off the end of the ruleset (which is prevented by the .unsupported() calls) then it will fall back on the old implementation. This lets you implement the .legal*() part of the rule first and then figure out how to write the legalization rules. I want to push towards deleting the old implementation as soon as possible.

lib/Target/AArch64/AArch64LegalizerInfo.cpp
53 ↗(On Diff #130454)

In the long run, yes. Once we've all moved to this then we'll be able to make it the default. Until then, we need a way to fallback on the existing implementation when there are no rules specified or when they are incomplete

bogner added inline comments.Jan 18 2018, 3:47 PM
include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
371–374 ↗(On Diff #130454)

Right. I'd rather falling off the end of the ruleset mean unsupported, and I don't really see a reason to allow fallback for an op once you've started writing the rules for that op.

lib/Target/AArch64/AArch64LegalizerInfo.cpp
53 ↗(On Diff #130454)

I don't understand when having incomplete rules that lead to fallback would be useful. I'm reasonably sure it would make more sense to say "no rules specified implies fallback, but any rules specified implies that the rules are complete". People would still be able to write the new rules for an opcode at a time, so we still wouldn't have to do an all or nothing switch.

dsanders added inline comments.Jan 18 2018, 5:16 PM
include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
371–374 ↗(On Diff #130454)

Ok, I can switch it over to that

lib/Target/AArch64/AArch64LegalizerInfo.cpp
53 ↗(On Diff #130454)

It's not useful for very long but it's handy when writing the rules. The way I've been porting these is:

  • Add the legal cases and test it
  • Add the scalar legalization cases and test it
  • Add the vector legalization cases and test it
  • Tack unsupported() on the end, delete the old code and test it once more.

At each step, the old code handles any cases I haven't dealt with yet

dsanders updated this revision to Diff 130978.Jan 22 2018, 3:33 PM
dsanders marked 4 inline comments as done.

Rebase.
Make falling off the end of the ruleset mean Unsupported unless the ruleset is empty in which case it automatically
falls back to the legacy rules. Other than that fallback legacy rules needs to be explicitly added

I generally like the direction of the API. Again, not really a qualified reviewer, but I added a bunch of small comments on readability of the API choices from a non-expert. Making this stuff more discoverable would be a major plus and this feels like a potential major step in that direction.

include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
150 ↗(On Diff #130978)

Might call this "and" and have an "or" as well.

166 ↗(On Diff #130978)

"more" is vague. "roundUp"?

221 ↗(On Diff #130978)

You could spell this with a lazily evaluated CartesianProduct class. If the interface took an arbitrary generator object, it would be powerful with less code duplication.

241 ↗(On Diff #130978)

I suspect a legalIf(Type) and actionIf(Action, Type) override might save some typing.

256 ↗(On Diff #130978)

Honestly, having the cover functions here just adds complexity. Might be better to spell out the rules in the actionIf form explicitly.

348 ↗(On Diff #130978)

You might want to choose a different verb than "pick" here. Or at least, it isn't obvious from just reading the code what this does.

363 ↗(On Diff #130978)

min and max feel like the wrong names. They sound like predicates, not actions. Maybe increaseNumElements? Or clampMinNumElements?

dsanders marked 9 inline comments as done.Jan 24 2018, 11:41 AM
dsanders added inline comments.
include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
150 ↗(On Diff #130978)

Unfortunately and is a reserved keyword. It's equivalent to && but can be represented in 7-bit ASCII.

I haven't needed an any yet, but we can add it when needed.

166 ↗(On Diff #130978)

I've named it after the LegalizeActions::MoreElements action but I agree the end result is weirdly named. I've changed them to widenScalarToNextPow2 and moreElementsToNextPow2 to begin with but the latter would make more sense as increaseElementsToNextPow2. If we do that, then we should also change MoreElements to IncreaseElements (and similarly FewerElements to ReduceElements).

221 ↗(On Diff #130978)

I'm not sure what you mean here. This function is avoiding computing the cartesian product by testing each type index against the same list.

241 ↗(On Diff #130978)

I'm reluctant to make it take a single type. I'd like to discourage something like:

.legalIf(s8)
.legalIf(s16)
.legalIf(s32)
.legalIf(s64)

since that pays the memory, loop, and function call overhead four times per instruction being legalized whereas:

.legalFor({s8, s16, s32, s64})

pays it once.

256 ↗(On Diff #130978)

This particular one is for consistency with the other *If() functions and isn't being used at the moment. For the cover functions in general: Aside from aesthetics in the target LegalizerInfo implementations, the main reason the cover functions are there is to present the right interface to the right action. Some action values for actionIf are only required to provide an action and predicate function, but others also need to provide a mutation function. I could make it an assertion but I want it to fail at compile-time.

What I originally wanted to do was to declare templates and then name particular instantiations. Something like:

template<LegalizeAction Action> LegalizeRuleSet &actionIf(LegalityPredicate Predicate) { ... }
template<LegalizeAction Action> LegalizeRuleSet &actionIfAndMutate(LegalityPredicate Predicate, LegalizeMutation Mutate) { ... }
using legalIf = actionIf<Legal>;
using unsupportedIf = actionIf<Unsupported>;
using widenScalarIf = actionIfAndMutate<WidenScalar>;
using narrowScalarIf = actionIfAndMutate<WidenScalar>;

but using can only create aliases for types.

348 ↗(On Diff #130978)

One of my colleagues has said the same thing in person a couple days ago but I haven't come up with anything better yet. The intent is that it returns a mutation function that always returns the given typeidx and type.

363 ↗(On Diff #130978)

I've gone with clampMinNumElements

371–374 ↗(On Diff #130454)

Kept fallback() but made falling off the end of the rules mean Unsupported

371–374 ↗(On Diff #130454)

I've kept fallback() because falling of the end of the ruleset now means unsupported

bogner added inline comments.Jan 24 2018, 12:07 PM
include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
166 ↗(On Diff #130978)

FWIW, as vague as "more" is here, I actually do find it clearer than "increase". Increase sounds like it could mean some kind of elementwise operation, whereas more says "we want more elements". I think this can be left as is.

241 ↗(On Diff #130978)

If you take an ArrayRef instead of an initializer list then ArrayRef's singleton constructor would mean that we could write .legalFor(s8) instead of .legalFor({s8}) and such. I'm not sure that's worth it though, given the potential for confusion that constructor opens up.

dsanders updated this revision to Diff 131343.Jan 24 2018, 1:49 PM
dsanders marked 2 inline comments as done.

Fix various review comments

It's now possible to declare multiple-opcodes together:

getActionDefinitions({G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SHL})
    .legalFor({s32, s64, v2s32, v4s32, v2s64})
    .clampScalar(0, s32, s64)
    .widenScalarToNextPow2(0)
    .clampNumElements(0, v2s32, v4s32)
    .clampNumElements(0, v2s64, v2s64)
    .moreElementsToNextPow2(0);

Aside from convenience, this is also a (very) slight improvement to
initialization time and doesn't measurably change run-time. In terms of
semantics, doing this is a promise that these opcodes are functionally
equivalent and will remain so.
Assertions check the following:

  • The second opcode onwards are not already equivalent to an opcode other than the first one specified.
  • The second opcode onwards do not already have definitions in case they are different from those of the first opcode.
  • The second opcode onwards are not modified beyond that point.

It's now possible to declare multiple-opcodes together:

getActionDefinitions({G_ADD, G_SUB, G_MUL, G_AND, G_OR, G_XOR, G_SHL})
    .legalFor({s32, s64, v2s32, v4s32, v2s64})
    .clampScalar(0, s32, s64)
    .widenScalarToNextPow2(0)
    .clampNumElements(0, v2s32, v4s32)
    .clampNumElements(0, v2s64, v2s64)
    .moreElementsToNextPow2(0);

I think it'd make for a slightly safer API to separate this out so it's done in two steps. Maybe something like:

auto ArithOps = createAliasedActionDefinitions({G_ADD, G_SUB, ...});
ArithOps.legalFor({s32, s64, ...}).clampScalar(...)...

This way you can avoid the complexity that happens if someone tries to pass different but overlapping sets of opcodes in subsequent calls to getActionDefinitions.

dsanders updated this revision to Diff 131374.Jan 24 2018, 4:08 PM

Along the lines of Justin's suggestion, getActionDefinitions() no longer has a
non-const version and cannot be used to build definitions.
Instead getActionDefinitionsBuilder() has been added and will assert if any
opcode you request is an alias or is aliased by something else.
The end result is that for the multi-opcode case, you must hang onto the
object you're given the first time until you're done adding rules because
getActionDefinitionsBuilder() won't provide it again. This protects against the
issues where you accidentally include slightly different opcodes in different
calls to getActionDefinitionsBuilder().

bogner accepted this revision.Jan 26 2018, 6:35 PM

This is looking pretty good. Let's get it in tree so people can start using it and we can iterate more based on feedback.

include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
348 ↗(On Diff #130978)

I'd probably just go with "identity". It's not great, but I think it's clearer/more precise than pick.

291–301 ↗(On Diff #131374)

Remove this for now, we can add it back when it's actually implemented.

This revision is now accepted and ready to land.Jan 26 2018, 6:35 PM
dsanders marked an inline comment as done.Jan 29 2018, 11:43 AM
dsanders added inline comments.
include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
348 ↗(On Diff #130978)

Ok, I'll go with identity for now but we ought to come up with a better name still. identity() makes it sound like you get a mutator that doesn't change anything but that's not correct. You get a mutator that changes the type to a specific type.

For example, given:

LegalityQuery({G_ADD, {s32})

the rule:

.widenScalarIf(typeInSet(0, {s32}), identity(0, s64))

will result in the action:

{WidenScalar, /*TypeIdx*/0, s64}

which will widen the G_ADD to work on s64's.

This revision was automatically updated to reflect the committed changes.
dsanders added inline comments.Jan 29 2018, 12:04 PM
include/llvm/CodeGen/GlobalISel/LegalizerInfo.h
348 ↗(On Diff #130978)

I'm not sure why this didn't occur to me until after I committed but changeTo would be a better name than either pick or identity.

rovka added a comment.Jan 31 2018, 7:14 AM

Hi,

I've been playing a bit with this for ARM (r323876) and it's been pretty straightforward to update. I have a few observations that I thought I could share, but feel free to ignore them depending on how you intend to evolve this in the future.

First of all, there's an inconsistency between getActionDefinitionsBuilder(Opcode) and getActionDefinitionsBuilder({Opcode}) - the latter marks Opcode as an alias even though there's nothing to alias with. Because of this, we get the somewhat surprising behaviour that using the former it's possible to add more rules even without saving the builder, whereas using the latter asserts. I'm not a huge fan of extra characters, but maybe we should just get rid of the first variant? This would be consistent with the rest of the API, since you can say legalFor({Type}), but not legalFor(Type). Of course, we could also go the other way and get rid of even more {} by using variadic templates instead of initializer lists (except for places where it looks good, like the CartesianProduct APIs). I haven't thought too much about it, so I'm not sure if it's worth the overhead, I'm just throwing it out as a potential alternative.

Also related to aliasing, I think some of the discussions from this review should move into comments in the code, because just by looking around it's a bit unclear if it's just an implementation detail / optimization or something users of the API should care about. In particular, the behaviour relative to progressively adding new rules to one or more opcodes should probably be documented somehow.

Hope that helps,
Diana

llvm/trunk/lib/Target/AArch64/AArch64LegalizerInfo.cpp
243

This doesn't look like it belongs here...

Hi,

I've been playing a bit with this for ARM (r323876) and it's been pretty straightforward to update. I have a few observations that I thought I could share, but feel free to ignore them depending on how you intend to evolve this in the future.

First of all, there's an inconsistency between getActionDefinitionsBuilder(Opcode) and getActionDefinitionsBuilder({Opcode}) - the latter marks Opcode as an alias even though there's nothing to alias with. Because of this, we get the somewhat surprising behaviour that using the former it's possible to add more rules even without saving the builder, whereas using the latter asserts. I'm not a huge fan of extra characters, but maybe we should just get rid of the first variant? This would be consistent with the rest of the API, since you can say legalFor({Type}), but not legalFor(Type). Of course, we could also go the other way and get rid of even more {} by using variadic templates instead of initializer lists (except for places where it looks good, like the CartesianProduct APIs). I haven't thought too much about it, so I'm not sure if it's worth the overhead, I'm just throwing it out as a potential alternative.

We could also fix that inconsistency by checking for a single element initializer list. I'll see what effect moving everything to the initializer_list has. I doubt it's a measurable difference and if that's the case I'll resolve the inconsistency that way. Otherwise, I'll make it so the initializer_list variant only marks it an alias if you give it multiple opcodes.

Also related to aliasing, I think some of the discussions from this review should move into comments in the code, because just by looking around it's a bit unclear if it's just an implementation detail / optimization or something users of the API should care about. In particular, the behaviour relative to progressively adding new rules to one or more opcodes should probably be documented somehow.

Sure, I'll add something to getActionDefinitionsBuilder() about it.

Hope that helps,
Diana

llvm/trunk/lib/Target/AArch64/AArch64LegalizerInfo.cpp
243

Well spotted. This used to have a .unsupported() on the end. I must have missed it when I switched falling off the end of the rules to mean unsupported.