As discussed in D42244, we have difficulty describing the legality of some
operations. We're not able to specify relationships between types.
For example, declaring the following
setAction({..., 0, s32}, Legal) setAction({..., 0, s64}, Legal) setAction({..., 1, s32}, Legal) setAction({..., 1, s64}, Legal)
currently declares these type combinations as legal:
{s32, s32} {s64, s32} {s32, s64} {s64, s64}
but we currently have no means to say that, for example, {s64, s32} is
not legal. Some operations such as G_INSERT/G_EXTRACT/G_MERGE_VALUES/
G_UNMERGE_VALUES have relationships between the types that are currently
described incorrectly.
Additionally, G_LOAD/G_STORE currently have no means to legalize non-atomics
differently to atomics. The necessary information is in the MMO but we have no
way to use this in the legalizer. Similarly, there is currently no way for the
register type and the memory type to differ so there is no way to cleanly
represent extending-load/truncating-store in a way that can't be broken by
optimizers (resulting in illegal MIR).
It's also difficult to control the legalization strategy. We've added support
for legalizing non-power of 2 types but there's still some hardcoded assumptions
about the strategy. The main one I've noticed is that type0 is always legalized
before type1 which is not a good strategy for type0 = G_EXTRACT type1, ... if
you need to widen the container. It will converge on the same result eventually
but it will take a much longer route when legalizing type0 than if you legalize
type1 first.
Lastly, the definition of legality and the legalization strategy is kept
separate which is not ideal. It's helpful to be able to look at a one piece of
code and see both what is legal and the method the legalizer will use to make
illegal MIR more legal.
This patch adds a layer onto the LegalizerInfo (to be removed when all targets
have been migrated) which resolves all these issues.
Here are the rules for shift and division:
for (unsigned BinOp : {G_LSHR, G_ASHR, G_SDIV, G_UDIV}) getActionDefinitions(BinOp) .legalFor({s32, s64}) // If type0 is s32/s64 then it's Legal .clampScalar(0, s32, s64) // If type0 is <s32 then WidenScalar to s32 // If type0 is >s64 then NarrowScalar to s64 .widenScalarToPow2(0) // Round type0 scalars up to powers of 2 .unsupported(); // Otherwise, it's unsupported
This describes everything needed to both define legality and describe how to
make illegal things legal.
Here's an example of a complex rule:
getActionDefinitions(G_INSERT) .unsupportedIf([=](const LegalityQuery &Query) { // If type0 is smaller than type1 then it's unsupported return Query.Types[0].getSizeInBits() <= Query.Types[1].getSizeInBits(); }) .legalIf([=](const LegalityQuery &Query) { // If type0 is s32/s64/p0 and type1 is a power of 2 other than 2 or 4 then it's legal // We don't need to worry about large type1's because unsupportedIf caught that. const LLT &Ty0 = Query.Types[0]; const LLT &Ty1 = Query.Types[1]; if (Ty0 != s32 && Ty0 != s64 && Ty0 != p0) return false; return isPowerOf2_32(Ty1.getSizeInBits()) && (Ty1.getSizeInBits() == 1 || Ty1.getSizeInBits() >= 8); }) .clampScalar(0, s32, s64) .widenScalarToPow2(0) .maxScalarIf(typeInSet(0, {s32}), 1, s16) // If type0 is s32 and type1 is bigger than s16 then NarrowScalar type1 to s16 .maxScalarIf(typeInSet(0, {s64}), 1, s32) // If type0 is s64 and type1 is bigger than s32 then NarrowScalar type1 to s32 .widenScalarToPow2(1) // Round type1 scalars up to powers of 2 .unsupported();
This uses a lambda to say that G_INSERT is unsupported when type0 is bigger than
type1 (in practice, this would be a default rule for G_INSERT). It also uses one
to describe the legal cases. This particular predicate is equivalent to:
.legalFor({{s32, s1}, {s32, s8}, {s32, s16}, {s64, s1}, {s64, s8}, {s64, s16}, {s64, s32}})
In terms of performance, I saw a slight (~6%) performance improvement when
AArch64 was around 30% ported but it's pretty much break even right now.
I'm going to take a look at constexpr as a means to reduce the initialization
cost.
Future work:
- Make it possible for opcodes to share rulesets. There's no need for G_LSHR/G_ASHR/G_SDIV/G_UDIV to have separate rule and ruleset objects. There's no technical barrier to this, it just hasn't been done yet.
- Replace the type-index numbers with an enum to get .clampScalar(Type0, s32, s64)
- Better names for things like .maxScalarIf() (clampMaxScalar?) and the vector rules.
- Improve initialization cost using constexpr
Possible future work:
- It's possible to make these rulesets change the MIR directly instead of returning a description of how to change the MIR. This should remove a little overhead caused by parsing the description and routing to the right code, but the real motivation is that it removes the need for LegalizeAction::Custom. With Custom removed, there's no longer a requirement that Custom legalization change the opcode to something that's considered legal.
This doesn't look like it belongs here...