There aren't very many requirements on the legalization rules but we should
document them.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
llvm/docs/GlobalISel.rst | ||
---|---|---|
464–465 ↗ | (On Diff #201323) | This should also probably say "selected" to copy, since IIRC COPY is not allowed to convert the type with generic virtual registers |
llvm/docs/GlobalISel.rst | ||
---|---|---|
405 ↗ | (On Diff #201323) | "GMIR" is used elsewhere in the document, probably best to use that instead of "gMIR" |
408–413 ↗ | (On Diff #201323) | Is it possible to use the fancy definition list thing here, or use some formatting to separate Producer Type Set/Consumer Type Set from the rest of the paragraph? (e.g. bold or italics) It's not terribly, long, but if I was skimming for the definitions, I think it would help. |
420 ↗ | (On Diff #201323) | four |
426 ↗ | (On Diff #201323) | Since the point of the section is to get the minimum rule set across I think this flow makes a bit more sense:
I think it would be good to have the mandatory instructions in a list rather than a paragraph too, for the sake of skimmability. |
450 ↗ | (On Diff #201323) | "LLVM IR" is more idiomatic, I think |
llvm/docs/GlobalISel.rst | ||
---|---|---|
426 ↗ | (On Diff #201323) | That makes sense. I'll re-work it to that |
456–457 ↗ | (On Diff #201323) | I'm trying to convey that even if the target doesn't use scalarize() to eliminate the vector, there's multiple choices on how to handle G_BUILD_VECTOR and no requirement to pick a specific method. In particular there's no requirement for the end point of legalization to be a .legal*() rule that marks the G_BUILD_VECTOR legal. An alternative is .custom*() that replaces it with a sequence of target instructions. The overall effect is that there aren't any legal G_BUILD_VECTOR's but vectors are still supported. I guess something like this would be clearer: There are no legality requirements for G_BUILD_VECTOR, or G_BUILD_VECTOR_TRUNC since these can be scalarized, lowered to other instructions, or lowered to target instructions. |
464–465 ↗ | (On Diff #201323) |
How are those represented in gMIR? I haven't had to deal with them
That's a good point, it needs to be constrained to a register class otherwise the types have to match. |
Update to account for review comments so far except for the non-integral
pointers one for which I need more information.
Producer/Consumer type set and minimum rules for scalars are assuming that set of available integer scalars is the same as set of all available floating point scalars. This does not have to be the case.
Could we define Producer/Consumer type set in terms of available sizes in register banks?
That is Producer/Consumer type set for G_ANYEXT, G_SEXT, G_ZEXT, G_TRUNC corresponds to available sizes in GPR register bank (integer instructions).
That is Producer/Consumer type set for G_FPEXT, G_FPTRUNC corresponds to available sizes in FPR register bank (floating point instructions).
This is the problem with GMIR in general since type (integer/float) is not available in low level type, but in most cases it is possible to figure out the type from opcode (float opcodes start with G_F , integer start with G_).
There are also ambiguous opcodes that are used for both integer and floating point instructions assuming they are available for same sizes (G_PHI, G_LOAD, G_STORE, G_IMPLICIT_DEF).
llvm/docs/GlobalISel.rst | ||
---|---|---|
422 ↗ | (On Diff #201379) | Here we are assuming that set of available integer scalars is the same as set of all available floating point scalars. This does not have to be the case. |
Hi Daniel,
Thanks for writing this down. I think the description is accurate (modulo the bit cast stuff) of what we do today.
That said, the vision (that we don't implement currently) for the legalizer was actually much simpler:
- The target needs to provide a way to legalize exts from producer/consumer sets.
- The target needs to support loading and storing from producer/consumer sets.
From there, supporting truncates, merge/unmerge, build vector and so on is a matter of issuing the right loads and stores. That's not very efficient thus we encourage targets to do what you wrote down, but the idea was that it is more an optimization than anything else.
Cheers,
-Quentin
llvm/docs/GlobalISel.rst | ||
---|---|---|
455 ↗ | (On Diff #201379) | Just a remark. |
llvm/docs/GlobalISel.rst | ||
---|---|---|
464–465 ↗ | (On Diff #201323) | They're the same as any other, it's just a field in the datalayout. The target defines these, so some set of operations on them must be legal. G_PTRTOINT/G_INTTOPTR is not allowed for them |
Scalars are a bag of bits that do not have an associated floating-point/integer interpretation. That interpretation is provided by the operation
Could we define Producer/Consumer type set in terms of available sizes in register banks?
That is Producer/Consumer type set for G_ANYEXT, G_SEXT, G_ZEXT, G_TRUNC corresponds to available sizes in GPR register bank (integer instructions).
That is Producer/Consumer type set for G_FPEXT, G_FPTRUNC corresponds to available sizes in FPR register bank (floating point instructions).
No, because register banks aren't a concept that legalization operates on. They're mostly introduced by the Register Bank Allocator
This is the problem with GMIR in general since type (integer/float) is not available in low level type, but in most cases it is possible to figure out the type from opcode (float opcodes start with G_F , integer start with G_).
You shouldn't be trying to figure out the data format of a scalar in GMIR as you'll get contradictory results (e.g. G_ADD feeding into G_FADD). The operations impart an interpretation on the bits in a register w.r.t that operation but the values in registers are just abstract bundles of bits.
There are also ambiguous opcodes that are used for both integer and floating point instructions assuming they are available for same sizes (G_PHI, G_LOAD, G_STORE, G_IMPLICIT_DEF).
They're not really ambiguous, the data format just has no meaning to them. G_PHI/G_LOAD/G_STORE are just moving bundles of bits around without interpreting them. G_IMPLICIT_DEF just invents a bundle of bits with unspecified values.
Those rules work when there's a load/store for every type but they run into unable-to-legalize if that's not the case. For example, suppose you have 8, 16, and 32-bit load/store but you also have 64-bit add. The s64 would need to narrowScalar to be usable with the s32 load/stores but that's not possible if narrowScalar needs an s64 store to narrow it. You'd need either G_TRUNC + G_LSHR (the G_LSHR can be a libcall) or G_UNMERGE_VALUES (which itself can legalize to G_TRUNC+G_LSHR) to legalize that case.
llvm/docs/GlobalISel.rst | ||
---|---|---|
422 ↗ | (On Diff #201379) |
No, the interpretation of the bits as being integer or floating point is irrelevant to this. An s64 containing a floating point value is still required to be truncatable with G_TRUNC. For example, this is necessary to feed it into an s32 store instruction. Similarly, an s32 containing half of a floating point value is still required to be any-extendable with G_ANYEXT in order to compose it with the other half to construct a full value.
G_TRUNC must be legal from s64 to s32 for a target to be guaranteed a means to store the s64 to memory (MIPS would implement it using mfc1). You're unlikely to actually use that path since MIPS you also have legal operations that make the guaranteed path redundant (e.g. sdc1) |
455 ↗ | (On Diff #201379) | Yep, I should clarify that. The intent was that because some targets can cope without it there's no requirement in the minimal set. I'll update the wording on this shortly |
Those rules work when there's a load/store for every type but they run into unable-to-legalize if that's not the case.
Indeed, and that was the intent. For the case that are not true, like in the example you describe, the target has to support those cases with custom expansion or other. In other words, what you describing get into specifics we didn't want to cover initially in the generic code.
llvm/docs/GlobalISel.rst | ||
---|---|---|
422 ↗ | (On Diff #201379) | OK, let's add rules defined here to ones that already exist. Where could we perform desired legalization/combine ? Assume that we are able to find out what needs to be combined/legalized with narrow scalar. |
llvm/docs/GlobalISel.rst | ||
---|---|---|
422 ↗ | (On Diff #201379) | You seem to be folding the job of RegBankSelect (or a custom pass) into the legalizer. Using your G_LOAD example, the availability of ldc1 makes s64 G_LOAD legal because it's possible to handle it any s64 G_LOAD that occurs. It might not be the most efficient thing to do in a given context but efficiency is not the legalizers job. The legalizer is merely tasked with eliminating things that are (or might be) impossible to handle in subsequent passes. As a side note, the artifact combiner doesn't really belong in the legalizer but exists as a concession to prevent the legalizer blowing up the size of the GMIR too much. It's intentionally limited to combining cases that the legalizer often emits around widenScalar/narrowScalar and we don't want the scope of the artifact combiner to increase because it shouldn't really be there at all. Binding the s64 G_LOAD to a register bank in the most efficient way is the job of the RegBankAllocator. Note that this isn't purely a matter of assigning a register bank to a vreg, it can also change the instructions.
In summary, trying to legalize i64 G_LOAD and f64 G_LOAD differently is bad because it's not the type of the data that determines what the most efficient instructions are and whichever decision you make you're locking the code generator out of selecting the best code. It's really the operations you perform on the data that are important as in some cases, i64 is best handled with floating point instructions and in others f64 is best handled with integer instructions. |
llvm/docs/GlobalISel.rst | ||
---|---|---|
422 ↗ | (On Diff #201379) | Thanks for the explanation. |
Hi @Petar.Avramovic ,
I second Daniel's comment, this is RegBankSelect's job to do this choice and the Legalizer shouldn't need to know about f64 vs. s64.
This way we duplicate some of the code from Legalizer and legalize/combine some instructions manually.
The way it was thought was that RegBankSelect should be able to call any Legalizer helper or whatever way you want to share the code between the legalizer and RegBankSelect.
Cheers,
-Quentin
llvm/docs/GlobalISel.rst | ||
---|---|---|
422 ↗ | (On Diff #201379) | As Quentin mentioned, the LegalizerHelper is available to all passes to enable the sharing of that code with any pass that needs it and there's no requirement that its usage is driven by consulting a LegalizerInfo. So after deciding to bind to GPRB, your code would just call narrowScalar(MI, 0, s32) to break it into two G_LOAD's. I'm not sure off-hand if the artifact combiner code is available in that interface at the moment but it should be so we can add it if it's not available today. |
llvm/docs/GlobalISel.rst | ||
---|---|---|
429 ↗ | (On Diff #205166) | Do G_LOAD and G_STORE deserve a mention here? They might have some complications depending on values in MemDesc, but should be legal for all inputs from producer and consumer type set in combination with some values in MemDesc. |
llvm/docs/GlobalISel.rst | ||
---|---|---|
429 ↗ | (On Diff #205166) | There's no minimal requirement for G_LOAD/G_STORE aside from being able to load/store _somehow_. You need at least one legal G_LOAD and G_STORE but there's no common subset that all targets can agree on. For example, while it's very common for targets to support all their producer/consumer types it's still possible to have targets that have s64 in their producer type set but don't support any s64 loads/stores. One case I know of is MIPS32 with the DSP ASE has 64-bit results from madd but doesn't have any 64-bit load/store instructions. I've also worked on a proprietary target in the past that had a restriction like this. Focusing more on the MemDesc side of things, it's very common for targets to support 1-byte accesses but it's still possible to not support that so long as at least one size is supported (you have to do the smaller sizes as a read-modify-write operation). I don't have an example to hand on this but there have been occasional threads on llvm-dev asking about a target like this. It's also possible to not support atomics (of course, you can't use language/library features that depend on it in that case) |