This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/trunk/docs/
-
trunk/
-
docs/
-
GlobalISel.rst

Differential D62423

[globalisel][legalizer] Attempt to write down the minimal legalization rules
ClosedPublic

Authored by dsanders on May 24 2019, 1:33 PM.

Download Raw Diff

Details

Reviewers

aditya_nandakumar
volkan
bogner
paquette
aemerson
rovka
arsenm
Petar.Avramovic
qcolombet

Commits

rG5b40eb057225: [globalisel][legalizer] Attempt to write down the minimal legalization rules
rL368321: [globalisel][legalizer] Attempt to write down the minimal legalization rules

Summary

There aren't very many requirements on the legalization rules but we should
document them.

Diff Detail

Repository: rL LLVM

Event Timeline

dsanders created this revision.May 24 2019, 1:33 PM

Herald added a project: Restricted Project. · View Herald TranscriptMay 24 2019, 1:33 PM

Herald added subscribers: kristof.beyls, wdng. · View Herald Transcript

dsanders added a reviewer: qcolombet.May 24 2019, 1:34 PM

Harbormaster completed remote builds in B32477: Diff 201323.May 24 2019, 1:35 PM

arsenm added inline comments.May 24 2019, 1:45 PM

llvm/docs/GlobalISel.rst
456–457 ↗	(On Diff #201323)	I don't follow this. How do you know they can be lowered to target instructions?
464–465 ↗	(On Diff #201323)	This ignores the existence of non-integral pointers

arsenm added inline comments.May 24 2019, 1:47 PM

llvm/docs/GlobalISel.rst
464–465 ↗	(On Diff #201323)	This should also probably say "selected" to copy, since IIRC COPY is not allowed to convert the type with generic virtual registers

paquette added inline comments.May 24 2019, 2:27 PM

llvm/docs/GlobalISel.rst
405 ↗	(On Diff #201323)	"GMIR" is used elsewhere in the document, probably best to use that instead of "gMIR"
408–413 ↗	(On Diff #201323)	Is it possible to use the fancy definition list thing here, or use some formatting to separate Producer Type Set/Consumer Type Set from the rest of the paragraph? (e.g. bold or italics) It's not terribly, long, but if I was skimming for the definitions, I think it would help.
420 ↗	(On Diff #201323)	four
426 ↗	(On Diff #201323)	Since the point of the section is to get the minimum rule set across I think this flow makes a bit more sense: G_ANYEXT and G_TRUNC must have legalization rules Introduce the other instructions Explanations for why you don't have mandatory rules for the other instructions I think it would be good to have the mandatory instructions in a list rather than a paragraph too, for the sake of skimmability.
450 ↗	(On Diff #201323)	"LLVM IR" is more idiomatic, I think

dsanders marked 3 inline comments as done.May 24 2019, 3:51 PM

dsanders added inline comments.

llvm/docs/GlobalISel.rst
426 ↗	(On Diff #201323)	That makes sense. I'll re-work it to that
456–457 ↗	(On Diff #201323)	I'm trying to convey that even if the target doesn't use `scalarize()` to eliminate the vector, there's multiple choices on how to handle G_BUILD_VECTOR and no requirement to pick a specific method. In particular there's no requirement for the end point of legalization to be a `.legal()` rule that marks the G_BUILD_VECTOR legal. An alternative is `.custom()` that replaces it with a sequence of target instructions. The overall effect is that there aren't any legal G_BUILD_VECTOR's but vectors are still supported. I guess something like this would be clearer: There are no legality requirements for G_BUILD_VECTOR, or G_BUILD_VECTOR_TRUNC since these can be scalarized, lowered to other instructions, or lowered to target instructions.
464–465 ↗	(On Diff #201323)	This ignores the existence of non-integral pointers How are those represented in gMIR? I haven't had to deal with them This should also probably say "selected" to copy, since IIRC COPY is not allowed to convert the type with generic virtual registers That's a good point, it needs to be constrained to a register class otherwise the types have to match.

Update to account for review comments so far except for the non-integral
pointers one for which I need more information.

Harbormaster completed remote builds in B32491: Diff 201379.May 24 2019, 7:20 PM

Producer/Consumer type set and minimum rules for scalars are assuming that set of available integer scalars is the same as set of all available floating point scalars. This does not have to be the case.
Could we define Producer/Consumer type set in terms of available sizes in register banks?
That is Producer/Consumer type set for G_ANYEXT, G_SEXT, G_ZEXT, G_TRUNC corresponds to available sizes in GPR register bank (integer instructions).
That is Producer/Consumer type set for G_FPEXT, G_FPTRUNC corresponds to available sizes in FPR register bank (floating point instructions).

This is the problem with GMIR in general since type (integer/float) is not available in low level type, but in most cases it is possible to figure out the type from opcode (float opcodes start with G_F , integer start with G_).
There are also ambiguous opcodes that are used for both integer and floating point instructions assuming they are available for same sizes (G_PHI, G_LOAD, G_STORE, G_IMPLICIT_DEF).

llvm/docs/GlobalISel.rst
422 ↗	(On Diff #201379)	Here we are assuming that set of available integer scalars is the same as set of all available floating point scalars. This does not have to be the case. e.g. "-mtriple=mipsel-linux-gnu" "-mtriple=i386-linux-gnu -mattr=+sse2"(http://lists.llvm.org/pipermail/llvm-dev/2017-July/114930.html) have i32, float/double available but i64 is not available. G_FPTRUNC is legal for {s32, s64} but G_TRUNC is not (for -mtriple=mipsel-linux-gnu I would say that it is unsupported since there are no register classes that hold less then 32 bits, and would expect artifact combiner to deal with G_TRUNC).

Hi Daniel,

Thanks for writing this down. I think the description is accurate (modulo the bit cast stuff) of what we do today.
That said, the vision (that we don't implement currently) for the legalizer was actually much simpler:

The target needs to provide a way to legalize exts from producer/consumer sets.
The target needs to support loading and storing from producer/consumer sets.

From there, supporting truncates, merge/unmerge, build vector and so on is a matter of issuing the right loads and stores. That's not very efficient thus we encourage targets to do what you wrote down, but the idea was that it is more an optimization than anything else.

Cheers,
-Quentin

llvm/docs/GlobalISel.rst
455 ↗	(On Diff #201379)	Just a remark. Bitcasts can actually be tricky to lower (from langref: `The conversion is done as if the value had been stored to memory and read back as type ty2`) because it can shuffle bits around. Therefore we should add that if the bistcast for the dst and src type is not a no-op, then the target should know how to expand it.

arsenm added inline comments.May 27 2019, 5:50 PM

llvm/docs/GlobalISel.rst
464–465 ↗	(On Diff #201323)	They're the same as any other, it's just a field in the datalayout. The target defines these, so some set of operations on them must be legal. G_PTRTOINT/G_INTTOPTR is not allowed for them

In D62423#1517863, @Petar.Avramovic wrote:

Producer/Consumer type set and minimum rules for scalars are assuming that set of available integer scalars is the same as set of all available floating point scalars. This does not have to be the case.

Scalars are a bag of bits that do not have an associated floating-point/integer interpretation. That interpretation is provided by the operation

Could we define Producer/Consumer type set in terms of available sizes in register banks?
That is Producer/Consumer type set for G_ANYEXT, G_SEXT, G_ZEXT, G_TRUNC corresponds to available sizes in GPR register bank (integer instructions).
That is Producer/Consumer type set for G_FPEXT, G_FPTRUNC corresponds to available sizes in FPR register bank (floating point instructions).

No, because register banks aren't a concept that legalization operates on. They're mostly introduced by the Register Bank Allocator

This is the problem with GMIR in general since type (integer/float) is not available in low level type, but in most cases it is possible to figure out the type from opcode (float opcodes start with G_F , integer start with G_).

You shouldn't be trying to figure out the data format of a scalar in GMIR as you'll get contradictory results (e.g. G_ADD feeding into G_FADD). The operations impart an interpretation on the bits in a register w.r.t that operation but the values in registers are just abstract bundles of bits.

There are also ambiguous opcodes that are used for both integer and floating point instructions assuming they are available for same sizes (G_PHI, G_LOAD, G_STORE, G_IMPLICIT_DEF).

They're not really ambiguous, the data format just has no meaning to them. G_PHI/G_LOAD/G_STORE are just moving bundles of bits around without interpreting them. G_IMPLICIT_DEF just invents a bundle of bits with unspecified values.

In D62423#1518267, @qcolombet wrote:

Hi Daniel,

Thanks for writing this down. I think the description is accurate (modulo the bit cast stuff) of what we do today.
That said, the vision (that we don't implement currently) for the legalizer was actually much simpler:

The target needs to provide a way to legalize exts from producer/consumer sets.

The target needs to support loading and storing from producer/consumer sets.

From there, supporting truncates, merge/unmerge, build vector and so on is a matter of issuing the right loads and stores. That's not very efficient thus we encourage targets to do what you wrote down, but the idea was that it is more an optimization than anything else.

Those rules work when there's a load/store for every type but they run into unable-to-legalize if that's not the case. For example, suppose you have 8, 16, and 32-bit load/store but you also have 64-bit add. The s64 would need to narrowScalar to be usable with the s32 load/stores but that's not possible if narrowScalar needs an s64 store to narrow it. You'd need either G_TRUNC + G_LSHR (the G_LSHR can be a libcall) or G_UNMERGE_VALUES (which itself can legalize to G_TRUNC+G_LSHR) to legalize that case.

llvm/docs/GlobalISel.rst
422 ↗	(On Diff #201379)	Here we are assuming that set of available integer scalars is the same as set of all available floating point scalars. This does not have to be the case. e.g. "-mtriple=mipsel-linux-gnu" "-mtriple=i386-linux-gnu -mattr=+sse2"(http://lists.llvm.org/pipermail/llvm-dev/2017-July/114930.html) have i32, float/double available but i64 is not available. No, the interpretation of the bits as being integer or floating point is irrelevant to this. An s64 containing a floating point value is still required to be truncatable with G_TRUNC. For example, this is necessary to feed it into an s32 store instruction. Similarly, an s32 containing half of a floating point value is still required to be any-extendable with G_ANYEXT in order to compose it with the other half to construct a full value. G_FPTRUNC is legal for {s32, s64} but G_TRUNC is not (for -mtriple=mipsel-linux-gnu I would say that it is unsupported since there are no register > classes that hold less then 32 bits, and would expect artifact combiner to deal with G_TRUNC). G_TRUNC must be legal from s64 to s32 for a target to be guaranteed a means to store the s64 to memory (MIPS would implement it using mfc1). You're unlikely to actually use that path since MIPS you also have legal operations that make the guaranteed path redundant (e.g. sdc1)
455 ↗	(On Diff #201379)	Yep, I should clarify that. The intent was that because some targets can cope without it there's no requirement in the minimal set. I'll update the wording on this shortly

Clarify the legality of G_BITCAST where non-trivial bitcasts appear.

Harbormaster completed remote builds in B33506: Diff 205166.Jun 17 2019, 1:27 PM

Those rules work when there's a load/store for every type but they run into unable-to-legalize if that's not the case.

Indeed, and that was the intent. For the case that are not true, like in the example you describe, the target has to support those cases with custom expansion or other. In other words, what you describing get into specifics we didn't want to cover initially in the generic code.

llvm/docs/GlobalISel.rst
422 ↗	(On Diff #201379)	OK, let's add rules defined here to ones that already exist. MIPS will then often have to select sequences of move instructions because Legalizer did not perform desired combine/narrow scalar because type information was not available. Where could we perform desired legalization/combine ? Assume that we are able to find out what needs to be combined/legalized with narrow scalar. Maybe target custom postLegalizer Legalizer pass or target custom postRegBankSelect Legalizer pass. It would narrow scalar i64 G_LOAD but would not touch f64 G_LOAD. Or to let targets to put together a InstList/ArtifactList during RegBankSelect and forward these lists to legalize (function that would hold main do/while loop from Legalizer.cpp) at the end of regBankSelect, this way we don't have to process all instructions just ones that need to be re-legalized. Here we need different LegalizerInfo object than one used in Legalizer pass.

dsanders marked an inline comment as done.Jun 18 2019, 11:29 AM

dsanders added inline comments.

llvm/docs/GlobalISel.rst
422 ↗	(On Diff #201379)	You seem to be folding the job of RegBankSelect (or a custom pass) into the legalizer. Using your G_LOAD example, the availability of ldc1 makes s64 G_LOAD legal because it's possible to handle it any s64 G_LOAD that occurs. It might not be the most efficient thing to do in a given context but efficiency is not the legalizers job. The legalizer is merely tasked with eliminating things that are (or might be) impossible to handle in subsequent passes. As a side note, the artifact combiner doesn't really belong in the legalizer but exists as a concession to prevent the legalizer blowing up the size of the GMIR too much. It's intentionally limited to combining cases that the legalizer often emits around widenScalar/narrowScalar and we don't want the scope of the artifact combiner to increase because it shouldn't really be there at all. Binding the s64 G_LOAD to a register bank in the most efficient way is the job of the RegBankAllocator. Note that this isn't purely a matter of assigning a register bank to a vreg, it can also change the instructions. You have two main implementation choices in RegBankAllocator and the choice will depend on the cost of the alternatives. The simplest option is to bind it to the only register bank capable of doing the s64 G_LOAD (FPRB on MIPS). This will encourage the RegBankAllocator to bind the data dependent operations to FPRB and emit a copy when it has to use GPRB (or doing so would require fewer copies). This isn't the ideal option though as it forces all 64-bit loads through the FPU and the mfhc1/mfc1/ldc1 costs more than lw/lw. The better option is to give the RegBankAllocator a choice on how to bind s64 G_LOAD so that if it chooses GPRB it costs two loads plus any cross bank move cost caused by data dependencies and if it chooses FPRB it costs one load plus any cross bank move cost. This gives it the freedom to make the right decision for the context: If you're just copying data from one memory location to another then (assuming s64 G_STORE is handled similarly) then it will bind to FPRB and use ldc1/sdc1 to copy the data which beats lw/sw/lw/sw. If you're doing floating point operations on the data then it will generally bind to FPRB and use something like ldc1/fadd.d which beats lw/lw/mthc1/mtc1/fadd.d If you're doing bit twiddling on floating point data (e.g. fneg vs xor 0x800000000000000, or fabs vs and 0x7fffffffffffffff) then it will have a choice. It will most likely pick FPRB and ldc1/fneg.d rather than lw/lw/xor 0x80000000 if there's other data-dependent floating point operations nearby but the costs are fairly close so the decision can go with GPRB in the right context If you're doing integer operations on the data then it will generally bind to GPRB and use something like lw/lw/setlt/add/add/add (for a narrowScalar'd s64 G_ADD). In summary, trying to legalize i64 G_LOAD and f64 G_LOAD differently is bad because it's not the type of the data that determines what the most efficient instructions are and whichever decision you make you're locking the code generator out of selecting the best code. It's really the operations you perform on the data that are important as in some cases, i64 is best handled with floating point instructions and in others f64 is best handled with integer instructions.

Petar.Avramovic added inline comments.Jun 19 2019, 6:12 AM

llvm/docs/GlobalISel.rst
422 ↗	(On Diff #201379)	Thanks for the explanation. Following up comment above, let's assume that we figured out that most efficient solution is to modify instruction (e.g. turn s64 G_LOAD into two s32 G_LOADs). What I meant by legalize is that changing instruction in e.g. `applyMappingImpl' is equivalent to applying some of legalization / artifact combine rules. This way we duplicate some of the code from Legalizer and legalize/combine some instructions manually. If e.g. G_LOAD needs to be turned into lw/lw/G_MERGE there is a G_UNMERGE somewhere waiting, dealing with them is equivalent to narrowScalar for G_LOAD + MERGE/UNMERGE combine. It looks nicer to collect all instructions that have to be "relegalized" and forward them to legalizer with RegBankLegalizerInfo (this one has narrowScalar for G_LOAD from s64 to s32, the one from Legalizer has G_LOAD legal for both s64 and s32 because of Minimum Rule Set). Is this job for RegBankSelect or Target custom Pass (Legalizer2) ?

Hi @Petar.Avramovic ,

I second Daniel's comment, this is RegBankSelect's job to do this choice and the Legalizer shouldn't need to know about f64 vs. s64.

This way we duplicate some of the code from Legalizer and legalize/combine some instructions manually.

The way it was thought was that RegBankSelect should be able to call any Legalizer helper or whatever way you want to share the code between the legalizer and RegBankSelect.

Cheers,
-Quentin

atanasyan added a subscriber: atanasyan.Jun 20 2019, 6:02 AM

dsanders marked an inline comment as done.Jun 20 2019, 10:38 AM

dsanders added inline comments.

llvm/docs/GlobalISel.rst
422 ↗	(On Diff #201379)	As Quentin mentioned, the LegalizerHelper is available to all passes to enable the sharing of that code with any pass that needs it and there's no requirement that its usage is driven by consulting a LegalizerInfo. So after deciding to bind to GPRB, your code would just call `narrowScalar(MI, 0, s32)` to break it into two G_LOAD's. I'm not sure off-hand if the artifact combiner code is available in that interface at the moment but it should be so we can add it if it's not available today.

arsenm mentioned this in D64726: AMDGPU/GlobalISel: Fix not constraining result reg of copies to VCC.Jul 16 2019, 6:58 AM

Ping. I think we reached a conclusion on this but I lack an LGTM :-)

Petar.Avramovic added inline comments.Aug 8 2019, 3:16 AM

llvm/docs/GlobalISel.rst
429 ↗	(On Diff #205166)	Do G_LOAD and G_STORE deserve a mention here? They might have some complications depending on values in MemDesc, but should be legal for all inputs from producer and consumer type set in combination with some values in MemDesc.

LGTM.

This revision is now accepted and ready to land.Aug 8 2019, 3:16 AM

dsanders marked an inline comment as done.Aug 8 2019, 10:51 AM

dsanders added inline comments.

llvm/docs/GlobalISel.rst
429 ↗	(On Diff #205166)	There's no minimal requirement for G_LOAD/G_STORE aside from being able to load/store _somehow_. You need at least one legal G_LOAD and G_STORE but there's no common subset that all targets can agree on. For example, while it's very common for targets to support all their producer/consumer types it's still possible to have targets that have s64 in their producer type set but don't support any s64 loads/stores. One case I know of is MIPS32 with the DSP ASE has 64-bit results from madd but doesn't have any 64-bit load/store instructions. I've also worked on a proprietary target in the past that had a restriction like this. Focusing more on the MemDesc side of things, it's very common for targets to support 1-byte accesses but it's still possible to not support that so long as at least one size is supported (you have to do the smaller sizes as a read-modify-write operation). I don't have an example to hand on this but there have been occasional threads on llvm-dev asking about a target like this. It's also possible to not support atomics (of course, you can't use language/library features that depend on it in that case)

Closed by commit rL368321: [globalisel][legalizer] Attempt to write down the minimal legalization rules (authored by dsanders). · Explain WhyAug 8 2019, 10:54 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

docs/

GlobalISel.rst

95 lines

Diff 214191

llvm/trunk/docs/GlobalISel.rst

	Show First 20 Lines • Show All 531 Lines • ▼ Show 20 Lines
	potentially useful project.			potentially useful project.

	.. rubric:: Footnotes			.. rubric:: Footnotes

	.. [#legalizer-legacy-footnote] An API is broadly similar to			.. [#legalizer-legacy-footnote] An API is broadly similar to
	SelectionDAG/TargetLowering is available but is not recommended as a more			SelectionDAG/TargetLowering is available but is not recommended as a more
	powerful API is available.			powerful API is available.

				.. _min-legalizerinfo:

				Minimum Rule Set
				^^^^^^^^^^^^^^^^

				GlobalISel's legalizer has a great deal of flexibility in how a given target
				shapes the GMIR that the rest of the backend must handle. However, there are
				a small number of requirements that all targets must meet.

				Before discussing the minimum requirements, we'll need some terminology:

				Producer Type Set
				The set of types which is the union of all possible types produced by at
				least one legal instruction.

				Consumer Type Set
				The set of types which is the union of all possible types consumed by at
				least one legal instruction.

				Both sets are often identical but there's no guarantee of that. For example,
				it's not uncommon to be unable to consume s64 but still be able to produce it
				for a few specific instructions.

				Minimum Rules For Scalars
				"""""""""""""""""""""""""

				* G_ANYEXT must be legal for all inputs from the producer type set and all larger
				outputs from the consumer type set.
				* G_TRUNC must be legal for all inputs from the producer type set and all
				smaller outputs from the consumer type set.

				G_ANYEXT, and G_TRUNC have mandatory legality since the GMIR requires a means to
				connect operations with different type sizes. They are usually trivial to support
				since G_ANYEXT doesn't define the value of the additional bits and G_TRUNC is
				discarding bits. The other conversions can be lowered into G_ANYEXT/G_TRUNC
				with some additional operations that are subject to further legalization. For
				example, G_SEXT can lower to:
				%1 = G_ANYEXT %0
				%2 = G_CONSTANT ...
				%3 = G_SHL %1, %2
				%4 = G_ASHR %3, %2
				and the G_CONSTANT/G_SHL/G_ASHR can further lower to other operations or target
				instructions. Similarly, G_FPEXT has no legality requirement since it can lower
				to a G_ANYEXT followed by a target instruction.

				G_MERGE_VALUES and G_UNMERGE_VALUES do not have legality requirements since the
				former can lower to G_ANYEXT and some other legalizable instructions, while the
				latter can lower to some legalizable instructions followed by G_TRUNC.

				Minimum Legality For Vectors
				""""""""""""""""""""""""""""

				Within the vector types, there aren't any defined conversions in LLVM IR as
				vectors are often converted by reinterpreting the bits or by decomposing the
				vector and reconstituting it as a different type. As such, G_BITCAST is the
				only operation to account for. We generally don't require that it's legal
				because it can usually be lowered to COPY (or to nothing using
				replaceAllUses()). However, there are situations where G_BITCAST is non-trivial
				(e.g. little-endian vectors of big-endian data such as on big-endian MIPS MSA and
				big-endian ARM NEON, see `_i_bitcast`). To account for this G_BITCAST must be
				legal for all type combinations that change the bit pattern in the value.

				There are no legality requirements for G_BUILD_VECTOR, or G_BUILD_VECTOR_TRUNC
				since these can be handled by:
				* Declaring them legal.
				* Scalarizing them.
				* Lowering them to G_TRUNC+G_ANYEXT and some legalizable instructions.
				* Lowering them to target instructions which are legal by definition.

				The same reasoning also allows G_UNMERGE_VALUES to lack legality requirements
				for vector inputs.

				Minimum Legality for Pointers
				"""""""""""""""""""""""""""""

				There are no minimum rules for pointers since G_INTTOPTR and G_PTRTOINT can
				be selected to a COPY from register class to another by the legalizer.

				Minimum Legality For Operations
				"""""""""""""""""""""""""""""""

				The rules for G_ANYEXT, G_MERGE_VALUES, G_BITCAST, G_BUILD_VECTOR,
				G_BUILD_VECTOR_TRUNC, G_CONCAT_VECTORS, G_UNMERGE_VALUES, G_PTRTOINT, and
				G_INTTOPTR have already been noted above. In addition to those, the following
				operations have requirements:
				* At least one G_IMPLICIT_DEF must be legal. This is usually trivial as it
				requires no code to be selected.
				* G_PHI must be legal for all types in the producer and consumer typesets. This
				is usually trivial as it requires no code to be selected.
				* At least one G_FRAME_INDEX must be legal
				* At least one G_BLOCK_ADDR must be legal

				There are many other operations you'd expect to have legality requirements but
				they can be lowered to target instructions which are legal by definition.

	.. _regbankselect:			.. _regbankselect:

	RegBankSelect			RegBankSelect
	-------------			-------------

	This pass constrains the :ref:`gmir-gvregs` operands of generic			This pass constrains the :ref:`gmir-gvregs` operands of generic
	instructions to some :ref:`gmir-regbank`.			instructions to some :ref:`gmir-regbank`.

	▲ Show 20 Lines • Show All 295 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[globalisel][legalizer] Attempt to write down the minimal legalization rulesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 214191

llvm/trunk/docs/GlobalISel.rst

[globalisel][legalizer] Attempt to write down the minimal legalization rules
ClosedPublic