This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Introduce G_BUILD_VECTOR and G_CONCAT_VECTOR opcodes
ClosedPublic

Authored by aemerson on Oct 23 2018, 11:27 AM.

Download Raw Diff

Details

Reviewers

qcolombet
dsanders
ab
aditya_nandakumar
volkan
bogner
tstellar
aivchenk
Petar.Avramovic
• jpaquette
rtereshin
paquette

Commits

rGa0b15d8f3e61: [GlobalISel] Introduce G_BUILD_VECTOR, G_BUILD_VECTOR_TRUNC and G_CONCAT_VECTOR…
rL348430: [GlobalISel] Introduce G_BUILD_VECTOR, G_BUILD_VECTOR_TRUNC and G_CONCAT_VECTOR…

Summary

Introduce G_BUILD_VECTOR and G_CONCAT_VECTOR opcodes.

These opcodes are intended to subsume some of the capability of G_MERGE_VALUES, as it was too powerful and thus complex to add deal with throughout the GISel pipeline.

G_BUILD_VECTOR creates a vector value from a sequence of uniformly typed scalar values.

G_CONCAT_VECTOR creates a vector by concatenating smaller, uniformly typed, vectors together.

These will be used in a subsequent commit. This commit just adds the initial infrastructure.

Diff Detail

Repository: rL LLVM

Event Timeline

aemerson created this revision.Oct 23 2018, 11:27 AM

Herald added a reviewer: paquette. · View Herald TranscriptOct 23 2018, 11:27 AM

Herald added subscribers: kristina, javed.absar, kristof.beyls, rovka. · View Herald Transcript

aemerson added a parent revision: D53592: [GlobalISel] Use the target preferred type for G_EXTRACT_VECTOR_ELT index.Oct 23 2018, 11:28 AM

aemerson added a child revision: D53629: [GlobalISel] Restrict G_MERGE_VALUES capability and replace with new opcodes.Oct 23 2018, 6:10 PM

Are two different opcodes for this really needed? Why not a single opcode with uniformly typed operands, either scalar or vector, which concatenates the operands as an appropriately sized vector? I know there's precedent for the separation elsewhere in LLVM, but it has always seemed redundant to me personally. The code for handling the BUILD vs. CONCAT cases tends to be very similar.

In D53594#1273910, @nhaehnle wrote:

Are two different opcodes for this really needed? Why not a single opcode with uniformly typed operands, either scalar or vector, which concatenates the operands as an appropriately sized vector? I know there's precedent for the separation elsewhere in LLVM, but it has always seemed redundant to me personally. The code for handling the BUILD vs. CONCAT cases tends to be very similar.

My view here is that legalization, combine optimizations and selection for the operations are almost always different. I would disagree that BUILD is a similar operation, there are optimization opportunities available for BUILD that don't apply for CONCAT_VECTORS, and vice versa, especially target specific ones at selection time. So with that, I think it's better that we have separate opcodes. G_MERGE is simply too flexible, and that makes legalization rules embed a lot of complexity. (A side benefit of this is that we may be able to re-use SelectionDAG patterns automatically, but that's not the motivating reason for this change);

+ @huntergr On a tangential note, G_BUILD_VECTOR for expressing broadcasts of scalars won't work for SVE and other variable vector width targets. We do however still leave the door open here for a new G_BROADCAST opcode in future.

In D53594#1273910, @nhaehnle wrote:

Are two different opcodes for this really needed? Why not a single opcode with uniformly typed operands, either scalar or vector, which concatenates the operands as an appropriately sized vector? I know there's precedent for the separation elsewhere in LLVM, but it has always seemed redundant to me personally. The code for handling the BUILD vs. CONCAT cases tends to be very similar.

It doesn't make much difference if we don't add both, we would need the check the types to figure out if the inputs are scalars or vectors. Having separate opcodes is going to simplify the legalization strategies and make it easier to match patterns when writing combines.

Updated patch to allow G_BUILD_VECTOR to have scalar operands not identical to the result vector element types. This allows implicit truncation.

This allows implicit truncation.

That sounds dangerous to me.

In D53594#1280672, @qcolombet wrote:

This allows implicit truncation.

That sounds dangerous to me.

This lets targets with say 4 x s8 legal but not s8 on it's own. Instead of writing
a(<4 x s32>) = G_BUILD_VECTOR p(s32), q, r, s
b(<4 x s8>) = G_TRUNC a

We can write this in one instruction (similar to the DAG)
b(<4 x s8>) = G_BUILD_VECTOR p, q, r, s

Without implicit truncations, it's necessary to make G_BUILD_VECTOR of <4 x s32> also legal just to make G_BUILD_VECTOR < 4 x s8> legal (though probably most targets will have 4 x s32 already legal).
Implicit truncation feels like a nice to have - I'm not strongly in favor of it. If there are good reasons to not have it, we should forbid it.

Thanks @aditya_nandakumar for the context.

I understand where you're coming from.

What I meant is that I would rather have one opcode of build_vector and one for build_vector with trunc. The rationale is that it makes it clear what you want to achieve and it makes for a stronger type verifier.

In D53594#1280811, @qcolombet wrote:

Thanks @aditya_nandakumar for the context.

I understand where you're coming from.

What I meant is that I would rather have one opcode of build_vector and one for build_vector with trunc. The rationale is that it makes it clear what you want to achieve and it makes for a stronger type verifier.

+1 for having explicit opcodes for truncation.

In D53594#1280811, @qcolombet wrote:

Thanks @aditya_nandakumar for the context.

I understand where you're coming from.

What I meant is that I would rather have one opcode of build_vector and one for build_vector with trunc. The rationale is that it makes it clear what you want to achieve and it makes for a stronger type verifier.

I'm not sure how this is any better. If we had an opcode G_BUILD_VECTOR_TRUNC, it would have a superset of the functionality of the G_BUILD_VECTOR. You'd still have this powerful opcode, except now you have potential code duplication in handling the two, as well as a hindered G_BUILD_VECTOR.

Unless you're say that G_BUILD_VECTOR _TRUNC can *only* deal with oversize scalars, in which case it's an odd opcode for a corner case (and I would suggest here that we just deal with the inconvenient types, but I'm not married to the idea).

In D53594#1281508, @aemerson wrote:

In D53594#1280811, @qcolombet wrote:

Thanks @aditya_nandakumar for the context.

I understand where you're coming from.

What I meant is that I would rather have one opcode of build_vector and one for build_vector with trunc. The rationale is that it makes it clear what you want to achieve and it makes for a stronger type verifier.

I'm not sure how this is any better. If we had an opcode G_BUILD_VECTOR_TRUNC, it would have a superset of the functionality of the G_BUILD_VECTOR. You'd still have this powerful opcode, except now you have potential code duplication in handling the two, as well as a hindered G_BUILD_VECTOR.

Unless you're say that G_BUILD_VECTOR _TRUNC can *only* deal with oversize scalars, in which case it's an odd opcode for a corner case (and I would suggest here that we just deal with the inconvenient types, but I'm not married to the idea).

I was under the impression that the G_BUILD_VECTOR_TRUNC would only deal with the special case of oversize scalars (and not the regular BUILD_VECTOR). You're right that we're just pushing complexity from checking types in G_BUILD_VECTOR to checking opcodes for truncating behavior. However because the trunc version probably (a big assumption) shows up a fraction of the time, it might be easier to write legalization/combines for G_BUILD_VECTOR without checking for corner cases in most places, and only deal with the corner case explicitly. That said, if there are patterns using G_BUILD_VECTOR from the DAG that rely on truncating behavior, importing it might be difficult.

I'm not sure how this is any better. If we had an opcode G_BUILD_VECTOR_TRUNC, it would have a superset of the functionality of the G_BUILD_VECTOR. You'd still have this powerful opcode, except now you have potential code duplication in handling the two, as well as a hindered G_BUILD_VECTOR.

Unless you're say that G_BUILD_VECTOR _TRUNC can *only* deal with oversize scalars, in which case it's an odd opcode for a corner case (and I would suggest here that we just deal with the inconvenient types, but I'm not married to the idea).

I was suggesting the later.

Generally speaking, I am not a fan of implicit stuff, thus the proposal (and that's easy to have a isBuildVectorLike or something). That said, I'd like to go back as why we need to do that.

So the problem was that we wanted to represent:
a(<4 x s32>) = G_BUILD_VECTOR p(s32), q, r, s
b(<4 x s8>) = G_TRUNC a

into
b(< 4 x s8>) = G_BUILD_VECTOR p(s32), q, r, s

Why can't we keep the "verbose" representation?
I would expect it would be easy to handle that in ISel.

Also, just throwing that here, I haven't really thought about it: should we rethink our legality model to check patterns instead of one instruction at a time?
I.e., maybe

a(s8) = ...

is not legal but

a(s8) = ...
b(<4 x s8>) = build_vector a(s8)

is.

This actually ties back to one question that I don't think we answered. Do we legalize on the type (i.e., s8 to s32) or the size and type can be different (i.e., s8(8b) to s8(32b))?
It seems to me we are trying to write <4 x s8> = s8(32b), ...., but maybe we want <4 x s8> = s32, ....

Anyhow, to be concrete, I am fine with the implicit truncate, we know this is not too bad based on SDISel experience, I want to make sure we consider all our options here.

Cheers,
-Quentin

In D53594#1283016, @qcolombet wrote:

I'm not sure how this is any better. If we had an opcode G_BUILD_VECTOR_TRUNC, it would have a superset of the functionality of the G_BUILD_VECTOR. You'd still have this powerful opcode, except now you have potential code duplication in handling the two, as well as a hindered G_BUILD_VECTOR.

Unless you're say that G_BUILD_VECTOR _TRUNC can *only* deal with oversize scalars, in which case it's an odd opcode for a corner case (and I would suggest here that we just deal with the inconvenient types, but I'm not married to the idea).

I was suggesting the later.

Generally speaking, I am not a fan of implicit stuff, thus the proposal (and that's easy to have a isBuildVectorLike or something). That said, I'd like to go back as why we need to do that.

So the problem was that we wanted to represent:
a(<4 x s32>) = G_BUILD_VECTOR p(s32), q, r, s
b(<4 x s8>) = G_TRUNC a

into
b(< 4 x s8>) = G_BUILD_VECTOR p(s32), q, r, s

Why can't we keep the "verbose" representation?
I would expect it would be easy to handle that in ISel.

As long as <4 x s32> = the G_BUILD_VECTOR could be legalized I think the worst that will happen is that we just get poor codegen.

Also, just throwing that here, I haven't really thought about it: should we rethink our legality model to check patterns instead of one instruction at a time?
I.e., maybe
a(s8) = ...
is not legal but
a(s8) = ...
b(<4 x s8>) = build_vector a(s8)
is.

I think the issue of sequences becoming illegal because of things like copies being introduced is still a deal breaker for that, and it would also introduce a code motion barrier since we can't select sequences spanning blocks. E.g. it wouldn't be safe to move the insn in the predecessor due to other uses.

This actually ties back to one question that I don't think we answered. Do we legalize on the type (i.e., s8 to s32) or the size and type can be different (i.e., s8(8b) to s8(32b))?
It seems to me we are trying to write <4 x s8> = s8(32b), ...., but maybe we want <4 x s8> = s32, ....

I've had a think about what having separate type and container sizes would actually mean. For brevity, let’s call values which have distinct type and container sizes “partial values”. A partial value s8:s32 means that the upper 24 bits of this 32 bit register is always undef. For some operations, it should always be possible to ignore the type size and just use the container size for selection. E.g. non-flag setting G_ADD could select a 32 bit add and satisfy the semantics of an s8 add. But for other operations like shifts, or count-leading-zeros, we can't use a wider operation type, so the entire chain of partial value operations would be restricted to selecting the narrower type, rather than the container. And at that point, the container type may as well not be there. There may be a way around this issue I haven't seen though.

Anyhow, to be concrete, I am fine with the implicit truncate, we know this is not too bad based on SDISel experience, I want to make sure we consider all our options here.

Sure. Given that we already have this implicit truncating behavior on account of G_MERGE being an omnipotent operation, should we allow the implicit truncation with G_BUILD_VECTOR for now? If we have problems with it later we can revisit the idea of adding a separate opcode/something else.

Cheers,
-Quentin

I think the issue of sequences becoming illegal because of things like copies being introduced is still a deal breaker for that, and it would also introduce a code motion barrier since we can't select sequences spanning blocks. E.g. it wouldn't be safe to move the insn in the predecessor due to other uses.

Interesting, I thought we would be able to recover from this kind of situation by 1) calling the legalizer helper in the pass that introduces the illegal code (plus looking through copy is must have IMHO), 2) looking through basic blocks.
Do we actually check the basic blocks when doing combines? I thought we get these "combines" for free in GISel.

so the entire chain of partial value operations would be restricted to selecting the narrower type, rather than the container

At this point, I would have expected the value to get z|sext appropriately for consumption. Basically the idea being we pay for legalitizing to a wider type only if we need to. But anyway, completely out of scope here :).

Sure. Given that we already have this implicit truncating behavior on account of G_MERGE being an omnipotent operation, should we allow the implicit truncation with G_BUILD_VECTOR for now? If we have problems with it later we can revisit the idea of adding a separate opcode/something else.

Agree!

In D53594#1283016, @qcolombet wrote:

I'm not sure how this is any better. If we had an opcode G_BUILD_VECTOR_TRUNC, it would have a superset of the functionality of the G_BUILD_VECTOR. You'd still have this powerful opcode, except now you have potential code duplication in handling the two, as well as a hindered G_BUILD_VECTOR.

Unless you're say that G_BUILD_VECTOR _TRUNC can *only* deal with oversize scalars, in which case it's an odd opcode for a corner case (and I would suggest here that we just deal with the inconvenient types, but I'm not married to the idea).

I was suggesting the later.

Generally speaking, I am not a fan of implicit stuff, thus the proposal (and that's easy to have a isBuildVectorLike or something). That said, I'd like to go back as why we need to do that.

So the problem was that we wanted to represent:
a(<4 x s32>) = G_BUILD_VECTOR p(s32), q, r, s
b(<4 x s8>) = G_TRUNC a

into
b(< 4 x s8>) = G_BUILD_VECTOR p(s32), q, r, s

Why can't we keep the "verbose" representation?

Hi Quentin,

The verbose representation may not be legal for the given target. Let's take Mips MSA as a concrete example. MSA has <16 x s8> vectors and <4 x s32> vectors (among others but they're not relevant) but doesn't have <16 x s32> (its 512-bits and the registers are 128-bit) or <4 x s8> (although we can potentially emulate it with <16 x s8> but let's stick to the real types the ISA can handle for now). For an example similar to yours, the verbose representation might be:

%0(<16 x s32>) = G_BUILD_VECTOR %a, %b, %c, %d, %e, %f, %g, %h, %i, %j, %k, %l, %m, %n, %o, %p
%z(<16 x s8>) = G_TRUNC %0

but <16 x s8> G_TRUNC <16 x s32> isn't legal so we start by expanding that:

%0(<16 x s32>) = G_BUILD_VECTOR %a, %b, %c, %d, %e, %f, %g, %h, %i, %j, %k, %l, %m, %n, %o, %p
%r0(<4 x s32>), %y1, %y2, %y3 = G_UNMERGE_VALUES %0
%z0(<4 x s8>) = G_TRUNC %r0
%z1(<4 x s8>) = G_TRUNC %r1
%z2(<4 x s8>) = G_TRUNC %r2
%z3(<4 x s8>) = G_TRUNC %r3
%z(16 x s8) = G_MERGE_VALUES %z0, %z1, %z2, %z3

the new <4 x s8> G_TRUNC <4 x s32> isn't legal either so we expand again to:

%0(<16 x s32>) = G_BUILD_VECTOR %a, %b, %c, %d, %e, %f, %g, %h, %i, %j, %k, %l, %m, %n, %o, %p
%r0(<4 x s32>), %r1, %r2, %r3 = G_UNMERGE_VALUES %0
%t0(s32), %t1, %t2, %t3 = G_UNMERGE_VALUES %r0
%t4(s32), %t5, %t6, %t7 = G_UNMERGE_VALUES %r1
%t8(s32), %t9, %t10, %t11 = G_UNMERGE_VALUES %r2
%t12(s32), %t13, %t14, %t15 = G_UNMERGE_VALUES %r3
%y0(s8) = G_TRUNC %t0
%y1(s8) = G_TRUNC %t1
%y2(s8) = G_TRUNC %t2
%y3(s8) = G_TRUNC %t3
%y4(s8) = G_TRUNC %t4
%y5(s8) = G_TRUNC %t5
%y6(s8) = G_TRUNC %t6
%y7(s8) = G_TRUNC %t7
%y8(s8) = G_TRUNC %t8
%y9(s8) = G_TRUNC %t9
%y10(s8) = G_TRUNC %t10
%y11(s8) = G_TRUNC %t11
%y12(s8) = G_TRUNC %t12
%y13(s8) = G_TRUNC %t13
%y14(s8) = G_TRUNC %t14
%y15(s8) = G_TRUNC %t15
%z(<16 x s8>) = G_BUILD_VECTOR %y0, %y1, %y2, %y3, %y4, %y5, %y6, %y7, %y8, %y9, %y10, %y11, %y12, %y13, %y14, %y15

Next, <16 x 32> G_BUILD_VECTOR isn't legal so we expand to:

%v0(<4 x s32>) = G_BUILD_VECTOR %a, %b, %c, %d
%v1(<4 x s32>) = G_BUILD_VECTOR %e, %f, %g, %h
%v2(<4 x s32>) = G_BUILD_VECTOR %i, %j, %k, %l
%v3(<4 x s32>) = G_BUILD_VECTOR %m, %n, %o, %p
%u(<16 x s32>) = G_MERGE_VALUES %v0, %v1, %v2, %v3
%r0(<4 x s32>), %r1, %r2, %r3 = G_UNMERGE_VALUES %u
%t0(s32), %t1, %t2, %t3 = G_UNMERGE_VALUES %r0
%t4(s32), %t5, %t6, %t7 = G_UNMERGE_VALUES %r0
%t8(s32), %t9, %t10, %t11 = G_UNMERGE_VALUES %r0
%t12(s32), %t13, %t14, %t15 = G_UNMERGE_VALUES %r0
%y0(s8) = G_TRUNC %t0
%y1(s8) = G_TRUNC %t1
%y2(s8) = G_TRUNC %t2
%y3(s8) = G_TRUNC %t3
%y4(s8) = G_TRUNC %t4
%y5(s8) = G_TRUNC %t5
%y6(s8) = G_TRUNC %t6
%y7(s8) = G_TRUNC %t7
%y8(s8) = G_TRUNC %t8
%y9(s8) = G_TRUNC %t9
%y10(s8) = G_TRUNC %t10
%y11(s8) = G_TRUNC %t11
%y12(s8) = G_TRUNC %t12
%y13(s8) = G_TRUNC %t13
%y14(s8) = G_TRUNC %t14
%y15(s8) = G_TRUNC %t15
%z(<16 x s8>) = G_BUILD_VECTOR %y0, %y1, %y2, %y3, %y4, %y5, %y6, %y7, %y8, %y9, %y10, %y11, %y12, %y13, %y14, %y15

and finally, we clean up the artefacts:

%v0(<4 x s32>) = G_BUILD_VECTOR %a, %b, %c, %d
%v1(<4 x s32>) = G_BUILD_VECTOR %e, %f, %g, %h
%v2(<4 x s32>) = G_BUILD_VECTOR %i, %j, %k, %l
%v3(<4 x s32>) = G_BUILD_VECTOR %m, %n, %o, %p
%t0(s32), %t1, %t2, %t3 = G_UNMERGE_VALUES %v0
%t4(s32), %t5, %t6, %t7 = G_UNMERGE_VALUES %v1
%t8(s32), %t9, %t10, %t11 = G_UNMERGE_VALUES %v2
%t12(s32), %t13, %t14, %t15 = G_UNMERGE_VALUES %v3
%y0(s8) = G_TRUNC %t0
%y1(s8) = G_TRUNC %t1
%y2(s8) = G_TRUNC %t2
%y3(s8) = G_TRUNC %t3
%y4(s8) = G_TRUNC %t4
%y5(s8) = G_TRUNC %t5
%y6(s8) = G_TRUNC %t6
%y7(s8) = G_TRUNC %t7
%y8(s8) = G_TRUNC %t8
%y9(s8) = G_TRUNC %t9
%y10(s8) = G_TRUNC %t10
%y11(s8) = G_TRUNC %t11
%y12(s8) = G_TRUNC %t12
%y13(s8) = G_TRUNC %t13
%y14(s8) = G_TRUNC %t14
%y15(s8) = G_TRUNC %t15
%z(<16 x s8>) = G_BUILD_VECTOR %y0, %y1, %y2, %y3, %y4, %y5, %y6, %y7, %y8, %y9, %y10, %y11, %y12, %y13, %y14, %y15

Meanwhile, the compact representation would be:

%z(<16 x s8>) = G_BUILD_VECTOR %a(s32), %b, %c, %d, %e, %f, %g, %h, %i, %j, %k, %l, %m, %n, %o, %p

and is legal from the start.

I would expect it would be easy to handle that in ISel.

Also, just throwing that here, I haven't really thought about it: should we rethink our legality model to check patterns instead of one instruction at a time?
I.e., maybe
a(s8) = ...
is not legal but
a(s8) = ...
b(<4 x s8>) = build_vector a(s8)
is.

The main problem with this is optimization. If anything breaks that pattern apart, then it ceases to be legal. That was the reason we moved away from G_SEXT+G_LOAD and to G_SEXTLOAD. We had problems with optimizations on G_SEXT making related G_LOAD's illegal.

Another issue there is that you can't look at an instruction and decide if it's legal. You may need to walk the uses looking for a pattern that covers it.

This actually ties back to one question that I don't think we answered. Do we legalize on the type (i.e., s8 to s32) or the size and type can be different (i.e., s8(8b) to s8(32b))?
It seems to me we are trying to write <4 x s8> = s8(32b), ...., but maybe we want <4 x s8> = s32, ....

Those look the same to me, it's just that one is implicitly converting s32 to s8.

Anyhow, to be concrete, I am fine with the implicit truncate, we know this is not too bad based on SDISel experience, I want to make sure we consider all our options here.

Cheers,
-Quentin

In D53594#1280815, @aditya_nandakumar wrote:

In D53594#1280811, @qcolombet wrote:

Thanks @aditya_nandakumar for the context.

I understand where you're coming from.

What I meant is that I would rather have one opcode of build_vector and one for build_vector with trunc. The rationale is that it makes it clear what you want to achieve and it makes for a stronger type verifier.

+1 for having explicit opcodes for truncation.

Either way we need to check if there is a truncate or not, having separate opcodes makes it easier to write combines/legality checks compared to dealing with types.

Updated patch to introduce a separate G_BUILD_VECTOR_TRUNC opcode.

aditya_nandakumar added inline comments.Nov 5 2018, 9:46 AM

lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
554 ↗	(On Diff #172608)	What's your take on just passing this to buildBuildVector vs asserting here?

aemerson added inline comments.Nov 5 2018, 9:56 AM

lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
554 ↗	(On Diff #172608)	I chose to redirect to buildBuildVector because I assumed that when you're using MachineIRBuilder you're interested in the overall semantics rather than with specific opcode is generated.

aditya_nandakumar added inline comments.Nov 5 2018, 9:57 AM

lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
554 ↗	(On Diff #172608)	Fair enough. Please update the documentation for the Opcode as well to reflect this.

Hi Amara, can you please add test case for the machine verifier?

Added comment for G_BUILD_VECTOR_TRUNC and tests for the verifier. I haven't added all the different combinations because we can only write one test per file, so I tested the most useful of the verifier errors.

This looks good to me assuming we go with the extra opcode..

This revision was not accepted when it landed; it landed in state Needs Review.Dec 5 2018, 3:57 PM

Closed by commit rL348430: [GlobalISel] Introduce G_BUILD_VECTOR, G_BUILD_VECTOR_TRUNC and G_CONCAT_VECTOR… (authored by aemerson). · Explain Why

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

trunk/

include/

llvm/

CodeGen/

GlobalISel/

MachineIRBuilder.h

40 lines

Support/

TargetOpcodes.def

11 lines

Target/

GenericOpcodes.td

22 lines

lib/

CodeGen/

GlobalISel/

MachineIRBuilder.cpp

58 lines

MachineVerifier.cpp

57 lines

test/

CodeGen/

AArch64/

GlobalISel/

legalizer-info-validation.mir

9 lines

Verifier/

gisel-g_build_vector.mir

27 lines

gisel-g_build_vector_trunc.mir

27 lines

gisel-g_concat_vector.mir

29 lines

Diff 176890

llvm/trunk/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h

Show First 20 Lines • Show All 597 Lines • ▼ Show 20 Lines	public:
/// \pre setBasicBlock or setMI must have been called.		/// \pre setBasicBlock or setMI must have been called.
/// \pre The entire register \p Res (and no more) must be covered by the input		/// \pre The entire register \p Res (and no more) must be covered by the input
/// registers.		/// registers.
/// \pre The type of all \p Res registers must be identical.		/// \pre The type of all \p Res registers must be identical.
///		///
/// \return a MachineInstrBuilder for the newly created instruction.		/// \return a MachineInstrBuilder for the newly created instruction.
MachineInstrBuilder buildUnmerge(ArrayRef<unsigned> Res, unsigned Op);		MachineInstrBuilder buildUnmerge(ArrayRef<unsigned> Res, unsigned Op);

		/// Build and insert \p Res = G_BUILD_VECTOR \p Op0, ...
		///
		/// G_BUILD_VECTOR creates a vector value from multiple scalar registers.
		/// \pre setBasicBlock or setMI must have been called.
		/// \pre The entire register \p Res (and no more) must be covered by the
		/// input scalar registers.
		/// \pre The type of all \p Ops registers must be identical.
		///
		/// \return a MachineInstrBuilder for the newly created instruction.
		MachineInstrBuilder buildBuildVector(unsigned Res, ArrayRef<unsigned> Ops);

		/// Build and insert \p Res = G_BUILD_VECTOR_TRUNC \p Op0, ...
		///
		/// G_BUILD_VECTOR_TRUNC creates a vector value from multiple scalar registers
		/// which have types larger than the destination vector element type, and
		/// truncates the values to fit.
		///
		/// If the operands given are already the same size as the vector elt type,
		/// then this method will instead create a G_BUILD_VECTOR instruction.
		///
		/// \pre setBasicBlock or setMI must have been called.
		/// \pre The type of all \p Ops registers must be identical.
		///
		/// \return a MachineInstrBuilder for the newly created instruction.
		MachineInstrBuilder buildBuildVectorTrunc(unsigned Res,
		ArrayRef<unsigned> Ops);

		/// Build and insert \p Res = G_CONCAT_VECTORS \p Op0, ...
		///
		/// G_CONCAT_VECTORS creates a vector from the concatenation of 2 or more
		/// vectors.
		///
		/// \pre setBasicBlock or setMI must have been called.
		/// \pre The entire register \p Res (and no more) must be covered by the input
		/// registers.
		/// \pre The type of all source operands must be identical.
		///
		/// \return a MachineInstrBuilder for the newly created instruction.
		MachineInstrBuilder buildConcatVectors(unsigned Res, ArrayRef<unsigned> Ops);

MachineInstrBuilder buildInsert(unsigned Res, unsigned Src,		MachineInstrBuilder buildInsert(unsigned Res, unsigned Src,
unsigned Op, unsigned Index);		unsigned Op, unsigned Index);

/// Build and insert either a G_INTRINSIC (if \p HasSideEffects is false) or		/// Build and insert either a G_INTRINSIC (if \p HasSideEffects is false) or
/// G_INTRINSIC_W_SIDE_EFFECTS instruction. Its first operand will be the		/// G_INTRINSIC_W_SIDE_EFFECTS instruction. Its first operand will be the
/// result register definition unless \p Reg is NoReg (== 0). The second		/// result register definition unless \p Reg is NoReg (== 0). The second
/// operand will be the intrinsic's ID.		/// operand will be the intrinsic's ID.
///		///
▲ Show 20 Lines • Show All 491 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/Support/TargetOpcodes.def

	Show First 20 Lines • Show All 252 Lines • ▼ Show 20 Lines
	/// Generic instruction to insert blocks of bits from the registers given into			/// Generic instruction to insert blocks of bits from the registers given into
	/// the source.			/// the source.
	HANDLE_TARGET_OPCODE(G_INSERT)			HANDLE_TARGET_OPCODE(G_INSERT)

	/// Generic instruction to paste a variable number of components together into a			/// Generic instruction to paste a variable number of components together into a
	/// larger register.			/// larger register.
	HANDLE_TARGET_OPCODE(G_MERGE_VALUES)			HANDLE_TARGET_OPCODE(G_MERGE_VALUES)

				/// Generic instruction to create a vector value from a number of scalar
				/// components.
				HANDLE_TARGET_OPCODE(G_BUILD_VECTOR)

				/// Generic instruction to create a vector value from a number of scalar
				/// components, which have types larger than the result vector elt type.
				HANDLE_TARGET_OPCODE(G_BUILD_VECTOR_TRUNC)

				/// Generic instruction to create a vector by concatenating multiple vectors.
				HANDLE_TARGET_OPCODE(G_CONCAT_VECTORS)

	/// Generic pointer to int conversion.			/// Generic pointer to int conversion.
	HANDLE_TARGET_OPCODE(G_PTRTOINT)			HANDLE_TARGET_OPCODE(G_PTRTOINT)

	/// Generic int to pointer conversion.			/// Generic int to pointer conversion.
	HANDLE_TARGET_OPCODE(G_INTTOPTR)			HANDLE_TARGET_OPCODE(G_INTTOPTR)

	/// Generic bitcast. The source and destination types must be different, or a			/// Generic bitcast. The source and destination types must be different, or a
	/// COPY is the relevant instruction.			/// COPY is the relevant instruction.
	▲ Show 20 Lines • Show All 254 Lines • Show Last 20 Lines

llvm/trunk/include/llvm/Target/GenericOpcodes.td

	Show First 20 Lines • Show All 669 Lines • ▼ Show 20 Lines
	// %0:(s32) = G_MERGE_VALUES %bits_0_7:(s8), %bits_8_15:(s8),			// %0:(s32) = G_MERGE_VALUES %bits_0_7:(s8), %bits_8_15:(s8),
	// %bits_16_23:(s8), %bits_24_31:(s8)			// %bits_16_23:(s8), %bits_24_31:(s8)
	def G_MERGE_VALUES : GenericInstruction {			def G_MERGE_VALUES : GenericInstruction {
	let OutOperandList = (outs type0:$dst);			let OutOperandList = (outs type0:$dst);
	let InOperandList = (ins type1:$src0, variable_ops);			let InOperandList = (ins type1:$src0, variable_ops);
	let hasSideEffects = 0;			let hasSideEffects = 0;
	}			}

				/// Create a vector from multiple scalar registers.
				def G_BUILD_VECTOR : GenericInstruction {
				let OutOperandList = (outs type0:$dst);
				let InOperandList = (ins type1:$src0, variable_ops);
				let hasSideEffects = 0;
				}

				/// Like G_BUILD_VECTOR, but truncates the larger operand types to fit the
				/// destination vector elt type.
				def G_BUILD_VECTOR_TRUNC : GenericInstruction {
				let OutOperandList = (outs type0:$dst);
				let InOperandList = (ins type1:$src0, variable_ops);
				let hasSideEffects = 0;
				}

				/// Create a vector by concatenating vectors together.
				def G_CONCAT_VECTORS : GenericInstruction {
				let OutOperandList = (outs type0:$dst);
				let InOperandList = (ins type1:$src0, variable_ops);
				let hasSideEffects = 0;
				}

	// Intrinsic without side effects.			// Intrinsic without side effects.
	def G_INTRINSIC : GenericInstruction {			def G_INTRINSIC : GenericInstruction {
	let OutOperandList = (outs);			let OutOperandList = (outs);
	let InOperandList = (ins unknown:$intrin, variable_ops);			let InOperandList = (ins unknown:$intrin, variable_ops);
	let hasSideEffects = 0;			let hasSideEffects = 0;
	}			}

	// Intrinsic with side effects.			// Intrinsic with side effects.
	▲ Show 20 Lines • Show All 66 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp

Show First 20 Lines • Show All 513 Lines • ▼ Show 20 Lines	#endif

MachineInstrBuilder MIB = buildInstr(TargetOpcode::G_UNMERGE_VALUES);		MachineInstrBuilder MIB = buildInstr(TargetOpcode::G_UNMERGE_VALUES);
for (unsigned i = 0; i < Res.size(); ++i)		for (unsigned i = 0; i < Res.size(); ++i)
MIB.addDef(Res[i]);		MIB.addDef(Res[i]);
MIB.addUse(Op);		MIB.addUse(Op);
return MIB;		return MIB;
}		}

		MachineInstrBuilder
		MachineIRBuilderBase::buildBuildVector(unsigned Res, ArrayRef<unsigned> Ops) {
		#ifndef NDEBUG
		assert((!Ops.empty() \|\| Ops.size() < 2) && "Must have at least 2 operands");
		assert(getMRI()->getType(Res).isVector() && "Res type must be a vector");
		LLT Ty = getMRI()->getType(Ops[0]);
		for (auto Reg : Ops)
		assert(getMRI()->getType(Reg) == Ty && "type mismatch in input list");
		assert(Ops.size() * Ty.getSizeInBits() ==
		getMRI()->getType(Res).getSizeInBits() &&
		"input scalars do not exactly cover the outpur vector register");
		#endif
		MachineInstrBuilder MIB = buildInstr(TargetOpcode::G_BUILD_VECTOR);
		MIB.addDef(Res);
		for (auto Op : Ops)
		MIB.addUse(Op);
		return MIB;
		}

		MachineInstrBuilder
		MachineIRBuilderBase::buildBuildVectorTrunc(unsigned Res,
		ArrayRef<unsigned> Ops) {
		#ifndef NDEBUG
		assert((!Ops.empty() \|\| Ops.size() < 2) && "Must have at least 2 operands");
		LLT Ty = getMRI()->getType(Ops[0]);
		for (auto Reg : Ops)
		assert(getMRI()->getType(Reg) == Ty && "type mismatch in input list");
		#endif
		if (getMRI()->getType(Ops[0]).getSizeInBits() ==
		getMRI()->getType(Res).getElementType().getSizeInBits())
		return buildBuildVector(Res, Ops);
		MachineInstrBuilder MIB = buildInstr(TargetOpcode::G_BUILD_VECTOR_TRUNC);
		MIB.addDef(Res);
		for (auto Op : Ops)
		MIB.addUse(Op);
		return MIB;
		}

		MachineInstrBuilder
		MachineIRBuilderBase::buildConcatVectors(unsigned Res, ArrayRef<unsigned> Ops) {
		#ifndef NDEBUG
		assert((!Ops.empty() \|\| Ops.size() < 2) && "Must have at least 2 operands");
		LLT Ty = getMRI()->getType(Ops[0]);
		for (auto Reg : Ops) {
		assert(getMRI()->getType(Reg).isVector() && "expected vector operand");
		assert(getMRI()->getType(Reg) == Ty && "type mismatch in input list");
		}
		assert(Ops.size() * Ty.getSizeInBits() ==
		getMRI()->getType(Res).getSizeInBits() &&
		"input vectors do not exactly cover the outpur vector register");
		#endif
		MachineInstrBuilder MIB = buildInstr(TargetOpcode::G_CONCAT_VECTORS);
		MIB.addDef(Res);
		for (auto Op : Ops)
		MIB.addUse(Op);
		return MIB;
		}

MachineInstrBuilder MachineIRBuilderBase::buildInsert(unsigned Res,		MachineInstrBuilder MachineIRBuilderBase::buildInsert(unsigned Res,
unsigned Src, unsigned Op,		unsigned Src, unsigned Op,
unsigned Index) {		unsigned Index) {
assert(Index + getMRI()->getType(Op).getSizeInBits() <=		assert(Index + getMRI()->getType(Op).getSizeInBits() <=
getMRI()->getType(Res).getSizeInBits() &&		getMRI()->getType(Res).getSizeInBits() &&
"insertion past the end of a register");		"insertion past the end of a register");

if (getMRI()->getType(Res).getSizeInBits() ==		if (getMRI()->getType(Res).getSizeInBits() ==
▲ Show 20 Lines • Show All 319 Lines • Show Last 20 Lines

llvm/trunk/lib/CodeGen/MachineVerifier.cpp

Show First 20 Lines • Show All 1,049 Lines • ▼ Show 20 Lines	case TargetOpcode::G_FPTRUNC: {
case TargetOpcode::G_FPTRUNC:		case TargetOpcode::G_FPTRUNC:
if (DstSize >= SrcSize)		if (DstSize >= SrcSize)
report("Generic truncate has destination type no smaller than source",		report("Generic truncate has destination type no smaller than source",
MI);		MI);
break;		break;
}		}
break;		break;
}		}
		case TargetOpcode::G_BUILD_VECTOR: {
		// Source types must be scalars, dest type a vector. Total size of scalars
		// must match the dest vector size.
		LLT DstTy = MRI->getType(MI->getOperand(0).getReg());
		LLT SrcEltTy = MRI->getType(MI->getOperand(1).getReg());
		if (!DstTy.isVector() \|\| SrcEltTy.isVector())
		report("G_BUILD_VECTOR must produce a vector from scalar operands", MI);
		for (unsigned i = 2; i < MI->getNumOperands(); ++i) {
		if (MRI->getType(MI->getOperand(1).getReg()) !=
		MRI->getType(MI->getOperand(i).getReg()))
		report("G_BUILD_VECTOR source operand types are not homogeneous", MI);
		}
		if (DstTy.getSizeInBits() !=
		SrcEltTy.getSizeInBits() * (MI->getNumOperands() - 1))
		report("G_BUILD_VECTOR src operands total size don't match dest "
		"size.",
		MI);
		break;
		}
		case TargetOpcode::G_BUILD_VECTOR_TRUNC: {
		// Source types must be scalars, dest type a vector. Scalar types must be
		// larger than the dest vector elt type, as this is a truncating operation.
		LLT DstTy = MRI->getType(MI->getOperand(0).getReg());
		LLT SrcEltTy = MRI->getType(MI->getOperand(1).getReg());
		if (!DstTy.isVector() \|\| SrcEltTy.isVector())
		report("G_BUILD_VECTOR_TRUNC must produce a vector from scalar operands",
		MI);
		for (unsigned i = 2; i < MI->getNumOperands(); ++i) {
		if (MRI->getType(MI->getOperand(1).getReg()) !=
		MRI->getType(MI->getOperand(i).getReg()))
		report("G_BUILD_VECTOR_TRUNC source operand types are not homogeneous",
		MI);
		}
		if (SrcEltTy.getSizeInBits() <= DstTy.getElementType().getSizeInBits())
		report("G_BUILD_VECTOR_TRUNC source operand types are not larger than "
		"dest elt type",
		MI);
		break;
		}
		case TargetOpcode::G_CONCAT_VECTORS: {
		// Source types should be vectors, and total size should match the dest
		// vector size.
		LLT DstTy = MRI->getType(MI->getOperand(0).getReg());
		LLT SrcTy = MRI->getType(MI->getOperand(1).getReg());
		if (!DstTy.isVector() \|\| !SrcTy.isVector())
		report("G_CONCAT_VECTOR requires vector source and destination operands",
		MI);
		for (unsigned i = 2; i < MI->getNumOperands(); ++i) {
		if (MRI->getType(MI->getOperand(1).getReg()) !=
		MRI->getType(MI->getOperand(i).getReg()))
		report("G_CONCAT_VECTOR source operand types are not homogeneous", MI);
		}
		if (DstTy.getNumElements() !=
		SrcTy.getNumElements() * (MI->getNumOperands() - 1))
		report("G_CONCAT_VECTOR num dest and source elements should match", MI);
		break;
		}
case TargetOpcode::COPY: {		case TargetOpcode::COPY: {
if (foundErrors)		if (foundErrors)
break;		break;
const MachineOperand &DstOp = MI->getOperand(0);		const MachineOperand &DstOp = MI->getOperand(0);
const MachineOperand &SrcOp = MI->getOperand(1);		const MachineOperand &SrcOp = MI->getOperand(1);
LLT DstTy = MRI->getType(DstOp.getReg());		LLT DstTy = MRI->getType(DstOp.getReg());
LLT SrcTy = MRI->getType(SrcOp.getReg());		LLT SrcTy = MRI->getType(SrcOp.getReg());
if (SrcTy.isValid() && DstTy.isValid()) {		if (SrcTy.isValid() && DstTy.isValid()) {
▲ Show 20 Lines • Show All 1,295 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir

	Show First 20 Lines • Show All 63 Lines • ▼ Show 20 Lines
	# DEBUG: .. type index coverage check SKIPPED: user-defined predicate detected			# DEBUG: .. type index coverage check SKIPPED: user-defined predicate detected
	#			#
	# DEBUG-NEXT: G_INSERT (opcode {{[0-9]+}}): 2 type indices			# DEBUG-NEXT: G_INSERT (opcode {{[0-9]+}}): 2 type indices
	# DEBUG: .. type index coverage check SKIPPED: user-defined predicate detected			# DEBUG: .. type index coverage check SKIPPED: user-defined predicate detected
	#			#
	# DEBUG-NEXT: G_MERGE_VALUES (opcode {{[0-9]+}}): 2 type indices			# DEBUG-NEXT: G_MERGE_VALUES (opcode {{[0-9]+}}): 2 type indices
	# DEBUG: .. type index coverage check SKIPPED: user-defined predicate detected			# DEBUG: .. type index coverage check SKIPPED: user-defined predicate detected
	#			#
				# DEBUG-NEXT: G_BUILD_VECTOR (opcode {{[0-9]+}}): 2 type indices
				# DEBUG: .. type index coverage check SKIPPED: no rules defined
				#
				# DEBUG-NEXT: G_BUILD_VECTOR_TRUNC (opcode {{[0-9]+}}): 2 type indices
				# DEBUG: .. type index coverage check SKIPPED: no rules defined
				#
				# DEBUG-NEXT: G_CONCAT_VECTORS (opcode {{[0-9]+}}): 2 type indices
				# DEBUG: .. type index coverage check SKIPPED: no rules defined
				#
	# DEBUG-NEXT: G_PTRTOINT (opcode {{[0-9]+}}): 2 type indices			# DEBUG-NEXT: G_PTRTOINT (opcode {{[0-9]+}}): 2 type indices
	# DEBUG: .. the first uncovered type index: 2, OK			# DEBUG: .. the first uncovered type index: 2, OK
	#			#
	# DEBUG-NEXT: G_INTTOPTR (opcode {{[0-9]+}}): 2 type indices			# DEBUG-NEXT: G_INTTOPTR (opcode {{[0-9]+}}): 2 type indices
	# DEBUG: .. the first uncovered type index: 2, OK			# DEBUG: .. the first uncovered type index: 2, OK
	#			#
	# DEBUG-NEXT: G_BITCAST (opcode {{[0-9]+}}): 2 type indices			# DEBUG-NEXT: G_BITCAST (opcode {{[0-9]+}}): 2 type indices
	# DEBUG: .. the first uncovered type index: 2, OK			# DEBUG: .. the first uncovered type index: 2, OK
	▲ Show 20 Lines • Show All 248 Lines • Show Last 20 Lines

llvm/trunk/test/Verifier/gisel-g_build_vector.mir

				#RUN: not llc -o - -global-isel -run-pass=none -verify-machineinstrs %s 2>&1 \| FileCheck %s
				# REQUIRES: global-isel, aarch64-registered-target
				--- \|
				target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
				target triple = "aarch64-unknown-unknown"

				define i32 @g_build_vector() {
				ret i32 0
				}

				...
				---
				name: g_build_vector
				legalized: true
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				registers:
				- { id: 0, class: _, preferred-register: '' }
				liveins:
				body: \|
				bb.0:
				; CHECK: Bad machine code: G_BUILD_VECTOR src operands total size don't match dest size

				%0(s32) = IMPLICIT_DEF
				%1:_(<2 x s32>) = G_BUILD_VECTOR %0, %0, %0, %0
				...

llvm/trunk/test/Verifier/gisel-g_build_vector_trunc.mir

				#RUN: not llc -o - -global-isel -run-pass=none -verify-machineinstrs %s 2>&1 \| FileCheck %s
				# REQUIRES: global-isel, aarch64-registered-target
				--- \|
				target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
				target triple = "aarch64-unknown-unknown"

				define i32 @g_build_vector_trunc() {
				ret i32 0
				}

				...
				---
				name: g_build_vector_trunc
				legalized: true
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				registers:
				- { id: 0, class: _, preferred-register: '' }
				liveins:
				body: \|
				bb.0:
				; CHECK: Bad machine code: G_BUILD_VECTOR_TRUNC source operand types are not larger than dest elt type

				%0(s32) = IMPLICIT_DEF
				%1:_(<2 x s32>) = G_BUILD_VECTOR_TRUNC %0, %0
				...

llvm/trunk/test/Verifier/gisel-g_concat_vector.mir

				#RUN: not llc -o - -global-isel -run-pass=none -verify-machineinstrs %s 2>&1 \| FileCheck %s
				# REQUIRES: global-isel, aarch64-registered-target
				--- \|
				target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
				target triple = "aarch64-unknown-unknown"

				define i32 @g_concat_vectors() {
				ret i32 0
				}

				...
				---
				name: g_concat_vectors
				legalized: true
				regBankSelected: false
				selected: false
				tracksRegLiveness: true
				registers:
				- { id: 0, class: _, preferred-register: '' }
				- { id: 1, class: _, preferred-register: '' }
				liveins:
				body: \|
				bb.0:
				; CHECK: Bad machine code: G_CONCAT_VECTOR num dest and source elements should match

				%0(<2 x s32>) = IMPLICIT_DEF
				%1(<2 x s32>) = IMPLICIT_DEF
				%2:_(<2 x s32>) = G_CONCAT_VECTORS %0, %1
				...

This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Introduce G_BUILD_VECTOR and G_CONCAT_VECTOR opcodesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 176890

llvm/trunk/include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h

llvm/trunk/include/llvm/Support/TargetOpcodes.def

llvm/trunk/include/llvm/Target/GenericOpcodes.td

llvm/trunk/lib/CodeGen/GlobalISel/MachineIRBuilder.cpp

llvm/trunk/lib/CodeGen/MachineVerifier.cpp

llvm/trunk/test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir

llvm/trunk/test/Verifier/gisel-g_build_vector.mir

llvm/trunk/test/Verifier/gisel-g_build_vector_trunc.mir

llvm/trunk/test/Verifier/gisel-g_concat_vector.mir

[GlobalISel] Introduce G_BUILD_VECTOR and G_CONCAT_VECTOR opcodes
ClosedPublic