This is an archive of the discontinued LLVM Phabricator instance.

[GISel]: Allow G_EXTRACT_VEC_ELT's result to be larger the source element type
Needs ReviewPublic

Authored by aditya_nandakumar on Feb 19 2019, 5:17 PM.

Details

Summary

It's quite inconvenient when dealing with G_EXTRACT_VEC_ELT of <4xs8>,0 where s8 is not legal in the target - typically one ends up with G_UNMERGE_VALUES + G_LSHR by 8 which expands to three instructions.
Instead allow the destination type to be larger than source scalar type and it's higher bits are any extended. This is similar to DAG's opcode.
Note - I couldn't really find a use in tree - so I added a unit test.

Diff Detail

Repository
rL LLVM

Event Timeline

Herald added a project: Restricted Project. · View Herald TranscriptFeb 19 2019, 5:18 PM
Herald added a subscriber: wdng. · View Herald Transcript

I always thought this behavior in SelectionDAG was error prone. I think I broke this on every combine I ever wrote for extract_vector_elt. Should there be a separate extending opcode instead? Although that would probably make some combines more painful. In GlobalISel we already have an explicit G_BUILD_VECTOR_TRUNC instead of the old ISD::BUILD_VECTOR truncate behavior, so that would match. However I don't think G_BUILD_VECTOR_TRUNC is well developed enough to declare that was definitely a good decision yet.

lib/CodeGen/MachineVerifier.cpp
1276

braces

1281

braces

I always thought this behavior in SelectionDAG was error prone.

Personally in an out of tree backend, I find BUILD_VEC_TRUNC to be very elegant when trying to legalize/optimize vector types whose scalar types are illegal in the architecture otherwise needing several unmerges + shifts + ands.
I'm fine with adding a separate opcode (say G_EXTRACT_VECTOR_ELT_ANYEXT - suggestions accepted).
We need to do something similar for INSERT_VEC_ELT which can implicitly truncate.

I always thought this behavior in SelectionDAG was error prone.

Personally in an out of tree backend, I find BUILD_VEC_TRUNC to be very elegant when trying to legalize/optimize vector types whose scalar types are illegal in the architecture otherwise needing several unmerges + shifts + ands.
I'm fine with adding a separate opcode (say G_EXTRACT_VECTOR_ELT_ANYEXT - suggestions accepted).
We need to do something similar for INSERT_VEC_ELT which can implicitly truncate.

One question I've been thinking about is whether implicit extension is needed for G_INSERT/G_EXTRACT. I figured extract_vector_elt/insert_vector_elt with a constant could always be implemented as a G_EXTRACT/G_INSERT.

I always thought this behavior in SelectionDAG was error prone.

Personally in an out of tree backend, I find BUILD_VEC_TRUNC to be very elegant when trying to legalize/optimize vector types whose scalar types are illegal in the architecture otherwise needing several unmerges + shifts + ands.
I'm fine with adding a separate opcode (say G_EXTRACT_VECTOR_ELT_ANYEXT - suggestions accepted).
We need to do something similar for INSERT_VEC_ELT which can implicitly truncate.

One question I've been thinking about is whether implicit extension is needed for G_INSERT/G_EXTRACT. I figured extract_vector_elt/insert_vector_elt with a constant could always be implemented as a G_EXTRACT/G_INSERT.

G_EXTRACT and G_INSERT are way more powerful instructions IMO and going from EXTRACT_VEC_ELT to EXTRACT seems to adding complexity (where as I'd prefer the other way if possible as sort of a canonicalization). I'm curious to hear others thoughts on this as well.

Missing braces.

I think this should be symmetric with G_BUILD_VECTOR_TRUNC, so either a separate opcode or re-merge BUILD_VECTOR_TRUNC with BUILD_VECTOR

I think this should be symmetric with G_BUILD_VECTOR_TRUNC, so either a separate opcode or re-merge BUILD_VECTOR_TRUNC with BUILD_VECTOR

Seems reasonable. Let's go ahead with this.