Page MenuHomePhabricator

GlobalISel: Reimplement moreElementsVectorDst
Needs ReviewPublic

Authored by arsenm on Feb 3 2020, 7:40 PM.

Details

Summary

Use pad with undef and unmerge with unused results. This is annoyingly
similar to several other places in LegalizerHelper, but they're all
slightly different.

Diff Detail

Event Timeline

arsenm created this revision.Feb 3 2020, 7:40 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 3 2020, 7:40 PM

Maybe I'm missing something obvious here, but why is this preferable to the original implementation?

Maybe I'm missing something obvious here, but why is this preferable to the original implementation?

merge/unmerge with even sized pieces is nicer to deal with than insert/extract with arbitrary, odd sized pieces. The legalization and artifact combiner for them are also way better developed than the handling for extract and inserts. We currently have multiple different strategies of the more elements/fewer elements. Some of them only handle even breakdowns some awkwardly handle odd cases, and some do the general widening technique like this. I'm trying to consolidate on the one strategy, since it's the most flexible and least likely to break (although all the extra padding instructions and registers seems a bit wasteful in the extreme cases)

Maybe I'm missing something obvious here, but why is this preferable to the original implementation?

merge/unmerge with even sized pieces is nicer to deal with than insert/extract with arbitrary, odd sized pieces. The legalization and artifact combiner for them are also way better developed than the handling for extract and inserts. We currently have multiple different strategies of the more elements/fewer elements. Some of them only handle even breakdowns some awkwardly handle odd cases, and some do the general widening technique like this. I'm trying to consolidate on the one strategy, since it's the most flexible and least likely to break (although all the extra padding instructions and registers seems a bit wasteful in the extreme cases)

Also when I posted the original legalizations for most of these, I got comments preferring this way

arsenm updated this revision to Diff 242768.Feb 5 2020, 3:30 PM

Rebase tests

This looks like a good candidate for new LegalizeAction ( MoreElementsLCM or something more appropriate ) that could be preferable more elements option for targets that support G_CONCAT_VECTORS and G_UNMERGE_VALUES for large vector sizes (large enough to hold LCM(WideTy, OrigTy) ).
Current algorithm seems to be designed for targets that prefer not to deal with vector types larger then WideTy. It would be difficult for them to deal with G_CONCAT_VECTORS and G_UNMERGE_VALUES that are not legal and likely hard to combine since they have different small LLTs.

arsenm added a comment.Feb 6 2020, 8:15 AM

This looks like a good candidate for new LegalizeAction ( MoreElementsLCM or something more appropriate ) that could be preferable more elements option for targets that support G_CONCAT_VECTORS and G_UNMERGE_VALUES for large vector sizes (large enough to hold LCM(WideTy, OrigTy) ).

I think we should limit the number of legalize actions. Have two separate actions that accomplish the same thing, but one sometimes works or not is a confusing state to be in.

Current algorithm seems to be designed for targets that prefer not to deal with vector types larger then WideTy. It would be difficult for them to deal with G_CONCAT_VECTORS and G_UNMERGE_VALUES that are not legal and likely hard to combine since they have different small LLTs.

I'm not sure what you mean. Do you have an example? The artifact combiner already handles unmerging from a concat_vector with a mixed number of elements

If not a new LegalizeAction maybe some other way ( like one more argument ) for targets to chose whether to create G_EXTRACT or CONCAT_VECTORS + G_UNMERGE_VALUES ?
I am not that familiar with the problem we are dealing here, but shouldn't insert + extract that are created during moreElements combine away like anyext + trunc for widen scalar ?
So this would be extract + insert into implicit def -> copy extract source if it had same type like implicit def (this is 3 instruction combine). Might that fix some of the problems?

llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp
1447

nit: there should be one more dead def here

// %0:_(<3 x s32>), %4:_(<3 x s32>), %5:_(<3 x s32>), %6:_(<3 x s32>) = G_UNMERGE_VALUES %3
llvm/test/CodeGen/AMDGPU/GlobalISel/legalize-and.mir
360–361

The artifact combiner already handles unmerging from a concat_vector with a mixed number of elements

In some nice cases yes, here however

if (NumDefs % NumMergeRegs != 0)
  return false;

4 % 3 != 0

Anyway this is probably equivalent to <3xs16> G_EXTRACT from first <4xs16> in G_CONCAT_VECTORS since unmerge created 3 more dead defs. That is what we had before this patch. This works because combiner didn't combine artifacts and they are legal.
If combiner was able to combine G_UNMERGE_VALUES + CONCAT_VECTORS it would ruin work done by moreElements. There was some discussion in some patch review that targets should also be able to chose which combines to use like they chose how to legalize. That might be useful here.