This is an archive of the discontinued LLVM Phabricator instance.

GlobalISel: first outline of legalization interface.
AbandonedPublic

Authored by t.p.northover on Jun 20 2016, 1:53 PM.

Details

Reviewers
qcolombet
Summary

This implements a preliminary interface for targets to specify how a given operation should be legalized, with a fewproof of concept legalizations for integer addition.

ISelDAG seems to have overloaded "Expand" to mean quite a few different possible legalization strategies, which I'd like to avoid for GlobalISel. So at the moment, the possible generic options are:

  • Legal (nothing to do)
  • NarrowScalar (e.g. i64 -> i32 twice, <4 x i64> -> <4 x i32>).
  • WidenScalar (e.g. i8 -> i32, <8 x i1> -> <8 x i8>).
  • FewerElements (e.g. <4 x i64> -> <2 x i64> twice).
  • MoreElements (e.g. <2 x i8> -> <8 x i8>).
  • Libcall (self-explanatory).
  • Custom (as with ISelDAG, target will handle it).

I am hoping scalarization can be considered a special case of "FewerElements" where the legal number of elements is 1 since the distinction between "<1 x %type>" and "%type" that exists now is a bit sketchy.

For an operation that is illegal on a vector type, we first legalize the underlying scalar (with a separate "setScalarInVectorAction" interface specifying how because, for example, "G_ADD <8 x i8>" may be fine but "G_ADD i8" only representable by promoting to i64). After that we know that the operation is legal for *some* number of vector elements and the second step is to find that.

Some extension will be needed to handle operations with more than one type (for example "i1 = G_ICMP i8 %a, %b"). In this and some other cases what should happen to the result is independent of the operands. I have some vague ideas on this, but nothing concrete.

One other thought, not on the wider picture but the specific G_ADD case in this patch was: just how many G_ADDs do we need/want? For now I've stuck with ADD/ADDC/ADDE, but I'm increasingly convinced that the last two are redundant (ADDC is just "ADDE a, b, 0") and I'm starting to think that it might be better to just have ADDE at the MIR level (with G_ADD being "res, thing<dead> = G_ADDE a, b, 0"). We could still have convenience queries and MIRBuilder methods to produce simple adds.

Any bright ideas or other comments welcome!

Cheers.

Tim.

Diff Detail

Repository
rL LLVM

Event Timeline

t.p.northover retitled this revision from to GlobalISel: first outline of legalization interface..
t.p.northover updated this object.
t.p.northover added a reviewer: qcolombet.
t.p.northover set the repository for this revision to rL LLVM.
t.p.northover added a subscriber: llvm-commits.
arsenm added a subscriber: arsenm.Jun 20 2016, 2:00 PM
arsenm added inline comments.
include/llvm/CodeGen/GlobalISel/MachineLegalizeInfo.h
25–28

Alphabetize

33–40

These need descriptive comments. Can we also add an unsupported / error on option?

45

Typo ancillary

include/llvm/Target/GenericOpcodes.td
24

Isn't this the default?

30

I think this should be trunc to match the IR instruction name

66

0 indexed src name? Should named operand tables somehow work with the generic opcodes?

75

EXTRACT_SUBREG?

97

REG_SEQUENCE?

All good suggestions (the ones that I understood), Matt. Thanks! I'll
wait around for more comments and fold them into the next iteration.

But I didn't quite get this one:

Should named operand tables somehow work with the generic opcodes?

Do you mean the MCInstrDesc information? Otherwise I'm in the dark, I'm afraid.

Tim.

All good suggestions (the ones that I understood), Matt. Thanks! I'll
wait around for more comments and fold them into the next iteration.

But I didn't quite get this one:

Should named operand tables somehow work with the generic opcodes?

Do you mean the MCInstrDesc information? Otherwise I'm in the dark, I'm afraid.

Tim.

I mean UseNamedOperandTable, to generate enums for accessing the instruction with the TII::getNamedOperandIdx(opcode, Foo::OpName::srcN)

MatzeB added a subscriber: MatzeB.Jun 20 2016, 2:59 PM

ISelDAG seems to have overloaded "Expand" to mean quite a few different possible legalization strategies, which I'd like to avoid for GlobalISel. So at the moment, the possible generic options are:

Legal (nothing to do)
NarrowScalar (e.g. i64 -> i32 twice, <4 x i64> -> <4 x i32>).
WidenScalar (e.g. i8 -> i32, <8 x i1> -> <8 x i8>).
FewerElements (e.g. <4 x i64> -> <2 x i64> twice).
MoreElements (e.g. <2 x i8> -> <8 x i8>).
Libcall (self-explanatory).
Custom (as with ISelDAG, target will handle it).

I always found the SelectionDAG interfaces that basically force you to create a table indexed by type and operation and only give you N pre-selected choices for the table entries. The amount of code we have in our *ISelLowering.cpp files in the custom lowering parts may be an indication that it is...

So how about a radical change here and simply let the targets handle this all in code. This doesn't need to become too tedious in the targets when we provide the right set of convenience functions in generic codegen code. Some pseudocode of what I imagine:

In generic CodeGen/GlobalISel code:

/* LegalizeHelper provides a set of "standard" legalizations in the form of different methods: */
void LegalizeHelper::narrowScalar(MI), LegalizeHelper::libcall(MI) ...
/* LegalizeHelper provides a set of standard query function to categorize operations into different "classes" that are often handled uniformly: */
boolean isIntArithmetic() { return isScalarIntType() && (is_Add || is_Sub || isMul) }, similar things like isVectorArithmetic(), isLoadStore(), ...

In target specific code:

/* No need to inialize a big table with legalization actions in the constructor, just do the case-by-case decisions inside legalize(). The custom legalization code also becomes part of legalize(): */
MyTargetLegalizer::legalize(MI) {
  if (isIntArithmetic(MI)) {
    if (MVTSize < 32)
      widenScalar(MI);
    else if (MVTSize > 32)
      splitScalar(MI, 32);
  }
  // ...
}

While we possible waste some extra cycles in this scheme because of extra if() checks compared to a table lookup. However this scheme:

  • Frees us from arguing semantics of the legalization actions enum; Just see what the function call is doing, if it doesn't suit you, add a new function or add a parameter to the legalization function for variations...
  • No artificial split between custom and "normal" lowering anymore.
  • Adding, renaming, adding parameters to the shared legalization functions keeps us more flexible in the future.

I'm not sure ScalarInVectorActions covers all the possibilities for legalization... for example, x86 prefers to legalize <4 x i16> to <4 x i32> rather than <8 x i16>.

It isn't clear how you plan on dealing with soft-float; for certain operations, like FABS, you probably want a separate "soften" action (so you can convert f64 FABS -> i64 AND -> i32 AND).

You really need to work out what to do with operations with multiple types sooner rather than later, given how common they are. The existing legalization infrastructure basically splits it into multiple steps: first legalize the result type, then legalize each operand type.

For <1 x i64> vs. i64, the distinction is a bit fuzzy... but I'm not sure we can really afford to throw it away. We're currently pretty comfortable with the distinction that <1 x i64> uses vector registers if possible, and i64 doesn't. Doing something different gets very complicated on x86-32 and other similar architectures because we would need to "legalize" i64 operations after register bank assignment. Maybe it's worth doing so we can put values into vector registers more aggressively, but it's a lot more complicated.

You need some legalization action equivalent to "LegalizeDAG", I think. Do you have something planned?

qcolombet edited edge metadata.Jun 21 2016, 11:58 AM

Hi Tim,

Thanks for working on this!

I believe this is work in progress so I did not go into great details to comment on each part of the patch. For instance, for a detailed review, I would have expected the patch to be split into smaller one.

First some general comment:

  • I would split the file with the legalization APIs in two:
    • one for the machine pass
    • one for the helper

The rationale is to be able to include them independently.

  • I would split the LegalizeHelper in two:
    • One with all the methods you added (narrow, widen, etc), but legalize
    • One with the legalize API and the set action stuff

Put differently, I would have the legalizer info being the actual legalizer. We have one per sub target and it knows how to legalize stuff. It uses the LegalizeHelper for the generic action.

  • Replace the Type/EVT by something lower level (Cf. my email)

Basically, we do not need to know what are the types, we only need to know their size and there number of lanes, i.e., I would go with a different types representation for the MI level. It is a much bigger change than just for legalization and we might postpone it, but it is something to keep in mind.

  • Add unit test cases for the legalize helper.

I think this is it for the high level comments.

More comments below.

Cheers,
-Quentin

include/llvm/CodeGen/GlobalISel/LegalizeMachineIR.h
39

I think we need to say something about what happens to the definition and arguments of MI.
My goal was to have transformations that take legal mir and produce legal mir. For that to be possible, we need to preserve the definition and arguments (with what I called the extract and build_sequence operations in the talk.)
We need to say that somewhere. Moreover, we need to say what happen to the extract and build_sequence operations that are created. Do we legalize them now, do we return a list of them, etc.?
I think the latter is preferable but maybe we want to also provide an option for the former.

Also, please comment on what is the boolean supposed to mean.

include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
199

I can see how this method will be useful to handle vectors.
Just one comment on the extract instruction. In my mind, the extract instruction was more a extract bitfield like instruction and could produce several definitions.
In other words, the instruction supports the following generic syntax:
<def1>, …, <defN> = extract <use>, <startBitIdx1>, <lengthDef1>, …, <startBitIdxN>, <lengthDefN>

The only requirement, again in my mind, was that the extracted bits do not overlap. The reason why I bring that here is because I think it is easier for the optimizers to reason about only one instruction for things that work on the same input (regbankselect in mind), e.g., gpr, gpr = movrrd fpr naturally fit with this semantic of extract instruction and vice versa. Therefore if we try to legalize only “half” of the extract, we may miss that this pattern is available.

To give something more concret with respect to legalization, say we have:
def1 = extract val, ...
def2 = extract val, …

Let say this is illegal and needs to be lowered in loads and stores. We will likely have:
store val
def1 = load @offset1
store val ; <— redundant store
def2 = load @offset2

And things needs to be cleaned up later.
Now, we the other representation:
def1, def2 = extract val, …
we get
store val
def1 = load @offset1 ; <— loads may be combined
def2 = load @offset2

include/llvm/CodeGen/GlobalISel/MachineLegalizeInfo.h
30

Copy paste :)

51

We’ll actually need more setAction kind of thing.
E.g., right now, there is no way to specify that "fptoint float to i32" is legal whereas "fptoint float to i64" is not, same for truncate, extensions, etc.
What I am saying is that some instructions need both their input and output types to set an action.
Note: This is something broken in SDISel and we need to fix that while we are here. I believe some of the custom lowering happens because of that.

lib/CodeGen/GlobalISel/IRTranslator.cpp
1

Unrelated change.

lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
69

The fact that we were not moving the cursor was actually by design.
If we change that we need to check what is the impact on the existing code.

lib/CodeGen/LLVMTargetMachine.cpp
163

I’d say we need a addPreLegalize hook as well.

arsenm added inline comments.Jun 21 2016, 12:09 PM
include/llvm/CodeGen/GlobalISel/MachineLegalizeInfo.h
51

For memory operations, we really need to be able to specify legality per address space

t.p.northover marked 6 inline comments as done.

OK, I've split off the what-should-I-do decision class into a separate review D22074 (as suggested by Quentin) and marked all comments relating to it as done in this review. Feel free to raise them again there if they're not adequately addressed.

Tim.

t.p.northover marked 7 inline comments as done.Jul 7 2016, 2:18 PM

I'm not uploading an actual diff yet (I'll want to split out more bits into separate reviews I think). But some comment replies:

include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
199

OK, I'll give it a go. For now I'll limit it to a single (implicit) size since otherwise we'll have to deal with multiple types.

include/llvm/Target/GenericOpcodes.td
66

Existing instructions are indexed by 1. Could do that in a separate refactoring though, I don't particularly feel strongly either way.

lib/CodeGen/GlobalISel/MachineIRBuilder.cpp
69

Well, all the tests still pass. ;)

I think we definitely need to do something so that the default use doesn't end up inserting instructions in reverse order. Maybe make setInstr default to Before = true instead?

t.p.northover abandoned this revision.Jul 18 2016, 1:20 PM
t.p.northover marked 2 inline comments as done.

I'll upload more modular patches for what's in this diff.