Page MenuHomePhabricator

[GISel][Legalizer][WIP][RFC] Retry legalization for failing instrs when artifacts are around
Needs ReviewPublic

Authored by qcolombet on Mar 13 2019, 4:19 PM.

Details

Summary

This is a prototype based on the discussion I had with @arsenm in D59227.

Basically what the patch does is when we cannot legalize an instruction but there are still legalization artifacts around, push it into a retry list instead of aborting right away. The rationale is that the legalization may have been prevented because the legalization of the instruction couldn't cope with the artifacts in the middle.

This is an RFC/WIP patch, in particular, there are no check whether the element in the retry list got removed, no check for loops (e.g., failed legalization actually inserts a bunch of artifacts and will keep being reapplied over and over) and so on and so forth.

The initial motivation for this patch was to simplify something like opToLegalize (zext(trunc cst)) into opToLegalize cst, i.e., the zext(trunc) chain (which is a legalization artifact) is in the way of the constant whereas opToLegalize may rely on those not being here.

Unfortunately this is not quite enough because legalization artifacts are turned into copies: opToLegalize zext(trunc cst) => opToLegalize (copy cst), so the legalization of opToLegalize still has to look through something.

@arsenm @volkan @aditya_nandakumar, just adding you as reviewers so that you are aware of this trial. I don't plan to push that.

Diff Detail

Repository
rL LLVM

Event Timeline

qcolombet created this revision.Mar 13 2019, 4:19 PM
Herald added a project: Restricted Project. · View Herald TranscriptMar 13 2019, 4:19 PM

I agree that this is an important issue and that it should be implemented. I have already done some work in this direction and you can find a description of my work in the following text. Any comment is very welcome.

In short: I think that we should eliminate all copies and artifacts between two instructions in Legalizer. After such a change we should not need to look through copies and other artifacts.

My understanding of Instruction/Artifact

Instruction:
Gets legalized, legalization does not depend on other instructions/artifacts. Instructions can be legalized in any order (currently are legalized bottom up for efficiency and use of stack data structure). During legalization Instruction can be legal (unchanged), or legalized.
Legalization can be done:
"in place" e.g. widen scalar. We mutate instruction and add few artifacts.
Replaced with a few new instructions and artifacts, original instruction gets deleted e.g narrow scalar.
New instructions and artifacts get pushed on stack where they wait to be legalized.

Artifact: Not an Instruction.
Some properties: First we try to combine artifacts (during combine, artifact gets deleted and replaced with some instruction or with instruction+artifact). In case we fail to combine, Artifact becomes instruction and gets pushed into Instruction stack where it waits to get legalized.
Currently, there is only one chance for artifact to get combined.
It can happen that Artifact1 needs to wait another Artifact2 to become instruction and to get legalized producing new Artifact3. Artifact1 would want to combine with Artifact3.
It can not happen for Artifact4 to wait for an Instruction to produce Artifact5 during its legalization because we first legalize Instructions and then combine artifacts.

The rationale is that the legalization may have been prevented because the legalization of the instruction couldn't cope with the artifacts in the middle.

This should be done for artifacts, if instruction is not legal, nothing can help it.
Generally :
If all uses of an Artifact that failed to combine are Instructions, then the Artifact should be turned into an Instruction.
If one of the uses is an Artifact, then wait, maybe it could be combined later.

Only G_CONSTANTS have to be handled differently (although G_CONSTANTS are formally Instructions now, this should be changed to allow combining).

I would like to propose and push some changes to legalizer, and add few new opcodes to improve artifact combine. This changes should allow support for all types (that is for all big types that cannot fit in register and Opcodes that work with them should be narrow scalared, since small types are easier to handle). The descriptions of changes are given below.

(1) Instead of directly generating shifts/and when we combine G_SEXT/G_ZEXT with G_TRUNC we should generate instructions that represent explicit sign/zero extend in register (G_SEXT_EXPLICIT/G_ZEXT_EXPLICIT), and allow targets to legalize them.

(2) Add few new opcodes that represent implicit trunc of sign/zero(G_IMPLICIT_SIGN_TRUNC/G_IMPLICIT_ZERO_TRUNC). Could be used for function arguments that are implicitly sign/zero extended (in ll this is indicated by signext/zeroext), or for legalization of instructions. For example widen scalar of G_ICMP on mips32 could return s32 that is G_IMPLICIT_ZERO_TRUNC-ated to s1 since actual instruction that is going to be selected does implicit zero-extend to full register. This should improve generation, that is avoid generation, of G_SEXT_EXPLICIT/G_ZEXT_EXPLICIT when a simple copy would do.

(3) Copies that are in the way of some artifact legalization are products of some other artifact combines. We may want to generate G_COPY instead of COPY (e.g. when we combine G_ANYEXT and G_TRUNC), push them into instruction legalize stack, and eliminate them when they are about to be legalized. G_COPY because COPY is sometimes used for cross register class copy/move that gets expanded later. If Targets would not use COPY for such purposes we could get away with COPY, but just to be sure and allow targets to do what they want I would go for G_COPY. G_COPY should not be allowed to leave legalizer and can only be created in artifact combine. This change would address looking through copies and other opcodes.

(4) Combining Artifacts (as already described)

(5) Improve the elimination of dead instructions, as they are present at the moment. (This should get removed once we are sure that we do not leave dead instructions after artifact combines.)

After these changes we should be able to handle constants in much better way. For example we could figure out if constant should be sign or zero extended and generate proper value instead of currently always sign-extending. We could try to implement some constant elimination once they are properly legalized.

Hi Petr,

Instructions:
Gets legalized, legalization does not depend on other instructions/artifacts.

That's unfortunately not true. For instance, for some out-of-tree target, we can legalize DIV by constant and that's all.
Remember that the legalization process can introduce target specific instruction, thus, it is entirely possible that an instruction is legal only in a given sequence of instruction. What this means is that it will likely be legalized directly into a pseudo or an actual instruction, in other words, something target specific.

This should be done for artifacts, if instruction is not legal, nothing can help it.

I don't get that part.
To go back to my example, I cannot legalize my DIV until I see the constant, however the constant may be hidden behind artifacts and it will become legal only when they are removed.
E.g.,

DIV ... (ext(trunc cst)) => illegal
DIV ... (cst) => legal

If all uses of an Artifact that failed to combine are Instructions, then the Artifact should be turned into an Instruction.

Agree.

If one of the uses is an Artifact, then wait, maybe it could be combined later.

Do you see cases where we indeed have to wait in the current scheme where we deal with every instruction before looking at cleaning at the artifacts?
Basically, do you have an example?

Only G_CONSTANTS have to be handled differently (although G_CONSTANTS are formally Instructions now, this should be changed to allow combining).

After my conversation with @arsenm I have mixed feeling about that. On one hand we could treat G_CONSTANT differently because it seems more efficient to me, on the other hands, if we deal with artifacts properly, we don't have to deal with them differently and I like the consistence here.

#1 and #2 sounds sensible to me.

#3 seems unnecessary in the sense that I am pretty sure we could just not insert the copy in the first place.

I am not sure I see the need for changing #4. This goes back to my question of do you have an example?

#5 somewhat agree, though they are not really in the way of anything right now.

We could try to implement some constant elimination once they are properly legalized.

Don't we have that for free already? If something is not dead we keep it, otherwise we remove it. What am I missing?

Cheers,
-Quentin

I did not consider the case where sequence of instruction is legal, but individual instructions are not.
Condition when we throw an error might become
if (ArtifactList.empty() && TryLaterArtifactList.empty())
but other then that, there is no conflicts with artifact combining so it should not be a problem.

here is a small example:
i64 is held as two i32s.

define i64 @f(i32 %x) {
entry:
  %conv = zext i32 %x to i64
  ret i64 %conv
}

IRTranslator gives:

%0:_(s32) = COPY $a0
%1:_(s64) = G_ZEXT %0(s32)
%2:_(s32), %3:_(s32) = G_UNMERGE_VALUES %1(s64)
$v0 = COPY %2(s32)
$v1 = COPY %3(s32)
RetRA implicit $v0, implicit $v1

In Legalizer, ArtifactList contains:

%1:_(s64) = G_ZEXT %0:_(s32)
%2:_(s32), %3:_(s32) = G_UNMERGE_VALUES %1:_(s64)

and we go bottom up.

Here
Artifact %2:_(s32), %3:_(s32) = G_UNMERGE_VALUES %1:_(s64)
Has nothing to combine with at the moment, but it has artifact use (zext) and goes to TryLaterArtifactList
Artifact %1:_(s64) = G_ZEXT %0:_(s32) had nothing to combine with, and is moved to InstList

Instruction %1:_(s64) = G_ZEXT %0:_(s32) gets narrowScalar-ed:

%0:_(s32) = COPY $a0
%1:_(s64) = G_ZEXT %0:_(s32)

Into:

%4:_(s32) = G_CONSTANT i32 0
%1:_(s64) = G_MERGE_VALUES %0:_(s32), %4:_(s32)

ArtifactList contains:

%1:_(s64) = G_MERGE_VALUES %0:_(s32), %4:_(s32)

G_MERGE_VALUES gets turned into instruction and legalized (it is legal).

now we move TryLaterArtifactList to ArtifactList
ArtifactList contains:
%2:_(s32), %3:_(s32) = G_UNMERGE_VALUES %1:_(s64)
it gets combined with
%1:_(s64) = G_MERGE_VALUES %0:_(s32), %4:_(s32)

we get:

%0:_(s32) = COPY $a0
%4:_(s32) = G_CONSTANT i32 0
$v0 = COPY %0(s32)
$v1 = COPY %4(s32)
RetRA implicit $v0, implicit $v1

We could try to implement some constant elimination once they are properly legalized.

Don't we have that for free already? If something is not dead we keep it, otherwise we remove it. What am I missing?

Constant combining is probably a better term.
For example when we want to store i1 true/false, target decides if it wants it zero/sign extended to size that it specifies. Here we can combine away explicit sign/zero extend + G_CONSTANT and create appropriate constant.
I am not completely sure how to treat constants, this is just an idea.

Btw, we might want to perform combine of (-O0 in clang frontend)

%0:_(s32) = COPY $a0
%1:_(s32) = G_CONSTANT i32 2
%2:_(s32) = G_CONSTANT i32 3
%3:_(s32) = G_ADD %0, %1
%4:_(s32) = G_ADD %3, %2

into

%0:_(s32) = COPY $a0
%5:_(s32) = G_CONSTANT i32 5
%6:_(s32) = G_ADD %0, %5

There are passes in middle-end that perform this kind of optimization so we might want to skip this.
On the other hand SDAG can perform this kind of optimization.