This is an archive of the discontinued LLVM Phabricator instance.

[X86][AMX] Lower tile copy instruction.
ClosedPublic

Authored by LuoYuanke on Feb 19 2021, 11:05 PM.

Details

Summary

Since there is no tile copy instruction, we need to store tile
register to stack and load from stack to another tile register.
We need extra GR to hold the stride, and we need stack slot to
hold the tile data register. We would run this pass after copy
propagation, so that we don't miss copy optimization. And we
would run this pass before prolog/epilog insertion, so that we
can allocate stack slot.

Diff Detail

Event Timeline

LuoYuanke created this revision.Feb 19 2021, 11:05 PM
LuoYuanke requested review of this revision.Feb 19 2021, 11:05 PM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 19 2021, 11:05 PM

Remove useless code.

pengfei added inline comments.Feb 20 2021, 1:15 AM
llvm/lib/Target/X86/X86LowerTileCopy.cpp
2

Comment is wrong.

10

instructions

llvm/lib/Target/X86/X86RegisterInfo.cpp
878

Is it possible to define a special COPY for AMX which can implicitly define a register for stride?

llvm/lib/Target/X86/X86TargetMachine.cpp
588

We are much like handling X87 register copy in pass "X86 FP Stackifier", so I think we can add the pass to addPostRegAlloc like it.

llvm/test/CodeGen/X86/AMX/amx-lower-tile-copy.ll
38

As we had discussed, tilezero should be rematerialized instead of spilling. For non tilezero cases, we still need to consider the spilling as loop invariant and hoist it out of the loop. Anyway, these are optimization thoughs which don't affect the functionality here.

LuoYuanke added inline comments.Feb 20 2021, 2:11 AM
llvm/lib/Target/X86/X86RegisterInfo.cpp
878

Not sure. The COPY instruction is common for all target.

llvm/lib/Target/X86/X86TargetMachine.cpp
588

Sounds good to me.

llvm/test/CodeGen/X86/AMX/amx-lower-tile-copy.ll
38

I would do the optimization in another patch.

LuoYuanke updated this revision to Diff 325175.Feb 20 2021, 2:15 AM

Address Pengfei's comments.

LuoYuanke added inline comments.Feb 20 2021, 2:16 AM
llvm/lib/Target/X86/X86RegisterInfo.cpp
878

And COPY instruction is auto generated by some passes.

pengfei accepted this revision.Feb 21 2021, 5:21 PM

LGTM.

This revision is now accepted and ready to land.Feb 21 2021, 5:21 PM
This revision was landed with ongoing or failed builds.Feb 22 2021, 3:50 PM
This revision was automatically updated to reflect the committed changes.

Do we need to force opt to build a legacypassmanager for this pass?