The previous code calculated the first ldtilecfg by dominating all AMX registers' def. This may result in the ldtilecfg being inserted into a loop.
This patch try to calculate the nearest point where post dominats all shapes of AMX registers.
Paths
| Differential D99010
[X86][AMX] Hoist ldtilecfg ClosedPublic Authored by pengfei on Mar 19 2021, 11:24 PM.
Details Summary The previous code calculated the first ldtilecfg by dominating all AMX registers' def. This may result in the ldtilecfg being inserted into a loop. This patch try to calculate the nearest point where post dominats all shapes of AMX registers.
Diff Detail
Event TimelineComment Actions Since D98845 is landed. I'd like to do the ldtilecfg hoist together with this patch. WIP. pengfei retitled this revision from [X86] Fix a bug when calculating the ldtilecfg insertion points. to [X86][AMX] Hoist ldtilecfg.Mar 31 2021, 7:52 PM
pengfei marked 8 inline comments as done. Comment ActionsAddress Yuanke and Xiang's comments.
Comment Actions Perhaps we need more comments and more test cases (maybe in a sperate file) to cover those scenario.
Comment Actions Fixed the problem when the sink need to be forked. I.e. +------+ |Entry | BB0 +------+ / \ +------+ +------+ |Shape1| |Shape2| BB2 +------+ +------+ BB1 \ / +------+ | AMX | BB3 +------+ If BB1 and BB2 don't have a call, we will try to insert ldtilecfg from BB0. Comment Actions The algorithm for updating shape postdominate BBs is buggy. For a given ShapeBB, clear all its predecessors flag is not enough since its unreachable BBs are also need to clear. Comment Actions
I have thought out a new method but need major refactor. Stay tuned~ Worked it out.
pengfei marked an inline comment as done. Comment ActionsAddress Yuanke's comments.
pengfei marked 2 inline comments as done. Comment ActionsAddress Xiang's comments.
pengfei marked an inline comment as done. Comment ActionsAddress Xiang's comment. pengfei marked 2 inline comments as done. Comment ActionsAddress Yuanke's comments. Add test5 for checking shape peek's loop break. Comment Actions Fix a silly bug that uses != in case == and a bug when records shape for phi. This is found by work with PostRA implementation. This should be found by case amx-ldtilecfg-insert.ll:, but happened to get expected output with these 2 bugs.
Comment Actions Refactor for excluding DBG_VALUE case.
This revision is now accepted and ready to land.Apr 12 2021, 6:22 AM This revision was landed with ongoing or failed builds.Apr 12 2021, 7:37 AM Closed by commit rG4cbaaf4a2437: [X86][AMX] Hoist ldtilecfg (authored by Wang, Pengfei <pengfei.wang@intel.com>). · Explain Why This revision was automatically updated to reflect the committed changes.
Revision Contents
Diff 335062 llvm/lib/Target/X86/X86PreTileConfig.cpp
llvm/test/CodeGen/X86/AMX/amx-across-func.ll
llvm/test/CodeGen/X86/AMX/amx-config.ll
llvm/test/CodeGen/X86/AMX/amx-ldtilecfg-insert.ll
llvm/test/CodeGen/X86/opt-pipeline.ll
|
Since we change the algorithm, we need to update the pass description.