This is an archive of the discontinued LLVM Phabricator instance.

[X86] Fix tile config register spill issue for AMX
AbandonedPublic

Authored by xiangzhangllvm on Jan 26 2021, 11:39 PM.

Details

Summary

This is an optimized approach for D94155 , D95136.

Previous code build the model that tile config register is the user of
each AMX instruction. There is a problem for the tile config register
spill. When across function, the ldtilecfg instruction may be inserted
on each AMX instruction which use tile config register. This cause all
tile data register clobber.

To fix this issue, we remove the model of tile config register. We
analyze the AMX instructions between one call to another. We will insert
ldtilecfg after the first call if we find any AMX instructions.

Diff Detail

Event Timeline

xiangzhangllvm requested review of this revision.Jan 26 2021, 11:39 PM
Herald added a project: Restricted Project. · View Herald TranscriptJan 26 2021, 11:39 PM
pengfei added inline comments.Jan 27 2021, 12:17 AM
llvm/lib/Target/X86/X86PreTileConfig.cpp
242

Since you iterate all MI twice, the total complex is 2 * N(BB) * M(MI), which is worse than D56136 (N * M + N).

260

If it is also succsor of itself, it will drop into infinite loop.

llvm/lib/Target/X86/X86PreTileConfig.cpp
242

we see O(N) == O(2N) in program.

260

good catch! it should move ahead.

xiangzhangllvm abandoned this revision.Jan 27 2021, 4:04 PM

Update the algorithm of ReloadTileConfig just for interest.

pengfei added inline comments.Jan 29 2021, 9:14 PM
llvm/lib/Target/X86/X86PreTileConfig.cpp
300

Doesn't it always just return BB2AMX[TileCfgBB]?

xiangzhangllvm added inline comments.Jan 30 2021, 1:10 AM
llvm/lib/Target/X86/X86PreTileConfig.cpp
300

If call collectCfgCalls 2 times, the first time call will make sure at least 1 loop dependent BB status (can directly reach amx or not), So the 2nd can make sure all BBs' status.

pengfei added inline comments.Jan 30 2021, 1:25 AM
llvm/lib/Target/X86/X86PreTileConfig.cpp
300

But the 2nd call doesn't do anything. Since the first will add all BB into BBVisited, the 2nd call will return directly from line 251.

llvm/lib/Target/X86/X86PreTileConfig.cpp
300

Oh, yes! I missed BBVisited.clear() before 2nd Call.

xiangzhangllvm abandoned this revision.Jan 30 2021, 11:17 PM
xiangzhangllvm updated this revision to Diff 320340.

Just for interest to improve ReloadTileConfig, now Abandoned this patch