This is an optimized approach for D94155 , D95136.
Previous code build the model that tile config register is the user of
each AMX instruction. There is a problem for the tile config register
spill. When across function, the ldtilecfg instruction may be inserted
on each AMX instruction which use tile config register. This cause all
tile data register clobber.
To fix this issue, we remove the model of tile config register. We
analyze the AMX instructions between one call to another. We will insert
ldtilecfg after the first call if we find any AMX instructions.
If it is also succsor of itself, it will drop into infinite loop.