Allocating AMX/tile register in a separate pass have 3 benifits.
a) When spill tile register in fast register allocation, it would create
a virtual register for stride. The instruction that def the virtual
register is inserted after current instruction which is allocating
physical register, so fast register allocation won't access the new
instrution and the virtual register won't be allocated with physical
b) When fill the shape information of tile configure, we need to know
the tile physical register. If the shape (row, column) is allocated with
physical register, the physical register may be re-defined before
writing it to stack. The re-def may happen during split or spill
registers. If the shape is in virtual register, there is no such
problem. Below is an example for the register split on %al. The number
of row of the tile should be 8, but it is configured as 4.
%al = 8
mov %al, %bl
%al = 4
store stack.cfg, %al ; config
c) Single configure may fail. In that case we can fallback to
multi-config and allocate tile register seperately, while leave other
registers be allocated by greedy RA.