This fixes 2 problems in subregister hierarchies with multiple levels
and tuples:
- For bigger tuples computing secondary subregs would miss 2nd
order effects. In the test case a register like S10_S11_S12_S13_S14
with D5 = S10_S11, D6 = S12_S13 we would correctly compute sub0 = D5,
sub1 = D6 but would miss the fact that we could now form
ssub0_ssub1_ssub2_ssub3 (aka sub0_sub1) = D5_D6. This is fixed by
changing computeSecondarySubRegs() to compute a fixpoint.
- Fixing 1) exposed a problem where TableGen would create multiple
names for effectively the same subregister index. In the test case
the subregister index sub0 is composed from ssub0 and ssub1, and sub1 is
composed from ssub2 and ssub3. TableGen should not create both sub0_sub1
and ssub0_ssub1_ssub2_ssub3 as infered subregister indexes. This changes
the code to build a transitive closure of the subregister components
before forming new concatenated subregister indexes.
This fix was developed for an out of tree target. For the in-tree
targets the only change is in the register information computed for ARM.
There is a slight chance this fixed/improved some register coalescing
around the QQQQ/QQ register classes there but I couldn't see/provoke any
code generation differences.
Could you add a comment with a drawing of the register hierarchy you describe?
For lazy people like me, it is faster to get what is being represented :).