Change the type of AllChains from std::set<unique_ptr<Chain>> to std::vector<unique_ptr<Chain>>, as traversing pointers contained in std::set doesn't guarantee a well-defined order. We have observed differences in register assignment in .s files produced with the exact same build. I don't see that duplicate entries can be added to AllChains, so it should be safe to replace std::set with std::vector. This way, traversal of AllChains will be well defined.
Example debug output with -debug-only=aarch64-a57-fp-load-balancing resulting in different traversal:
Compile #1:
colorChainSet(): #sets=4 - Parity=2, Color=Odd - colorChain({%S19<def> = FMULSrr -> %S19<def> = FMULSrr}, Odd) - Scavenged register: S3 - Destination register not changed. - Parity=1, Color=Odd - {%S18<def> = FMULSrr -> %S18<def> = FMULSrr} - not worthwhile changing; color remains Even - colorChain({%S18<def> = FMULSrr -> %S18<def> = FMULSrr}, Even) - Scavenged register: S2 - Destination register not changed. - Parity=2, Color=Odd - colorChain({%S6<def> = FMULSrr -> %S6<def> = FMULSrr (kill @ %S18<def> = FMULSrr)}, Odd) - Scavenged register: S3 - Kill instruction seen. - Parity=1, Color=Odd - colorChain({%S16<def> = FMULSrr -> %S16<def> = FMULSrr (kill @ %S19<def> = FMULSrr)}, Odd) - Scavenged register: S5 - Kill instruction seen.
Compile #2:
colorChainSet(): #sets=4
- Parity=2, Color=Odd
- colorChain({%S19<def> = FMULSrr -> %S19<def> = FMULSrr}, Odd)
- Scavenged register: S3
- Destination register not changed.
- Parity=1, Color=Odd
- {%S18<def> = FMULSrr -> %S18<def> = FMULSrr} - not worthwhile changing; color remains Even
- colorChain({%S18<def> = FMULSrr -> %S18<def> = FMULSrr}, Even)
- Scavenged register: S2
- Destination register not changed.
- Parity=2, Color=Odd
- colorChain({%S16<def> = FMULSrr -> %S16<def> = FMULSrr (kill @ %S19<def> = FMULSrr)}, Odd)
- Scavenged register: S3
- Kill instruction seen.
- Parity=1, Color=Odd
- colorChain({%S6<def> = FMULSrr -> %S6<def> = FMULSrr (kill @ %S18<def> = FMULSrr)}, Odd)
- Scavenged register: S5
- Kill instruction seen.
Cannot have a reproducible test case for this.