This patch introduces a new heuristic for guiding operand reordering. The new "look-ahead" heuristic can look beyond the immediate predecessors. This helps break ties when the immediate predecessors have identical opcodes (see lit test for an example).
I agree, they look quite scary but the values themselves are not very important.
We could still use the same score of 1 for all of them and we would still get decent reordering. Another alternative would be to check TTI for each potential candidate vector, but it looks like an overkill.
Addressed comments and updated lit test.
Hmm good point. Well, currently 'getScoreAtLevelRec()' will simply walk past the alt instructions and will assign them ScoreSameOpcode. This is does not look very accurate because alt opcodes usually require shuffles and should have a lower score.
As for the alt opcodes that fold away, maybe that should be fixed in the getSameOpcode() and in struct InstructionState ? If the get folded then maybe isAltShuffle() should return false?
I investigated the two AArch64 failing tests. These tests feature the exact problem that we are trying to solve with this look-ahead heuristic. A commutative instruction had operands of the same opcode that the current heuristic has no way of reordering in an informed way. The current reordering was just lucky to pick the proper one, while the look-ahead heuristic was reordering the operands according to the score. However, the problem was that the score calculation was not considering external uses and was therefore favoring a sub-optimal operand ordering.
I updated the patch to factor in the cost of external uses, and both failures are now gone. I also updated the lit-test with a test that shows the problem with the external-uses.
hmm, I have another failure with this change on my setup, now it is PR39774.ll. Probably, it might be sort algorithm differents of similar, since it just swap of two inserts. I don't have this failure if PR39774.ll is intact.