This patch tries to fix PR50823.
The shuffle mask should be twisted twice before gotten the correct one due to the difference between inner HOP and outer.
Differential D104903
[X86] Twist shuffle mask when fold HOP(SHUFFLE(X,Y),SHUFFLE(X,Y)) -> SHUFFLE(HOP(X,Y)) pengfei on Jun 25 2021, 2:41 AM. Authored by
Details This patch tries to fix PR50823. The shuffle mask should be twisted twice before gotten the correct one due to the difference between inner HOP and outer.
Diff Detail
Unit Tests Event TimelineComment Actions Thanks for looking at this - I've been busy with other things this week and haven't really been keeping up with bug traffic!
Comment Actions This has scaringly many magic numbers.
Comment Actions Address review comments. Finally, I figured out the math here. The shuffle mask should be twisted twice before gotten the correct one. But the output happens to be identical when the input mask is <0, 2, 1, 3>. It confused me a long time and I think this is why the bug was hidden.
|
Since the final shuffle has element type of i64/f64,
should this enforce that the source element type is less than that?