While estimating a node cost of the tree which has dependency on another node we might too optimistic estimate code of shuffle operation by assuming that it is depending on a single tree node with TargetTransformInfo::SK_Select.
For example such node dependence ended up with this snippet:
%9 = fsub fast <2 x float> %8, %3 %10 = fadd fast <2 x float> %8, %3 %11 = shufflevector <2 x float> %9, <2 x float> %10, <2 x i32> <i32 0, i32 3>
Which ended up in suboptimal result in the end.
Can you try to reuse buildShuffleEntryMask instead?