For code like
struct S {
void* p; void* q;
};
static S kS0;
S getS() {
return kS0;
}
LLVM generate following instructions for ppc
lxvx 0, 0, 3 mfvsrld 3, 0 mfvsrd 4, 0
Ideal result should be just
ld 3, 0(4) ld 4, 8(4)
The problem is in SLPVectorizer, the vector build instructions (insertvalue for aggregate type) is passed to BoUpSLP.buildTree, it is treated as UserIgnoreList, so later in cost estimation, the cost of these instructions are not counted.
For aggregate value, later usage are more likely to be done in integer registers, either used as individual scalars or used as a whole for function call or return value. So for vectorization of aggregate value, the scalar extraction instructions are required in cost estimation.
I'd say "scalar registers" instead of "integer registers" (they might, for example, be floating-point registers).