For the case of vector storing of aggregate built before by findBuildAggregate() we adjust total cost to consider that vector is already prepared.
This patch fixes llvm.org/PR40522
Differential D96791
[SLP] Double UserCost compensation for vector store of aggregate anton-afanasyev on Feb 16 2021, 8:32 AM. Authored by
Details
For the case of vector storing of aggregate built before by findBuildAggregate() we adjust total cost to consider that vector is already prepared. This patch fixes llvm.org/PR40522
Diff Detail Event TimelineComment Actions I think the better approach would be to pass the list of InsertUses as a buildTree function UserIgnoreLst argument. You just need to correctly generate ExtractElement instructions for these InsertElements because currently the compiler just crashes trying to remove instructions used as operands for the original InsertElements. Thoughts? Comment Actions Please can you cleanup pr40522.ll (remove the metatdata etc.)? Also, PR40522 notes missing fp2int vectorization - please can you add those? typedef int int4 __attribute__((__vector_size__(16))); void fp2int_vec1(float a, float b, float c, float d, int *p) { int4 result = (int4) { (int) a, (int) b, (int) c, (int) d }; *p++ = result[0]; *p++ = result[1]; *p++ = result[2]; *p++ = result[3]; } void fp2int_vec2(float a, float b, float c, float d, int4 *p) { int4 result = (int4) { (int) a, (int) b, (int) c, (int) d }; *p = result; } Comment Actions I can add fp2int sample, but it is not touched by this patch. Actually, its tree built is not vectorized by marking as "tiny" (R.isTreeTinyAndNotFullyVectorizable() function), its size is 2 (since fp2int is unary operation, whereas add is binary one). Though it could be fixed by checking that stores are using its result and therefore tree size could be increased. But this looks too hacky, isn't it? Comment Actions OK - adding the fp2int tests to the same test file makes sense to me as they call came from the same bug, even if we don't yet have a fix |
Can we use an llvm::any_of pattern instead here?