This build vector lowering pattern came up in D79886. I've tried to limit the improvement to cases where it looks clearly better to load, but we could remove the 'TODO' predicates already if we are willing to overlook some corner cases.
Details
Diff Detail
Event Timeline
llvm/test/CodeGen/X86/combine-udiv.ll | ||
---|---|---|
602 | This would improve without the -1 restriction. | |
681–682 | No change for AVX2 is probably caused by the 128-bit limit. | |
llvm/test/CodeGen/X86/sad.ll | ||
547–548 | This would improve without the -1 restriction. | |
1019–1020 | No change for AVX2/AXV512 is probably caused by the 128-bit limit. | |
llvm/test/CodeGen/X86/vec_shift2.ll | ||
13 | This is a regression, but I'm assuming it does not matter because we have been using standard IR for vector shifts for at least 5 years. If it does matter, then I think the next test shows an existing failure of constant analysis. Also, if the high part of the shift amount is undef, then can't we fold both of these tests to constant 0 (no shift needed)? |
This would improve without the -1 restriction.