This is WIP patch to remove the broadcasting of constants from the DAG and to instead perform this in a later pass, I'd like to hear people's thoughts on the approach while its still in the early stages.
The principal aim is to prevent the premature creation of broadcasts that prevent us folding the loads with another instruction, helping to reduce register pressure.
There's still a lot to be addressed in this early patch including:
- Subvector Broadcast handling (VBROADCASTF128 etc.).
- Folding of AVX512 constant loads (including masked loads) to AVX512 broadcasts.
- Folding of AVX512 instruction with folded constant loads to folded broadcasts.
- Better use of AVX (fp broadcasts) and SSE3 (movddup) broadcast instructions - the comment printout are a mess of float / integer which we might want to address first?
- Use of VPMOVZ/VPMOVSX extension load for non-uniform constants that are representable with smaller integers
- Remove the constant support entirely from lowerBuildVectorAsBroadcast in DAG
What's the reason to check hasInt256 rather than hasAVX2?