Patch to allow detectAVGPattern handle vectors larger than the legal size (128 SSE2, 256 AVX2, 512 AVX512BW), splitting the vectors accordingly.
Details
Details
Diff Detail
Diff Detail
- Repository
- rL LLVM
Event Timeline
Comment Actions
What happens with types that are multiple of 128 bits,
| lib/Target/X86/X86ISelLowering.cpp | ||
|---|---|---|
| 33899–33900 | Isn't the modulo guaranteed true by the isPowerOf2_32(NumElems) check above? And if it wasn't it would mean we would do something different for a 384-bit vector with only SSE2(since its divisible by 128) than we would for AVX2(since its not divisible by 256) | |
Comment Actions
Don't bother checking for whole vector sizes - the isPowerOf2_32 test will handle it.
@craig.topper I added a v48i8 test at rL321261 - do you think its worth me generalizing this further to support irregular sized vectors like that?
Isn't the modulo guaranteed true by the isPowerOf2_32(NumElems) check above? And if it wasn't it would mean we would do something different for a 384-bit vector with only SSE2(since its divisible by 128) than we would for AVX2(since its not divisible by 256)