- User Since
- May 5 2014, 7:26 AM (159 w, 12 h)
Test _mm256_cmp_pd as well?
Fri, May 19
Thu, May 18
I suggest that we just delete mul-i1024.ll before accepting this patch - any objections?
Why don't you use MOVMSKPD/MOVMSKPS for the 32/64 bit cases and avoid the vector truncation?
Add extra tests?
Wed, May 17
The test files need some attention.
LGTM - I'm happy for the missed constant fold to be handled in a followup, add a TODO if you can.
I think this is sound, but I'm not an expert on any of this. Any other comments?
Tue, May 16
Please can you regenerate the diff with context?
This needs to be done in general, not just for Jaguar. Please can you add WriteFHAdd and WriteVecHAdd defs in X86Schedule.td, and then tag the relevant instructions in X86InstrSSE.td and X86InstrAVX512.td. Then in ScheduleBtVer2.td you need to add instances of the 2 defs and special case the ymm versions. Either add TODOs for the other x86 models or add them if you want to dig through Agner's tables.
Mon, May 15
A possible addition would be to custom lower i8/i16 vectors with a trunc(popcnt(zext))) pattern.
Sun, May 14
Please can you rebase against trunk latest?
Sat, May 13
Fri, May 12
LGTM - with a future patch to investigate using ExecutionDepsFix
@mcrosier Any comments?
Is this patch still relevant?
LGTM with a couple of minor queries
Thu, May 11
Wed, May 10
Tue, May 9
Should X86Local be inside the llvm namespace?
Ryzen returns 0xffffffff as well.
Mon, May 8
Added v4i64/AVX2 assertion