- User Since
- Apr 25 2016, 3:58 AM (103 w, 6 d)
Sun, Apr 8
Sat, Apr 7
It shouldn't differ.
The xmm version has 1 cycle latency and ymm version has 2 cycle latency for both AVX and SSE.
Thu, Mar 29
Tue, Mar 27
Sun, Mar 25
Looks good to me!
Mar 21 2018
Aug 30 2017
Updated for review comments from Craig Topper!
Aug 22 2017
Simon! If you are okay with the patch, can you please commit the patch on my behalf!
Updated as per Javed's comments!
Aug 20 2017
Updated the patch as per Simon's comments.
Added the FP instruction itineraries which includes SSE4A and SHA instructions.
Aug 18 2017
Yes Simon! I will include the SSE4A instructions, their itineraries in the next patch. I will include tests verifying them as well.
If this patch is okay, can you please commit this patch on my behalf.
Simon, Craig Topper! My next increment is ready. If this patch can be accepted and committed, I will rebase and submit the next patch.
Or should I submit the next patch as an incremental patch with the changes put forth in this patch? Please help!
Aug 14 2017
Updated for the itineraries of memory variants of the instructions.
Aug 11 2017
Jul 19 2017
Jul 18 2017
Simon! If you are fine, can you please commit the patch on my behalf. I am yet to get commit access rights. Probably, after this patch, I will try to get it.
Patch update: For newer testcases.
Jul 17 2017
Updated as per Javed's review comments!
Jul 16 2017
Updated as per the review comments.
Jul 12 2017
Feb 8 2017
Thank you @craig.topper.
@craig.topper If you are okay, can you please commit the changes on my behalf?
I think it is okay even if we don't set the mayStore attribute.
I wrote a simple test to check the following
- Schedules based on the instruction attribute
- Side-effect handling
Feb 7 2017
Updated the test file "x86-32.s" for clzero only test!
Updated the builtins test for "__builtin_ia32_clzero"
Updated for review comments.
Updated for the review comments
Feb 1 2017
Jan 9 2017
If Okay, can you please commit these on my behalf. I don't have write access.
Yes. True I mentioned that for the grouping or the order of the features enabled. These initFeatureMap are done based on the intrinsics and the CodeGen part.
Adding znver1 to following tests.
b. Slow SHLD
c. slow unaligned memory
Fallback to CK_BTVER1 is ok but not to CK_BTVER2. This is not possible because of the partial YMM writes. They have different behavior for znver1 with AVX and their legacy SIMD counterparts. So, as of now leaving them to alphabetical order.
Jan 8 2017
The clzero intrinsic handling and feature addition will be handled as a separate patch.
Added movbe and sse4a into ISA list of znver1.
The clzero builtins and feature addition will be handled separately in another patch.
SSE4a and movbe are added to the ISA list.
Dec 21 2016
I am preparing a patch which doesn't include the clzero feature patch.
I will submit a separate patch for clzero feature patch.
May 17 2016
May 13 2016
Added FeatureMWAITX to bdver4.
May 11 2016
Incorporated comments from Simon!
May 9 2016