Page MenuHomePhabricator

[X86] Emit KTEST when possible

Authored by davezarzycki on Oct 17 2019, 8:17 AM.


Diff Detail

Event Timeline

davezarzycki created this revision.Oct 17 2019, 8:17 AM

What's the benefit of this? We already turn KORTEST+KAND into KTEST at the end of isel. I was trying to favor masked compares instead of keeping to registers alive into the ktest. The patch for that was here

davezarzycki added a comment.EditedOct 17 2019, 11:03 AM

I agree that masked compare instructions are an important code density optimization. That being said, over reliance on this optimization can decrease overall instruction parallelism due to over-serialization of the instruction stream. If we can trust Agner Fog's data, the AVX512 comparison instructions have 3 cycle latency and a throughput of 1 instruction per cycle on SKX (and later), and 2 cycle latency and 2 instructions/cycle on KNL (and later). Therefore code before this change would needlessly serialize comparison instructions when they could run in parallel.

This revision is now accepted and ready to land.Oct 17 2019, 1:15 PM