User Details
- User Since
- Apr 4 2022, 3:15 PM (50 w, 6 d)
Nov 29 2022
Our local testing with this patch shows that it solves our issue. Thanks very much! Looking forward to it landing.
Sep 20 2022
Whoops, I forgot to reference this revision when I committed the patch. This was bccc9aa81c1c1d212acd3314895731ec4de30e35.
Sep 7 2022
Jul 19 2022
Jul 18 2022
Jun 15 2022
Jun 14 2022
Rebased; fixed pasto in getFMACostSavings(); ensured that getFMACostSavings() returns nonnegative values; and made a slight simplification in adjustForFMAs(). Thanks!
Jun 3 2022
May 25 2022
May 23 2022
This revision now includes fixes for both cases: a horizontal reduction of fadds fed by a vectorized tree rooted at fmuls, and a vectorized tree of fmuls that feeds arbitrary fadds and/or fsubs. It was misguided to break this up into two patches without showing you both...
May 19 2022
You're welcome! Thanks for the good advice!
May 18 2022
Thanks for the helpful comments to date! In this version, I've managed to remove the undefs from the original test. I also added a second test that removes the loop structure. For both tests, today we will generate an unprofitable horizontal reduction. With the first test, adding cost modeling to constrain the horizontal reduction allows FMAs to be generated. With the second test, this is insufficient, as we then decide to vectorize the multiplies in an unprofitable way. The two tests demonstrate the need to account for lost FMAs in the cost modeling both when vectorizing for a reduction and when vectorizing a list of multiplies.
May 17 2022
May 11 2022
May 10 2022
Hi! I'd like to ping this revision, please.
May 4 2022
I've made all requested changes, with the exception that I can't remove the loop structure or any of the undefs without breaking the test. In both cases, we no longer generate the horizontal reduction. I've made all the reductions Alexey requested, and changed the variable names as Vasileios requested.