- User Since
- Jan 25 2017, 7:29 AM (133 w, 4 d)
Mon, Jul 29
Fri, Jul 26
Managed to get the fmac test to keep using fmac
Also updated the test to use non-anonymous values
Changed test to use fma intrinsic
+Stas to comment on the v_fmac_f16 test change.
Is it acceptable to change the result to look for v_pk_fma_f16 rather than 2 v_fmac_f16 instructions? If not, any suggestions on how to get the compiler to generate 2 x fmac instead?
Jul 17 2019
I might revisit this one - setting cumode seems messy to enable driver control of the WGP setting, but seems the most pragmatic at the moment.
Jul 15 2019
Jun 21 2019
May 9 2019
Apr 23 2019
Mar 20 2019
Mar 18 2019
Mar 12 2019
Mar 11 2019
Mar 7 2019
Modified test in line with review comments
Mar 6 2019
Mar 5 2019
Mar 4 2019
Feb 11 2019
Not really an area I'm 100% sure about - but looks ok to me. One of the other reviewers will have to sign off too.
Minor niggle on the comment (if my understanding is correct).
Feb 4 2019
Jan 14 2019
Dec 11 2018
LGTM - but probably need approval from one of the other reviewers as well
Nov 29 2018
Nov 28 2018
Nov 27 2018
Nov 19 2018
Thanks for the review - made all the suggested changes
Nov 13 2018
Nov 9 2018
Scott - you made some changes here most recently so adding you as the reviewer.
Nov 7 2018
Added Neil as a reviewer as I've made some changes to some of his a16 tests. I'm pretty certain that the modifications are correct, but wanted to get feedback on that as well.
Modified based on review feedback from Nicolai
Nov 6 2018
This patch has been superseded by D49097 - so I'm abandoning this one.
Oct 30 2018
Minor changes made.
Made minor code changes suggested in review
Oct 24 2018
@arsenm Matt, any more comments? Would you be happy with a clarification comment as per the last suggestion from me?
@arsenm Matt - good to go?
Covered all the requested changes (I think). Also implemented a test to make sure that simplifyDemanded doesn't run when TFE/LWE is enabled.
Changed the implementation of the intrinsic return type to be an aggregate type
Sep 24 2018
Updated a mov to a copy as per review comment
Sep 14 2018
Sep 13 2018
Moved foldToImm into SIInstrInfo as suggested
Implemented check in verifyInstruction and checked that it worked when the fix was removed
Added missing break
De-duplicated 1/2PI constant
Sep 12 2018
Made suggested change
Made suggested changes
Sep 11 2018
Folded in most of the changes highlighted in the review
Looking again at the code - you're correct that it attempts to only do this transformation if the high bits are zero.
However, the code that checks this has the following telling comment:
Jul 23 2018
LGTM - but you might want further reviews from others not so involved in implementation.
Jul 2 2018
@nhaehnle - just added you as reviewer at the moment.