- User Since
- Oct 13 2017, 7:01 AM (137 w, 2 d)
Wed, May 20
As Nicolai suggested, I added an operand which encodes A16- and G16-ness.
Apr 27 2020
Apr 9 2020
Apr 2 2020
What is the best way to handle A16 and G16 in instruction selection for GlobalISel?
Mar 31 2020
I think any operand changes should be done during legalization, and selection should be relatively simple. We already do more of this type of conversion during legalize, so I think it makes sense to consolidate it there.
What is the reason to do this during legalization?
Could we look for a constant zero argument during instruction selection?
Mar 30 2020
Add uniformity test
return instead of report_fatal_error
Mar 26 2020
Thank you for the fast review!
Anyone willing to review this?
Forgot to run formatter
Mar 23 2020
Fix formatting issue
Move the code to lowering again, I’m back were Jay started.
Report a fatal error if the size is neither i32 nor i64.
Mar 20 2020
I removed the COPY_TO_REGCLASS, it looked flaky and does not work with GlobalISel.
Instead, the ballot intrinsic is morphed into an AMDGPUISD::SETCC. Compares are the only reasonable way to get a boolean value into the wavefront form as an i32/i64 and use it in LLVM.
Mar 19 2020
Mar 18 2020
Add missing wave32 instcombining
Use getCopyFromReg(exec) and rebase on fixed whole-wave-mode.
Unfortunately this does not work anymore with the updated ballot intrinsic. I’ll leave this for later, see also D65088.
Mar 17 2020
Use mayReadEXEC as suggested by Matt.
Mar 16 2020
Mar 13 2020
Thanks Matt, I’ll use CopyFromReg.
Mar 12 2020
Mar 11 2020
Address Nicolai’s comments and implement this as DAG combines and TableGen patterns.
Mar 10 2020
Mar 9 2020
I’ll add a test case. Yes, this is related to the atomic optimizer and ballot intrinsic. There we get e.g. %0:vgpr_32 = V_MBCNT_LO_U32_B32_e64 $exec_lo, 0, implicit $exec.
No specific reason, I fixed the other tests manually but this one had many operations with exec so I wanted to make sure I don’t destroy anything there.
Remove accidentally added empty line
Feb 17 2020
Fix test, I hope the existing test suffices.
Oct 14 2019
is there anything missing for this pull request?
May 18 2019
The ssp is now set to zero in the SDAG variant and when a stack guard check function is used. I am not familiar with the selection DAG so please correct me if it can be improved.
I also added a test as requested.
Sep 6 2018
It took a while, now the code, figures and a more detailed explanation are online: https://flakebi.de/uni-items/ba/#clear-ssp
Even with this microbenchmark, I was not able to measure a difference in performance in comparison to llvm without this patch.