Compiler Warlock @ Unity
User Details
- User Since
- Dec 6 2013, 2:54 AM (384 w, 1 d)
Feb 19 2020
Feb 17 2020
Feb 3 2020
@hfinkel ping?
Jan 28 2020
Use combineMetadataForCSE to combine the metadata.
Jan 27 2020
Jan 24 2020
Jan 6 2020
Dec 13 2019
Dec 11 2019
@nhaehnle just added you so that you are aware of this change - it in theory could minorly change codegen on AMDGPU but I think you don't actually return any more precise noaliasing state so it should (tm) be fine. But thought it was worth you oking this either way!
Jun 20 2019
May 24 2019
May 22 2019
Guarded the change by an option to allow users of the structurizer to turn this functionality on in their own time.
May 21 2019
May 1 2019
Apr 30 2019
Added the extra test case that @spatel found that my change also fixes.
@spatel yup its the same bug, and my fix fixes it to. I'll add it to my test changes.
@reames ping?
Apr 23 2019
Fix some review comments by @spatel
Apr 12 2019
Apr 1 2019
Mar 29 2019
Found a fun little bug whereby the phys vgprs were being coalesced onto previous instructions, and then shouldClusterMemOps was assuming only virt regs. Added a workaround for that.
Mar 22 2019
Update for two reasons:
Mar 21 2019
Fix review comments.
Mar 19 2019
Added the max3 cases too.
Changed it to a whitelist of types rather than a blacklist (much better idea).
Fixed review comments by Matt and added the extra test cases that were a great idea!
Mar 18 2019
Reduce the number of DPP calls in the test for cleanliness, and reintroduce the convergent on WWM. The CFG test contains the bug that was exposed by the lack of convergent on WWM, LLVM will sink the WWM statement out of the branch which totally messes up all calculations.
Remove the explicit pass name.
Addressed all review comments.
Mar 13 2019
Mar 7 2019
Mar 6 2019
Add a test case that triggers the target transform info code path.
Mar 5 2019
LGTM.
LGTM.
Feb 11 2019
Fix comment that should now say exclusive scan instead of inclusive scan.
Feb 8 2019
Make the final readlane be in the WWM section too as per the review comments.
Feb 7 2019
Feb 6 2019
Updated to bring in an additional fix to remove read_register exec and replace it with a ballot.
Feb 5 2019
Jan 29 2019
Jan 28 2019
Fixed review comments.
Jan 18 2019
Fix review comments.
Jan 17 2019
Review comments.
Also tag s.getpc as scalar, and add tests in the Analysis folder like suggested.
@arsenm agreed - not quite sure how to do tests for this? I started to look at how we test alias analysis, and they have a cl::opt to force print the debug info for that pass, and that is how our amdgpu-alias-analysis.ll tests the aliasing rules. How exactly to test that divergence analysis did something here?
Jan 10 2019
@cwabbott I planned to do a followup once this DPP change had landed to add the missing dpp/codegen patterns to the atomic optimizer - so watch this space!
Jan 9 2019
Fix review comment.
Jan 8 2019
Fix review comments.