- User Since
- Jan 13 2015, 12:58 PM (183 w, 3 d)
Tue, Jul 17
Thu, Jul 5
Thanks @vivekvpandya for getting back to me.
I'll work on updating the patch now. I think we're on the same page with the potential problems.
Essentially, there seems to be some assumptions that when EnableIIPRA is set, it will work properly and add additional information that can be used. Now, with the early exit due to isDefinitionExact, that is not necessarily the case. So, I'm trying to put a check in to validate that IPRA was successful before using the information. This way, if we extend the checks in IPRA in the future, we don't need to continually replicate them in other places that rely on IPRA being successful.
Wed, Jun 27
@vivekvpandya You're right - the change will allow noCSROpt to run on functions with external linkage, which is not correct.
I think what we really want here is the isSafeForNoCSROpt to check for the existing conditions (i.e., prior to this patch, when it looks for local linkage) and also whether IPRA was successful for this function. I added the isDefinitionExact() call, to try and replicate the necessary conditions for IPRA, but now I'm wondering if it would be better to query the PRUI and make sure it contains a valid RegMask for the given function. Does this seem reasonable to you?
Mon, Jun 25
Jun 6 2018
Added the isDefinitionExact check to isSafeForNoCSROpt method.
Updated the test case to use -print-regusage to check for exact register clobbers computed by IPRA.
Apr 4 2018
Oct 24 2017
Oct 23 2017
Aug 22 2017
Did @nemanjai comment about the vector extracts get addressed?
I would change the summary of this review to reflect the fact we are moving this pass back to the PPC target.
Aside from that, this LGTM.
Aside from some minor changes to the comments, this LGTM.
Jul 10 2017
LGTM. Just some minor updates in the comments.
Jun 13 2017
Jun 12 2017
Jun 5 2017
I'm OK with this patch and then a follow-up patch to enable the new behaviour by default.
@arsenm Are you OK with this approach or do you prefer a different approach?
This looks OK to me, but I don't have enough knowledge of SROA to accept this.
If I understand this correctly, you're trying to use the tryBitfieldInsert before the tryBitPermutation function because you can get cleaner code out of some cases of tryBitfieldInsert. However, you only want to use the tryBitfieldInsert on some cases, not on all. Is this correct?
If so, I think it makes sense to refactor the tryBitfieldInsert to focus on the specific cases you want to try and identify and then call the existing tryBitPermutation and the remaining tryBitfieldInsert afterwards. There may be some code duplication as a result (hopefully that can be minimal) but the logic will be easier to understand.
May 25 2017
Aside from a spelling mistake, LGTM.
May 16 2017
Thanks for adding the additional test case!
May 12 2017
May 4 2017
Aside from two minor comments, this LGTM.
Please make sure to update the release notes when this gets committed.
Please add the PR that this problem is addressing to the summary.
D32763 only handles the trailing fences. Is there going to be another patch that deals with the leading fences as well?
Mar 1 2017
Feb 8 2017
Aside from a couple minor comments, this LGTM.
Feb 1 2017
Jan 13 2017
Aside from updating two comments, this LGTM.
Jan 11 2017
Jan 4 2017
Sorry, I don't have time to go through the entire patch in detail right now. But I did notice several places where the lines are too long, which need to get fixed.
There are still a few things that need to get cleaned up.
Dec 14 2016
Dec 1 2016
I looked through this, but my review is mostly superficial as I don't have the background to be able to comment on the logic here.
Aside from some minor comments, this LGTM.
Please make explicit the signed for the parameters to the functions you are changing and remove unnecessary casts. I marked the first few that I found, but stopped marking them after the first several.
Need to address Hal's comments and post a subsequent patch. Thanks.
Nov 22 2016
LGTM, but I'd like @nemanjai to have a quick look at the builtin_vsx_insertword and builtin_vsx_extractuword too.
Nov 21 2016
There are some things that need to get cleaned up.
Nov 18 2016
The patterns for the int_ppc_vsx_lxvw4x and int_ppc_vsx_stxvw4x are still here.
These patterns should be intercepted and dealt with earlier, but if they are not they will match here and end up generating correct code. Can you please remove these patterns for LE. That way if they survive to this point, the program should fail to compile instead of generating incorrect code.
Nov 15 2016
Did you upload a new patch? The diff doesn't appear to change for me.
There are several inline comments that need to be addressed.
I also think it's worthwhile putting a comment at the top of the review indicating the (assumed) semantics for the vec_sube and vec_subec instructions that are being implemented (i.e., the behaviour mimics the vec_subeuqm instructions and thus uses one's compliment plus the carry)
Nov 12 2016
Nov 11 2016
Nov 10 2016
Aside from some minor formatting of comments, this LGTM.
Nov 9 2016
One minor question, more for my own curiosity, that we can likely address offline. Aside from that, LGTM.
Nov 8 2016
With the suggestions above, this LGTM.
Nov 7 2016
Oct 31 2016
I think the loop on line 10619 needs to be reworked a bit (or I don't understand what it's trying to do).
Aside from some nits about comments and formatting this LGTM.
Can you please add some more comments in the summary about the intention of this group of patches?
Aside from some formatting issues, this LGTM.
Oct 27 2016
Oct 25 2016
I went through this fairly carefully and couldn't spot anything wrong. I think there are some opportunities to make some of the methods const, but that would be outside the scope of this patch.