- User Since
- Jan 26 2016, 2:17 AM (94 w, 5 d)
Thu, Nov 16
Looks good to me.
Wed, Nov 15
Hi Sam, that's a very nice optimisation, but I think we need more tests. The "variables" that we have here are the values types (load byte, load half, load word), and the different possible offsets. Guessing a bit here, but that must be something like 3 * 3(?) = 9 tests.
And another question is how this interacts with -munaligned-access and -mno-unaligned-access?
Tue, Nov 7
Thu, Nov 2
Many thanks for the reviews and suggestions! Comments addressed.
Tue, Oct 31
Addressed the comments about portability, and added a note that Float16 is available in both C and C++ mode.
Oct 16 2017
Oct 13 2017
Oct 12 2017
Thanks for reviewing/checking. The regexp class implements both BRE and ERE, but it defaults to using ERE. Thus FileCheck defaults to ERE. As the mode is not exposed to the user (with e.g. an option), I kept it simple and now say that FileCheck supports ERE.
Oct 11 2017
I would like to add links for BRE and ERE, to get a quick syntax overview. Not sure what page would be best for that, and if we e.g. can or like to link to wikipedia (https://en.wikipedia.org/wiki/Regular_expression#POSIX_basic_and_extended) or something else.
Oct 10 2017
Oct 4 2017
Just clarifying a typo in my previous comment:
- removed the f16 type from the calling conventions except for ARM_AAPCS_VFP. It's necessary to add them elsewhere, the default behaviour is what we want.
- added a test case using the hard float abi, but without fullfp16 support.
- added some comments about the HPR register class.
Oct 3 2017
Hi Sam, many thanks for checking! Cheers, Sjoerd.
Oct 2 2017
Added some more.
Realised that I am not testing hard float abi without fullfp16 support. And now see that code generation is not really optimal: it is generating eabi calls for the fp16 <-> fp conversions, whereas it should use the normal convert instructions. Fixing this.
Sep 29 2017
Looks good to me. Thanks.
Sep 27 2017
Sep 26 2017
Looks good to me (sorry I missed this in the other/previous review; I looked at it and thought it was alright...)
Sep 22 2017
looks reasonable to me
The context of this work was to recognise loop induction statements in Machine Instructions/Loops and to allow calculation of loop iteration counts. And isAdd provides more semantic information about a MI and helps recognising induction variables. Thus, this was the ground work for D27193, which was the next step to recognise loop induction. This patch was abandoned because the Quentin's comment was that the analysis was too ad-hoc and missing some formalism. The reason this is used only in the Hexagon backend, is because I refactored it and took it from the Hexagon backend. The Hexagon backend has a lot of MachineLoop/MI analysis and other goodness (sw pipeliner) that is mostly target independent and could be made available, but a few target specific hooks are required to do this.
Now, I really would like to pick up this work again because I still think it will be valuable, but need to rethink strategy after abandoning D27193. Probably that will be a some SVEC-like approach, but hopefully that can be a light-weight implementation to recognise simple loops (which is what we're doing anyway). This will probably be quite some work, so in the mean time, I really don't mind removing or moving isAdd back to the Hexagon backend. I could also dig up the exact requirements and document that if it's okay to keep it.
Sep 18 2017
Thanks for checking!
I was just replying, but yes, there are only 4 bits available to encode Vm, the other M bit is the index.
Hi, the full architecture specification is publicly available here:
Sep 15 2017
Looks reasonable to me.
Sep 13 2017
many thanks for reviewing and your help.
Fixed the typos, and added tests.
Sep 12 2017
Sep 11 2017
Sep 8 2017
Many thanks for reviewing and your help!
Sep 6 2017
I am going to commit this within a few days. That looks reasonable to me given that the comments in the last reviews were very minor (which I have of course addressed already). Also, in case of issues, I am guessing fixes and/or addition can be easily done post-commit, and if not a revert is cheap.
Aug 30 2017
Hi Sam, many thanks for refactoring this. Looks really good now! Just 2 nits inlined, but no need for another review.
Comments addressed. Thanks for reviewing.
Aug 29 2017
No changes were needed to make the conversions work, the existing logic is taking care of that, but I agree it doesn't hurt to add a few test cases. So I've added tests to both files, and cleaned up that comment.
Thanks for reviewing! I've addressed your comments: have set operation action to 'promote', added checks to test fcmp, and also added checks when fp16 is disabled.
Fixed typo in test case type declaration.
Forgot to mention that I will address the v8f16 types in a separate patch.
Aug 25 2017
Aug 24 2017
Thanks for fixing this.
Added test case.
Looks good to me, was wondering if you're also planning to add a Clang patch to test +rdm?
Aug 23 2017
Just to clarify my last comment, I will repeat the exercise I did in D36396 allowing f16 scalars, and will now start working on allowing f16 vector types. I will then also add tests for these.
I have removed the special case, and added the vector types.
For the copysign intrinsics working on these vectors, I have not yet added new test cases, because it looks like I first have to first do some work to allow f16 vectors.
thanks for reviewing!
Added missing spaces.
Aug 22 2017
Hi Alex, thanks for the feedback. The difficulty here is finding the right balance of adding (good) debug messages, and not cluttering the code too much. I think what you are suggesting is what I had in my first revision: that had debug messages in the different AArch64 lowering functions. Then Sam's suggestion was to move some of these messages to SelectionDAG (which we did in D36984), which has 2 advantages: all targets benefit, and it does not clutter up the code. Also, I think this is quite difficult to get completely right the first time. So I think we will have to see what works and what is good and I will be adding more stuff as I go along. There's a lot of room for improvement here, but I already find these extra debug messages quite useful...
Removed the debug messages that were moved to SelectionDAG in D36984
Hi Sam, thanks for reviewing. I have addressed your comments. Cheers.
Just added a few more debug messages.
Aug 21 2017
I will split this patch up in a target dependent, and target independent patch. I will continue here with the AArch64 specific messages, and I have created D36984 for some of the debug messages that I moved to SelectionDAG getNode functions.
The disadvantage of moving it to SelectionDAG is that we lose the context where the new nodes are created. In other words, the debug message will (potentially) become less specific. But since they currently do not have an awful lot of context, I think your suggestion makes sense. I will experiment to see if it works.
Looks reasonable to me.