- User Since
- Apr 10 2015, 3:10 PM (232 w, 2 d)
Aug 14 2019
Aug 9 2019
Thanks for the patch! My only concern is with all the checks in the test case. Checking for the exact code sequence can be very sensitive to other changes in the compiler that are unrelated to this patch.
Jun 25 2019
Looks good to me. Thanks for the fix!
May 24 2019
Looks good to me. Thanks for the contribution!
Apr 12 2019
Apr 11 2019
Apr 9 2019
Thanks for the patch! I've been trying to create a test case, but haven't had much luck.
Mar 29 2019
Mar 14 2019
Mar 6 2019
Mar 5 2019
Thanks. I didn't realise there could be nested hardware loops. Good to see there are not worse. Are you OK with me leaving them as they are here, and you fixing them as you think is best?
Feb 28 2019
Jan 22 2019
Jan 21 2019
Just a couple of minor comments on formatting. Otherwise, it looks good to me. Thanks!
Nov 9 2018
Nov 7 2018
Oct 26 2018
Oct 22 2018
Looks good to me.
Oct 18 2018
Just a minor question - looks good otherwise.
Oct 11 2018
Sep 12 2018
Aug 27 2018
Aug 23 2018
Changed code in the patch to use static_cast. I'll submit a separate patch to fix the other occurrences of a cast.
Fixed test case. Needed an extra llc option to work correctly.
Jul 30 2018
Jun 26 2018
Jun 25 2018
Jun 15 2018
Jun 14 2018
This patch looks good to me. I applied it to our code and ran some of our internal correctness and performance tests (for Hexagon), and the results came back clean.
May 30 2018
This change looks correct to me. I agree with you that those functions shouldn't return the same direction vectors and that the Src version is wrong.
May 18 2018
May 17 2018
Mar 9 2018
I did some debugging:
For the given example, after the node order is generated, the pipeliner is able to find a schedule with II=1.
However, when the "orderDependence" function is called from inside the "finalizeSchedule" function,
it gets caught in an infinite recursion. This warrants further investigation.
Mar 1 2018
Thanks again for the patch to the pipeliner. I think it looks good, so feel free to commit if you're able to after addressing the final comment.
Feb 28 2018
Thanks for the detailed explanation. That make sense, and I agree that the ZLD/ZLH is a more general/better solution.
Feb 27 2018
Thanks for adding a patch to the pipeliner! I think what the patch is attempting is a good thing to add to the heuristic. Though I would have expected that the check for hasDataDependence would generate the correct node order for the example provided in the comment? Maybe only if hasDataDependence is extended to handle other types of dependences? But, this patch is probably cheaper since it pre-computes the information. I like the addition of the isValidNode as well.
Nov 3 2017
Oct 27 2017
Sorry for the delay in responding. I've been trying to create a simple test case for this patch, but no luck yet. Otherwise, the patch looks good to me. Thanks!
Oct 12 2017
Jul 5 2017
Jun 28 2017
Hi Philip - thanks for the review. I made the changes you suggested.
Jun 27 2017
Apr 17 2017
Feb 3 2017
Feb 2 2017
Thanks for the comments - much appreciated. I'll update the test case and then push.
Jan 26 2017
I've created a test case using the AMDGPU target that shows the error. This required a couple of steps. I took an existing AMDGPU lit test, and added debug metadata manually. Then, created MIR for the test by stopping after the TwoAddressInstruction pass. Then, I needed to insert a DBG_VALUE instruction in the correct location. I'm not really sure if it's possible for a DBG_VALUE instruction to appear in the specific place that I put it. But, with these changes, the test case causes an assert because there is a call to getInstructionIndex with a DBG_VALUE insruction.
Jan 23 2017
Jan 9 2017
Sep 12 2016
The change to the Hexagon test looks good to me. As you mention correctly, the intent is to check that the compiler generates one of special add variants on Hexagon, so the reduced test case is a good change.
Aug 16 2016
Aug 15 2016
Aug 11 2016
The changes look good to me. The only nitpick is that you should run clang-format on your changes.
Aug 4 2016
I think it's a good idea to try to move some of the code in the Hexagon hardware loop pass so that it can be reused. The getExitingBlock() function in HexagonHardwareLoops.cpp is probably poorly named since it's doing something different than the existing function that is in MachineLoop. If the function is moved, it seems to fit better in the MachineLoop class vs the MachineLoopInfo class.
Jul 29 2016
Jul 25 2016
Rebased the patch. There were some API changes to resolve.
Jun 29 2016
I'm seeing the same problem, which is fixed with this patch. I'm seeing a hang when running llvm-lit with test/Bugpoint/crash-narrowfunctiontest.ll. The hang appears to be caused because data in the stack frame is corrupted, and the clear() method in a vector never returns, when executing CallBacksToRun->clear() in RunSignalHandlers().
Jun 16 2016
Jun 14 2016
Hi Matthias - thank you for the review and the comments. I really appreciate it. I think I've addressed all of your comments, unless I've inadvertently missed something. The new patch includes a lot of changes based upon your comments. The code has also been rebased.
Jun 8 2016
Hi James - I think your patch is reasonable. You're correct about our hardware loop pass. We would need to improve it to catch this case. I'm not sure how easy/hard it is to do that without a little more investigation. Until we do that, I'm fine with either xfailing or removing the test (or some other suggestion?).
May 2 2016
Hi Hal - thanks for the review and your comments. I've looked into this issue some more, and I think that the problem (or part of the problem) is that Hexagon code adds a kill flag to the register use of the tied operand. For example,
Apr 25 2016
I tried this patch on our Hexagon compiler to see what impact the pass had on some of our performance benchmarks (mostly embedded programs). The biggest improvement was 1.5% and the biggest degradation was -1.8% (in spec2K/twolf). Most differences were under 1%. Many benchmarks were unchanged. I didn't look at code size though.
Apr 19 2016
Apr 18 2016
Apr 4 2016
Thanks for the review, Sanjoy. I'll refactor the getConstantPart function in a subsequent change as suggested.
Apr 1 2016
Mar 30 2016
I'm just wondering how to generate a loop which has only one basic block. For an example,
Mar 29 2016
Updated to the correct license and rebased the patch.