This is an archive of the discontinued LLVM Phabricator instance.

Add anti- and output loop carried dependences in SwingScheduler
AcceptedPublic

Authored by ning4827 on Oct 9 2017, 9:02 AM.

Details

Reviewers
bcahoon
Summary

Consider anti- and output dependences in the addLoopCarriedDependences function

Diff Detail

Event Timeline

ning4827 created this revision.Oct 9 2017, 9:02 AM
bcahoon edited edge metadata.Oct 12 2017, 3:26 PM

Hi Ning,

Just a couple of minor comments, but I think the change looks good. I tried your example, provided in an earlier version, but it doesn't have any output or anti dependences by the time it gets to the pipeliner. The load of a[i-1] gets optimized away. I used the following to compile it, so perhaps some other command-line options are needed?

clang -fno-unroll-loops -target hexagon -O2 example.c -o example.s -S

But, I did noticee that a lit test, test/CodeGen/Hexagon/vect/vect-v4i16.ll, fails with the patch due to extra output dependence edges. It contains a loop that is no longer pipelined. The code generation is different, so the checks in the test fail. I think it's fine, for this lit test to remove the checks:
; CHECK: memuh(r{{[0-9]+}}+#6)
; CHECK: combine(r{{[0-9]+}},r{{[0-9]+}})

and leave the check for the vaddh, which shows that the test is vectorized.

Thanks,
Brendon

lib/CodeGen/MachinePipeliner.cpp
1026

Update the command to reflect the new functionality.

1052

Change the variable name, Load, since this can be either a load or store.

ning4827 updated this revision to Diff 119021.Oct 14 2017, 8:11 AM

Change the variable name and update the comments

ning4827 updated this revision to Diff 119024.Oct 14 2017, 8:25 AM

Fix the lit test

Thanks Brendon for the comments. I have updated the diff. Let me know if it's okay.

Best,
Ning

Hi Ning,

Just a couple of minor comments, but I think the change looks good. I tried your example, provided in an earlier version, but it doesn't have any output or anti dependences by the time it gets to the pipeliner. The load of a[i-1] gets optimized away. I used the following to compile it, so perhaps some other command-line options are needed?

I worked on another platform and didn't realize that a[i-1] will be optimized away on Hexagon..

clang -fno-unroll-loops -target hexagon -O2 example.c -o example.s -S

But, I did noticee that a lit test, test/CodeGen/Hexagon/vect/vect-v4i16.ll, fails with the patch due to extra output dependence edges. It contains a loop that is no longer pipelined. The code generation is different, so the checks in the test fail. I think it's fine, for this lit test to remove the checks:
; CHECK: memuh(r{{[0-9]+}}+#6)
; CHECK: combine(r{{[0-9]+}},r{{[0-9]+}})

and leave the check for the vaddh, which shows that the test is vectorized.

Thanks,
Brendon

ning4827 updated this revision to Diff 119025.Oct 14 2017, 8:31 AM

Add back the changes on pipliner

bcahoon accepted this revision.Oct 27 2017, 4:03 PM

Sorry for the delay in responding. I've been trying to create a simple test case for this patch, but no luck yet. Otherwise, the patch looks good to me. Thanks!

This revision is now accepted and ready to land.Oct 27 2017, 4:03 PM

Hi Ning,

Let me know if you need me to commit this patch. Or, if you're able to commit it, then that would be great too.

Thanks,
Brendon

Hi Brendon,

As a first-time llvm patch uploader, I don't think I can commit.. :-)
Thank you for reviewing this patch and to commit it.

Best,
Ning

Hi Ning,

Let me know if you need me to commit this patch. Or, if you're able to commit it, then that would be great too.

Thanks,
Brendon