This is an archive of the discontinued LLVM Phabricator instance.

[MachinePipeliner] Refactor schedule emission logic
AbandonedPublic

Authored by jmolloy on Jul 12 2019, 1:47 PM.

Details

Summary

Emitting schedules is hard. There are many prologs and epilogs to stitch together, and importantly there are different strategies for creating a fully-pipelined loop. The current emission code is tangled in with the schedule discovery code, which makes it harder than necessary to understand and makes it hard to plumb in new emission strategies.

I'm not particularly happy that I couldn't make this a set of incremental changes. The changes are pretty invasive - we now model the emitted stages and blocks explicitly as a CFG - and I couldn't find a reasonable way to do this incrementally.

This patch still needs a bit of tweaking - Hexagon tests fail because we're making the kernel loop a bit differently (easy fix). PPC tests need a little register allocation tweak. But at this point I'm pretty happy with the overall structure of the code, it emits comparable code to before and I think it only needs small tweaks. It also passes all of Jinsong's recent tests (thanks for the sms-phi.ll test, that was nasty!)

I'd appreciate some feedback on the direction! The overall diffstat is -650LoC, so this makes the Pipeliner smaller too.

Diff Detail

Repository
rL LLVM

Event Timeline

jmolloy created this revision.Jul 12 2019, 1:47 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 12 2019, 1:47 PM
jmolloy updated this revision to Diff 209583.Jul 12 2019, 1:51 PM

(excise some leftover debugging code).

arsenm added a subscriber: arsenm.Jul 12 2019, 1:57 PM
arsenm added inline comments.
lib/CodeGen/MachinePipeliner.cpp
4315–4316

Should update to use Register

arsenm added inline comments.Jul 12 2019, 2:04 PM
lib/CodeGen/MachinePipeliner.cpp
4315–4316

See r364191. Registers should no longer be plain unsigned

jsji added a comment.Jul 15 2019, 8:25 AM

Personally, I like the idea of refacting and more abstraction,
but I am a little worry about it might nont handle the complication of Phi well.
And unfortunately, I don't know enough about the cases here, especially for Hexagon.
So I would like to hear the valuable feedback from Brendon @bcahoon as well.

BTW: I am seeing Assertion failures in sms-grp-order.ll:

Assertion `isVirtualRegister(Reg) && "Not a virtual register"' failed

And also quite a few other Assertion when running test with test-suites.

include/llvm/CodeGen/MachinePipeliner.h
713

private field 'StageIdx' is not used

lib/CodeGen/MachinePipeliner.cpp
3308

debug output

4199

comparison of integers of different signs: 'int' and 'unsigned int'

4203

comparison of integers of different signs: 'int' and 'unsigned int'

4238

comparison of integers of different signs: 'unsigned int' and 'int'

4244

comparison of integers of different signs: 'int' and 'unsigned int'

jmolloy updated this revision to Diff 212147.Jul 29 2019, 4:28 AM

Hi Brendon,

Thanks for your great explanation! I've now spent a good 2 weeks refactoring this, and what I have now I'm pretty happy with.

I think it reduces the complexity down quite a lot, it does handle all of the weird cases in the Hexagon test-suite (although I've been focussing so much on Hexagon I still have failures in the PowerPC suite - trivial regclass fix).

I'd appreciate your input on this patchset. I think it needs a bit more cleanup, but I've been cleaning it up for quite some time now so eventually I do have to submit it for review ;)

It'd be great if you could run it through your internal test suite and see if anything strange comes out?

Cheers!

James

jmolloy updated this revision to Diff 212800.Aug 1 2019, 6:42 AM

Hi Brendon and Jinsong,

The patch now works for PowerPC - Jinsong, would it be possible for you to run this through your regression tests? I've tried cross-compiling for ppc -mcpu=pwr9 and enabling the pipeliner, and I don't get any assert failures (yay!) but haven't been able to fully verify correctness.

Cheers,

James

jsji added a comment.Aug 1 2019, 7:45 AM

How do you build and test? I can't config & build it with clang.

lib/CodeGen/MachineLoopUtils.cpp
2

New file, you need to update llvm/lib/CodeGen/CMakeLists.txt as well.

lib/CodeGen/MachinePipeliner.cpp
4287

comparison of integers of different signs: 'int' and 'unsigned int'

4296

comparison of integers of different signs: 'int' and 'unsigned int'

4299

comparison of integers of different signs: 'int' and 'unsigned int'

4304

comparison of integers of different signs: 'int' and 'unsigned int'

4308

comparison of integers of different signs: 'int' and 'unsigned int'

4417

`.../llvm-git/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:4417:22: error: no viable overloaded '='

InstrInfos[MI] = {Schedule.stageScheduled(SU), Cycle, Index++};
~~~~~~~~~~~~~~ ^ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.../llvm-git/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:4190:10: note: candidate function (the implicit copy assignment operator) not viable: cannot convert initializer list argument to 'const (anonymous namespace)::CGBlock::InstrInfo'

struct InstrInfo {
       ^

.../llvm-git/llvm-project/llvm/lib/CodeGen/MachinePipeliner.cpp:4190:10: note: candidate function (the implicit move assignment operator) not viable: cannot convert initializer list argument to '(anonymous namespace)::CGBlock::InstrInfo'

struct InstrInfo {

`

jmolloy updated this revision to Diff 212987.Aug 2 2019, 1:28 AM

Thanks Jinsong,

Indeed, I messed up moving the patch between repositories on my side meaning I lost half of the changes. This version should build fine.

Thanks again!

James

jsji added a comment.Aug 2 2019, 3:30 PM

Yes, good job! All the existing lit test passed.
Haven't done run time verification, but there are a few compile time assertion & abort when compiling test-suites,
I have reduced some of them and committed in https://reviews.llvm.org/rL367732 for your further investigation.

lib/CodeGen/MachinePipeliner.cpp
4268

This needs to rebase to latest ToT. Using Register:: instead.

jmolloy updated this revision to Diff 213562.Aug 6 2019, 3:06 AM

Thanks Jinsong!

Your patch got reverted, so I patched it in locally and fixed all the issues it raised. The new experimental codegen now produces almost-bit-accurate results for your testcases. There is a little permutation (we emit prologs in pipelined order rather than emitting stages separately) but I've validated that the code produced is the same.

I'd be really keen for any additional testing you could throw at this, including runtime tests!

Cheers,

James

jsji added a comment.Aug 9 2019, 12:21 PM

@jmolly Looks like you messed up the patch again? I can't see changes in CMakeList.txt in latest diff now, and after adding it, I am seeing the compile time error again.

error: no viable overloaded '='
      InstrInfos[MI] = {Schedule.stageScheduled(SU), Cycle, Index++, false};
jmolloy updated this revision to Diff 214745.Aug 12 2019, 5:34 PM

Really sorry about that Jinsong. If you want the embarrassing details I scp'd the patchfile from one machine to another and then uploaded the stale patch from a different directory :(

Cheers,

James

Brendon, I've got a bunch of changes that I'd like to push based on this change. it's the gating factor for allowing predicated emission and allowing targets to adapt how they emit a schedule. I also have a different version of SMS to upstream (will be upstream very shortly!) that I'd love to reuse this emission logic for.

Given this is under a flag (which would be off), do you have any objections to me pushing this in-tree?

Cheers,

James

Hi James,

Sorry that I haven't been able to provide any feedback on your patch yet. I think putting this under a flag is a good idea, so I don't have any objections to that. I like the general direction of this patch, and I think it would be great to be able to remove the old code generation code eventually.

Thanks,
Brendon

majnemer added inline comments.Aug 14 2019, 4:57 PM
lib/CodeGen/MachineLoopUtils.cpp
58

Can this be DenseMap<Register, Register>?

65–66

Maybe have a comment explaining this.

jmolloy updated this revision to Diff 215435.Aug 15 2019, 10:41 AM
jmolloy marked 2 inline comments as done.

Thanks David!

jsji added a comment.Aug 15 2019, 1:39 PM

Gave it a try on test-suite, with -O3 -mcpu=pwr9 --ppc-enable-pipeliner

The following testcases failed with experimental codegen, while passed with origianl codegen.

test-suite :: MultiSource/Benchmarks/MallocBench/espresso/espresso.test
test-suite :: MultiSource/Benchmarks/MiBench/security-sha/security-sha.test
include/llvm/CodeGen/MachinePipeliner.h
361

Extra new line

lib/CodeGen/MachineLoopUtils.cpp
2

Still missing patch for llvm/lib/CodeGen/CMakeLists.txt.

jmolloy updated this revision to Diff 215480.Aug 15 2019, 3:19 PM
jsji added a comment.Aug 16 2019, 8:00 AM

Gave it a try on SPEC2017 with -O3 -mcpu=native -mllvm --ppc-enable-pipeliner on P9,
500.perlbench_r, 525.x264_r , 531.deepsjeng_r, 557.xz_r failed with new cg, pass with old cg.

jmolloy abandoned this revision.Sep 2 2019, 8:09 AM

Abandoning this revision; a more incremental approach will be landing soon.