The 'SchedBoundary::releaseNode' is merely invoked for releasing the Top/Bottom root nodes.
However, 'SchedBoundary::releasePending' uses its same logic to check if the Pending queue
has any releasable SUnit.
It is possible to slightly modify the body of the two, allowing re-use of the former ('releaseNode')
in the latter.
Details
Diff Detail
Event Timeline
As asked, the if-bodies are clang-formatted; proper formatting is applied to 'HazardDetected'
condition and 'template' specification of 'releaseNode'.
Further, a typo is fixed (From 'IsPQueue' to 'InPQueue', in MachineScheduler.cpp)
llvm/lib/CodeGen/MachineScheduler.cpp | ||
---|---|---|
2366 | Actually, 'HazardDetected' condition in 'releaseNode' covers this case. |
llvm/lib/CodeGen/MachineScheduler.cpp | ||
---|---|---|
2366 | Right, my main concern was missing out on the early exit in the loop. I think it would still be worth keeping the early exits if Available.size() >= ReadyListLimit. I *think* it would make sense to have the early exit at the beginning of the loop, as currently we set MinReadyCycle even if we do not add the SU to Available. IIRC, MinReadyCycle should indicate the Cycle of the soonest available instruction in Available, not pending. |
Insert Available.size() >= ReadyListLimit at beginning of SchedBoundary::releasePending
loop, allowing early-exit in case of full Available queue.
Further, fix update of MinReadyCycle: it was set to SU->ReadyCycle, even if the
scheduling unit was not moved to the Available queue.
(thank @fhahn)
Removed useless round brackets and merged Available.size() >= ReadyListLimit with other
hazard conditions.
Fixes applied.
@fhahn, could I ask you to commit the patch? I do not have commit-access :(
This change causes a crash, because MinReadyCycle is not properly initialised when going through pickOnlyChoice. I think we should address that separately and I'll commit the version with setting MinReadyCycle at the original position.
Even with moving setting the MinReadyCycle back, it fails the following tests. Could you take a look?
Failing Tests (3):
LLVM :: CodeGen/PowerPC/build-vector-tests.ll LLVM :: CodeGen/PowerPC/vec_conv_fp32_to_i64_elts.ll LLVM :: CodeGen/PowerPC/vec_conv_i16_to_fp64_elts.ll
I beg your pardon, I ran only the unit-tests and forgot the regression ones.
Yes, I agree with you.
Checked. By my side, regression tests (I didn't forget them this time) and unit tests are all passed.
UPDATE
My llvm's configuration targeted only some architectures (ARM, Mips and X86). Again, sorry.
I re-configured the project to target all the supported archs, executed the tests and confirm the above
three failing.
I'll try to figure out why they're not passing.
Great, thanks!
BTW, it is better to respond with a new comment instead of updating existing ones, as it seems like Phabricator does not send emails for edited comments.
Hi @fhahn ,
sorry for the long silence period.
I tracked down the problem: it is how the Pending queue is explored with this patch (from the end).
Clearly, this affects which nodes move first from pending to available state; this is critical when
the Available queue reaches the maximum size.
I can re-write the queue exploration to match the pre-patch behaviour.
However, I'm wondering if it is a sign of "bad" design: shouldn't the pending queue explore the best
candidates to move in the available queue?
I think it would be best keep the old behavior in this patch and potentially fix the iteration order in a follow up. Changes to the scheduler can lead to subtle regressions, so it's best to keep the changes isolated.
- Revert Pending queue exploration to pre-patch behavior (from the beginning). It was source of failure for 3 tests:
LLVM :: CodeGen/PowerPC/build-vector-tests.ll LLVM :: CodeGen/PowerPC/vec_conv_fp32_to_i64_elts.ll LLVM :: CodeGen/PowerPC/vec_conv_i16_to_fp64_elts.ll
- Fix missing MinReadyCycle assignment in releaseNode. It was source of MinReadyCycle uninitialized assertion error.
LGTM, thanks! And sorry for the delay! Please let me know if I should commit this change for you.
Thank you for the patience and helping me!
Still lacking of commit-access; hence, yes: you can commit for me as
- Name, Surname: Lorenzo Casalino
- Email: lorenzo.casalino93@gmail.com
Thank you again, and happy new year :) !
Happy new year!
I've committed it with you as the author, please let me know in case you get emails about build failures.
Hi,
bad luck, got a build failure.
Link to the report: http://lab [.] llvm [.] org:8011/builders/clang-ppc64be-linux-lnt/builds/33875
Build Reason: scheduler Build Source Stamp: [branch master] f9f78cf6ac73d9148be9b626f418bf6770e512f6 Blamelist: Lorenzo Casalino <lorenzo.casalino93@gmail.com> BUILD FAILED: failed test-suite
I created a patch for a build failure on GCC: https://reviews.llvm.org/D72069. Would you like to help review it? Thanks.
I guess the failure is caused by PowerPC backend.
/opt/at12.0/lib/gcc/powerpc64le-linux-gnu/8.2.1/../../../../powerpc64le-linux-gnu/bin/ld: ../../lib/libLLVMPowerPCCodeGen.a(PPCMachineScheduler.cpp.o): in function `llvm::PostGenericScheduler::releaseTopNode(llvm::SUnit*)': PPCMachineScheduler.cpp:(.text._ZN4llvm20PostGenericScheduler14releaseTopNodeEPNS_5SUnitE[_ZN4llvm20PostGenericScheduler14releaseTopNodeEPNS_5SUnitE]+0x2c): undefined reference to `void llvm::SchedBoundary::releaseNode<false>(llvm::SUnit*, unsigned int, unsigned int)' /opt/at12.0/lib/gcc/powerpc64le-linux-gnu/8.2.1/../../../../powerpc64le-linux-gnu/bin/ld: ../../lib/libLLVMPowerPCCodeGen.a(PPCMachineScheduler.cpp.o): in function `llvm::GenericScheduler::releaseBottomNode(llvm::SUnit*)': PPCMachineScheduler.cpp:(.text._ZN4llvm16GenericScheduler17releaseBottomNodeEPNS_5SUnitE[_ZN4llvm16GenericScheduler17releaseBottomNodeEPNS_5SUnitE]+0x34): undefined reference to `void llvm::SchedBoundary::releaseNode<false>(llvm::SUnit*, unsigned int, unsigned int)' /opt/at12.0/lib/gcc/powerpc64le-linux-gnu/8.2.1/../../../../powerpc64le-linux-gnu/bin/ld: ../../lib/libLLVMPowerPCCodeGen.a(PPCMachineScheduler.cpp.o): in function `llvm::GenericScheduler::releaseTopNode(llvm::SUnit*)': PPCMachineScheduler.cpp:(.text._ZN4llvm16GenericScheduler14releaseTopNodeEPNS_5SUnitE[_ZN4llvm16GenericScheduler14releaseTopNodeEPNS_5SUnitE]+0x34): undefined reference to `void llvm::SchedBoundary::releaseNode<false>(llvm::SUnit*, unsigned int, unsigned int)'
It is because you are splitting the template declaration and definition into header and cpp. Usually, you need to add the explicit template instantiation in the cpp. However, I am not sure if it is a good idea to use template ... BTW, it compiles clean if with clang.
It is because you are splitting the template declaration and definition into header and cpp. Usually, you need to add the explicit template instantiation in the cpp. However, I am not sure if it is a good idea to use template ... BTW, it compiles clean if with clang.
You are totally right about explicit instantiation.
Regarding the usage of a template argument, instead of a function parameter, lies in the fact that the scheduler is aware whether
a node is in the Pending queue or not, when calling releaseNode.
Indeed, such method is invoked in two occasions
- a node is scheduled: successors/predecessors are released; thus, they're known to be not in the Pending queue.
- moving nodes from pending to available state; the nodes are known to be in the Pending queue.
As a consequence, part of the operations perfomed by 'releaseNode' may be actively skipped, avoiding the
overheads of control instructions. (Yes, modern architectures are endowed with branch predictors; but I do not
think that such an argument saves us from correctly employ known invariants in the algorithms.)
FWIW I think we should trust the compiler to optimize this properly and passing it as parameter should be fine. Both versions should probably compile to the same code (although I don't have time to verify that right now)
nit: outermost brackets unnecessary.