This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
1
MachineScheduler.h
-
ScheduleDAGInstrs.h
-
lib/
-
CodeGen/
6/11
MachineScheduler.cpp
-
Target/SystemZ/
-
SystemZ/
-
SystemZHazardRecognizer.h
3/8
SystemZHazardRecognizer.cpp
1/2
SystemZMachineScheduler.h
5/14
SystemZMachineScheduler.cpp
-
test/CodeGen/SystemZ/
-
CodeGen/
-
SystemZ/
-
int-cmp-48.ll

Differential D35053

Improve post-RA scheduling for SystemZ
ClosedPublic

Authored by jonpa on Jul 6 2017, 6:29 AM.

Download Raw Diff

Details

Reviewers

uweigand
atrick
hfinkel
MatzeB

Summary

The idea of this patch is to continue the scheduler state over a MBB boundary in the case where the successor block has only one predecessor. This means that the scheduler will continue in the successor block (after emitting any branch instructions) with e.g. maintained processor resource counters. Benchmarks seem to benefit from this.

In order to do this the single HazardRecognizer is replaced with a map so that each MBB gets its own HazardRecognizer. This represent the outgoing state of that MBB. Another block which is only reached from MBB can then later take that state before beginning its first region.

*All* instructions are emitted, so that the state of the scheduler is as good as possible at the entry of the region. This means branches and also singular instructions before regions / in small MBBs.

Special care is taken for blocks with a call (which is about the only scheduling boundary on SystemZ), so that only the instructions after the call are part of the final scheduler state.

The only common code change needed is to set the MachineLoopInfo* pointer (that analysis pass seems to have been required also for post-RA scheduling, but not used for some reason).

Diff Detail

Event Timeline

jonpa created this revision.Jul 6 2017, 6:29 AM

Herald added a subscriber: MatzeB. · View Herald TranscriptJul 6 2017, 6:29 AM

Updated per (off-line) review.

Patch updated.

updated per review

This is the "alternate take" of the last version. I am not sure which is best...

This version introductes of MachineSchedStrategy::leaveMBB() and an optional reversal of BB schedregions traversal which

eliminates PreviousMBB
simplifies transferStateFromPRed()
simplifies initialize() and setupHazardRecForScheduling() (HazardRec pointer could be removed, but kept to minimize diff for now)

This makes the idea of the order of processing scheduling regions in an MBB explicit, which is good since this patch is dependent on that order.
Overall, this simplifies the SystemZ implementation, but has some common-code changes so that the regions can be handled top-down per MBB. (MachineScheduler.cpp)

It does indeed seem to be necessary to get an explicit guarantee from common code that scheduling happens in a particular order, otherwise our back-end code may silently break if common code logic happens to change in the future.

I would agree that getting common code to specifically call us in the top-down order we need to simplify cross-region tracking is also helpful. We should discuss this with the relevant common code scheduler developers -- can you check who has been working on this code lately and add them as reviewers?

jonpa added reviewers: atrick, hfinkel, MatzeB.Jul 24 2017, 8:33 AM

(gentle) ping - this patch is waiting for review of its common-code parts (MachineScheduler), which should be NFC for other targets.

This looks good to me in principle, but there are some comments below:

include/llvm/Target/TargetSubtargetInfo.h
166–170 ↗	(On Diff #107878)	My impression is that this decision is not based on the subtarget, but rather on what Scheduler is plugged in. Therefore we should better make this part of ScheduleDAGInstrs.
lib/CodeGen/MachineScheduler.cpp
408	Oh this was already required and not used by the PostMachineScheduler. Guess you got lucky and can just use it ;-)
441	`///`
451–489	Would it be possible to implement this more iterator style instead of synthesizing an std::vector (also `std::vector` is usually the wrong choice in llvm)? (I'm not talking full on STL iterator, just some object with a "get me the next region" function).
695–697	This should also call `ScheduleDAGInstr::finishBlock()`.

Thanks for review!

Patch updated.

jonpa added inline comments.Jul 28 2017, 2:29 AM

include/llvm/CodeGen/MachineScheduler.h
217	This comment is duplicated in several places, but not sure if this is expected.
include/llvm/Target/TargetSubtargetInfo.h
166–170 ↗	(On Diff #107878)	Aha. I changed it then so that the ScheduleDAGInstrs is called instead, which in turn calls the SchedStrategy, which makes sense as you pointed out. This is clearly better to me as well. Looks good?
lib/CodeGen/MachineScheduler.cpp
408	yes :-)
451–489	I made an attempt to wrap this in a class per your suggestion. Could not find a better vector class than std, however.. (?). This is a bit more code compared to just calling std::reverse(). If we decide this is better, perhaps you know of an example in the LLVM code base that I could reuse if better?
695–697	yep

atrick added inline comments.Jul 28 2017, 10:52 AM

lib/CodeGen/MachineScheduler.cpp
441	Please explain what RegionBegin/End refer to here. Note that they need to be inclusive--they cannot refer to instructions outside of the identified scheduling region because those will be reordered before scheduling this region. Also note somewhere that the scheduling pass cannot add instructions in other regions now without updating these boundaries.

Comments updated per review.

Also, the comment

// MBB::size() uses instr_iterator to count. Here we need a bundle to
// count as a single instruction.

moved into getSchedRegions().

A couple more comments on the SystemZ parts, see inline comments.

lib/Target/SystemZ/SystemZHazardRecognizer.cpp
379	I guess this is not only "counters". Maybe rename to "copyState" ?
427	Hmm. Shouldn't emitInstruction itself try to handle branches correctly? Maybe it only needs one extra bit of information passed in, whether to assume a branch is taken or not?
lib/Target/SystemZ/SystemZMachineScheduler.cpp
147	I'm wondering why you don't do the initial allocation of the hazard recognizer for the MBB (and taking over the predecessor state) right here. Currently, it seems that if an MBB is scheduled, this is done in "initialize", and if the MBB is not scheduled, it is done in "leaveMBB". If you'd always do it here, it seems that duplication could be removed.
lib/Target/SystemZ/SystemZMachineScheduler.h
107	This is dead now, right?

SystemZ parts updated per review. See inline comments.

lib/Target/SystemZ/SystemZHazardRecognizer.cpp
379	ok
427	ok - updated so that EmitInstruction() will recognize that any branch in slot 1 or 2 will end current group (there is at least one instruction already in group). emitInstruction() takes a new parameter TakenBranch and makes sure that if true, current decoder group is empty after emission.
lib/Target/SystemZ/SystemZMachineScheduler.cpp
147	There is one subtle change in behavior with this patch, and that is the fact that initPolicy() (via enterRegion()) is never called on an empty MBB. That is why the DoneMBB argument is needed for leaveMBB(). I thought about adding an empty region for an empty MBB so that initPolicy() is always called for each MBB. That didn't quite work still, since the MBB cannot then be retrieved from Begin->getParent(), if Begin == End.
lib/Target/SystemZ/SystemZMachineScheduler.h
107	ah, yes.

uweigand added inline comments.Jul 31 2017, 8:43 AM

lib/Target/SystemZ/SystemZMachineScheduler.cpp
147	Maybe the common-code interface still isn't quite right then. Should we have an enterMBB() to properly pair with the leaveMBB(), maybe?

Updated per review.
startBlock() and enterMBB() implemented / added.

jonpa marked an inline comment as done.Jul 31 2017, 11:36 PM

jonpa added inline comments.

lib/Target/SystemZ/SystemZMachineScheduler.cpp
147	Yes, this seems better.

Thanks! The SystemZ part now looks very reasonable. Just a couple of final, really just cosmetic comments inline.

I think we still need final approval for the common-code changes, though.

lib/Target/SystemZ/SystemZHazardRecognizer.cpp
399	There doesn't appear to be anything HazardRecognizer-specific left in this routine. Maybe move to the SchedPolicy (and possibly merge with getNextMIToEmit into an "advanceTo" routine with just and "end" argument)?
408	Maybe also move to the SchedPolicy and merge into its sole user? Then we wouldn't have to pass the MBB around.
lib/Target/SystemZ/SystemZMachineScheduler.cpp
62	Is it even ever possible now that LastEmittedMI->getParent() is not equal to MBB?
66	Maybe inline into its single user? Only if the result looks simpler ...
76	Maybe use HazardRec here? It is now guaranteed to be the same ...
109	HazardRec again?
147	Agreed. Final question: can't the "advance" now be done in initPolicy, so we're finally rid of this function (and the CurrBegin global)?

SystemZ parts updated per review.

lib/Target/SystemZ/SystemZHazardRecognizer.cpp
399	Yes :-)
408	Yes, looks better. Moved the assert that makes sure that we are aware of all terminators into emitInstruction(). The purpose is currently to check that we don't have any terminators that are branches without the isBranch/isReturn flag. It seems that CondTrap is the only example, right now. Perhaps CondTrap could get the isBranch flag instead? Do we want this assert, or could we simply assume that all terminators are branches?
lib/Target/SystemZ/SystemZMachineScheduler.cpp
62	Yes, a terminator in the single predecessor will be emitted and recorded as LastEmittedMI. I tried first to guard the update of LastEmittedMI with this check, but then realized that the HazardRec does not have the MBB member.
66	Yes, I guess that looks ok.
76	ok
109	done
147	Yes, indeed :-)

OK, the SystemZ parts look good to me now. Thanks!

ping - waiting for review of the common-code parts.

Matthias: are you happy with the changes I did according to your suggestion to avoid std::reverse?
Andy: Do the comments now look ok?

ping!

This patch is now only waiting for review of the common-code parts.

You may revert the region iterator changes if there is no way to avoid the vector (see below). Either way LGTM.

lib/CodeGen/MachineScheduler.cpp
451	Capitalize parameter names.
451–489	I was thinking about something where we do not store intermediate results in a vector but compute regions on the fly. If it turns out we cannot avoid a vector then your previous approach was fine. Using something like SmallVector<16> should save some allocations when compiling small functions.

This revision is now accepted and ready to land.Aug 15 2017, 7:30 AM

Andy: Do the comments now look ok?

Yes, thanks.

Updated with change back to SmallVector.

Thanks for review. Committing soon.

llvm/trunk@311072

lib/CodeGen/MachineScheduler.cpp
451–489	Ah, I see. I am then going back to the previous version since it was simpler, and since I really don't want to get into rewriting the loop that extracts the scheduling regions unless really needed. I also changed to SmallVector.

Revision Contents

Path

Size

include/

llvm/

CodeGen/

MachineScheduler.h

21 lines

ScheduleDAGInstrs.h

5 lines

lib/

CodeGen/

MachineScheduler.cpp

120 lines

Target/

SystemZ/

SystemZHazardRecognizer.h

38 lines

SystemZHazardRecognizer.cpp

87 lines

SystemZMachineScheduler.h

51 lines

SystemZMachineScheduler.cpp

129 lines

test/

CodeGen/

SystemZ/

int-cmp-48.ll

4 lines

Diff 111473

include/llvm/CodeGen/MachineScheduler.h

Show First 20 Lines • Show All 208 Lines • ▼ Show 20 Lines	public:
/// initializing this strategy. Called after initPolicy.		/// initializing this strategy. Called after initPolicy.
virtual bool shouldTrackPressure() const { return true; }		virtual bool shouldTrackPressure() const { return true; }

/// Returns true if lanemasks should be tracked. LaneMask tracking is		/// Returns true if lanemasks should be tracked. LaneMask tracking is
/// necessary to reorder independent subregister defs for the same vreg.		/// necessary to reorder independent subregister defs for the same vreg.
/// This has to be enabled in combination with shouldTrackPressure().		/// This has to be enabled in combination with shouldTrackPressure().
virtual bool shouldTrackLaneMasks() const { return false; }		virtual bool shouldTrackLaneMasks() const { return false; }

		// If this method returns true, handling of the scheduling regions
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions This comment is duplicated in several places, but not sure if this is expected. jonpa: This comment is duplicated in several places, but not sure if this is expected.
		// themselves (in case of a scheduling boundary in MBB) will be done
		// beginning with the topmost region of MBB.
		virtual bool doMBBSchedRegionsTopDown() const { return false; }

/// Initialize the strategy after building the DAG for a new region.		/// Initialize the strategy after building the DAG for a new region.
virtual void initialize(ScheduleDAGMI *DAG) = 0;		virtual void initialize(ScheduleDAGMI *DAG) = 0;

		/// Tell the strategy that MBB is about to be processed.
		virtual void enterMBB(MachineBasicBlock *MBB) {};

		/// Tell the strategy that current MBB is done.
		virtual void leaveMBB() {};

/// Notify this strategy that all roots have been released (including those		/// Notify this strategy that all roots have been released (including those
/// that depend on EntrySU or ExitSU).		/// that depend on EntrySU or ExitSU).
virtual void registerRoots() {}		virtual void registerRoots() {}

/// Pick the next node to schedule, or return NULL. Set IsTopNode to true to		/// Pick the next node to schedule, or return NULL. Set IsTopNode to true to
/// schedule the node at the top of the unscheduled region. Otherwise it will		/// schedule the node at the top of the unscheduled region. Otherwise it will
/// be scheduled at the bottom.		/// be scheduled at the bottom.
virtual SUnit *pickNode(bool &IsTopNode) = 0;		virtual SUnit *pickNode(bool &IsTopNode) = 0;
▲ Show 20 Lines • Show All 51 Lines • ▼ Show 20 Lines	public:
ScheduleDAGMI(MachineSchedContext *C, std::unique_ptr<MachineSchedStrategy> S,		ScheduleDAGMI(MachineSchedContext *C, std::unique_ptr<MachineSchedStrategy> S,
bool RemoveKillFlags)		bool RemoveKillFlags)
: ScheduleDAGInstrs(*C->MF, C->MLI, RemoveKillFlags), AA(C->AA),		: ScheduleDAGInstrs(*C->MF, C->MLI, RemoveKillFlags), AA(C->AA),
LIS(C->LIS), SchedImpl(std::move(S)), Topo(SUnits, &ExitSU) {}		LIS(C->LIS), SchedImpl(std::move(S)), Topo(SUnits, &ExitSU) {}

// Provide a vtable anchor		// Provide a vtable anchor
~ScheduleDAGMI() override;		~ScheduleDAGMI() override;

		/// If this method returns true, handling of the scheduling regions
		/// themselves (in case of a scheduling boundary in MBB) will be done
		/// beginning with the topmost region of MBB.
		bool doMBBSchedRegionsTopDown() const override {
		return SchedImpl->doMBBSchedRegionsTopDown();
		}

// Returns LiveIntervals instance for use in DAG mutators and such.		// Returns LiveIntervals instance for use in DAG mutators and such.
LiveIntervals *getLIS() const { return LIS; }		LiveIntervals *getLIS() const { return LIS; }

/// Return true if this DAG supports VReg liveness and RegPressure.		/// Return true if this DAG supports VReg liveness and RegPressure.
virtual bool hasVRegLiveness() const { return false; }		virtual bool hasVRegLiveness() const { return false; }

/// Add a postprocessing step to the DAG builder.		/// Add a postprocessing step to the DAG builder.
/// Mutations are applied in the order that they are added after normal DAG		/// Mutations are applied in the order that they are added after normal DAG
Show All 26 Lines	void enterRegion(MachineBasicBlock *bb,
MachineBasicBlock::iterator begin,		MachineBasicBlock::iterator begin,
MachineBasicBlock::iterator end,		MachineBasicBlock::iterator end,
unsigned regioninstrs) override;		unsigned regioninstrs) override;

/// Implement ScheduleDAGInstrs interface for scheduling a sequence of		/// Implement ScheduleDAGInstrs interface for scheduling a sequence of
/// reorderable instructions.		/// reorderable instructions.
void schedule() override;		void schedule() override;

		void startBlock(MachineBasicBlock *bb) override;
		void finishBlock() override;

/// Change the position of an instruction within the basic block and update		/// Change the position of an instruction within the basic block and update
/// live ranges and region boundary iterators.		/// live ranges and region boundary iterators.
void moveInstruction(MachineInstr *MI, MachineBasicBlock::iterator InsertPos);		void moveInstruction(MachineInstr *MI, MachineBasicBlock::iterator InsertPos);

const SUnit *getNextClusterPred() const { return NextClusterPred; }		const SUnit *getNextClusterPred() const { return NextClusterPred; }

const SUnit *getNextClusterSucc() const { return NextClusterSucc; }		const SUnit *getNextClusterSucc() const { return NextClusterSucc; }

▲ Show 20 Lines • Show All 701 Lines • Show Last 20 Lines

include/llvm/CodeGen/ScheduleDAGInstrs.h

Show First 20 Lines • Show All 269 Lines • ▼ Show 20 Lines	public:
MachineBasicBlock::iterator end() const { return RegionEnd; }		MachineBasicBlock::iterator end() const { return RegionEnd; }

/// Creates a new SUnit and return a ptr to it.		/// Creates a new SUnit and return a ptr to it.
SUnit newSUnit(MachineInstr MI);		SUnit newSUnit(MachineInstr MI);

/// Returns an existing SUnit for this MI, or nullptr.		/// Returns an existing SUnit for this MI, or nullptr.
SUnit getSUnit(MachineInstr MI) const;		SUnit getSUnit(MachineInstr MI) const;

		/// If this method returns true, handling of the scheduling regions
		/// themselves (in case of a scheduling boundary in MBB) will be done
		/// beginning with the topmost region of MBB.
		virtual bool doMBBSchedRegionsTopDown() const { return false; }

/// Prepares to perform scheduling in the given block.		/// Prepares to perform scheduling in the given block.
virtual void startBlock(MachineBasicBlock *BB);		virtual void startBlock(MachineBasicBlock *BB);

/// Cleans up after scheduling in the given block.		/// Cleans up after scheduling in the given block.
virtual void finishBlock();		virtual void finishBlock();

/// \brief Initialize the DAG and common scheduler state for a new		/// \brief Initialize the DAG and common scheduler state for a new
/// scheduling region. This does not actually create the DAG, only clears		/// scheduling region. This does not actually create the DAG, only clears
▲ Show 20 Lines • Show All 94 Lines • Show Last 20 Lines

lib/CodeGen/MachineScheduler.cpp

Show First 20 Lines • Show All 399 Lines • ▼ Show 20 Lines	bool PostMachineScheduler::runOnMachineFunction(MachineFunction &mf) {
} else if (!mf.getSubtarget().enablePostRAScheduler()) {		} else if (!mf.getSubtarget().enablePostRAScheduler()) {
DEBUG(dbgs() << "Subtarget disables post-MI-sched.\n");		DEBUG(dbgs() << "Subtarget disables post-MI-sched.\n");
return false;		return false;
}		}
DEBUG(dbgs() << "Before post-MI-sched:\n"; mf.print(dbgs()));		DEBUG(dbgs() << "Before post-MI-sched:\n"; mf.print(dbgs()));

// Initialize the context of the pass.		// Initialize the context of the pass.
MF = &mf;		MF = &mf;
		MLI = &getAnalysis<MachineLoopInfo>();
		MatzeBUnsubmitted Done Reply Inline Actions Oh this was already required and not used by the PostMachineScheduler. Guess you got lucky and can just use it ;-) MatzeB: Oh this was already required and not used by the PostMachineScheduler. Guess you got lucky and…
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions yes :-) jonpa: yes :-)
PassConfig = &getAnalysis<TargetPassConfig>();		PassConfig = &getAnalysis<TargetPassConfig>();

if (VerifyScheduling)		if (VerifyScheduling)
MF->verify(this, "Before post machine scheduling.");		MF->verify(this, "Before post machine scheduling.");

// Instantiate the selected scheduler for this target, function, and		// Instantiate the selected scheduler for this target, function, and
// optimization level.		// optimization level.
std::unique_ptr<ScheduleDAGInstrs> Scheduler(createPostMachineScheduler());		std::unique_ptr<ScheduleDAGInstrs> Scheduler(createPostMachineScheduler());
Show All 16 Lines
/// calls this late anyway.		/// calls this late anyway.
static bool isSchedBoundary(MachineBasicBlock::iterator MI,		static bool isSchedBoundary(MachineBasicBlock::iterator MI,
MachineBasicBlock *MBB,		MachineBasicBlock *MBB,
MachineFunction *MF,		MachineFunction *MF,
const TargetInstrInfo *TII) {		const TargetInstrInfo *TII) {
return MI->isCall() \|\| TII->isSchedulingBoundary(MI, MBB, MF);		return MI->isCall() \|\| TII->isSchedulingBoundary(MI, MBB, MF);
}		}

		/// A region of an MBB for scheduling.
		MatzeBUnsubmitted Done Reply Inline Actions `///` MatzeB: `///`
		atrickUnsubmitted Done Reply Inline Actions Please explain what RegionBegin/End refer to here. Note that they need to be inclusive--they cannot refer to instructions outside of the identified scheduling region because those will be reordered before scheduling this region. Also note somewhere that the scheduling pass cannot add instructions in other regions now without updating these boundaries. atrick: Please explain what RegionBegin/End refer to here. Note that they need to be inclusive--they…
		struct SchedRegion {
		/// RegionBegin is the first instruction in the scheduling region, and
		/// RegionEnd is either MBB->end() or the scheduling boundary after the
		/// last instruction in the scheduling region. These iterators cannot refer
		/// to instructions outside of the identified scheduling region because
		/// those may be reordered before scheduling this region.
		MachineBasicBlock::iterator RegionBegin;
		MachineBasicBlock::iterator RegionEnd;
		unsigned NumRegionInstrs;
		SchedRegion(MachineBasicBlock::iterator B, MachineBasicBlock::iterator E,
		MatzeBUnsubmitted Done Reply Inline Actions Capitalize parameter names. MatzeB: Capitalize parameter names.
		unsigned N) :
		RegionBegin(B), RegionEnd(E), NumRegionInstrs(N) {}
		};

		typedef SmallVector<SchedRegion, 16> MBBRegionsVector;
		static void
		getSchedRegions(MachineBasicBlock *MBB,
		MBBRegionsVector &Regions,
		bool RegionsTopDown) {
		MachineFunction *MF = MBB->getParent();
		const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();

		MachineBasicBlock::iterator I = nullptr;
		for(MachineBasicBlock::iterator RegionEnd = MBB->end();
		RegionEnd != MBB->begin(); RegionEnd = I) {

		// Avoid decrementing RegionEnd for blocks with no terminator.
		if (RegionEnd != MBB->end() \|\|
		isSchedBoundary(&std::prev(RegionEnd), &MBB, MF, TII)) {
		--RegionEnd;
		}

		// The next region starts above the previous region. Look backward in the
		// instruction stream until we find the nearest boundary.
		unsigned NumRegionInstrs = 0;
		I = RegionEnd;
		for (;I != MBB->begin(); --I) {
		MachineInstr &MI = *std::prev(I);
		if (isSchedBoundary(&MI, &*MBB, MF, TII))
		break;
		if (!MI.isDebugValue())
		// MBB::size() uses instr_iterator to count. Here we need a bundle to
		// count as a single instruction.
		++NumRegionInstrs;
		}

		Regions.push_back(SchedRegion(I, RegionEnd, NumRegionInstrs));
		}
		MatzeBUnsubmitted Not Done Reply Inline Actions Would it be possible to implement this more iterator style instead of synthesizing an std::vector (also `std::vector` is usually the wrong choice in llvm)? (I'm not talking full on STL iterator, just some object with a "get me the next region" function). MatzeB: Would it be possible to implement this more iterator style instead of synthesizing an std…
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions I made an attempt to wrap this in a class per your suggestion. Could not find a better vector class than std, however.. (?). This is a bit more code compared to just calling std::reverse(). If we decide this is better, perhaps you know of an example in the LLVM code base that I could reuse if better? jonpa: I made an attempt to wrap this in a class per your suggestion. Could not find a better vector…
		MatzeBUnsubmitted Done Reply Inline Actions I was thinking about something where we do not store intermediate results in a vector but compute regions on the fly. If it turns out we cannot avoid a vector then your previous approach was fine. Using something like SmallVector<16> should save some allocations when compiling small functions. MatzeB: I was thinking about something where we do not store intermediate results in a vector but…
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions Ah, I see. I am then going back to the previous version since it was simpler, and since I really don't want to get into rewriting the loop that extracts the scheduling regions unless really needed. I also changed to SmallVector. jonpa: Ah, I see. I am then going back to the previous version since it was simpler, and since I…

		if (RegionsTopDown)
		std::reverse(Regions.begin(), Regions.end());
		}

/// Main driver for both MachineScheduler and PostMachineScheduler.		/// Main driver for both MachineScheduler and PostMachineScheduler.
void MachineSchedulerBase::scheduleRegions(ScheduleDAGInstrs &Scheduler,		void MachineSchedulerBase::scheduleRegions(ScheduleDAGInstrs &Scheduler,
bool FixKillFlags) {		bool FixKillFlags) {
const TargetInstrInfo *TII = MF->getSubtarget().getInstrInfo();

// Visit all machine basic blocks.		// Visit all machine basic blocks.
//		//
// TODO: Visit blocks in global postorder or postorder within the bottom-up		// TODO: Visit blocks in global postorder or postorder within the bottom-up
// loop tree. Then we can optionally compute global RegPressure.		// loop tree. Then we can optionally compute global RegPressure.
for (MachineFunction::iterator MBB = MF->begin(), MBBEnd = MF->end();		for (MachineFunction::iterator MBB = MF->begin(), MBBEnd = MF->end();
MBB != MBBEnd; ++MBB) {		MBB != MBBEnd; ++MBB) {

Scheduler.startBlock(&*MBB);		Scheduler.startBlock(&*MBB);

#ifndef NDEBUG		#ifndef NDEBUG
if (SchedOnlyFunc.getNumOccurrences() && SchedOnlyFunc != MF->getName())		if (SchedOnlyFunc.getNumOccurrences() && SchedOnlyFunc != MF->getName())
continue;		continue;
if (SchedOnlyBlock.getNumOccurrences()		if (SchedOnlyBlock.getNumOccurrences()
&& (int)SchedOnlyBlock != MBB->getNumber())		&& (int)SchedOnlyBlock != MBB->getNumber())
continue;		continue;
#endif		#endif

// Break the block into scheduling regions [I, RegionEnd), and schedule each		// Break the block into scheduling regions [I, RegionEnd). RegionEnd
// region as soon as it is discovered. RegionEnd points the scheduling		// points to the scheduling boundary at the bottom of the region. The DAG
// boundary at the bottom of the region. The DAG does not include RegionEnd,		// does not include RegionEnd, but the region does (i.e. the next
// but the region does (i.e. the next RegionEnd is above the previous		// RegionEnd is above the previous RegionBegin). If the current block has
// RegionBegin). If the current block has no terminator then RegionEnd ==		// no terminator then RegionEnd == MBB->end() for the bottom region.
// MBB->end() for the bottom region.		//
		// All the regions of MBB are first found and stored in MBBRegions, which
		// will be processed (MBB) top-down if initialized with true.
//		//
// The Scheduler may insert instructions during either schedule() or		// The Scheduler may insert instructions during either schedule() or
// exitRegion(), even for empty regions. So the local iterators 'I' and		// exitRegion(), even for empty regions. So the local iterators 'I' and
// 'RegionEnd' are invalid across these calls.		// 'RegionEnd' are invalid across these calls. Instructions must not be
//		// added to other regions than the current one without updating MBBRegions.
// MBB::size() uses instr_iterator to count. Here we need a bundle to count
// as a single instruction.
for(MachineBasicBlock::iterator RegionEnd = MBB->end();
RegionEnd != MBB->begin(); RegionEnd = Scheduler.begin()) {

// Avoid decrementing RegionEnd for blocks with no terminator.		MBBRegionsVector MBBRegions;
if (RegionEnd != MBB->end() \|\|		getSchedRegions(&*MBB, MBBRegions, Scheduler.doMBBSchedRegionsTopDown());
isSchedBoundary(&std::prev(RegionEnd), &MBB, MF, TII)) {		for (MBBRegionsVector::iterator R = MBBRegions.begin();
--RegionEnd;		R != MBBRegions.end(); ++R) {
}		MachineBasicBlock::iterator I = R->RegionBegin;
		MachineBasicBlock::iterator RegionEnd = R->RegionEnd;
		unsigned NumRegionInstrs = R->NumRegionInstrs;

// The next region starts above the previous region. Look backward in the
// instruction stream until we find the nearest boundary.
unsigned NumRegionInstrs = 0;
MachineBasicBlock::iterator I = RegionEnd;
for (; I != MBB->begin(); --I) {
MachineInstr &MI = *std::prev(I);
if (isSchedBoundary(&MI, &*MBB, MF, TII))
break;
if (!MI.isDebugValue())
++NumRegionInstrs;
}
// Notify the scheduler of the region, even if we may skip scheduling		// Notify the scheduler of the region, even if we may skip scheduling
// it. Perhaps it still needs to be bundled.		// it. Perhaps it still needs to be bundled.
Scheduler.enterRegion(&*MBB, I, RegionEnd, NumRegionInstrs);		Scheduler.enterRegion(&*MBB, I, RegionEnd, NumRegionInstrs);

// Skip empty scheduling regions (0 or 1 schedulable instructions).		// Skip empty scheduling regions (0 or 1 schedulable instructions).
if (I == RegionEnd \|\| I == std::prev(RegionEnd)) {		if (I == RegionEnd \|\| I == std::prev(RegionEnd)) {
// Close the current region. Bundle the terminator if needed.		// Close the current region. Bundle the terminator if needed.
// This invalidates 'RegionEnd' and 'I'.		// This invalidates 'RegionEnd' and 'I'.
Show All 9 Lines	for (MBBRegionsVector::iterator R = MBBRegions.begin();
dbgs() << " RegionInstrs: " << NumRegionInstrs << '\n');		dbgs() << " RegionInstrs: " << NumRegionInstrs << '\n');
if (DumpCriticalPathLength) {		if (DumpCriticalPathLength) {
errs() << MF->getName();		errs() << MF->getName();
errs() << ":BB# " << MBB->getNumber();		errs() << ":BB# " << MBB->getNumber();
errs() << " " << MBB->getName() << " \n";		errs() << " " << MBB->getName() << " \n";
}		}

// Schedule a region: possibly reorder instructions.		// Schedule a region: possibly reorder instructions.
// This invalidates 'RegionEnd' and 'I'.		// This invalidates the original region iterators.
Scheduler.schedule();		Scheduler.schedule();

// Close the current region.		// Close the current region.
Scheduler.exitRegion();		Scheduler.exitRegion();

// Scheduling has invalidated the current iterator 'I'. Ask the
// scheduler for the top of it's scheduled region.
RegionEnd = Scheduler.begin();
}		}
Scheduler.finishBlock();		Scheduler.finishBlock();
// FIXME: Ideally, no further passes should rely on kill flags. However,		// FIXME: Ideally, no further passes should rely on kill flags. However,
// thumb2 size reduction is currently an exception, so the PostMIScheduler		// thumb2 size reduction is currently an exception, so the PostMIScheduler
// needs to do this.		// needs to do this.
if (FixKillFlags)		if (FixKillFlags)
Scheduler.fixupKills(*MBB);		Scheduler.fixupKills(*MBB);
}		}
▲ Show 20 Lines • Show All 112 Lines • ▼ Show 20 Lines
/// releasePredecessors - Call releasePred on each of SU's predecessors.		/// releasePredecessors - Call releasePred on each of SU's predecessors.
void ScheduleDAGMI::releasePredecessors(SUnit *SU) {		void ScheduleDAGMI::releasePredecessors(SUnit *SU) {
for (SUnit::pred_iterator I = SU->Preds.begin(), E = SU->Preds.end();		for (SUnit::pred_iterator I = SU->Preds.begin(), E = SU->Preds.end();
I != E; ++I) {		I != E; ++I) {
releasePred(SU, &*I);		releasePred(SU, &*I);
}		}
}		}

		void ScheduleDAGMI::startBlock(MachineBasicBlock *bb) {
		ScheduleDAGInstrs::startBlock(bb);
		SchedImpl->enterMBB(bb);
		MatzeBUnsubmitted Done Reply Inline Actions This should also call `ScheduleDAGInstr::finishBlock()`. MatzeB: This should also call `ScheduleDAGInstr::finishBlock()`.
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions yep jonpa: yep
		}

		void ScheduleDAGMI::finishBlock() {
		SchedImpl->leaveMBB();
		ScheduleDAGInstrs::finishBlock();
		}

/// enterRegion - Called back from MachineScheduler::runOnMachineFunction after		/// enterRegion - Called back from MachineScheduler::runOnMachineFunction after
/// crossing a scheduling boundary. [begin, end) includes all instructions in		/// crossing a scheduling boundary. [begin, end) includes all instructions in
/// the region, including the boundary itself and single-instruction regions		/// the region, including the boundary itself and single-instruction regions
/// that don't get scheduled.		/// that don't get scheduled.
void ScheduleDAGMI::enterRegion(MachineBasicBlock *bb,		void ScheduleDAGMI::enterRegion(MachineBasicBlock *bb,
MachineBasicBlock::iterator begin,		MachineBasicBlock::iterator begin,
MachineBasicBlock::iterator end,		MachineBasicBlock::iterator end,
unsigned regioninstrs)		unsigned regioninstrs)
▲ Show 20 Lines • Show All 2,951 Lines • Show Last 20 Lines

lib/Target/SystemZ/SystemZHazardRecognizer.h

Show All 13 Lines
// scheduling candidates. This includes:		// scheduling candidates. This includes:
//		//
// * Decoder grouping. A decoder group can maximally hold 3 uops, and		// * Decoder grouping. A decoder group can maximally hold 3 uops, and
// instructions that always begin a new group should be scheduled when		// instructions that always begin a new group should be scheduled when
// the current decoder group is empty.		// the current decoder group is empty.
// * Processor resources usage. It is beneficial to balance the use of		// * Processor resources usage. It is beneficial to balance the use of
// resources.		// resources.
//		//
		// A goal is to consider all instructions, also those outside of any
		// scheduling region. Such instructions are "advanced" past and include
		// single instructions before a scheduling region, branches etc.
		//
		// A block that has only one predecessor continues scheduling with the state
		// of it (which may be updated by emitting branches).
		//
// ===---------------------------------------------------------------------===//		// ===---------------------------------------------------------------------===//

#ifndef LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H		#ifndef LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H
#define LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H		#define LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H

#include "SystemZSubtarget.h"		#include "SystemZSubtarget.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineScheduler.h"		#include "llvm/CodeGen/MachineScheduler.h"
#include "llvm/CodeGen/ScheduleHazardRecognizer.h"		#include "llvm/CodeGen/ScheduleHazardRecognizer.h"
#include "llvm/MC/MCInstrDesc.h"		#include "llvm/MC/MCInstrDesc.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <string>		#include <string>

namespace llvm {		namespace llvm {

/// SystemZHazardRecognizer maintains the state during scheduling.		/// SystemZHazardRecognizer maintains the state for one MBB during scheduling.
class SystemZHazardRecognizer : public ScheduleHazardRecognizer {		class SystemZHazardRecognizer : public ScheduleHazardRecognizer {

ScheduleDAGMI *DAG;		const SystemZInstrInfo *TII;
const TargetSchedModel *SchedModel;		const TargetSchedModel *SchedModel;

/// Keep track of the number of decoder slots used in the current		/// Keep track of the number of decoder slots used in the current
/// decoder group.		/// decoder group.
unsigned CurrGroupSize;		unsigned CurrGroupSize;

/// The tracking of resources here are quite similar to the common		/// The tracking of resources here are quite similar to the common
/// code use of a critical resource. However, z13 differs in the way		/// code use of a critical resource. However, z13 differs in the way
Show All 33 Lines	class SystemZHazardRecognizer : public ScheduleHazardRecognizer {

/// Clear all counters for processor resources.		/// Clear all counters for processor resources.
void clearProcResCounters();		void clearProcResCounters();

/// With the goal of alternating processor sides for stalling (FPd)		/// With the goal of alternating processor sides for stalling (FPd)
/// ops, return true if it seems good to schedule an FPd op next.		/// ops, return true if it seems good to schedule an FPd op next.
bool isFPdOpPreferred_distance(const SUnit *SU);		bool isFPdOpPreferred_distance(const SUnit *SU);

public:		/// Last emitted instruction or nullptr.
SystemZHazardRecognizer(const MachineSchedContext *C);		MachineInstr *LastEmittedMI;

void setDAG(ScheduleDAGMI *dag) {		public:
DAG = dag;		SystemZHazardRecognizer(const SystemZInstrInfo *tii,
SchedModel = dag->getSchedModel();		const TargetSchedModel *SM)
}		: TII(tii), SchedModel(SM) { Reset(); }

HazardType getHazardType(SUnit *m, int Stalls = 0) override;		HazardType getHazardType(SUnit *m, int Stalls = 0) override;
void Reset() override;		void Reset() override;
void EmitInstruction(SUnit *SU) override;		void EmitInstruction(SUnit *SU) override;

		/// Resolves and cache a resolved scheduling class for an SUnit.
		const MCSchedClassDesc getSchedClass(SUnit SU) const {
		if (!SU->SchedClass && SchedModel->hasInstrSchedModel())
		SU->SchedClass = SchedModel->resolveSchedClass(SU->getInstr());
		return SU->SchedClass;
		}

		/// Wrap a non-scheduled instruction in an SU and emit it.
		void emitInstruction(MachineInstr *MI, bool TakenBranch = false);

// Cost functions used by SystemZPostRASchedStrategy while		// Cost functions used by SystemZPostRASchedStrategy while
// evaluating candidates.		// evaluating candidates.

/// Return the cost of decoder grouping for SU. If SU must start a		/// Return the cost of decoder grouping for SU. If SU must start a
/// new decoder group, this is negative if this fits the schedule or		/// new decoder group, this is negative if this fits the schedule or
/// positive if it would mean ending a group prematurely. For normal		/// positive if it would mean ending a group prematurely. For normal
/// instructions this returns 0.		/// instructions this returns 0.
int groupingCost(SUnit *SU) const;		int groupingCost(SUnit *SU) const;

/// Return the cost of SU in regards to processor resources usage.		/// Return the cost of SU in regards to processor resources usage.
/// A positive value means it would be better to wait with SU, while		/// A positive value means it would be better to wait with SU, while
/// a negative value means it would be good to schedule SU next.		/// a negative value means it would be good to schedule SU next.
int resourcesCost(SUnit *SU);		int resourcesCost(SUnit *SU);

#ifndef NDEBUG		#ifndef NDEBUG
// Debug dumping.		// Debug dumping.
std::string CurGroupDbg; // current group as text		std::string CurGroupDbg; // current group as text
void dumpSU(SUnit *SU, raw_ostream &OS) const;		void dumpSU(SUnit *SU, raw_ostream &OS) const;
void dumpCurrGroup(std::string Msg = "") const;		void dumpCurrGroup(std::string Msg = "") const;
void dumpProcResourceCounters() const;		void dumpProcResourceCounters() const;
#endif		#endif

		MachineBasicBlock::iterator getLastEmittedMI() { return LastEmittedMI; }

		/// Copy counters from end of single predecessor.
		void copyState(SystemZHazardRecognizer *Incoming);
};		};

} // namespace llvm		} // namespace llvm

#endif /* LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H */		#endif /* LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZHAZARDRECOGNIZER_H */

lib/Target/SystemZ/SystemZHazardRecognizer.cpp

Show All 13 Lines
// scheduling candidates. This includes:		// scheduling candidates. This includes:
//		//
// * Decoder grouping. A decoder group can maximally hold 3 uops, and		// * Decoder grouping. A decoder group can maximally hold 3 uops, and
// instructions that always begin a new group should be scheduled when		// instructions that always begin a new group should be scheduled when
// the current decoder group is empty.		// the current decoder group is empty.
// * Processor resources usage. It is beneficial to balance the use of		// * Processor resources usage. It is beneficial to balance the use of
// resources.		// resources.
//		//
		// A goal is to consider all instructions, also those outside of any
		// scheduling region. Such instructions are "advanced" past and include
		// single instructions before a scheduling region, branches etc.
		//
		// A block that has only one predecessor continues scheduling with the state
		// of it (which may be updated by emitting branches).
		//
// ===---------------------------------------------------------------------===//		// ===---------------------------------------------------------------------===//

#include "SystemZHazardRecognizer.h"		#include "SystemZHazardRecognizer.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "misched"		#define DEBUG_TYPE "misched"

// This is the limit of processor resource usage at which the		// This is the limit of processor resource usage at which the
// scheduler should try to look for other instructions (not using the		// scheduler should try to look for other instructions (not using the
// critical resource).		// critical resource).
static cl::opt<int> ProcResCostLim("procres-cost-lim", cl::Hidden,		static cl::opt<int> ProcResCostLim("procres-cost-lim", cl::Hidden,
cl::desc("The OOO window for processor "		cl::desc("The OOO window for processor "
"resources during scheduling."),		"resources during scheduling."),
cl::init(8));		cl::init(8));

SystemZHazardRecognizer::
SystemZHazardRecognizer(const MachineSchedContext *C) : DAG(nullptr),
SchedModel(nullptr) {}

unsigned SystemZHazardRecognizer::		unsigned SystemZHazardRecognizer::
getNumDecoderSlots(SUnit *SU) const {		getNumDecoderSlots(SUnit *SU) const {
const MCSchedClassDesc *SC = DAG->getSchedClass(SU);		const MCSchedClassDesc *SC = getSchedClass(SU);
if (!SC->isValid())		if (!SC->isValid())
return 0; // IMPLICIT_DEF / KILL -- will not make impact in output.		return 0; // IMPLICIT_DEF / KILL -- will not make impact in output.

if (SC->BeginGroup) {		if (SC->BeginGroup) {
if (!SC->EndGroup)		if (!SC->EndGroup)
return 2; // Cracked instruction		return 2; // Cracked instruction
else		else
return 3; // Expanded/group-alone instruction		return 3; // Expanded/group-alone instruction
Show All 14 Lines	getHazardType(SUnit *m, int Stalls) {
return (fitsIntoCurrentGroup(m) ? NoHazard : Hazard);		return (fitsIntoCurrentGroup(m) ? NoHazard : Hazard);
}		}

void SystemZHazardRecognizer::Reset() {		void SystemZHazardRecognizer::Reset() {
CurrGroupSize = 0;		CurrGroupSize = 0;
clearProcResCounters();		clearProcResCounters();
GrpCount = 0;		GrpCount = 0;
LastFPdOpCycleIdx = UINT_MAX;		LastFPdOpCycleIdx = UINT_MAX;
		LastEmittedMI = nullptr;
DEBUG(CurGroupDbg = "";);		DEBUG(CurGroupDbg = "";);
}		}

bool		bool
SystemZHazardRecognizer::fitsIntoCurrentGroup(SUnit *SU) const {		SystemZHazardRecognizer::fitsIntoCurrentGroup(SUnit *SU) const {
const MCSchedClassDesc *SC = DAG->getSchedClass(SU);		const MCSchedClassDesc *SC = getSchedClass(SU);
if (!SC->isValid())		if (!SC->isValid())
return true;		return true;

// A cracked instruction only fits into schedule if the current		// A cracked instruction only fits into schedule if the current
// group is empty.		// group is empty.
if (SC->BeginGroup)		if (SC->BeginGroup)
return (CurrGroupSize == 0);		return (CurrGroupSize == 0);

Show All 30 Lines	void SystemZHazardRecognizer::nextGroup(bool DbgOutput) {

DEBUG(if (DbgOutput)		DEBUG(if (DbgOutput)
dumpProcResourceCounters(););		dumpProcResourceCounters(););
}		}

#ifndef NDEBUG // Debug output		#ifndef NDEBUG // Debug output
void SystemZHazardRecognizer::dumpSU(SUnit *SU, raw_ostream &OS) const {		void SystemZHazardRecognizer::dumpSU(SUnit *SU, raw_ostream &OS) const {
OS << "SU(" << SU->NodeNum << "):";		OS << "SU(" << SU->NodeNum << "):";
OS << SchedModel->getInstrInfo()->getName(SU->getInstr()->getOpcode());		OS << TII->getName(SU->getInstr()->getOpcode());

const MCSchedClassDesc *SC = DAG->getSchedClass(SU);		const MCSchedClassDesc *SC = getSchedClass(SU);
if (!SC->isValid())		if (!SC->isValid())
return;		return;

for (TargetSchedModel::ProcResIter		for (TargetSchedModel::ProcResIter
PI = SchedModel->getWriteProcResBegin(SC),		PI = SchedModel->getWriteProcResBegin(SC),
PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI) {		PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI) {
const MCProcResourceDesc &PRD =		const MCProcResourceDesc &PRD =
*SchedModel->getProcResource(PI->ProcResourceIdx);		*SchedModel->getProcResource(PI->ProcResourceIdx);
▲ Show 20 Lines • Show All 56 Lines • ▼ Show 20 Lines
}		}
#endif //NDEBUG		#endif //NDEBUG

void SystemZHazardRecognizer::clearProcResCounters() {		void SystemZHazardRecognizer::clearProcResCounters() {
ProcResourceCounters.assign(SchedModel->getNumProcResourceKinds(), 0);		ProcResourceCounters.assign(SchedModel->getNumProcResourceKinds(), 0);
CriticalResourceIdx = UINT_MAX;		CriticalResourceIdx = UINT_MAX;
}		}

		static inline bool isBranchRetTrap(MachineInstr *MI) {
		return (MI->isBranch() \|\| MI->isReturn() \|\|
		MI->getOpcode() == SystemZ::CondTrap);
		}

// Update state with SU as the next scheduled unit.		// Update state with SU as the next scheduled unit.
void SystemZHazardRecognizer::		void SystemZHazardRecognizer::
EmitInstruction(SUnit *SU) {		EmitInstruction(SUnit *SU) {
const MCSchedClassDesc *SC = DAG->getSchedClass(SU);		const MCSchedClassDesc *SC = getSchedClass(SU);
DEBUG( dumpCurrGroup("Decode group before emission"););		DEBUG( dumpCurrGroup("Decode group before emission"););

// If scheduling an SU that must begin a new decoder group, move on		// If scheduling an SU that must begin a new decoder group, move on
// to next group.		// to next group.
if (!fitsIntoCurrentGroup(SU))		if (!fitsIntoCurrentGroup(SU))
nextGroup();		nextGroup();

DEBUG( dbgs() << "+++ HazardRecognizer emitting "; dumpSU(SU, dbgs());		DEBUG( dbgs() << "+++ HazardRecognizer emitting "; dumpSU(SU, dbgs());
dbgs() << "\n";		dbgs() << "\n";
raw_string_ostream cgd(CurGroupDbg);		raw_string_ostream cgd(CurGroupDbg);
if (CurGroupDbg.length())		if (CurGroupDbg.length())
cgd << ", ";		cgd << ", ";
dumpSU(SU, cgd););		dumpSU(SU, cgd););

		LastEmittedMI = SU->getInstr();

// After returning from a call, we don't know much about the state.		// After returning from a call, we don't know much about the state.
if (SU->getInstr()->isCall()) {		if (SU->isCall) {
DEBUG (dbgs() << "+++ Clearing state after call.\n";);		DEBUG (dbgs() << "+++ Clearing state after call.\n";);
clearProcResCounters();		clearProcResCounters();
LastFPdOpCycleIdx = UINT_MAX;		LastFPdOpCycleIdx = UINT_MAX;
CurrGroupSize += getNumDecoderSlots(SU);		CurrGroupSize += getNumDecoderSlots(SU);
assert (CurrGroupSize <= 3);		assert (CurrGroupSize <= 3);
nextGroup();		nextGroup();
return;		return;
}		}
Show All 23 Lines	EmitInstruction(SUnit *SU) {

// Make note of an instruction that uses a blocking resource (FPd).		// Make note of an instruction that uses a blocking resource (FPd).
if (SU->isUnbuffered) {		if (SU->isUnbuffered) {
LastFPdOpCycleIdx = getCurrCycleIdx();		LastFPdOpCycleIdx = getCurrCycleIdx();
DEBUG (dbgs() << "+++ Last FPd cycle index: "		DEBUG (dbgs() << "+++ Last FPd cycle index: "
<< LastFPdOpCycleIdx << "\n";);		<< LastFPdOpCycleIdx << "\n";);
}		}

		bool GroupEndingBranch =
		(CurrGroupSize >= 1 && isBranchRetTrap(SU->getInstr()));

// Insert SU into current group by increasing number of slots used		// Insert SU into current group by increasing number of slots used
// in current group.		// in current group.
CurrGroupSize += getNumDecoderSlots(SU);		CurrGroupSize += getNumDecoderSlots(SU);
assert (CurrGroupSize <= 3);		assert (CurrGroupSize <= 3);

// Check if current group is now full/ended. If so, move on to next		// Check if current group is now full/ended. If so, move on to next
// group to be ready to evaluate more candidates.		// group to be ready to evaluate more candidates.
if (CurrGroupSize == 3 \|\| SC->EndGroup)		if (CurrGroupSize == 3 \|\| SC->EndGroup \|\| GroupEndingBranch)
nextGroup();		nextGroup();
}		}

int SystemZHazardRecognizer::groupingCost(SUnit *SU) const {		int SystemZHazardRecognizer::groupingCost(SUnit *SU) const {
const MCSchedClassDesc *SC = DAG->getSchedClass(SU);		const MCSchedClassDesc *SC = getSchedClass(SU);
if (!SC->isValid())		if (!SC->isValid())
return 0;		return 0;

// If SU begins new group, it can either break a current group early		// If SU begins new group, it can either break a current group early
// or fit naturally if current group is empty (negative cost).		// or fit naturally if current group is empty (negative cost).
if (SC->BeginGroup) {		if (SC->BeginGroup) {
if (CurrGroupSize)		if (CurrGroupSize)
return 3 - CurrGroupSize;		return 3 - CurrGroupSize;
Show All 27 Lines	if (LastFPdOpCycleIdx > getCurrCycleIdx())
return ((LastFPdOpCycleIdx - getCurrCycleIdx()) == 3);		return ((LastFPdOpCycleIdx - getCurrCycleIdx()) == 3);
return ((getCurrCycleIdx() - LastFPdOpCycleIdx) == 3);		return ((getCurrCycleIdx() - LastFPdOpCycleIdx) == 3);
}		}

int SystemZHazardRecognizer::		int SystemZHazardRecognizer::
resourcesCost(SUnit *SU) {		resourcesCost(SUnit *SU) {
int Cost = 0;		int Cost = 0;

const MCSchedClassDesc *SC = DAG->getSchedClass(SU);		const MCSchedClassDesc *SC = getSchedClass(SU);
if (!SC->isValid())		if (!SC->isValid())
return 0;		return 0;

// For a FPd op, either return min or max value as indicated by the		// For a FPd op, either return min or max value as indicated by the
// distance to any prior FPd op.		// distance to any prior FPd op.
if (SU->isUnbuffered)		if (SU->isUnbuffered)
Cost = (isFPdOpPreferred_distance(SU) ? INT_MIN : INT_MAX);		Cost = (isFPdOpPreferred_distance(SU) ? INT_MIN : INT_MAX);
// For other instructions, give a cost to the use of the critical resource.		// For other instructions, give a cost to the use of the critical resource.
else if (CriticalResourceIdx != UINT_MAX) {		else if (CriticalResourceIdx != UINT_MAX) {
for (TargetSchedModel::ProcResIter		for (TargetSchedModel::ProcResIter
PI = SchedModel->getWriteProcResBegin(SC),		PI = SchedModel->getWriteProcResBegin(SC),
PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI)		PE = SchedModel->getWriteProcResEnd(SC); PI != PE; ++PI)
if (PI->ProcResourceIdx == CriticalResourceIdx)		if (PI->ProcResourceIdx == CriticalResourceIdx)
Cost = PI->Cycles;		Cost = PI->Cycles;
}		}

return Cost;		return Cost;
}		}

		void SystemZHazardRecognizer::emitInstruction(MachineInstr *MI,
		bool TakenBranch) {
		// Make a temporary SUnit.
		SUnit SU(MI, 0);

		// Set interesting flags.
		SU.isCall = MI->isCall();

		const MCSchedClassDesc *SC = SchedModel->resolveSchedClass(MI);
		for (const MCWriteProcResEntry &PRE :
		make_range(SchedModel->getWriteProcResBegin(SC),
		SchedModel->getWriteProcResEnd(SC))) {
		switch (SchedModel->getProcResource(PRE.ProcResourceIdx)->BufferSize) {
		case 0:
		SU.hasReservedResource = true;
		break;
		case 1:
		SU.isUnbuffered = true;
		break;
		default:
		break;
		}
		}

		EmitInstruction(&SU);

		if (TakenBranch && CurrGroupSize > 0)
		nextGroup(false /DbgOutput/);
		uweigandUnsubmitted Done Reply Inline Actions I guess this is not only "counters". Maybe rename to "copyState" ? uweigand: I guess this is not only "counters". Maybe rename to "copyState" ?
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions ok jonpa: ok

		assert ((!MI->isTerminator() \|\| isBranchRetTrap(MI)) &&
		"Scheduler: unhandled terminator!");
		}

		void SystemZHazardRecognizer::
		copyState(SystemZHazardRecognizer *Incoming) {
		// Current decoder group
		CurrGroupSize = Incoming->CurrGroupSize;
		DEBUG (CurGroupDbg = Incoming->CurGroupDbg;);

		// Processor resources
		ProcResourceCounters = Incoming->ProcResourceCounters;
		CriticalResourceIdx = Incoming->CriticalResourceIdx;

		// FPd
		LastFPdOpCycleIdx = Incoming->LastFPdOpCycleIdx;
		GrpCount = Incoming->GrpCount;
		}
		uweigandUnsubmitted Done Reply Inline Actions Hmm. Shouldn't emitInstruction itself try to handle branches correctly? Maybe it only needs one extra bit of information passed in, whether to assume a branch is taken or not? uweigand: Hmm. Shouldn't emitInstruction itself try to handle branches correctly? Maybe it only needs…
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions ok - updated so that EmitInstruction() will recognize that any branch in slot 1 or 2 will end current group (there is at least one instruction already in group). emitInstruction() takes a new parameter TakenBranch and makes sure that if true, current decoder group is empty after emission. jonpa: ok - updated so that - EmitInstruction() will recognize that any branch in slot 1 or 2 will…
		uweigandUnsubmitted Done Reply Inline Actions There doesn't appear to be anything HazardRecognizer-specific left in this routine. Maybe move to the SchedPolicy (and possibly merge with getNextMIToEmit into an "advanceTo" routine with just and "end" argument)? uweigand: There doesn't appear to be anything HazardRecognizer-specific left in this routine. Maybe move…
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions Yes :-) jonpa: Yes :-)
		uweigandUnsubmitted Not Done Reply Inline Actions Maybe also move to the SchedPolicy and merge into its sole user? Then we wouldn't have to pass the MBB around. uweigand: Maybe also move to the SchedPolicy and merge into its sole user? Then we wouldn't have to…
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions Yes, looks better. Moved the assert that makes sure that we are aware of all terminators into emitInstruction(). The purpose is currently to check that we don't have any terminators that are branches without the isBranch/isReturn flag. It seems that CondTrap is the only example, right now. Perhaps CondTrap could get the isBranch flag instead? Do we want this assert, or could we simply assume that all terminators are branches? jonpa: Yes, looks better. Moved the assert that makes sure that we are aware of all terminators into…

lib/Target/SystemZ/SystemZMachineScheduler.h

//==- SystemZMachineScheduler.h - SystemZ Scheduler Interface ----- C++ --==//		//==- SystemZMachineScheduler.h - SystemZ Scheduler Interface ----- C++ --==//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// -------------------------- Post RA scheduling ---------------------------- //		// -------------------------- Post RA scheduling ---------------------------- //
// SystemZPostRASchedStrategy is a scheduling strategy which is plugged into		// SystemZPostRASchedStrategy is a scheduling strategy which is plugged into
// the MachineScheduler. It has a sorted Available set of SUs and a pickNode()		// the MachineScheduler. It has a sorted Available set of SUs and a pickNode()
// implementation that looks to optimize decoder grouping and balance the		// implementation that looks to optimize decoder grouping and balance the
// usage of processor resources.		// usage of processor resources. Scheduler states are saved for the end
		// region of each MBB, so that a successor block can learn from it.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "SystemZHazardRecognizer.h"		#include "SystemZHazardRecognizer.h"
#include "llvm/CodeGen/MachineScheduler.h"		#include "llvm/CodeGen/MachineScheduler.h"
#include "llvm/CodeGen/ScheduleDAG.h"		#include "llvm/CodeGen/ScheduleDAG.h"
#include <set>		#include <set>

#ifndef LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZMACHINESCHEDULER_H		#ifndef LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZMACHINESCHEDULER_H
#define LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZMACHINESCHEDULER_H		#define LLVM_LIB_TARGET_SYSTEMZ_SYSTEMZMACHINESCHEDULER_H

using namespace llvm;		using namespace llvm;

namespace llvm {		namespace llvm {

/// A MachineSchedStrategy implementation for SystemZ post RA scheduling.		/// A MachineSchedStrategy implementation for SystemZ post RA scheduling.
class SystemZPostRASchedStrategy : public MachineSchedStrategy {		class SystemZPostRASchedStrategy : public MachineSchedStrategy {
ScheduleDAGMI *DAG;
		const MachineLoopInfo *MLI;
		const SystemZInstrInfo *TII;

		// A SchedModel is needed before any DAG is built while advancing past
		// non-scheduled instructions, so it would not always be possible to call
		// DAG->getSchedClass(SU).
		TargetSchedModel SchedModel;

/// A candidate during instruction evaluation.		/// A candidate during instruction evaluation.
struct Candidate {		struct Candidate {
SUnit *SU = nullptr;		SUnit *SU = nullptr;

/// The decoding cost.		/// The decoding cost.
int GroupingCost = 0;		int GroupingCost = 0;

Show All 34 Lines	struct SUSet : std::set<SUnit*, SUSorter> {
#ifndef NDEBUG		#ifndef NDEBUG
void dump(SystemZHazardRecognizer &HazardRec);		void dump(SystemZHazardRecognizer &HazardRec);
#endif		#endif
};		};

/// The set of available SUs to schedule next.		/// The set of available SUs to schedule next.
SUSet Available;		SUSet Available;

// HazardRecognizer that tracks the scheduler state for the current		/// Current MBB
// region.		MachineBasicBlock *MBB;
SystemZHazardRecognizer HazardRec;
		/// Maintain hazard recognizers for all blocks, so that the scheduler state
		/// can be maintained past BB boundaries when appropariate.
		typedef std::map<MachineBasicBlock, SystemZHazardRecognizer> MBB2HazRec;
		MBB2HazRec SchedStates;

		/// Pointer to the HazardRecognizer that tracks the scheduler state for
		/// the current region.
		SystemZHazardRecognizer *HazardRec;

		/// Update the scheduler state by emitting (non-scheduled) instructions
		/// up to, but not including, NextBegin.
		void advanceTo(MachineBasicBlock::iterator NextBegin);

public:		public:
SystemZPostRASchedStrategy(const MachineSchedContext *C);		SystemZPostRASchedStrategy(const MachineSchedContext *C);
		uweigandUnsubmitted Done Reply Inline Actions This is dead now, right? uweigand: This is dead now, right?
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions ah, yes. jonpa: ah, yes.
		virtual ~SystemZPostRASchedStrategy();

		/// Called for a region before scheduling.
		void initPolicy(MachineBasicBlock::iterator Begin,
		MachineBasicBlock::iterator End,
		unsigned NumRegionInstrs) override;

/// PostRA scheduling does not track pressure.		/// PostRA scheduling does not track pressure.
bool shouldTrackPressure() const override { return false; }		bool shouldTrackPressure() const override { return false; }

/// Initialize the strategy after building the DAG for a new region.		// Process scheduling regions top-down so that scheduler states can be
void initialize(ScheduleDAGMI *dag) override;		// transferrred over scheduling boundaries.
		bool doMBBSchedRegionsTopDown() const override { return true; }

		void initialize(ScheduleDAGMI *dag) override {}

		/// Tell the strategy that MBB is about to be processed.
		void enterMBB(MachineBasicBlock *NextMBB) override;

		/// Tell the strategy that current MBB is done.
		void leaveMBB() override;

/// Pick the next node to schedule, or return NULL.		/// Pick the next node to schedule, or return NULL.
SUnit *pickNode(bool &IsTopNode) override;		SUnit *pickNode(bool &IsTopNode) override;

/// ScheduleDAGMI has scheduled an instruction - tell HazardRec		/// ScheduleDAGMI has scheduled an instruction - tell HazardRec
/// about it.		/// about it.
void schedNode(SUnit *SU, bool IsTopNode) override;		void schedNode(SUnit *SU, bool IsTopNode) override;

Show All 11 Lines

lib/Target/SystemZ/SystemZMachineScheduler.cpp

//-- SystemZMachineScheduler.cpp - SystemZ Scheduler Interface -- C++ ----==//		//-- SystemZMachineScheduler.cpp - SystemZ Scheduler Interface -- C++ ----==//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// -------------------------- Post RA scheduling ---------------------------- //		// -------------------------- Post RA scheduling ---------------------------- //
// SystemZPostRASchedStrategy is a scheduling strategy which is plugged into		// SystemZPostRASchedStrategy is a scheduling strategy which is plugged into
// the MachineScheduler. It has a sorted Available set of SUs and a pickNode()		// the MachineScheduler. It has a sorted Available set of SUs and a pickNode()
// implementation that looks to optimize decoder grouping and balance the		// implementation that looks to optimize decoder grouping and balance the
// usage of processor resources.		// usage of processor resources. Scheduler states are saved for the end
		// region of each MBB, so that a successor block can learn from it.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "SystemZMachineScheduler.h"		#include "SystemZMachineScheduler.h"

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "misched"		#define DEBUG_TYPE "misched"

#ifndef NDEBUG		#ifndef NDEBUG
// Print the set of SUs		// Print the set of SUs
void SystemZPostRASchedStrategy::SUSet::		void SystemZPostRASchedStrategy::SUSet::
dump(SystemZHazardRecognizer &HazardRec) {		dump(SystemZHazardRecognizer &HazardRec) {
dbgs() << "{";		dbgs() << "{";
for (auto &SU : *this) {		for (auto &SU : *this) {
HazardRec.dumpSU(SU, dbgs());		HazardRec.dumpSU(SU, dbgs());
if (SU != *rbegin())		if (SU != *rbegin())
dbgs() << ", ";		dbgs() << ", ";
}		}
dbgs() << "}\n";		dbgs() << "}\n";
}		}
#endif		#endif

		// Try to find a single predecessor that would be interesting for the
		// scheduler in the top-most region of MBB.
		static MachineBasicBlock getSingleSchedPred(MachineBasicBlock MBB,
		const MachineLoop *Loop) {
		MachineBasicBlock *PredMBB = nullptr;
		if (MBB->pred_size() == 1)
		PredMBB = *MBB->pred_begin();

		// The loop header has two predecessors, return the latch, but not for a
		// single block loop.
		if (MBB->pred_size() == 2 && Loop != nullptr && Loop->getHeader() == MBB) {
		for (auto I = MBB->pred_begin(); I != MBB->pred_end(); ++I)
		if (Loop->contains(*I))
		PredMBB = (I == MBB ? nullptr : I);
		}

		assert ((PredMBB == nullptr \|\| !Loop \|\| Loop->contains(PredMBB))
		&& "Loop MBB should not consider predecessor outside of loop.");

		return PredMBB;
		}

		void SystemZPostRASchedStrategy::
		advanceTo(MachineBasicBlock::iterator NextBegin) {
		MachineBasicBlock::iterator LastEmittedMI = HazardRec->getLastEmittedMI();
		uweigandUnsubmitted Not Done Reply Inline Actions Is it even ever possible now that LastEmittedMI->getParent() is not equal to MBB? uweigand: Is it even ever possible now that LastEmittedMI->getParent() is not equal to MBB?
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions Yes, a terminator in the single predecessor will be emitted and recorded as LastEmittedMI. I tried first to guard the update of LastEmittedMI with this check, but then realized that the HazardRec does not have the MBB member. jonpa: Yes, a terminator in the single predecessor will be emitted and recorded as LastEmittedMI. I…
		MachineBasicBlock::iterator I =
		((LastEmittedMI != nullptr && LastEmittedMI->getParent() == MBB) ?
		std::next(LastEmittedMI) : MBB->begin());

		uweigandUnsubmitted Done Reply Inline Actions Maybe inline into its single user? Only if the result looks simpler ... uweigand: Maybe inline into its single user? Only if the result looks simpler ...
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions Yes, I guess that looks ok. jonpa: Yes, I guess that looks ok.
		for (; I != NextBegin; ++I) {
		if (I->isPosition() \|\| I->isDebugValue())
		continue;
		HazardRec->emitInstruction(&*I);
		}
		}

		void SystemZPostRASchedStrategy::enterMBB(MachineBasicBlock *NextMBB) {
		assert ((SchedStates.find(NextMBB) == SchedStates.end()) &&
		"Entering MBB twice?");
		uweigandUnsubmitted Done Reply Inline Actions Maybe use HazardRec here? It is now guaranteed to be the same ... uweigand: Maybe use HazardRec here? It is now guaranteed to be the same ...
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions ok jonpa: ok
		DEBUG (dbgs() << "+++ Entering MBB#" << NextMBB->getNumber());

		MBB = NextMBB;
		/// Create a HazardRec for MBB, save it in SchedStates and set HazardRec to
		/// point to it.
		HazardRec = SchedStates[MBB] = new SystemZHazardRecognizer(TII, &SchedModel);
		DEBUG (const MachineLoop *Loop = MLI->getLoopFor(MBB);
		if(Loop && Loop->getHeader() == MBB)
		dbgs() << " (Loop header)";
		dbgs() << ":\n";);

		// Try to take over the state from a single predecessor, if it has been
		// scheduled. If this is not possible, we are done.
		MachineBasicBlock *SinglePredMBB =
		getSingleSchedPred(MBB, MLI->getLoopFor(MBB));
		if (SinglePredMBB == nullptr \|\|
		SchedStates.find(SinglePredMBB) == SchedStates.end())
		return;

		DEBUG (dbgs() << "+++ Continued scheduling from MBB#"
		<< SinglePredMBB->getNumber() << "\n";);

		HazardRec->copyState(SchedStates[SinglePredMBB]);

		// Emit incoming terminator(s). Be optimistic and assume that branch
		// prediction will generally do "the right thing".
		for (MachineBasicBlock::iterator I = SinglePredMBB->getFirstTerminator();
		I != SinglePredMBB->end(); I++) {
		DEBUG (dbgs() << "+++ Emitting incoming branch: "; I->dump(););
		bool TakenBranch = (I->isBranch() &&
		(TII->getBranchInfo(*I).Target->isReg() \|\| // Relative branch
		TII->getBranchInfo(*I).Target->getMBB() == MBB));
		HazardRec->emitInstruction(&*I, TakenBranch);
		uweigandUnsubmitted Done Reply Inline Actions HazardRec again? uweigand: HazardRec again?
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions done jonpa: done
		if (TakenBranch)
		break;
		}
		}

		void SystemZPostRASchedStrategy::leaveMBB() {
		DEBUG (dbgs() << "+++ Leaving MBB#" << MBB->getNumber() << "\n";);

		// Advance to first terminator. The successor block will handle terminators
		// dependent on CFG layout (T/NT branch etc).
		advanceTo(MBB->getFirstTerminator());
		}

SystemZPostRASchedStrategy::		SystemZPostRASchedStrategy::
SystemZPostRASchedStrategy(const MachineSchedContext *C)		SystemZPostRASchedStrategy(const MachineSchedContext *C)
: DAG(nullptr), HazardRec(C) {}		: MLI(C->MLI),
		TII(static_cast<const SystemZInstrInfo *>
		(C->MF->getSubtarget().getInstrInfo())),
		MBB(nullptr), HazardRec(nullptr) {
		const TargetSubtargetInfo *ST = &C->MF->getSubtarget();
		SchedModel.init(ST->getSchedModel(), ST, TII);
		}

		SystemZPostRASchedStrategy::~SystemZPostRASchedStrategy() {
		// Delete hazard recognizers kept around for each MBB.
		for (auto I : SchedStates) {
		SystemZHazardRecognizer *hazrec = I.second;
		delete hazrec;
		}
		}

		void SystemZPostRASchedStrategy::initPolicy(MachineBasicBlock::iterator Begin,
		MachineBasicBlock::iterator End,
		unsigned NumRegionInstrs) {
		// Don't emit the terminators.
		if (Begin->isTerminator())
		return;

		uweigandUnsubmitted Not Done Reply Inline Actions I'm wondering why you don't do the initial allocation of the hazard recognizer for the MBB (and taking over the predecessor state) right here. Currently, it seems that if an MBB is scheduled, this is done in "initialize", and if the MBB is not scheduled, it is done in "leaveMBB". If you'd always do it here, it seems that duplication could be removed. uweigand: I'm wondering why you don't do the initial allocation of the hazard recognizer for the MBB (and…
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions There is one subtle change in behavior with this patch, and that is the fact that initPolicy() (via enterRegion()) is never called on an empty MBB. That is why the DoneMBB argument is needed for leaveMBB(). I thought about adding an empty region for an empty MBB so that initPolicy() is always called for each MBB. That didn't quite work still, since the MBB cannot then be retrieved from Begin->getParent(), if Begin == End. jonpa: There is one subtle change in behavior with this patch, and that is the fact that initPolicy()…
		uweigandUnsubmitted Done Reply Inline Actions Maybe the common-code interface still isn't quite right then. Should we have an enterMBB() to properly pair with the leaveMBB(), maybe? uweigand: Maybe the common-code interface still isn't quite right then. Should we have an enterMBB() to…
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions Yes, this seems better. jonpa: Yes, this seems better.
		uweigandUnsubmitted Done Reply Inline Actions Agreed. Final question: can't the "advance" now be done in initPolicy, so we're finally rid of this function (and the CurrBegin global)? uweigand: Agreed. Final question: can't the "advance" now be done in initPolicy, so we're finally rid of…
		jonpaAuthorUnsubmitted Not Done Reply Inline Actions Yes, indeed :-) jonpa: Yes, indeed :-)
void SystemZPostRASchedStrategy::initialize(ScheduleDAGMI *dag) {		// Emit any instructions before start of region.
DAG = dag;		advanceTo(Begin);
HazardRec.setDAG(dag);
HazardRec.Reset();
}		}

// Pick the next node to schedule.		// Pick the next node to schedule.
SUnit *SystemZPostRASchedStrategy::pickNode(bool &IsTopNode) {		SUnit *SystemZPostRASchedStrategy::pickNode(bool &IsTopNode) {
// Only scheduling top-down.		// Only scheduling top-down.
IsTopNode = true;		IsTopNode = true;

if (Available.empty())		if (Available.empty())
return nullptr;		return nullptr;

// If only one choice, return it.		// If only one choice, return it.
if (Available.size() == 1) {		if (Available.size() == 1) {
DEBUG (dbgs() << "+++ Only one: ";		DEBUG (dbgs() << "+++ Only one: ";
HazardRec.dumpSU(*Available.begin(), dbgs()); dbgs() << "\n";);		HazardRec->dumpSU(*Available.begin(), dbgs()); dbgs() << "\n";);
return *Available.begin();		return *Available.begin();
}		}

// All nodes that are possible to schedule are stored by in the		// All nodes that are possible to schedule are stored by in the
// Available set.		// Available set.
DEBUG(dbgs() << "+++ Available: "; Available.dump(HazardRec););		DEBUG(dbgs() << "+++ Available: "; Available.dump(*HazardRec););

Candidate Best;		Candidate Best;
for (auto *SU : Available) {		for (auto *SU : Available) {

// SU is the next candidate to be compared against current Best.		// SU is the next candidate to be compared against current Best.
Candidate c(SU, HazardRec);		Candidate c(SU, *HazardRec);

// Remeber which SU is the best candidate.		// Remeber which SU is the best candidate.
if (Best.SU == nullptr \|\| c < Best) {		if (Best.SU == nullptr \|\| c < Best) {
Best = c;		Best = c;
DEBUG(dbgs() << "+++ Best sofar: ";		DEBUG(dbgs() << "+++ Best sofar: ";
HazardRec.dumpSU(Best.SU, dbgs());		HazardRec->dumpSU(Best.SU, dbgs());
if (Best.GroupingCost != 0)		if (Best.GroupingCost != 0)
dbgs() << "\tGrouping cost:" << Best.GroupingCost;		dbgs() << "\tGrouping cost:" << Best.GroupingCost;
if (Best.ResourcesCost != 0)		if (Best.ResourcesCost != 0)
dbgs() << " Resource cost:" << Best.ResourcesCost;		dbgs() << " Resource cost:" << Best.ResourcesCost;
dbgs() << " Height:" << Best.SU->getHeight();		dbgs() << " Height:" << Best.SU->getHeight();
dbgs() << "\n";);		dbgs() << "\n";);
}		}

▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	operator<(const Candidate &other) {
return false;		return false;
}		}

void SystemZPostRASchedStrategy::schedNode(SUnit *SU, bool IsTopNode) {		void SystemZPostRASchedStrategy::schedNode(SUnit *SU, bool IsTopNode) {
DEBUG(dbgs() << "+++ Scheduling SU(" << SU->NodeNum << ")\n";);		DEBUG(dbgs() << "+++ Scheduling SU(" << SU->NodeNum << ")\n";);

// Remove SU from Available set and update HazardRec.		// Remove SU from Available set and update HazardRec.
Available.erase(SU);		Available.erase(SU);
HazardRec.EmitInstruction(SU);		HazardRec->EmitInstruction(SU);
}		}

void SystemZPostRASchedStrategy::releaseTopNode(SUnit *SU) {		void SystemZPostRASchedStrategy::releaseTopNode(SUnit *SU) {
// Set isScheduleHigh flag on all SUs that we want to consider first in		// Set isScheduleHigh flag on all SUs that we want to consider first in
// pickNode().		// pickNode().
const MCSchedClassDesc *SC = DAG->getSchedClass(SU);		const MCSchedClassDesc *SC = HazardRec->getSchedClass(SU);
bool AffectsGrouping = (SC->isValid() && (SC->BeginGroup \|\| SC->EndGroup));		bool AffectsGrouping = (SC->isValid() && (SC->BeginGroup \|\| SC->EndGroup));
SU->isScheduleHigh = (AffectsGrouping \|\| SU->isUnbuffered);		SU->isScheduleHigh = (AffectsGrouping \|\| SU->isUnbuffered);

// Put all released SUs in the Available set.		// Put all released SUs in the Available set.
Available.insert(SU);		Available.insert(SU);
}		}

test/CodeGen/SystemZ/int-cmp-48.ll

Show All 23 Lines	exit:
ret void		ret void
}		}


; Check that we do not fold across an aliasing store.		; Check that we do not fold across an aliasing store.
define void @f2(i8 *%src) {		define void @f2(i8 *%src) {
; CHECK-LABEL: f2:		; CHECK-LABEL: f2:
; CHECK: llc [[REG:%r[0-5]]], 0(%r2)		; CHECK: llc [[REG:%r[0-5]]], 0(%r2)
; CHECK: tmll [[REG]], 1		; CHECK-DAG: mvi 0(%r2), 0
; CHECK: mvi 0(%r2), 0		; CHECK-DAG: tmll [[REG]], 1
; CHECK: ber %r14		; CHECK: ber %r14
; CHECK: br %r14		; CHECK: br %r14
entry:		entry:
%byte = load i8 , i8 *%src		%byte = load i8 , i8 *%src
store i8 0, i8 *%src		store i8 0, i8 *%src
%and = and i8 %byte, 1		%and = and i8 %byte, 1
%cmp = icmp eq i8 %and, 0		%cmp = icmp eq i8 %and, 0
br i1 %cmp, label %exit, label %store		br i1 %cmp, label %exit, label %store
▲ Show 20 Lines • Show All 204 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

Improve post-RA scheduling for SystemZClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 111473

include/llvm/CodeGen/MachineScheduler.h

include/llvm/CodeGen/ScheduleDAGInstrs.h

lib/CodeGen/MachineScheduler.cpp

lib/Target/SystemZ/SystemZHazardRecognizer.h

lib/Target/SystemZ/SystemZHazardRecognizer.cpp

lib/Target/SystemZ/SystemZMachineScheduler.h

lib/Target/SystemZ/SystemZMachineScheduler.cpp

test/CodeGen/SystemZ/int-cmp-48.ll

Improve post-RA scheduling for SystemZ
ClosedPublic