This is an archive of the discontinued LLVM Phabricator instance.

Differential D14928

[PostRA scheduling] Allow a subtarget to do scheduling when it wants post RA
ClosedPublic

Authored by jonpa on Nov 23 2015, 7:34 AM.

Download Raw Diff

Details

Reviewers

hfinkel

Summary

As it says in Passes.cpp about post RA scheduling pass insertion:

...
Ideally it wouldn't be part of the standard pass pipeline, and the target would just add
a PostRA scheduling pass wherever it wants.
...

With this comment in mind, I suspect that there may be some potential broader redesign of
command line options / subtarget methods awating, but this works for now.

Does anybody have an opinion / better suggestion, or is this acceptable?

Motivation:

To allow the use of the PostMachineScheduler at an arbitrary (later)
point while not running the PostRAScheduler, it was necessary to add a
new method TargetSubtargetInfo::customPostRAScheduling(), otherwise
the scheduler will immediately abort.

Any target that wants to insert the PostMachineScheduler pass at a
custom place should override this method to return true.

Diff Detail

Event Timeline

jonpa updated this revision to Diff 40931.Nov 23 2015, 7:34 AM

jonpa retitled this revision from to [PostRA scheduling] Allow a subtarget to do scheduling when it wants post RA.

jonpa updated this object.

jonpa updated this object.Nov 23 2015, 7:36 AM

jonpa added a subscriber: llvm-commits.

Please post this with full context.

Why don't you just return false from enablePostRAScheduler() and then schedule the pass yourself anyway (and just fix the scheduler so it does not abort)?

You are right - the check in PostrAScheduler is not needed as this is actually done already a few lines down. A bit confusing that this pass is loaded into the pass manager even though the subtarget has it disabled...
Removed.

What is left is the new customPostRAScheduling() hook, which just tells PostMachineScheduler to keep
running even if post-ra scheduling is disabled by subtarget. This is so that the target can use PostMachineScheduler anywhere it wants.

Backing up a bit (because I still don't really like this API), where in the pipeline do you want to run post-RA scheduling? We currently have (omitting INC and GC passes):

RA
addPostRegAlloc() -- targets can easily add passes here
shrink wrapping
PEI
branch folding, tail duplication and copy propagation
expand post-RA pseudos (which, by design, we assume happens before post-RA scheduling)
addPreSched2(); -- targets can easily add passes here
post-RA-scheduling
block placement
addPreEmitPass() -- targets can easily add passes here

I want to add a scheduling pass after all else - that is as the last pass added in addPreEmitPass(). The motivation is that the decoder needs special grouping of instructions which other optimizations in PreEmitPass() would otherwise disrupt. SystemZ currently runs 3 optimization passes in pre-emit with the motivation that they should be subordinate to other optimizations.

It seems that currently PostMachineScheduler is a pass that is only intended to replace the PostRAScheduler pass. An alternative then - if you would rather not see this new (ugly?) subtarget hook -
would be that Targets wanting a custom scheduler some place else would also insert its own pass to run it.

In D14928#296362, @jonpa wrote:

I want to add a scheduling pass after all else - that is as the last pass added in addPreEmitPass(). The motivation is that the decoder needs special grouping of instructions which other optimizations in PreEmitPass() would otherwise disrupt. SystemZ currently runs 3 optimization passes in pre-emit with the motivation that they should be subordinate to other optimizations.

I think that you should move SystemZShortenInstPass and SystemZElimComparePass into addPreSched2 instead of having them in addPreEmitPass. That should fix your problem. Doing that might require that you teach your implementations of AnalyzeBranch, ReverseBranchCondition, RemoveBranch, and InsertBranch about BRCT(G). That seems like the right solution.

It seems that currently PostMachineScheduler is a pass that is only intended to replace the PostRAScheduler pass. An alternative then - if you would rather not see this new (ugly?) subtarget hook -
would be that Targets wanting a custom scheduler some place else would also insert its own pass to run it.

No, we'd just make a better API for it.

True that those passes could be moved, but I am thinking that there is also the SystemZLongBranch pass. It can split compare&branch instructions, which would then give room for scheduler to put the compare into an empty slot somewhere if possible. (This pass can not be moved before block placement, of course.)

This may or may not be important (perhaps in some loop?), but it couldn't hurt. Considering the comment I quoted before from Passes.cpp, it seems that also others may like to try this.

In D14928#296525, @jonpa wrote:

True that those passes could be moved, but I am thinking that there is also the SystemZLongBranch pass. It can split compare&branch instructions, which would then give room for scheduler to put the compare into an empty slot somewhere if possible. (This pass can not be moved before block placement, of course.)

This may or may not be important (perhaps in some loop?), but it couldn't hurt. Considering the comment I quoted before from Passes.cpp, it seems that also others may like to try this.

Okay, I'm convinced. The argument is simple (and should be documented as such), branch relaxation (on some targets) can expose additional post-RA scheduling opportunities. Branch relaxation must happen after block placement. Thus, we want the option of running post-RA scheduling after branch relaxation instead of before it.

I recommend that we name the callback:

virtual bool targetSchedulesPostRAScheduling() const { return false; }

and update the comment to explain the situation (as I did here).

updates as requested, however some new concerns:

I guess it would make sense to also allow the PostRAScheduler pass to be run at a later point. I tried to do this, but it became very weird to try to override the options.

It is very confusing that these passes are always inserted into the pass list, and then controlled by an option to 'not run'. To then in turn override that option - like this patch is doing - only gives the expected behaviour (the scheduler running once at the point specified by the target), as long as the *other* scheding pass is inserted at the standard point. Otherwise, it would run twice! This doesn't seem acceptable.

It would also be nice to use the same command line options regardless of where the pass is run, which is not possible as long as the options are needed to 'not run' the pass.

Perhaps one way would be to let TargetMachine specify to TargetPassConfig to disable the standard insertion of post-ra scheduler pass? That would work for SystemZ, at least. This would work as long as all subtargets follow the same behaviour. What do you think?

In D14928#297146, @jonpa wrote:

updates as requested, however some new concerns:

I guess it would make sense to also allow the PostRAScheduler pass to be run at a later point. I tried to do this, but it became very weird to try to override the options.

It is very confusing that these passes are always inserted into the pass list, and then controlled by an option to 'not run'. To then in turn override that option

IIRC, this is done because whether or not to run post-RA scheduling is subtarget-dependent on many architectures, and so the pass needs to be always scheduled and then turn itself on or off depending on the TM object associated with the currently-compiled function.

like this patch is doing - only gives the expected behaviour (the scheduler running once at the point specified by the target), as long as the *other* scheding pass is inserted at the standard point. Otherwise, it would run twice! This doesn't seem acceptable.

It would also be nice to use the same command line options regardless of where the pass is run, which is not possible as long as the options are needed to 'not run' the pass.

To which options are you referring?

Perhaps one way would be to let TargetMachine specify to TargetPassConfig to disable the standard insertion of post-ra scheduler pass? That would work for SystemZ, at least. This would work as long as all subtargets follow the same behaviour. What do you think?

Yes, this makes sense.

Changed the new method to belong to TargetMachine.
Made MISchedPostRA option global (removed static keyword), and using it in SystemZTargetMachine.cpp.
comment update

(I was referring to options MISchedPostRA, and perhaps EnablePostRAScheduler. They would have been "locked" into always being used for not running the passes at the standard point in pass sequence. With this updated patch, any Target can use them instead of adding new ones).

Hal, what do you think about this patch now? /Jonas

LGTM.

include/llvm/Target/TargetMachine.h
260 ↗	(On Diff #41334)	Add space after first ;
lib/Target/SystemZ/SystemZTargetMachine.cpp
19 ↗	(On Diff #41334)	We generally frown on this, but as it is intended to be a temporary option, I suppose it is okay.
lib/Target/SystemZ/SystemZTargetMachine.h
47 ↗	(On Diff #41334)	Add a space after first ;

This revision is now accepted and ready to land.Dec 9 2015, 6:29 AM

Committed r255234

Revision Contents

Path

Size

include/

llvm/

Target/

TargetSubtargetInfo.h

5 lines

lib/

CodeGen/

MachineScheduler.cpp

3 lines

PostRASchedulerList.cpp

5 lines

Diff 40931

include/llvm/Target/TargetSubtargetInfo.h

Context not available.
	/// which is the preferred way to influence this.	/// which is the preferred way to influence this.
	virtual bool enablePostRAScheduler() const;	virtual bool enablePostRAScheduler() const;

		/// True if subtarget is using the PostMachineScheduler at some
		/// arbitrary point in the pass sequence while disabling the
		/// PostRAScheduler.
		virtual bool customPostRAScheduling() const { return false;};

	/// \brief True if the subtarget should run the atomic expansion pass.	/// \brief True if the subtarget should run the atomic expansion pass.
	virtual bool enableAtomicExpand() const;	virtual bool enableAtomicExpand() const;

Context not available.

lib/CodeGen/MachineScheduler.cpp

Context not available.
	if (skipOptnoneFunction(*mf.getFunction()))	if (skipOptnoneFunction(*mf.getFunction()))
	return false;	return false;

	if (!mf.getSubtarget().enablePostRAScheduler()) {	if (!mf.getSubtarget().enablePostRAScheduler() &&
		!mf.getSubtarget().customPostRAScheduling()) {
	DEBUG(dbgs() << "Subtarget disables post-MI-sched.\n");	DEBUG(dbgs() << "Subtarget disables post-MI-sched.\n");
	return false;	return false;
	}	}
Context not available.

lib/CodeGen/PostRASchedulerList.cpp

Context not available.
	if (skipOptnoneFunction(*Fn.getFunction()))	if (skipOptnoneFunction(*Fn.getFunction()))
	return false;	return false;

		// Subtarget is doing scheduling at some other point in the pass
		// sequence instead of now.
		if (Fn.getSubtarget().customPostRAScheduling())
		return false;

	TII = Fn.getSubtarget().getInstrInfo();	TII = Fn.getSubtarget().getInstrInfo();
	MachineLoopInfo &MLI = getAnalysis<MachineLoopInfo>();	MachineLoopInfo &MLI = getAnalysis<MachineLoopInfo>();
	AliasAnalysis *AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();	AliasAnalysis *AA = &getAnalysis<AAResultsWrapperPass>().getAAResults();
Context not available.