This is an archive of the discontinued LLVM Phabricator instance.

Move Post RA Scheduling flag bit into SchedMachineModel
AbandonedPublic

Authored by spatel on Jun 17 2014, 3:50 PM.

Details

Reviewers
nadav
Summary

Currently, there's an "X86ProcFamily" enum that is used to enable post-register-allocator code scheduling for Atom CPU variants. It would be better to explicitly enable/disable this scheduler feature using a flag in the subtarget's SchedMachineModel rather than a bit in the X86Subtarget class.

This minimal patch just moves the PostRAScheduler flag bit and fixes up the accesses to that bit. Subsequent patches will clean up the X86ProcFamilyEnum. Other targets (ARM also has a PostRAScheduler bit in ARMSubtarget.h) may want to make a similar change to keep uniformity, or maybe it's better to move this bit into the parent TargetSubtargetInfo?

No change to codegen for any target is intended with this patch.

Diff Detail

Event Timeline

spatel updated this revision to Diff 10517.Jun 17 2014, 3:50 PM
spatel retitled this revision from to Add a SubTargetFeature to enable Post RA Scheduling.
spatel updated this object.
spatel edited the test plan for this revision. (Show Details)
spatel added a reviewer: nadav.
spatel added a subscriber: Unknown Object (MLST).
  • Original Message -----

From: "Sanjay Patel" <spatel@rotateright.com>
To: spatel@rotateright.com, nrotem@apple.com
Cc: llvm-commits@cs.uiuc.edu
Sent: Tuesday, June 17, 2014 5:50:34 PM
Subject: [PATCH] Add a SubTargetFeature to enable Post RA Scheduling

Hi nadav,

Currently, there's an "X86ProcFamily" enum that is used to enable
post-register-allocator code scheduling for Atom CPU variants.

This patch is the first step in removing that enum by creating a new
SubTargetFeature for PostRA Scheduling that can be used by any CPU.
This will clean up the code and make it easier to debug potential
postRA scheduler problems (see PR20020).

No change to codegen for any target is intended with this patch.

Can you please further explain the motivation? Two things:

  1. The "post-RA scheduler" is not a target feature; and I don't see why you'd want to model it as one.
  2. You can already override whether or not the post-RA scheduler is used for debugging purposes by passing the -post-RA-scheduler=0 flag.

    -Hal

Patch author: Simon Pilgrim

http://reviews.llvm.org/D4183

Files:

lib/Target/X86/X86.td
lib/Target/X86/X86Subtarget.cpp
lib/Target/X86/X86Subtarget.h

Index: lib/Target/X86/X86.td

  • lib/Target/X86/X86.td

+++ lib/Target/X86/X86.td
@@ -170,6 +170,9 @@

"LEA instruction with certain
arguments is slow">;

def FeatureSlowIncDec : SubtargetFeature<"slow-incdec",
"SlowIncDec", "true",

"INC and DEC instructions are
slower than ADD and SUB">;

+def FeaturePostRAScheduler : SubtargetFeature<"sched-post-ra",
+ "PostRAScheduler", "true",
+ "Enable post register allocation
scheduling">;

===----------------------------------------------------------------------===
// X86 processors supported.
@@ -221,7 +224,7 @@

FeatureSlowDivide,
FeatureCallRegIndirect,
FeatureLEAUsesAG,
  • FeaturePadShortFunctions]>;

+ FeaturePadShortFunctions,
FeaturePostRAScheduler]>;

// Atom Silvermont.
def : ProcessorModel<"slm", SLMModel, [ProcIntelSLM,
@@ -231,7 +234,8 @@

FeatureCallRegIndirect,
FeaturePRFCHW,
FeatureSlowLEA, FeatureSlowIncDec,
  • FeatureSlowBTMem, FeatureFastUAMem]>;

+ FeatureSlowBTMem, FeatureFastUAMem,
+ FeaturePostRAScheduler]>;
// "Arrandale" along with corei3 and corei5
def : ProcessorModel<"corei7", SandyBridgeModel,

[FeatureSSE42, FeatureCMPXCHG16B,
FeatureSlowBTMem,

Index: lib/Target/X86/X86Subtarget.cpp

  • lib/Target/X86/X86Subtarget.cpp

+++ lib/Target/X86/X86Subtarget.cpp
@@ -219,9 +219,6 @@

// Make sure the right MCSchedModel is used.
InitCPUSchedModel(CPUName);
  • if (X86ProcFamily == IntelAtom || X86ProcFamily == IntelSLM)
  • PostRAScheduler = true; - InstrItins = getInstrItineraryForCPU(CPUName);

    // It's important to keep the MCSubtargetInfo feature bits in sync with

Index: lib/Target/X86/X86Subtarget.h

  • lib/Target/X86/X86Subtarget.h

+++ lib/Target/X86/X86Subtarget.h
@@ -453,7 +453,7 @@

/// Enable the MachineScheduler pass for all X86 subtargets.
bool enableMachineScheduler() const override { return true; }
  • /// enablePostRAScheduler - run for Atom optimization.

+ /// enablePostRAScheduler - set by FeaturePostRAScheduler
subtargetfeature.

bool enablePostRAScheduler(CodeGenOpt::Level OptLevel,
                           TargetSubtargetInfo::AntiDepBreakMode&
                           Mode,
                           RegClassVector& CriticalPathRCs) const
                           override;

llvm-commits mailing list
llvm-commits@cs.uiuc.edu
http://lists.cs.uiuc.edu/mailman/listinfo/llvm-commits

Hi Hal,

Thanks for looking at this. I agree that PostRA isn't a real subtarget feature, but using a processor model list (and a weirdly defined one at that - Atom, SLM, Other?) as a proxy for a pass seemed even worse. The existing code that asks "isAtom()" or "isSLM()" to decide about unrolling or changing LEAs feels wrong to me; we should be specifying the feature that we're interested in enabling directly rather than by asking what the CPU is. Is there a better way?

Our motivation here is that we'd like to eventually add Post RA scheduling by default when selecting the AMD btver2 target and possibly others. The existing enum and the CPU-based if-check will just grow over time if we leave it as-is. We'd also like to be able to optionally turn postRA scheduling off in llc and up in clang too.

In D4183#6, @spatel wrote:

Hi Hal,

Thanks for looking at this. I agree that PostRA isn't a real subtarget feature, but using a processor model list (and a weirdly defined one at that - Atom, SLM, Other?) as a proxy for a pass seemed even worse. The existing code that asks "isAtom()" or "isSLM()" to decide about unrolling or changing LEAs feels wrong to me; we should be specifying the feature that we're interested in enabling directly rather than by asking what the CPU is. Is there a better way?

I think that (late) unrolling is actually a good model for how this should be handled. There are two pieces:

  1. The PassManagerBuilder has a DisableUnrollLoops flag for use by clang/opt (the corresponding technique here is to add a flag to TargetOptions for use by clang/llc)
  2. The target unrolling size is a property of the target, and is specified in the scheduling model. Likewise, whether the scheduling model is setup for post-RA scheduling (and whether this is likely to be useful) should be kept with the scheduling mode. For late unrolling, a LoopMicroOpBufferSize flag was added to the scheduling model. A corresponding flag could be added to control the default here. Also, Andy recently added a enablePostMachineScheduler() hook in TargetSubtargetInfo (and it could use this new model flag to provide the default instead of having a default of 'false').

Thanks again,
Hal

Our motivation here is that we'd like to eventually add Post RA scheduling by default when selecting the AMD btver2 target and possibly others. The existing enum and the CPU-based if-check will just grow over time if we leave it as-is. We'd also like to be able to optionally turn postRA scheduling off in llc and up in clang too.

spatel retitled this revision from Add a SubTargetFeature to enable Post RA Scheduling to Move Post RA Scheduling flag bit into SchedMachineModel.Jun 19 2014, 9:30 AM
spatel updated this object.
spatel abandoned this revision.Jun 19 2014, 9:42 AM

Oops - (phabricator noob) I updated the patch based on my understanding of Hal's feedback, but I made it a new patch:
http://reviews.llvm.org/D4217

Abandon this diff?