This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
Analysis/
4/4
TargetTransformInfo.h
4/4
TargetTransformInfoImpl.h
-
CodeGen/
2/3
BasicTTIImpl.h
-
TargetSubtargetInfo.h
-
MC/
2/2
MCSubtargetInfo.h
5/6
MCSystemModel.h
-
Target/
-
Target.td
-
TargetCacheModel.td
-
TargetMemoryModel.td
4/4
TargetSoftwarePrefetchConfig.td
-
TargetSystemModel.td
-
TargetWCBufferModel.td
-
lib/
-
Analysis/
-
TargetTransformInfo.cpp
-
CodeGen/
-
TargetSubtargetInfo.cpp
-
MC/
-
CMakeLists.txt
2/2
MCSubtargetInfo.cpp
3/3
MCSystemModel.cpp
-
Target/
-
AArch64/
-
AArch64Subtarget.h
-
AArch64TargetTransformInfo.h
-
AArch64TargetTransformInfo.cpp
-
AMDGPU/MCTargetDesc/
-
MCTargetDesc/
-
AMDGPUMCTargetDesc.cpp
-
Hexagon/
-
HexagonTargetTransformInfo.h
-
PowerPC/
-
PPCTargetTransformInfo.h
-
PPCTargetTransformInfo.cpp
-
SystemZ/
-
SystemZTargetTransformInfo.h
-
Transforms/Scalar/
-
Scalar/
-
LoopDataPrefetch.cpp
-
test/TableGen/
-
TableGen/
-
SystemModelEmitter.td
-
unittests/
-
CodeGen/
-
MachineInstrTest.cpp
-
MC/
-
CMakeLists.txt
-
SystemModel.cpp
-
utils/TableGen/
-
TableGen/
-
SubtargetEmitter.cpp

Differential D58736

[System Model] Introduce a target system model
Needs ReviewPublic

Authored by greened on Feb 27 2019, 1:37 PM.

Download Raw Diff

Details

Reviewers

Meinersbur
hfinkel
simoll
rengolin
andreadb

Summary

Add the concept of a per-subtarget system model, incorporating information
about caches, execution resources, write-combining buffers and software
prefetching configuration.

This is TableGen-driven so that targets may conveniently define new system
models and associate system models with subtargets.

By default, processor classes use a system model that captures the legacy
values exposed by TargetTransformInfo and/or MCSubtarget and friends.
Targets may opt-in to custom system models by defining them and associating
them in instantiations of the Processor template, similarly to how schedulers
are associated.

This patch is for overall higher-level design discussion, to provide a view of where this is going. Smaller patches to review for actual merging will follow.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

greened created this revision.Feb 27 2019, 1:37 PM

Herald added a project: Restricted Project. · View Herald TranscriptFeb 27 2019, 1:37 PM

Herald added subscribers: jdoerfert, jsji, mgrang and 8 others. · View Herald Transcript

A larger design question I have about this is the proper place to put software prefetching configuration. Right now it lives at the memory model level, in that a memory model specifies a cache heirachy along with a software prefetch configuration. I wonder if we should allow for software prefetching configuration for each cache level, as targets might want different policies depending on which cache level they are prefetching into. I don't think we have any examples of that in the codebase today but I can imagine cases where targets might want it.

steleman added a subscriber: steleman.Feb 28 2019, 11:46 AM

ajasty-cavium added a subscriber: ajasty-cavium.Feb 28 2019, 11:47 AM

joelkevinjones added a subscriber: joelkevinjones.Feb 28 2019, 11:50 AM

Ping?

Thank you for pushing this forward and sorry for the delay.

Could you add some central high-level documentation about what the memory system model is? E.g. describe that an MCSystemModel has a list of execution resources, memory hierarchies, prefetch configs and write-combining buffers. A Cache hierarchy as a total size, line size, associativity, etc. To get the interpretation eight, please add more details about ever parameter, particularly the prefetch configs. Some other examples than ARM big.LITTLE would be nice as well.

What exactly is a "prefetch config"? Is there a prefetch config for each cache level? Different hardware mechanisms for prefetch (e.g. stride detection or software-estabslished). Different strategies for inserting prefetch instruction selectable at compile-time?

AFAICS, this patch does uses the default model for all targets?

llvm/include/llvm/Analysis/TargetTransformInfo.h
735–742	[bikeshedding] The name should indicate that these methods just return some information. How about `isPrefetchingReads`/etc.?
1136	It is interesting that getCacheSize/getCacheAssociativity return `llvm::Optional`s and are cache-level specific, but not getCacheLineSize
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
403–407	Could you add some documentation about what these special values mean?
llvm/include/llvm/MC/MCSubtargetInfo.h
58–65	I don't understand the idea of the `resolve...` API. What is a target overriding it supposed to do? There are not overrides in this patch. Why not providing a default implementation of `getCacheSize` that targets can override?
llvm/include/llvm/MC/MCSystemModel.h
406–407	[style] These trailing empty comment lines don't seem to be useful. The current LLVM code base does not have them.
443	Could we avoid the static initializer?
464	The type has 'Set' in its name, but is an alias for a vector?
468	Why does it need to be `mutable`?
llvm/include/llvm/Target/TargetSoftwarePrefetchConfig.td
44–47	ISA instructions or µOps? Why not cycles?
48–52	Isn't this algorithm-dependent, i.e. the size of the loop?
llvm/lib/MC/MCSubtargetInfo.cpp
144	I find this handling of `--help` strange, but the current `MCSubtargetInfo::getSchedModelForCPU` does the same thing.
llvm/lib/MC/MCSystemModel.cpp
27–48	where are these used?
66	[typo] `lttle`

This takes a while to digest. Some quick remarks for now (also inline):

Is there a way to query the number of (automatic) HW prefetchers?
Does the interface provide the latency of each cache level (hit)/memory (miss)?

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
407	What is this method supposed to return? The right value seems to be a property of a specific pair of a loop and a target architecture rather than just the target alone.

Meinersbur added inline comments.Mar 19 2019, 9:58 AM

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
407	It is used by LoopDataPrefetch. `TargetTransformInfo.h` contains a description: /// \return The maximum number of iterations to prefetch ahead. If the /// required number of iterations is more than this number, no prefetching is /// performed. unsigned getMaxPrefetchIterationsAhead() const; I assume it was added just because the current code base already defines it.

In D58736#1434008, @Meinersbur wrote:

Thank you for pushing this forward and sorry for the delay.

Could you add some central high-level documentation about what the memory system model is? E.g. describe that an MCSystemModel has a list of execution resources, memory hierarchies, prefetch configs and write-combining buffers. A Cache hierarchy as a total size, line size, associativity, etc. To get the interpretation eight, please add more details about ever parameter, particularly the prefetch configs. Some other examples than ARM big.LITTLE would be nice as well.

Will do.

What exactly is a "prefetch config"? Is there a prefetch config for each cache level? Different hardware mechanisms for prefetch (e.g. stride detection or software-estabslished). Different strategies for inserting prefetch instruction selectable at compile-time?

This is certainly an area for exploration. Currently the model doesn't have a prefetch config for each cache level but we could make it so. I'm not sure if the flexibility will be needed or not. We haven't needed in the past but processors are getting a lot more complicated in this area.

The intent is to describe parameters for software prefetching. It doesn't attempt to describe hardware prefetchers but that may be useful depending on the problem at hand. I think that's something we could consider for later.

AFAICS, this patch does uses the default model for all targets?

Yes, currently. My intent was not to disrupt how anything currently works as far as what TTI returns for its interfaces. Individual subtargets can then opt-in by defining a non-default model.

In D58736#1434370, @simoll wrote:

This takes a while to digest. Some quick remarks for now (also inline):

Yes, I know it's a big patch. I wanted to provide the whole context but can certainly break it up into smaller pieces for review if that's easier. Would it be useful to post smaller pieces for review/commit but maintain this patch for reference? I don't intend to actually commit all this as one big change.

Is there a way to query the number of (automatic) HW prefetchers?

Not currently. It's something we could add later.

Does the interface provide the latency of each cache level (hit)/memory (miss)?

The latency of each level is the latency for a hit. I hadn't considered a separate miss latency as I thought the last "level" would be DRAM and the latency for that would approximate the latency of a full miss. Of course the various cache levels will have a longer miss latency than a direct DRAM access but I hadn't considered that overhead to be arge enough to model. If we think it is then we can do that.

greened marked 10 inline comments as done.Apr 3 2019, 2:49 PM

greened added inline comments.

llvm/include/llvm/Analysis/TargetTransformInfo.h
735–742	Good point. I think `shouldPrefetchReads` would better convey what this is supposed to be telling us.
1136	Yeah. I assume that's because on almost every processor, the cache line size is the same thoughout the cache heirarchy so the interface was designed assuming that no cache level parameter was necessary. If there's no cache level parameter, there's no possiblity of the user providing a non-existent cache level and thus no need for an `llvm::Optional`. However, there have been processors in the past where cache line size varied by level and certainly when you have a heterogeneous system the cache line size will not be uniform across the system.
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
407	Yes, that's exactly right. When I originally posted the RFC I suggested a single prefetch distance in bytes. Others chimed in and said they preferred to think in terms of loop iterations and that is indeed what `LoopDataPrefetch` does. `LoopDataPrefetch` also drove the inclsion of `getMinPrefetchStride`. Since I didn't want to affect how current passes work, I added the necessary inrfrastructure to have the system model specify it. I expect that as we gain experience we may eliminate some of these interfaces.
llvm/include/llvm/MC/MCSubtargetInfo.h
58–65	This is something I don't have in our local implementation, but after posting the RFC several people said they'd like to model GPUs and big.LITTLE-style systems. In such hybrid systems, there is no single "L2 cache," for example. If a client asks the system model, "Give me the size of the L2 cache," what should it do? Return the L2 of the CPU, the L2 of the GPU or something else? That's the idea of the `resolve...` stuff. Somebody has to decide what to do which is what these virtual APIs are suggesting. `SubtargetInfo` might not be the right place for this, but lacking a better place for it, I just put it here. I could remove all this for the initial changeset and just return some default with an override as you suggest. I'm definitely open to ideas/opinions here.
llvm/include/llvm/MC/MCSystemModel.h
443	Maybe? I took cues for how the scheduler stuff is implemented and it uses similar static initializers. I'll look into this and see if there is a better way.
468	It doesn't. This is leftover from a time when I thought `initCacheInfoCache` and friends could be called lazily. I'll rework this.
llvm/include/llvm/Target/TargetSoftwarePrefetchConfig.td
44–47	`LoopDataPrefetch` thinks in terms of IR `Instructions`. I'll clarify the comment and name. Maybe we should reconsider how `LoopDataPrefetch` thinks about things but I'd prefer to leave that for later work. I want to be confident we can model the way things work today before we go changing a bunch of things.
48–52	Yep. Again, this is driven by `LoopDataPrefetch`.
llvm/lib/MC/MCSubtargetInfo.cpp
144	Right. That's what I used as guidance.
llvm/lib/MC/MCSystemModel.cpp
27–48	They aren't anymore. Will remove.

Updated to address comments.

Next week I plan to start submitting smaller changes to build up to what's here. So comments on the overall design of this patch would be great, but it's huge and not the best vehicle for discussing details.

I want to start getting the foundational pieces in, likely starting with abstracting the prefetching stuff via TTI which will hopefully be non-controversial.

greened marked 14 inline comments as done.Jun 14 2019, 1:35 PM

greened edited the summary of this revision. (Show Details)

arsenm added inline comments.Jun 14 2019, 1:40 PM

llvm/include/llvm/CodeGen/BasicTTIImpl.h
509	All of these should probably have address space arguments

greened marked an inline comment as done.Jun 20 2019, 10:04 AM

greened added inline comments.

llvm/include/llvm/CodeGen/BasicTTIImpl.h
509	What would the address space argument specify? What are the use-cases for modeling caches with address spaces? Perhaps this could replace the `resolve...` APIs. That would be nice. These are existing interfaces so what would be the best way to transition them?

I just posted D63614, the subset of this patch covering only the changes to TTI and related classes.

llvm/include/llvm/CodeGen/BasicTTIImpl.h
509	It's probably more productive to continue this discussion on D63614, which is the TTI part of this patch.

greened mentioned this in D63614: [System Model] [TTI] Update cache and prefetch TTI interfaces.Jun 20 2019, 10:20 AM

Meinersbur mentioned this in D70228: [LoopDataPrefetch + SystemZ] Let target decide on prefetching on a per loop basis.Mar 30 2020, 12:15 AM

Matt added a subscriber: Matt.Apr 20 2021, 8:27 AM

Herald added a subscriber: kerbowa. · View Herald TranscriptApr 20 2021, 8:27 AM

tschuett mentioned this in D123050: [BOLT] Cache-Aware Tail Duplication.Apr 15 2022, 12:57 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

Analysis/

TargetTransformInfo.h

41 lines

TargetTransformInfoImpl.h

18 lines

CodeGen/

BasicTTIImpl.h

40 lines

TargetSubtargetInfo.h

4 lines

MC/

MCSubtargetInfo.h

169 lines

MCSystemModel.h

566 lines

Target/

Target.td

9 lines

TargetCacheModel.td

58 lines

TargetMemoryModel.td

42 lines

TargetSoftwarePrefetchConfig.td

99 lines

TargetSystemModel.td

73 lines

TargetWCBufferModel.td

22 lines

lib/

Analysis/

TargetTransformInfo.cpp

12 lines

CodeGen/

TargetSubtargetInfo.cpp

6 lines

MC/

CMakeLists.txt

1 line

MCSubtargetInfo.cpp

219 lines

MCSystemModel.cpp

153 lines

Target/

AArch64/

AArch64Subtarget.h

8 lines

AArch64TargetTransformInfo.h

8 lines

AArch64TargetTransformInfo.cpp

16 lines

AMDGPU/

MCTargetDesc/

AMDGPUMCTargetDesc.cpp

22 lines

Hexagon/

HexagonTargetTransformInfo.h

4 lines

PowerPC/

PPCTargetTransformInfo.h

4 lines

PPCTargetTransformInfo.cpp

4 lines

SystemZ/

SystemZTargetTransformInfo.h

6 lines

Transforms/

Scalar/

LoopDataPrefetch.cpp

22 lines

test/

TableGen/

SystemModelEmitter.td

261 lines

unittests/

CodeGen/

MachineInstrTest.cpp

3 lines

MC/

CMakeLists.txt

1 line

SystemModel.cpp

1407 lines

utils/

TableGen/

SubtargetEmitter.cpp

400 lines

Diff 188611

llvm/include/llvm/Analysis/TargetTransformInfo.h

Show First 20 Lines • Show All 726 Lines • ▼ Show 20 Lines	public:
};		};

/// \return The size of the cache level in bytes, if available.		/// \return The size of the cache level in bytes, if available.
llvm::Optional<unsigned> getCacheSize(CacheLevel Level) const;		llvm::Optional<unsigned> getCacheSize(CacheLevel Level) const;

/// \return The associativity of the cache level, if available.		/// \return The associativity of the cache level, if available.
llvm::Optional<unsigned> getCacheAssociativity(CacheLevel Level) const;		llvm::Optional<unsigned> getCacheAssociativity(CacheLevel Level) const;

		/// \return Whether to prefetch loads.
		bool prefetchReads() const;

		/// \return Whether to prefetch stores.
		bool prefetchWrites() const;

		/// \return Whether to use read prefetches for stores.
		bool useReadPrefetchForWrites() const;
		MeinersburUnsubmitted Done Reply Inline Actions [bikeshedding] The name should indicate that these methods just return some information. How about `isPrefetchingReads`/etc.? Meinersbur: [bikeshedding] The name should indicate that these methods just return some information. How…
		greenedAuthorUnsubmitted Done Reply Inline Actions Good point. I think `shouldPrefetchReads` would better convey what this is supposed to be telling us. greened: Good point. I think `shouldPrefetchReads` would better convey what this is supposed to be…

/// \return How much before a load we should place the prefetch instruction.		/// \return How much before a load we should place the prefetch instruction.
/// This is currently measured in number of instructions.		/// This is currently measured in number of instructions.
unsigned getPrefetchDistance() const;		unsigned getPrefetchDistance() const;

/// \return Some HW prefetchers can handle accesses up to a certain constant		/// \return Some HW prefetchers can handle accesses up to a certain constant
/// stride. This is the minimum stride in bytes where it makes sense to start		/// stride. This is the minimum stride in bytes where it makes sense to start
/// adding SW prefetches. The default is 1, i.e. prefetch with any stride.		/// adding SW prefetches. The default is 1, i.e. prefetch with any stride.
unsigned getMinPrefetchStride() const;		unsigned getMinPrefetchStride() const;
▲ Show 20 Lines • Show All 376 Lines • ▼ Show 20 Lines	virtual int getIntImmCost(Intrinsic::ID IID, unsigned Idx, const APInt &Imm,
Type *Ty) = 0;		Type *Ty) = 0;
virtual unsigned getNumberOfRegisters(bool Vector) = 0;		virtual unsigned getNumberOfRegisters(bool Vector) = 0;
virtual unsigned getRegisterBitWidth(bool Vector) const = 0;		virtual unsigned getRegisterBitWidth(bool Vector) const = 0;
virtual unsigned getMinVectorRegisterBitWidth() = 0;		virtual unsigned getMinVectorRegisterBitWidth() = 0;
virtual bool shouldMaximizeVectorBandwidth(bool OptSize) const = 0;		virtual bool shouldMaximizeVectorBandwidth(bool OptSize) const = 0;
virtual unsigned getMinimumVF(unsigned ElemWidth) const = 0;		virtual unsigned getMinimumVF(unsigned ElemWidth) const = 0;
virtual bool shouldConsiderAddressTypePromotion(		virtual bool shouldConsiderAddressTypePromotion(
const Instruction &I, bool &AllowPromotionWithoutCommonHeader) = 0;		const Instruction &I, bool &AllowPromotionWithoutCommonHeader) = 0;
virtual unsigned getCacheLineSize() = 0;		virtual unsigned getCacheLineSize() const = 0;
		MeinersburUnsubmitted Done Reply Inline Actions It is interesting that getCacheSize/getCacheAssociativity return `llvm::Optional`s and are cache-level specific, but not getCacheLineSize Meinersbur: It is interesting that getCacheSize/getCacheAssociativity return `llvm::Optional`s and are…
		greenedAuthorUnsubmitted Done Reply Inline Actions Yeah. I assume that's because on almost every processor, the cache line size is the same thoughout the cache heirarchy so the interface was designed assuming that no cache level parameter was necessary. If there's no cache level parameter, there's no possiblity of the user providing a non-existent cache level and thus no need for an `llvm::Optional`. However, there have been processors in the past where cache line size varied by level and certainly when you have a heterogeneous system the cache line size will not be uniform across the system. greened: Yeah. I assume that's because on almost every processor, the cache line size is the same…
virtual llvm::Optional<unsigned> getCacheSize(CacheLevel Level) = 0;		virtual llvm::Optional<unsigned> getCacheSize(CacheLevel Level) const = 0;
virtual llvm::Optional<unsigned> getCacheAssociativity(CacheLevel Level) = 0;		virtual llvm::Optional<unsigned> getCacheAssociativity(CacheLevel Level) const = 0;
virtual unsigned getPrefetchDistance() = 0;		virtual bool prefetchReads() const = 0;
virtual unsigned getMinPrefetchStride() = 0;		virtual bool prefetchWrites() const = 0;
virtual unsigned getMaxPrefetchIterationsAhead() = 0;		virtual bool useReadPrefetchForWrites() const = 0;
		virtual unsigned getPrefetchDistance() const = 0;
		virtual unsigned getMinPrefetchStride() const = 0;
		virtual unsigned getMaxPrefetchIterationsAhead() const = 0;
virtual unsigned getMaxInterleaveFactor(unsigned VF) = 0;		virtual unsigned getMaxInterleaveFactor(unsigned VF) = 0;
virtual unsigned		virtual unsigned
getArithmeticInstrCost(unsigned Opcode, Type *Ty, OperandValueKind Opd1Info,		getArithmeticInstrCost(unsigned Opcode, Type *Ty, OperandValueKind Opd1Info,
OperandValueKind Opd2Info,		OperandValueKind Opd2Info,
OperandValueProperties Opd1PropInfo,		OperandValueProperties Opd1PropInfo,
OperandValueProperties Opd2PropInfo,		OperandValueProperties Opd2PropInfo,
ArrayRef<const Value *> Args) = 0;		ArrayRef<const Value *> Args) = 0;
virtual int getShuffleCost(ShuffleKind Kind, Type *Tp, int Index,		virtual int getShuffleCost(ShuffleKind Kind, Type *Tp, int Index,
▲ Show 20 Lines • Show All 294 Lines • ▼ Show 20 Lines	public:
unsigned getMinimumVF(unsigned ElemWidth) const override {		unsigned getMinimumVF(unsigned ElemWidth) const override {
return Impl.getMinimumVF(ElemWidth);		return Impl.getMinimumVF(ElemWidth);
}		}
bool shouldConsiderAddressTypePromotion(		bool shouldConsiderAddressTypePromotion(
const Instruction &I, bool &AllowPromotionWithoutCommonHeader) override {		const Instruction &I, bool &AllowPromotionWithoutCommonHeader) override {
return Impl.shouldConsiderAddressTypePromotion(		return Impl.shouldConsiderAddressTypePromotion(
I, AllowPromotionWithoutCommonHeader);		I, AllowPromotionWithoutCommonHeader);
}		}
unsigned getCacheLineSize() override {		unsigned getCacheLineSize() const override {
return Impl.getCacheLineSize();		return Impl.getCacheLineSize();
}		}
llvm::Optional<unsigned> getCacheSize(CacheLevel Level) override {		llvm::Optional<unsigned> getCacheSize(CacheLevel Level) const override {
return Impl.getCacheSize(Level);		return Impl.getCacheSize(Level);
}		}
llvm::Optional<unsigned> getCacheAssociativity(CacheLevel Level) override {		llvm::Optional<unsigned> getCacheAssociativity(CacheLevel Level) const override {
return Impl.getCacheAssociativity(Level);		return Impl.getCacheAssociativity(Level);
}		}
unsigned getPrefetchDistance() override { return Impl.getPrefetchDistance(); }		bool prefetchReads() const override { return Impl.prefetchReads(); }
unsigned getMinPrefetchStride() override {		bool prefetchWrites() const override { return Impl.prefetchWrites(); }
		bool useReadPrefetchForWrites() const override {
		return Impl.useReadPrefetchForWrites();
		}
		unsigned getPrefetchDistance() const override { return Impl.getPrefetchDistance(); }
		unsigned getMinPrefetchStride() const override {
return Impl.getMinPrefetchStride();		return Impl.getMinPrefetchStride();
}		}
unsigned getMaxPrefetchIterationsAhead() override {		unsigned getMaxPrefetchIterationsAhead() const override {
return Impl.getMaxPrefetchIterationsAhead();		return Impl.getMaxPrefetchIterationsAhead();
}		}
unsigned getMaxInterleaveFactor(unsigned VF) override {		unsigned getMaxInterleaveFactor(unsigned VF) override {
return Impl.getMaxInterleaveFactor(VF);		return Impl.getMaxInterleaveFactor(VF);
}		}
unsigned getEstimatedNumberOfCaseClusters(const SwitchInst &SI,		unsigned getEstimatedNumberOfCaseClusters(const SwitchInst &SI,
unsigned &JTSize) override {		unsigned &JTSize) override {
return Impl.getEstimatedNumberOfCaseClusters(SI, JTSize);		return Impl.getEstimatedNumberOfCaseClusters(SI, JTSize);
▲ Show 20 Lines • Show All 267 Lines • Show Last 20 Lines

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

Show First 20 Lines • Show All 363 Lines • ▼ Show 20 Lines	public:

bool		bool
shouldConsiderAddressTypePromotion(const Instruction &I,		shouldConsiderAddressTypePromotion(const Instruction &I,
bool &AllowPromotionWithoutCommonHeader) {		bool &AllowPromotionWithoutCommonHeader) {
AllowPromotionWithoutCommonHeader = false;		AllowPromotionWithoutCommonHeader = false;
return false;		return false;
}		}

unsigned getCacheLineSize() { return 0; }		unsigned getCacheLineSize() const { return 0; }

llvm::Optional<unsigned> getCacheSize(TargetTransformInfo::CacheLevel Level) {		llvm::Optional<unsigned> getCacheSize(TargetTransformInfo::CacheLevel Level) const {
switch (Level) {		switch (Level) {
case TargetTransformInfo::CacheLevel::L1D:		case TargetTransformInfo::CacheLevel::L1D:
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case TargetTransformInfo::CacheLevel::L2D:		case TargetTransformInfo::CacheLevel::L2D:
return llvm::Optional<unsigned>();		return llvm::Optional<unsigned>();
}		}

llvm_unreachable("Unknown TargetTransformInfo::CacheLevel");		llvm_unreachable("Unknown TargetTransformInfo::CacheLevel");
}		}

llvm::Optional<unsigned> getCacheAssociativity(		llvm::Optional<unsigned> getCacheAssociativity(
TargetTransformInfo::CacheLevel Level) {		TargetTransformInfo::CacheLevel Level) const {
switch (Level) {		switch (Level) {
case TargetTransformInfo::CacheLevel::L1D:		case TargetTransformInfo::CacheLevel::L1D:
LLVM_FALLTHROUGH;		LLVM_FALLTHROUGH;
case TargetTransformInfo::CacheLevel::L2D:		case TargetTransformInfo::CacheLevel::L2D:
return llvm::Optional<unsigned>();		return llvm::Optional<unsigned>();
}		}

llvm_unreachable("Unknown TargetTransformInfo::CacheLevel");		llvm_unreachable("Unknown TargetTransformInfo::CacheLevel");
}		}

unsigned getPrefetchDistance() { return 0; }		bool prefetchReads() const { return false; };

unsigned getMinPrefetchStride() { return 1; }		bool prefetchWrites() const { return false; }

unsigned getMaxPrefetchIterationsAhead() { return UINT_MAX; }		bool useReadPrefetchForWrites() const { return false; }

		unsigned getPrefetchDistance() const { return 0; }

		unsigned getMinPrefetchStride() const { return 1; }

		unsigned getMaxPrefetchIterationsAhead() const { return UINT_MAX; }
		MeinersburUnsubmitted Done Reply Inline Actions Could you add some documentation about what these special values mean? Meinersbur: Could you add some documentation about what these special values mean?
		simollUnsubmitted Done Reply Inline Actions What is this method supposed to return? The right value seems to be a property of a specific pair of a loop and a target architecture rather than just the target alone. simoll: What is this method supposed to return? The right value seems to be a property of a specific…
		MeinersburUnsubmitted Done Reply Inline Actions It is used by LoopDataPrefetch. `TargetTransformInfo.h` contains a description: /// \return The maximum number of iterations to prefetch ahead. If the /// required number of iterations is more than this number, no prefetching is /// performed. unsigned getMaxPrefetchIterationsAhead() const; I assume it was added just because the current code base already defines it. Meinersbur: It is used by LoopDataPrefetch. `TargetTransformInfo.h` contains a description: ``` ///…
		greenedAuthorUnsubmitted Done Reply Inline Actions Yes, that's exactly right. When I originally posted the RFC I suggested a single prefetch distance in bytes. Others chimed in and said they preferred to think in terms of loop iterations and that is indeed what `LoopDataPrefetch` does. `LoopDataPrefetch` also drove the inclsion of `getMinPrefetchStride`. Since I didn't want to affect how current passes work, I added the necessary inrfrastructure to have the system model specify it. I expect that as we gain experience we may eliminate some of these interfaces. greened: Yes, that's exactly right. When I originally posted the RFC I suggested a single prefetch…

unsigned getMaxInterleaveFactor(unsigned VF) { return 1; }		unsigned getMaxInterleaveFactor(unsigned VF) { return 1; }

unsigned getArithmeticInstrCost(unsigned Opcode, Type *Ty,		unsigned getArithmeticInstrCost(unsigned Opcode, Type *Ty,
TTI::OperandValueKind Opd1Info,		TTI::OperandValueKind Opd1Info,
TTI::OperandValueKind Opd2Info,		TTI::OperandValueKind Opd2Info,
TTI::OperandValueProperties Opd1PropInfo,		TTI::OperandValueProperties Opd1PropInfo,
TTI::OperandValueProperties Opd2PropInfo,		TTI::OperandValueProperties Opd2PropInfo,
▲ Show 20 Lines • Show All 459 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/BasicTTIImpl.h

Show First 20 Lines • Show All 486 Lines • ▼ Show 20 Lines	public:

int getInstructionLatency(const Instruction *I) {		int getInstructionLatency(const Instruction *I) {
if (isa<LoadInst>(I))		if (isa<LoadInst>(I))
return getST()->getSchedModel().DefaultLoadLatency;		return getST()->getSchedModel().DefaultLoadLatency;

return BaseT::getInstructionLatency(I);		return BaseT::getInstructionLatency(I);
}		}

		virtual Optional<unsigned>
		getCacheSize(TargetTransformInfo::CacheLevel Level) const {
		return Optional<unsigned>(
		getST()->getCacheSize(static_cast<unsigned>(Level)));
		}

		virtual Optional<unsigned>
		getCacheAssociativity(TargetTransformInfo::CacheLevel Level) const {
		return Optional<unsigned>(
		getST()->getCacheAssociativity(static_cast<unsigned>(Level)));
		}

		virtual unsigned getCacheLineSize() const {
		return getST()->getCacheLineSize();
		}
		arsenmUnsubmitted Not Done Reply Inline Actions All of these should probably have address space arguments arsenm: All of these should probably have address space arguments
		greenedAuthorUnsubmitted Done Reply Inline Actions What would the address space argument specify? What are the use-cases for modeling caches with address spaces? Perhaps this could replace the `resolve...` APIs. That would be nice. These are existing interfaces so what would be the best way to transition them? greened: What would the address space argument specify? What are the use-cases for modeling caches with…
		greenedAuthorUnsubmitted Done Reply Inline Actions It's probably more productive to continue this discussion on D63614, which is the TTI part of this patch. greened: It's probably more productive to continue this discussion on D63614, which is the TTI part of…

		virtual bool prefetchReads() const {
		return getST()->prefetchReads();
		}

		virtual bool prefetchWrites() const {
		return getST()->prefetchWrites();
		}

		virtual bool useReadPrefetchForWrites() const {
		return getST()->useReadPrefetchForWrites();
		}

		virtual unsigned getPrefetchDistance() const {
		return getST()->getPrefetchDistance();
		}

		virtual unsigned getMinPrefetchStride() const {
		return getST()->getMinPrefetchStride();
		}

		virtual unsigned getMaxPrefetchIterationsAhead() const {
		return getST()->getMaxPrefetchIterationsAhead();
		}

/// @}		/// @}

/// \name Vector TTI Implementations		/// \name Vector TTI Implementations
/// @{		/// @{

unsigned getNumberOfRegisters(bool Vector) { return Vector ? 0 : 1; }		unsigned getNumberOfRegisters(bool Vector) { return Vector ? 0 : 1; }

unsigned getRegisterBitWidth(bool Vector) const { return 32; }		unsigned getRegisterBitWidth(bool Vector) const { return 32; }
▲ Show 20 Lines • Show All 1,132 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/TargetSubtargetInfo.h

	Show All 38 Lines
	struct MCWriteLatencyEntry;			struct MCWriteLatencyEntry;
	struct MCWriteProcResEntry;			struct MCWriteProcResEntry;
	class RegisterBankInfo;			class RegisterBankInfo;
	class SDep;			class SDep;
	class SelectionDAGTargetInfo;			class SelectionDAGTargetInfo;
	struct SubtargetFeatureKV;			struct SubtargetFeatureKV;
	struct SubtargetInfoKV;			struct SubtargetInfoKV;
	class SUnit;			class SUnit;
				class TargetSystemModel;
	class TargetFrameLowering;			class TargetFrameLowering;
	class TargetInstrInfo;			class TargetInstrInfo;
	class TargetLowering;			class TargetLowering;
	class TargetRegisterClass;			class TargetRegisterClass;
	class TargetRegisterInfo;			class TargetRegisterInfo;
	class TargetSchedModel;			class TargetSchedModel;
	class Triple;			class Triple;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	///			///
	/// TargetSubtargetInfo - Generic base class for all target subtargets. All			/// TargetSubtargetInfo - Generic base class for all target subtargets. All
	/// Target-specific options that control code generation and printing should			/// Target-specific options that control code generation and printing should
	/// be exposed through a TargetSubtargetInfo-derived class.			/// be exposed through a TargetSubtargetInfo-derived class.
	///			///
	class TargetSubtargetInfo : public MCSubtargetInfo {			class TargetSubtargetInfo : public MCSubtargetInfo {
	protected: // Can only create subclasses...			protected: // Can only create subclasses...
	TargetSubtargetInfo(const Triple &TT, StringRef CPU, StringRef FS,			TargetSubtargetInfo(const Triple &TT, StringRef CPU, StringRef FS,
	ArrayRef<SubtargetFeatureKV> PF,			ArrayRef<SubtargetFeatureKV> PF,
	ArrayRef<SubtargetFeatureKV> PD,			ArrayRef<SubtargetFeatureKV> PD,
	const SubtargetInfoKV *ProcSched,			const SubtargetInfoKV *ProcSched,
	const MCWriteProcResEntry *WPR,			const MCWriteProcResEntry *WPR,
	const MCWriteLatencyEntry *WL,			const MCWriteLatencyEntry *WL,
	const MCReadAdvanceEntry RA, const InstrStage IS,			const MCReadAdvanceEntry RA, const InstrStage IS,
	const unsigned OC, const unsigned FP);			const unsigned OC, const unsigned FP,
				const SubtargetInfoKV *SystemModels);

	public:			public:
	// AntiDepBreakMode - Type of anti-dependence breaking that should			// AntiDepBreakMode - Type of anti-dependence breaking that should
	// be performed before post-RA scheduling.			// be performed before post-RA scheduling.
	using AntiDepBreakMode = enum { ANTIDEP_NONE, ANTIDEP_CRITICAL, ANTIDEP_ALL };			using AntiDepBreakMode = enum { ANTIDEP_NONE, ANTIDEP_CRITICAL, ANTIDEP_ALL };
	using RegClassVector = SmallVectorImpl<const TargetRegisterClass *>;			using RegClassVector = SmallVectorImpl<const TargetRegisterClass *>;

	TargetSubtargetInfo() = delete;			TargetSubtargetInfo() = delete;
	▲ Show 20 Lines • Show All 213 Lines • Show Last 20 Lines

llvm/include/llvm/MC/MCSubtargetInfo.h

Show All 10 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_MC_MCSUBTARGETINFO_H		#ifndef LLVM_MC_MCSUBTARGETINFO_H
#define LLVM_MC_MCSUBTARGETINFO_H		#define LLVM_MC_MCSUBTARGETINFO_H

#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
#include "llvm/ADT/Triple.h"		#include "llvm/ADT/Triple.h"
		#include "llvm/MC/MCSystemModel.h"
#include "llvm/MC/MCInstrItineraries.h"		#include "llvm/MC/MCInstrItineraries.h"
#include "llvm/MC/MCSchedule.h"		#include "llvm/MC/MCSchedule.h"
#include "llvm/MC/SubtargetFeature.h"		#include "llvm/MC/SubtargetFeature.h"
#include <algorithm>		#include <algorithm>
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>
#include <string>		#include <string>

Show All 18 Lines	class MCSubtargetInfo {
const MCReadAdvanceEntry *ReadAdvanceTable;		const MCReadAdvanceEntry *ReadAdvanceTable;
const MCSchedModel *CPUSchedModel;		const MCSchedModel *CPUSchedModel;

const InstrStage *Stages; // Instruction itinerary stages		const InstrStage *Stages; // Instruction itinerary stages
const unsigned *OperandCycles; // Itinerary operand cycles		const unsigned *OperandCycles; // Itinerary operand cycles
const unsigned *ForwardingPaths;		const unsigned *ForwardingPaths;
FeatureBitset FeatureBits; // Feature bits for current CPU + FS		FeatureBitset FeatureBits; // Feature bits for current CPU + FS

		// System models
		const SubtargetInfoKV *SystemModels;
		const MCSystemModel *CPUModel;

		/// If caches at a particular level are of different sizes, ask the
		/// target what to do. By default, return zero.
		///
		virtual Optional<unsigned>
		resolveCacheSize(unsigned Level,
		const MCSystemModel::CacheLevelSet &Levels) const {
		return Optional<unsigned>(0);
		}
		MeinersburUnsubmitted Done Reply Inline Actions I don't understand the idea of the `resolve...` API. What is a target overriding it supposed to do? There are not overrides in this patch. Why not providing a default implementation of `getCacheSize` that targets can override? Meinersbur: I don't understand the idea of the `resolve...` API. What is a target overriding it supposed to…
		greenedAuthorUnsubmitted Done Reply Inline Actions This is something I don't have in our local implementation, but after posting the RFC several people said they'd like to model GPUs and big.LITTLE-style systems. In such hybrid systems, there is no single "L2 cache," for example. If a client asks the system model, "Give me the size of the L2 cache," what should it do? Return the L2 of the CPU, the L2 of the GPU or something else? That's the idea of the `resolve...` stuff. Somebody has to decide what to do which is what these virtual APIs are suggesting. `SubtargetInfo` might not be the right place for this, but lacking a better place for it, I just put it here. I could remove all this for the initial changeset and just return some default with an override as you suggest. I'm definitely open to ideas/opinions here. greened: This is something I don't have in our local implementation, but after posting the RFC several…

		/// If caches at a particular level are of different
		/// associativities, ask the target what to do. By default, return
		/// one.
		///
		virtual Optional<unsigned>
		resolveCacheAssociativity(unsigned Level,
		const MCSystemModel::CacheLevelSet &Levels) const {
		return Optional<unsigned>(1);
		}

		/// If cache lines at a particular level are of different sizes, ask
		/// the target what to do. By default, return zero.
		///
		virtual Optional<unsigned>
		resolveCacheLineSize(unsigned Level,
		const MCSystemModel::CacheLevelSet &Levels) const {
		return Optional<unsigned>(0);
		}

		/// If prefetcher configs report differences on whether they are
		/// enabled for reads, ask the target what to do. By default,
		/// return the first enable/disable setting we find.
		///
		virtual bool resolvePrefetchReads(
		const MCSystemModel::PrefetchConfigSet &Prefetchers) const {
		return (*Prefetchers.begin())->isEnabledForReads();
		}

		/// If prefetcher configs report differences on whether they are
		/// enabled for writes, ask the target what to do. By default,
		/// return the first enable/disable setting we find.
		///
		virtual bool resolvePrefetchWrites(
		const MCSystemModel::PrefetchConfigSet &Prefetchers) const {
		return (*Prefetchers.begin())->isEnabledForWrites();
		}

		/// If prefetcher configs report differences on whether to use read
		/// prefetches for stores, ask the target what to do. By default,
		/// return the first setting we find.
		///
		virtual bool resolveUseReadPrefetchForWrites(
		const MCSystemModel::PrefetchConfigSet &Prefetchers) const {
		return (*Prefetchers.begin())->useReadPrefetchForWrites();
		}

		/// If prefetcher configs report different distances, ask the target
		/// what to do. By default, return the first distance we find.
		///
		virtual unsigned resolvePrefetchDistanceInInstructions(
		const MCSystemModel::PrefetchConfigSet &Prefetchers) const {
		return (*Prefetchers.begin())->getDistanceInInstructions();
		}

		/// If prefetcher configs report different max distances, ask the
		/// target what to do. By default, return the first distance we
		/// find.
		///
		virtual unsigned resolveMaxPrefetchIterationsAhead(
		const MCSystemModel::PrefetchConfigSet &Prefetchers) const {
		return (*Prefetchers.begin())->getMaxDistanceInIterations();
		}

		/// If prefetcher configs report different min strides, ask the
		/// target what to do. By default, return the first stride we find.
		///
		virtual unsigned resolveMinPrefetchStride(
		const MCSystemModel::PrefetchConfigSet &Prefetchers) const {
		return (*Prefetchers.begin())->getMinByteStride();
		}

public:		public:
MCSubtargetInfo(const MCSubtargetInfo &) = default;		MCSubtargetInfo(const MCSubtargetInfo &) = default;
MCSubtargetInfo(const Triple &TT, StringRef CPU, StringRef FS,		MCSubtargetInfo(const Triple &TT, StringRef CPU, StringRef FS,
ArrayRef<SubtargetFeatureKV> PF,		ArrayRef<SubtargetFeatureKV> PF,
ArrayRef<SubtargetFeatureKV> PD,		ArrayRef<SubtargetFeatureKV> PD,
const SubtargetInfoKV *ProcSched,		const SubtargetInfoKV *ProcSched,
const MCWriteProcResEntry WPR, const MCWriteLatencyEntry WL,		const MCWriteProcResEntry WPR, const MCWriteLatencyEntry WL,
const MCReadAdvanceEntry RA, const InstrStage IS,		const MCReadAdvanceEntry RA, const InstrStage IS,
const unsigned OC, const unsigned FP);		const unsigned OC, const unsigned FP,
		const SubtargetInfoKV *SystemModelTable);
MCSubtargetInfo() = delete;		MCSubtargetInfo() = delete;
MCSubtargetInfo &operator=(const MCSubtargetInfo &) = delete;		MCSubtargetInfo &operator=(const MCSubtargetInfo &) = delete;
MCSubtargetInfo &operator=(MCSubtargetInfo &&) = delete;		MCSubtargetInfo &operator=(MCSubtargetInfo &&) = delete;
virtual ~MCSubtargetInfo() = default;		virtual ~MCSubtargetInfo() = default;

const Triple &getTargetTriple() const { return TargetTriple; }		const Triple &getTargetTriple() const { return TargetTriple; }
StringRef getCPU() const { return CPU; }		StringRef getCPU() const { return CPU; }

▲ Show 20 Lines • Show All 105 Lines • ▼ Show 20 Lines	resolveVariantSchedClass(unsigned SchedClass, const MCInst *MI,
return 0;		return 0;
}		}

/// Check whether the CPU string is valid.		/// Check whether the CPU string is valid.
bool isCPUStringValid(StringRef CPU) const {		bool isCPUStringValid(StringRef CPU) const {
auto Found = std::lower_bound(ProcDesc.begin(), ProcDesc.end(), CPU);		auto Found = std::lower_bound(ProcDesc.begin(), ProcDesc.end(), CPU);
return Found != ProcDesc.end() && StringRef(Found->Key) == CPU;		return Found != ProcDesc.end() && StringRef(Found->Key) == CPU;
}		}

		/// Get the system model of a CPU.
		const MCSystemModel &getSystemModelForCPU(StringRef CPU) const;

		/// Get the system model for this subtarget's CPU.
		const MCSystemModel &getSystemModel() const { return *CPUModel; }

		/// Return the cache size in bytes for the given level of cache.
		/// Level is zero-based, so a value of zero means the first level of
		/// cache. If the size at the level is ambiguous (for example,
		/// there are two different types of cores with different L1 sizes),
		/// ask the target what to do via resolveCacheSize.
		///
		virtual Optional<unsigned> getCacheSize(unsigned Level) const;

		/// Return the cache associatvity for the given level of cache.
		/// Level is zero-based, so a value of zero means the first level of
		/// cache. If the associativity at the level is ambiguous (for
		/// example, there are two different types of cores with different
		/// L1 associativities), ask the target what to do via
		/// resolveCacheAssociativity.
		///
		virtual Optional<unsigned> getCacheAssociativity(unsigned Level) const;

		/// Return the target cache line size in bytes at a given level. If
		/// there are multiple such caches with different sizes, ask the
		/// target what to do via resolveCacheLineSize.
		///
		virtual Optional<unsigned> getCacheLineSize(unsigned Level) const;

		/// Return the target cache line size in bytes. By default, return
		/// the line size for the bottom-most level of cache. This provides
		/// a more convenient interface for the common case where all cache
		/// levels have the same line size. Return zero if there is no
		/// cache model.
		///
		virtual unsigned getCacheLineSize() const {
		Optional<unsigned> Size = getCacheLineSize(0);
		if (Size)
		return *Size;

		return 0;
		}

		/// Return whether we should do software prefetching for loads on
		/// this target.
		///
		virtual bool prefetchReads() const;

		/// Return whether we should do software prefetching for stores on
		/// this target.
		///
		virtual bool prefetchWrites() const;

		/// Return whether to use read prefetches for stores on this target.
		///
		virtual bool useReadPrefetchForWrites() const;

		/// Return the preferred prefetch distance in terms of instructions.
		/// Return the prefetch config for the topmost memory model that has
		/// a prefetcher. If there are multiple such models with different
		/// prefetching configs, return 0. The target will have to override
		/// this to do the right thing.
		///
		virtual unsigned getPrefetchDistance() const;

		/// Return the maximum prefetch distance in terms of loop
		/// iterations. Return the prefetch config for the topmost memory
		/// model that has a prefetcher. If there are multiple such models
		/// with different prefetching configs, return 0. The target will
		/// have to override this to do the right thing.
		///
		virtual unsigned getMaxPrefetchIterationsAhead() const;

		/// Return the minimum stride necessary to trigger software
		/// prefetching. Return the prefetch config for the topmost memory
		/// model that has a prefetcher. If there are multiple such models
		/// with different prefetching configs, return 0. The target will
		/// have to override this to do the right thing.
		///
		virtual unsigned getMinPrefetchStride() const;
};		};

} // end namespace llvm		} // end namespace llvm

#endif // LLVM_MC_MCSUBTARGETINFO_H		#endif // LLVM_MC_MCSUBTARGETINFO_H

llvm/include/llvm/MC/MCSystemModel.h

This file was added.

				//=== MC/MCSystemModel.h - Target System Model --------------- C++ --=======//
				//
				// The LLVM Compiler Infrastructure
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file describes an abstract interface used to get information
				// about a target machine's execution engine, including core
				// specifications, memory models and other things related to execution
				// resources.
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_MC_MCSYSTEMMODEL_H
				#define LLVM_MC_MCSYSTEMMODEL_H

				#include "llvm/ADT/iterator.h"
				#include "llvm/ADT/SmallVector.h"

				#include <cassert>

				namespace llvm {

				/// Provide information about write-combining buffers. These are
				/// typically used by hardware to buffer non-temporal stores for
				/// efficient data streaming. Each buffer expects a more-or-less
				/// linear stream of writes. A write outside the current cache line
				/// being filled causes the buffer to flush, so software should not
				/// oversubscribe the available hardware resources. If it does, in
				/// the worst case bufffers will thrash and flush after each write, as
				/// each address sent will map to a cache line outside those currently
				/// being filled. For example, assuming hardware has two buffers,
				/// streaming arrays A and B in the following loop is fine, as writes
				/// to A will map to, say, buffer 0 and writes to B will map to buffer
				/// 1. If C were also streamed, in the worst case its writes would
				/// ping-pong between buffers 0 and 1, flushing one or the other after
				/// each write to C, degrading the streaming of A and B.
				///
				/// do {
				/// A[i] = ...
				/// B[i] = ...
				/// C[i] = ...
				/// } while(i < Something);
				///
				class MCWriteCombiningBufferInfo {
				private:
				unsigned ID;
				const char *Name;
				const int NumBuffers; // The number of write-commbining buffers

				public:
				MCWriteCombiningBufferInfo(unsigned I, const char *TheName, int NumBufs)
				: ID(I), Name(TheName), NumBuffers(NumBufs) {}

				virtual ~MCWriteCombiningBufferInfo();

				/// Return the buffer ID number.
				///
				unsigned getID() const { return ID; }

				/// Return the buffer name for debugging.
				///
				const char *getName() const { return Name; }

				/// Return the number of available write-combining buffers.
				///
				int getNumBuffers() const { return NumBuffers; }
				};

				/// MCSoftwarePrefetcherConfig - Provide information about how to
				/// configure the software prefetcher.
				///
				class MCSoftwarePrefetcherConfig {
				private:
				unsigned ID;
				const char *Name;
				const bool EnabledForReads;
				const bool EnabledForWrites;
				const bool UseReadPrefetchForWrites;
				const unsigned BytesAhead;
				const unsigned MinBytesAhead;
				const unsigned MaxBytesAhead;
				const unsigned InstructionsAhead;
				const unsigned MaxIterationsAhead;
				const unsigned MinByteStride;

				public:
				MCSoftwarePrefetcherConfig(unsigned I,
				const char *TheName,
				bool EnableForReads,
				bool EnableForWrites,
				bool ReadPrefetchForWrites,
				unsigned NumBytesAhead,
				unsigned MinNumBytesAhead,
				unsigned MaxNumBytesAhead,
				unsigned NumInstructionsAhead,
				unsigned MaxNumIterationsAhead,
				unsigned MinStride)
				: ID(I),
				Name(TheName),
				EnabledForReads(EnableForReads),
				EnabledForWrites(EnableForWrites),
				UseReadPrefetchForWrites(ReadPrefetchForWrites),
				BytesAhead(NumBytesAhead),
				MinBytesAhead(MinNumBytesAhead),
				MaxBytesAhead(MaxNumBytesAhead),
				InstructionsAhead(NumInstructionsAhead),
				MaxIterationsAhead(MaxNumIterationsAhead),
				MinByteStride(MinStride) {}

				virtual ~MCSoftwarePrefetcherConfig();

				/// Return the prefetch config ID number.
				///
				unsigned getID() const { return ID; }

				/// Return the prefetch config name for debugging.
				///
				const char *getName() const { return Name; }

				/// Return whether we should do software prefetching for loads.
				///
				bool isEnabledForReads() const { return EnabledForReads; }

				/// Return whether we should do software prefetching for stores.
				///
				bool isEnabledForWrites() const { return EnabledForWrites; }

				/// Return whether we should use read prefetches for stores.
				///
				bool useReadPrefetchForWrites() const { return UseReadPrefetchForWrites; }

				/// Return the preferred prefetch distance in bytes. A value of 0
				/// tells the software prefetcher to determine distance using
				/// heuristics.
				///
				unsigned getDistanceInBytes() const { return BytesAhead; }

				/// Never prefetch less that this number of bytes ahead.
				///
				unsigned getMinDistanceInBytes() const { return MinBytesAhead; }

				/// Never prefetch more that this number of bytes ahead.
				///
				unsigned getMaxDistanceInBytes() const { return MaxBytesAhead; }

				/// Return the preferred prefetch distance in terms of number of
				/// instructions.
				///
				unsigned getDistanceInInstructions() const { return InstructionsAhead; }

				/// Never prefetch more than this number of loop iterations ahead.
				///
				unsigned getMaxDistanceInIterations() const { return MaxIterationsAhead; }

				/// Prefetch only if the byte stride is at least this large.
				///
				unsigned getMinByteStride() const { return MinByteStride; }
				};

				/// Provide information about a specific level in the cache (size,
				/// associativity, etc.).
				///
				class MCCacheLevelInfo {
				private:
				unsigned ID;
				const char *Name;
				const unsigned Size; // Size of cache in bytes
				const unsigned LineSize; // Size of cache line in bytes
				const unsigned Ways; // Number of ways
				const unsigned Latency; // Number of cycles to load

				public:
				MCCacheLevelInfo(unsigned I,
				const char *TheName,
				uint64_t TotalSize,
				unsigned TheLineSize,
				unsigned NumWays,
				unsigned TheLatency)
				: ID(I),
				Name(TheName),
				Size(TotalSize),
				LineSize(TheLineSize),
				Ways(NumWays),
				Latency(TheLatency) {}

				virtual ~MCCacheLevelInfo();

				/// Return the register class ID number.
				///
				unsigned getID() const { return ID; }

				/// Return the register class name for debugging.
				///
				const char *getName() const { return Name; }

				/// Return the total size of the cache level in bytes.
				///
				uint64_t getSizeInBytes() const { return Size; }

				/// Return the size of the cache line in bytes.
				///
				unsigned getLineSizeInBytes() const { return LineSize; }

				/// Return the number of ways.
				///
				unsigned getAssociativity() const { return Ways; }

				/// Return the latency of a load in clocks.
				///
				unsigned getLatency() const { return Latency; }
				};


				/// Aggregate some number of cache levels together along with an
				/// software prefetching configuration and write-combining buffer
				/// information into a model of the memory system as viewed from a
				/// particular execution resource. For example, a core my have L1 and
				/// L2 caches private to it, while a socket may have an L3 shared by
				/// all cores contained by the socket. The core memory model will
				/// list L1 and L2 and the socket memory model will list L3.
				///
				class MCMemoryModel {
				public:
				typedef const MCCacheLevelInfo *cachelevel_iterator;

				private:
				unsigned IDNum;
				const char *Name;

				const MCCacheLevelInfo *Levels; // Array of cache levels
				unsigned NumLevels; // Number of cache levels
				// Write-combining buffer information
				const MCWriteCombiningBufferInfo &WCBuffers;
				// Software prefetching config
				const MCSoftwarePrefetcherConfig &SoftwarePrefetcher;

				public:
				MCMemoryModel(unsigned I,
				const char *TheName,
				const MCCacheLevelInfo *CacheLevels,
				unsigned NumCacheLevels,
				const MCWriteCombiningBufferInfo &WCBufs,
				const MCSoftwarePrefetcherConfig &PrefetcherConfig)
				: IDNum(I),
				Name(TheName),
				Levels(CacheLevels),
				NumLevels(NumCacheLevels),
				WCBuffers(WCBufs),
				SoftwarePrefetcher(PrefetcherConfig) {}

				virtual ~MCMemoryModel();

				/// Return the memory model ID number.
				///
				unsigned getID() const { return IDNum; }

				/// Return the memory model name for debugging.
				///
				const char *getName() const { return Name; }

				//===--------------------------------------------------------------------===//
				// Cache Level Information
				//

				/// Index the hierarchy for a cache level. Note that this is a
				/// piece of the global cache hierarchy private to the execution
				/// resource using the memory model, and shared by any contained
				/// execution resources. As such "level 0" (or level 1, etc.) has
				/// no correspondence to a global-view cache level. Thus names like
				/// "L1" aren't very useful.
				///
				const MCCacheLevelInfo &getCacheLevel(unsigned Level) const {
				assert(Level < NumLevels &&
				"Attempting to access record for invalid cache level!");
				return Levels[Level];
				}

				/// Return the number of cache levels.
				///
				unsigned getNumCacheLevels() const {
				return NumLevels;
				}

				/// Cache level iterators
				///
				cachelevel_iterator begin() const { return Levels; }
				cachelevel_iterator end() const {
				return Levels + getNumCacheLevels();
				}

				//===--------------------------------------------------------------------===//
				// Write Combining Buffer Information
				//

				/// Return the write combining buffer info.
				///
				const MCWriteCombiningBufferInfo &getWCBufferInfo() const {
				return WCBuffers;
				}

				//===--------------------------------------------------------------------===//
				// Software Prefetcher Configuration
				//

				/// Return the software prefetcher configuration.
				///
				const MCSoftwarePrefetcherConfig &getSoftwarePrefetcherConfig() const {
				return SoftwarePrefetcher;
				}
				};

				class MCExecutionResource;

				/// Provide information about the number of execution resources of a
				/// given type are contained within an execution resource. For example
				/// at the socket level there may be a core resource descriptor specifying
				/// that the socket has 48 cores.
				///
				class MCExecutionResourceDesc {
				unsigned ID;
				const char *Name;
				const MCExecutionResource *Resource; // The described resource
				unsigned NumResources; // The resource count

				public:
				MCExecutionResourceDesc(unsigned I,
				const char *TheName,
				const MCExecutionResource *R,
				unsigned N)
				: ID(I), Name(TheName), Resource(R), NumResources(N) {}

				/// Return the resource descriptor ID number.
				///
				unsigned getID() const { return ID; }

				/// Return the resource descriptor name for debugging.
				///
				const char *getName() const { return Name; }

				/// Get the resource.
				///
				const MCExecutionResource &getResource() const {
				return *Resource;
				}

				/// Get the number of resources represented by this descriptor.
				///
				unsigned getNumResources() const {
				return NumResources;
				}
				};

				/// Provide information about a specific kind of execution resource
				/// (core, thread, etc.).
				class MCExecutionResource {
				unsigned ID;
				const char *Name;

				/// An array of execution resource desciptors, allowing an execution
				/// resource to contain a variety of resources; for example a socket
				/// containing some number of big cores and some number of little
				/// cores
				const MCExecutionResourceDesc const Contained;

				/// The number of unique contained execution resource types
				unsigned NumContained;

				/// The memory model for this execution resource
				const MCMemoryModel &MemoryModel;

				public:

				using resource_iterator =
				pointee_iterator<const MCExecutionResourceDesc const >;

				MCExecutionResource(unsigned I,
				const char *TheName,
				const MCExecutionResourceDesc const C,
				unsigned NC,
				const MCMemoryModel &M)
				: ID(I), Name(TheName), Contained(C), NumContained(NC), MemoryModel(M) {}

				virtual ~MCExecutionResource(); // Allow subclasses

				/// Return the resource ID number.
				///
				unsigned getID() const { return ID; }

				/// Return the resource name for debugging.
				///
				const char *getName() const { return Name; }

				/// Return the memory model for this resource.
				///
				const MCMemoryModel &getMemoryModel() const {
				return MemoryModel;
				}

				/// Return the number of unique execution resource types contained
				/// within this one.
				///
				///
				MeinersburUnsubmitted Done Reply Inline Actions [style] These trailing empty comment lines don't seem to be useful. The current LLVM code base does not have them. Meinersbur: [style] These trailing empty comment lines don't seem to be useful. The current LLVM code base…
				unsigned getNumContainedExecutionResourceTypes() const {
				return NumContained;
				}

				/// Iterate over unique contained resources.
				///
				resource_iterator begin() const {
				return resource_iterator(Contained);
				}
				resource_iterator end() const {
				return resource_iterator(Contained + NumContained);
				}

				/// Get the resrouce descriptor indexed by the given value.
				///
				const MCExecutionResourceDesc &getResourceDescriptor(unsigned Index) const {
				assert(Index < getNumContainedExecutionResourceTypes() &&
				"Overindexing resource descriptors!");
				return *Contained[Index];
				}
				};

				/// Model a collection of execution resources coupled with other
				/// information. This also aggregates information about the resource
				/// memory models, presenting a global system view of memory
				/// characteristics.
				///
				class MCSystemModel {
				unsigned ID;
				const char *Name;

				/// An array of execution resource descriptor pointers
				const MCExecutionResourceDesc const Resources;
				unsigned NumResources; /// Number of entries in the array

				static const MCSystemModel Default;
				MeinersburUnsubmitted Not Done Reply Inline Actions Could we avoid the static initializer? Meinersbur: Could we avoid the static initializer?
				greenedAuthorUnsubmitted Done Reply Inline Actions Maybe? I took cues for how the scheduler stuff is implemented and it uses similar static initializers. I'll look into this and see if there is a better way. greened: Maybe? I took cues for how the scheduler stuff is implemented and it uses similar static…

				public:
				// Caches of information about execution resources and their memory
				// models.

				// Make the cache topology indexable by level. The bottom-most
				// cache level of each resource makes up level zero. For example:
				//
				// Thread
				// \|
				// Big Core Little Core
				// \/
				// Socket
				//
				// If the big core has an L1 and L2 cache, the little core has an L1
				// cache and the socket has an L3 cache, the big and little L1s go
				// into the L1 set, the big core L2 goes into the L2 set and the
				// socket L3 goes into the L3 set.
				//

				using CacheLevelSet = SmallVector<const MCCacheLevelInfo *, 4>;
				MeinersburUnsubmitted Done Reply Inline Actions The type has 'Set' in its name, but is an alias for a vector? Meinersbur: The type has 'Set' in its name, but is an alias for a vector?

				private:
				using CacheLevelInfo = SmallVector<CacheLevelSet, 4>;
				mutable CacheLevelInfo CacheLevels;
				MeinersburUnsubmitted Done Reply Inline Actions Why does it need to be `mutable`? Meinersbur: Why does it need to be `mutable`?
				greenedAuthorUnsubmitted Done Reply Inline Actions It doesn't. This is leftover from a time when I thought `initCacheInfoCache` and friends could be called lazily. I'll rework this. greened: It doesn't. This is leftover from a time when I thought `initCacheInfoCache` and friends could…

				/// Cache information about caches on an as-needed basis.
				///
				void initCacheInfoCache() const;

				// Gather information about all software prefetch configs.
				//
				public:
				using PrefetchConfigSet = SmallVector<const MCSoftwarePrefetcherConfig *, 4>;

				private:
				mutable PrefetchConfigSet Prefetchers;

				/// Cache information about prefetchers on an as-needed basis.
				///
				void initPrefetchConfigCache() const;

				public:
				/// Convenience values for indexing the global-view cache hierarchy.
				///
				enum CacheLevel {
				L1 = 0,
				L2,
				L3,
				L4
				};

				using resource_iterator =
				pointee_iterator<const MCExecutionResourceDesc const >;

				MCSystemModel(unsigned I,
				const char *TheName,
				const MCExecutionResourceDesc const R,
				unsigned NR)
				: ID(I), Name(TheName), Resources(R), NumResources(NR) {
				initCacheInfoCache();
				initPrefetchConfigCache();
				}

				virtual ~MCSystemModel();

				/// Return the default initialized model.
				///
				static const MCSystemModel &getDefaultSystemModel() {
				return Default;
				}

				/// Return the execution engine ID number.
				///
				unsigned getID() const { return ID; }

				/// Return the execution engine name for debugging.
				///
				const char *getName() const { return Name; }

				/// Return the number of unique execution resource types.
				///
				unsigned getNumExecutionResourceTypes() const {
				return NumResources;
				}

				/// Iterate over top-level execution resources.
				///
				resource_iterator begin() const {
				return resource_iterator(Resources);
				}
				resource_iterator end() const {
				return resource_iterator(Resources + NumResources);
				}

				/// Get the resrouce descriptor indexed by the given value.
				///
				const MCExecutionResourceDesc &getResourceDescriptor(unsigned Index) const {
				assert(Index < getNumExecutionResourceTypes() &&
				"Overindexing resource descriptors!");
				return *Resources[Index];
				}

				/// Retrieve cached information about cache levels.
				///
				const CacheLevelSet *getCacheLevelInfo(unsigned Level) const {
				if (Level >= CacheLevels.size()) {
				return nullptr;
				}

				return &CacheLevels[Level];
				}

				/// Retrieve cached information about prefetchers.
				///
				const PrefetchConfigSet &getSoftwarePrefetcherInfo() const {
				return Prefetchers;
				}
				};

				} // End llvm namespace

				#endif

llvm/include/llvm/Target/Target.td

	Show First 20 Lines • Show All 1,452 Lines • ▼ Show 20 Lines

	/// A custom predicate used to determine if an instruction is			/// A custom predicate used to determine if an instruction is
	/// deprecated or not.			/// deprecated or not.
	class ComplexDeprecationPredicate<string dep> {			class ComplexDeprecationPredicate<string dep> {
	string ComplexDeprecationPredicate = dep;			string ComplexDeprecationPredicate = dep;
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
				// Pull in the common support for execution engine generation.
				//
				include "llvm/Target/TargetSystemModel.td"

				//===----------------------------------------------------------------------===//
	// Processor chip sets - These values represent each of the chip sets supported			// Processor chip sets - These values represent each of the chip sets supported
	// by the scheduler. Each Processor definition requires corresponding			// by the scheduler. Each Processor definition requires corresponding
	// instruction itineraries.			// instruction itineraries.
	//			//
	class Processor<string n, ProcessorItineraries pi, list<SubtargetFeature> f> {			class Processor<string n, ProcessorItineraries pi, list<SubtargetFeature> f> {
	// Name - Chip set name. Used by command line (-mcpu=) to determine the			// Name - Chip set name. Used by command line (-mcpu=) to determine the
	// appropriate target chip.			// appropriate target chip.
	//			//
	string Name = n;			string Name = n;

	// SchedModel - The machine model for scheduling and instruction cost.			// SchedModel - The machine model for scheduling and instruction cost.
	//			//
	SchedMachineModel SchedModel = NoSchedModel;			SchedMachineModel SchedModel = NoSchedModel;

				// System - A system model describing execution resources and
				// the machine memory model.
				SystemModel System = MinimalSystemModel;

	// ProcItin - The scheduling information for the target processor.			// ProcItin - The scheduling information for the target processor.
	//			//
	ProcessorItineraries ProcItin = pi;			ProcessorItineraries ProcItin = pi;

	// Features - list of			// Features - list of
	list<SubtargetFeature> Features = f;			list<SubtargetFeature> Features = f;
	}			}

	▲ Show 20 Lines • Show All 89 Lines • Show Last 20 Lines

llvm/include/llvm/Target/TargetCacheModel.td

This file was added.

				//===- TargetCacheModel.td - Target cache information----------- tablegen --//
				//
				// The LLVM Compiler Infrastructure
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				//===----------------------------------------------------------------------===//
				// This represents a specific level within the cache hierarchy.
				//===----------------------------------------------------------------------===//
				class Size<int s> : Int<s>;
				class Ways<int w> : Int<w>;
				class LineSize<int l> : Int<l>;
				class Latency<int t> : Int<t>;

				class CacheLevel<Size size, LineSize linesize, Ways ways, Latency latency> {
				int Size = size.Value;
				int LineSize = linesize.Value;
				int Ways = ways.Value;
				int Latency = latency.Value;
				}

				def NoCacheLevel : CacheLevel<Size<0>, LineSize<0>, Ways<0>, Latency<0>>;

				//===----------------------------------------------------------------------===//
				// This models a specific cache hierarchy. Levels should be given in order
				// from lowest to highest (e.g. L1, then L2...).
				//===----------------------------------------------------------------------===//
				class CacheHierarchy<list<CacheLevel> levels> {
				list<CacheLevel> Levels = levels;
				}

				def NoCaches : CacheHierarchy<[]>;

				//===----------------------------------------------------------------------===//
				// Provide some common cache sizes.
				//===----------------------------------------------------------------------===//
				def _1KiB : Int<1024>;
				def _16KiB : Int<!shl(_1KiB.Value, 4)>;
				def _32KiB : Int<!shl(_16KiB.Value, 1)>;
				def _64KiB : Int<!shl(_32KiB.Value, 1)>;
				def _128KiB : Int<!shl(_64KiB.Value, 1)>;
				def _256KiB : Int<!shl(_128KiB.Value, 1)>;
				def _512KiB : Int<!shl(_256KiB.Value, 1)>;
				def _1MiB : Int<!shl(_512KiB.Value, 1)>;
				def _2MiB : Int<!shl(_1MiB.Value, 1)>;
				def _4MiB : Int<!shl(_2MiB.Value, 1)>;
				def _6MiB : Int<!add(_2MiB.Value, _4MiB.Value)>;
				def _8MiB : Int<!shl(_4MiB.Value, 1)>;
				def _12MiB : Int<!add(_8MiB.Value, _4MiB.Value)>;
				def _16MiB : Int<!shl(_8MiB.Value, 1)>;
				def _20MiB : Int<!add(_16MiB.Value, _4MiB.Value)>;
				def _25MiB : Int<!add(_20MiB.Value, !add(_4MiB.Value, _1MiB.Value))>;
				def _32MiB : Int<!shl(_16MiB.Value, 1)>;
				def _40MiB : Int<!add(_32MiB.Value, _8MiB.Value)>;

llvm/include/llvm/Target/TargetMemoryModel.td

This file was added.

				//===- TargetMemoryModel.td - Target memory system information-- tablegen --//
				//
				// The LLVM Compiler Infrastructure
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines the target-independent memory hiearchy interfaces which
				// should be implemented by each target that makes use of such information.
				//
				//===----------------------------------------------------------------------===//

				include "llvm/Target/TargetCacheModel.td"
				include "llvm/Target/TargetWCBufferModel.td"
				include "llvm/Target/TargetSoftwarePrefetchConfig.td"

				//===----------------------------------------------------------------------===//
				// MemorySystem - This models a memory subsystem.
				//===----------------------------------------------------------------------===//
				class MemoryModel<CacheHierarchy c,
				WriteCombiningBuffer w,
				SoftwarePrefetcher p> {
				CacheHierarchy Caches = c;
				WriteCombiningBuffer WCBuffers = w;
				SoftwarePrefetcher Prefetcher = p;
				}

				// For execution resources that really don't have a memory model.
				//
				def NoMemoryModel : MemoryModel<NoCaches,
				NoWCBuffers,
				NoSoftwarePrefetcher>;

				//===----------------------------------------------------------------------===//
				// Define the minimal memory model needed to implement legacy TTI interfaces.
				//===----------------------------------------------------------------------===//
				def MinimalMemoryModel : MemoryModel<NoCaches,
				NoWCBuffers,
				TransitionSoftwarePrefetcher>;

llvm/include/llvm/Target/TargetSoftwarePrefetchConfig.td

This file was added.

				//===- TargetSoftwarePrefetchConfig.td - Target prefetch info--- tablegen --//
				//
				// The LLVM Compiler Infrastructure
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				//===----------------------------------------------------------------------===//
				// SoftwarePrefetcher - This provides parameters to software prefetching.
				//===----------------------------------------------------------------------===//
				class IsReadEnabled<int v> : Int<v>;
				class IsWriteEnabled<int v> : Int<v>;
				class ReadPrefetchForWrites<int v> : Int<v>;
				class ByteDistance<int d> : Int<d>;
				class MinByteDistance<int d> : Int<d>;
				class MaxByteDistance<int d> : Int<d>;
				class InstructionDistance<int d> : Int<d>;
				class MaxIterationDistance<int d> : Int<d>;
				class MinByteStride<int s> : Int<s>;

				class SoftwarePrefetcher<IsReadEnabled re,
				IsWriteEnabled we,
				ReadPrefetchForWrites rfw,
				ByteDistance bd,
				MinByteDistance bnd,
				MaxByteDistance bxd,
				InstructionDistance id,
				MaxIterationDistance ixd,
				MinByteStride ns> {
				int EnabledForReads = re.Value; // Do prefetching for loads.
				int EnabledForWrites = we.Value; // Do prefetching for stores.
				int UseReadPFForWrites = rfw.Value; // Use a read prefetch for stores.
				int BytesAhead = bd.Value; // Average "good" prefetch distance. Set
				// to zero to tell the prefetcher to use
				// heuristics to determine an appropriate
				// distance.
				int MinBytesAhead = bnd.Value; // Don't prefetch anything less than this
				// far ahead.
				int MaxBytesAhead = bxd.Value; // Don't prefetch anything more than this
				// far ahead.
				int InstructionsAhead = id.Value; // Prefetch this many instructions ahead,
				// used by prefetchers that operate in
				// terms of instruction distance rather
				// than bytes.
				MeinersburUnsubmitted Done Reply Inline Actions ISA instructions or µOps? Why not cycles? Meinersbur: ISA instructions or µOps? Why not cycles?
				greenedAuthorUnsubmitted Done Reply Inline Actions `LoopDataPrefetch` thinks in terms of IR `Instructions`. I'll clarify the comment and name. Maybe we should reconsider how `LoopDataPrefetch` thinks about things but I'd prefer to leave that for later work. I want to be confident we can model the way things work today before we go changing a bunch of things. greened: `LoopDataPrefetch` thinks in terms of IR `Instructions`. I'll clarify the comment and name.
				int MaxIterationsAhead = ixd.Value; // Don't prefetch more than this number
				// of iterations ahead. Used by
				// prefetchers that operate in terms of
				// instruction distance rather than
				// bytes.
				MeinersburUnsubmitted Done Reply Inline Actions Isn't this algorithm-dependent, i.e. the size of the loop? Meinersbur: Isn't this algorithm-dependent, i.e. the size of the loop?
				greenedAuthorUnsubmitted Done Reply Inline Actions Yep. Again, this is driven by `LoopDataPrefetch`. greened: Yep. Again, this is driven by `LoopDataPrefetch`.
				int MinStride = ns.Value; // Don't prefetch unless stride is at
				// least this large.
				}


				def ReadEnabled : IsReadEnabled<1>;
				def ReadDisabled : IsReadEnabled<0>;

				def WriteEnabled : IsWriteEnabled<1>;
				def WriteDisabled : IsWriteEnabled<0>;

				def UseReadPrefetchForWrites : ReadPrefetchForWrites<1>;
				def UseWritePrefetchForWrites : ReadPrefetchForWrites<0>;

				def HeuristicByteDistance : ByteDistance<0>;
				def HeuristicInstructionDistance : InstructionDistance<0>;

				def NoSoftwarePrefetcher : SoftwarePrefetcher<ReadDisabled,
				WriteDisabled,
				UseWritePrefetchForWrites,
				HeuristicByteDistance,
				MinByteDistance<0>,
				MaxByteDistance<0>,
				HeuristicInstructionDistance,
				MaxIterationDistance<0>,
				MinByteStride<0>>;

				// This is for targets that define some aspects of prefetching in
				// target-specific TTI. Until such targets are ported to the system
				// model, they should continue to work as they do now. Once the porting
				// is complete, TransitionSoftwarePrefetcher use should migrate to
				// NoSoftwarePrefetcher or to the appropriate target-defined software
				// prefetch configuration.
				//
				def TransitionSoftwarePrefetcher :
				SoftwarePrefetcher<ReadEnabled,
				WriteEnabled,
				UseWritePrefetchForWrites,
				HeuristicByteDistance,
				MinByteDistance<0>,
				MaxByteDistance<0>,
				HeuristicInstructionDistance,
				// Legacy TTI used UINT_MAX, which isn't available in
				// TableGen. Approximate it with some thing that will
				// work on 32-bit hosts.
				MaxIterationDistance<4294967295>,
				MinByteStride<1>>;

llvm/include/llvm/Target/TargetSystemModel.td

This file was added.

				//===- TargetSystemModel.td - Target Hardware Info --------- tablegen --====//
				//
				// The LLVM Compiler Infrastructure
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defines the target-independent execution hardware available. It
				// should be implemented by each target that makes use of such information.
				//
				//===----------------------------------------------------------------------===//


				//===----------------------------------------------------------------------===//
				// A convenience wrapper to hold an integer value for later consumption.
				// This is used to give names to common integer-y concepts like sizes
				// and flags.
				//===----------------------------------------------------------------------===//
				class Int<int value> {
				int Value = value;
				}

				include "llvm/Target/TargetMemoryModel.td"

				class ExecutionResource;

				class ExecutionResourceDesc<ExecutionResource resource, int numresources> {
				ExecutionResource Resource = resource;
				int NumResources = numresources;
				}

				//===----------------------------------------------------------------------===//
				// This models a particular group of execution resources (threads, cores,
				// etc.), describing the relationship among them and their cache sharing
				// characteristics.
				//===----------------------------------------------------------------------===//
				class ExecutionResource<list<ExecutionResourceDesc> contained,
				MemoryModel memmodel> {
				list<ExecutionResourceDesc> Contained = contained;
				MemoryModel MemModel = memmodel;
				}

				//===----------------------------------------------------------------------===//
				// Define some common execution resources, mainly for readability.
				//===----------------------------------------------------------------------===//
				class Thread : ExecutionResource<[], NoMemoryModel>;
				class Core<list<ExecutionResourceDesc> contained,
				MemoryModel memmodel> :
				ExecutionResource<contained, memmodel>;
				class Socket<list<ExecutionResourceDesc> contained,
				MemoryModel memmodel> :
				ExecutionResource<contained, memmodel>;

				//===----------------------------------------------------------------------===//
				// This models a collection of execution resources as well as the machine
				// memory model.
				//===----------------------------------------------------------------------===//
				class SystemModel<list<ExecutionResourceDesc> resources> {
				list<ExecutionResourceDesc> Resources = resources;
				}

				//===----------------------------------------------------------------------===//
				// Define the minimal execution engine needed to implement legacy TTI
				// interfaces.
				//===----------------------------------------------------------------------===//
				def MinimalCore : Core<[], MinimalMemoryModel>;

				def MinimalCoreResourceDesc : ExecutionResourceDesc<MinimalCore, 1>;

				def MinimalSystemModel : SystemModel<[MinimalCoreResourceDesc]>;

llvm/include/llvm/Target/TargetWCBufferModel.td

This file was added.

				//===- TargetWCBufferModel.td - Target buffer information------- tablegen --//
				//
				// The LLVM Compiler Infrastructure
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				//===----------------------------------------------------------------------===//
				// WriteCombiningBuffer - This models hardware buffers that combine
				// writes into a single transaction. Typically
				// these are used for non-temporal store operations.
				//
				// NumBuffers - The number of effective buffers available.
				//===----------------------------------------------------------------------===//
				class WriteCombiningBuffer<int n> {
				int NumBuffers = n;
				}

				def NoWCBuffers : WriteCombiningBuffer<0>;

llvm/lib/Analysis/TargetTransformInfo.cpp

Show First 20 Lines • Show All 369 Lines • ▼ Show 20 Lines	llvm::Optional<unsigned> TargetTransformInfo::getCacheSize(CacheLevel Level)
return TTIImpl->getCacheSize(Level);		return TTIImpl->getCacheSize(Level);
}		}

llvm::Optional<unsigned> TargetTransformInfo::getCacheAssociativity(		llvm::Optional<unsigned> TargetTransformInfo::getCacheAssociativity(
CacheLevel Level) const {		CacheLevel Level) const {
return TTIImpl->getCacheAssociativity(Level);		return TTIImpl->getCacheAssociativity(Level);
}		}

		bool TargetTransformInfo::prefetchReads() const {
		return TTIImpl->prefetchReads();
		}

		bool TargetTransformInfo::prefetchWrites() const {
		return TTIImpl->prefetchWrites();
		}

		bool TargetTransformInfo::useReadPrefetchForWrites() const {
		return TTIImpl->useReadPrefetchForWrites();
		}

unsigned TargetTransformInfo::getPrefetchDistance() const {		unsigned TargetTransformInfo::getPrefetchDistance() const {
return TTIImpl->getPrefetchDistance();		return TTIImpl->getPrefetchDistance();
}		}

unsigned TargetTransformInfo::getMinPrefetchStride() const {		unsigned TargetTransformInfo::getMinPrefetchStride() const {
return TTIImpl->getMinPrefetchStride();		return TTIImpl->getMinPrefetchStride();
}		}

▲ Show 20 Lines • Show All 838 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetSubtargetInfo.cpp

	Show All 13 Lines

	using namespace llvm;			using namespace llvm;

	TargetSubtargetInfo::TargetSubtargetInfo(			TargetSubtargetInfo::TargetSubtargetInfo(
	const Triple &TT, StringRef CPU, StringRef FS,			const Triple &TT, StringRef CPU, StringRef FS,
	ArrayRef<SubtargetFeatureKV> PF, ArrayRef<SubtargetFeatureKV> PD,			ArrayRef<SubtargetFeatureKV> PF, ArrayRef<SubtargetFeatureKV> PD,
	const SubtargetInfoKV ProcSched, const MCWriteProcResEntry WPR,			const SubtargetInfoKV ProcSched, const MCWriteProcResEntry WPR,
	const MCWriteLatencyEntry WL, const MCReadAdvanceEntry RA,			const MCWriteLatencyEntry WL, const MCReadAdvanceEntry RA,
	const InstrStage IS, const unsigned OC, const unsigned *FP)			const InstrStage IS, const unsigned OC, const unsigned *FP,
	: MCSubtargetInfo(TT, CPU, FS, PF, PD, ProcSched, WPR, WL, RA, IS, OC, FP) {			const SubtargetInfoKV *SystemModels)
				: MCSubtargetInfo(TT, CPU, FS, PF, PD, ProcSched, WPR, WL, RA, IS, OC, FP,
				SystemModels) {
	}			}

	TargetSubtargetInfo::~TargetSubtargetInfo() = default;			TargetSubtargetInfo::~TargetSubtargetInfo() = default;

	bool TargetSubtargetInfo::enableAtomicExpand() const {			bool TargetSubtargetInfo::enableAtomicExpand() const {
	return true;			return true;
	}			}

	Show All 30 Lines

llvm/lib/MC/CMakeLists.txt

Show All 36 Lines	add_llvm_library(LLVMMC
MCSectionCOFF.cpp		MCSectionCOFF.cpp
MCSectionELF.cpp		MCSectionELF.cpp
MCSectionMachO.cpp		MCSectionMachO.cpp
MCSectionWasm.cpp		MCSectionWasm.cpp
MCStreamer.cpp		MCStreamer.cpp
MCSubtargetInfo.cpp		MCSubtargetInfo.cpp
MCSymbol.cpp		MCSymbol.cpp
MCSymbolELF.cpp		MCSymbolELF.cpp
		MCSystemModel.cpp
MCTargetOptions.cpp		MCTargetOptions.cpp
MCValue.cpp		MCValue.cpp
MCWasmObjectTargetWriter.cpp		MCWasmObjectTargetWriter.cpp
MCWasmStreamer.cpp		MCWasmStreamer.cpp
MCWin64EH.cpp		MCWin64EH.cpp
MCWinCOFFStreamer.cpp		MCWinCOFFStreamer.cpp
MCWinEH.cpp		MCWinEH.cpp
MachObjectWriter.cpp		MachObjectWriter.cpp
Show All 11 Lines

llvm/lib/MC/MCSubtargetInfo.cpp

//===- MCSubtargetInfo.cpp - Subtarget Information ------------------------===//		//===- MCSubtargetInfo.cpp - Subtarget Information ------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/MC/MCSubtargetInfo.h"		#include "llvm/MC/MCSubtargetInfo.h"
#include "llvm/ADT/ArrayRef.h"		#include "llvm/ADT/ArrayRef.h"
#include "llvm/ADT/StringRef.h"		#include "llvm/ADT/StringRef.h"
		#include "llvm/MC/MCSystemModel.h"
#include "llvm/MC/MCInstrItineraries.h"		#include "llvm/MC/MCInstrItineraries.h"
#include "llvm/MC/MCSchedule.h"		#include "llvm/MC/MCSchedule.h"
#include "llvm/MC/SubtargetFeature.h"		#include "llvm/MC/SubtargetFeature.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include <algorithm>		#include <algorithm>
#include <cassert>		#include <cassert>
#include <cstring>		#include <cstring>

using namespace llvm;		using namespace llvm;

static FeatureBitset getFeatures(StringRef CPU, StringRef FS,		static FeatureBitset getFeatures(StringRef CPU, StringRef FS,
ArrayRef<SubtargetFeatureKV> ProcDesc,		ArrayRef<SubtargetFeatureKV> ProcDesc,
ArrayRef<SubtargetFeatureKV> ProcFeatures) {		ArrayRef<SubtargetFeatureKV> ProcFeatures) {
SubtargetFeatures Features(FS);		SubtargetFeatures Features(FS);
return Features.getFeatureBits(CPU, ProcDesc, ProcFeatures);		return Features.getFeatureBits(CPU, ProcDesc, ProcFeatures);
}		}

void MCSubtargetInfo::InitMCProcessorInfo(StringRef CPU, StringRef FS) {		void MCSubtargetInfo::InitMCProcessorInfo(StringRef CPU, StringRef FS) {
FeatureBits = getFeatures(CPU, FS, ProcDesc, ProcFeatures);		FeatureBits = getFeatures(CPU, FS, ProcDesc, ProcFeatures);
if (!CPU.empty())		if (!CPU.empty()) {
CPUSchedModel = &getSchedModelForCPU(CPU);		CPUSchedModel = &getSchedModelForCPU(CPU);
else		CPUModel = &getSystemModelForCPU(CPU);
		}
		else {
CPUSchedModel = &MCSchedModel::GetDefaultSchedModel();		CPUSchedModel = &MCSchedModel::GetDefaultSchedModel();
		CPUModel = &MCSystemModel::getDefaultSystemModel();
		}
}		}

void MCSubtargetInfo::setDefaultFeatures(StringRef CPU, StringRef FS) {		void MCSubtargetInfo::setDefaultFeatures(StringRef CPU, StringRef FS) {
FeatureBits = getFeatures(CPU, FS, ProcDesc, ProcFeatures);		FeatureBits = getFeatures(CPU, FS, ProcDesc, ProcFeatures);
}		}

MCSubtargetInfo::MCSubtargetInfo(		MCSubtargetInfo::MCSubtargetInfo(
const Triple &TT, StringRef C, StringRef FS,		const Triple &TT, StringRef C, StringRef FS,
ArrayRef<SubtargetFeatureKV> PF, ArrayRef<SubtargetFeatureKV> PD,		ArrayRef<SubtargetFeatureKV> PF, ArrayRef<SubtargetFeatureKV> PD,
const SubtargetInfoKV ProcSched, const MCWriteProcResEntry WPR,		const SubtargetInfoKV ProcSched, const MCWriteProcResEntry WPR,
const MCWriteLatencyEntry WL, const MCReadAdvanceEntry RA,		const MCWriteLatencyEntry WL, const MCReadAdvanceEntry RA,
const InstrStage IS, const unsigned OC, const unsigned *FP)		const InstrStage IS, const unsigned OC, const unsigned *FP,
		const SubtargetInfoKV *SystemModelTable)
: TargetTriple(TT), CPU(C), ProcFeatures(PF), ProcDesc(PD),		: TargetTriple(TT), CPU(C), ProcFeatures(PF), ProcDesc(PD),
ProcSchedModels(ProcSched), WriteProcResTable(WPR), WriteLatencyTable(WL),		ProcSchedModels(ProcSched), WriteProcResTable(WPR), WriteLatencyTable(WL),
ReadAdvanceTable(RA), Stages(IS), OperandCycles(OC), ForwardingPaths(FP) {		ReadAdvanceTable(RA), Stages(IS), OperandCycles(OC), ForwardingPaths(FP),
		SystemModels(SystemModelTable) {
InitMCProcessorInfo(CPU, FS);		InitMCProcessorInfo(CPU, FS);
}		}

FeatureBitset MCSubtargetInfo::ToggleFeature(uint64_t FB) {		FeatureBitset MCSubtargetInfo::ToggleFeature(uint64_t FB) {
FeatureBits.flip(FB);		FeatureBits.flip(FB);
return FeatureBits;		return FeatureBits;
}		}

▲ Show 20 Lines • Show All 54 Lines • ▼ Show 20 Lines	MCSubtargetInfo::getInstrItineraryForCPU(StringRef CPU) const {
const MCSchedModel &SchedModel = getSchedModelForCPU(CPU);		const MCSchedModel &SchedModel = getSchedModelForCPU(CPU);
return InstrItineraryData(SchedModel, Stages, OperandCycles, ForwardingPaths);		return InstrItineraryData(SchedModel, Stages, OperandCycles, ForwardingPaths);
}		}

void MCSubtargetInfo::initInstrItins(InstrItineraryData &InstrItins) const {		void MCSubtargetInfo::initInstrItins(InstrItineraryData &InstrItins) const {
InstrItins = InstrItineraryData(getSchedModel(), Stages, OperandCycles,		InstrItins = InstrItineraryData(getSchedModel(), Stages, OperandCycles,
ForwardingPaths);		ForwardingPaths);
}		}

		const MCSystemModel &
		MCSubtargetInfo::getSystemModelForCPU(StringRef CPU) const {
		assert(SystemModels && "Processor execution engine not available!");

		ArrayRef<SubtargetInfoKV> Models(SystemModels, ProcDesc.size());

		assert(std::is_sorted(Models.begin(), Models.end(),
		[](const SubtargetInfoKV &LHS, const SubtargetInfoKV &RHS) {
		return strcmp(LHS.Key, RHS.Key) < 0;
		}) &&
		"Processor system model table is not sorted");

		// Find entry
		auto Found =
		std::lower_bound(Models.begin(), Models.end(), CPU);
		if (Found == Models.end() \|\| StringRef(Found->Key) != CPU) {
		if (CPU != "help") // Don't error if the user asked for help.
		MeinersburUnsubmitted Done Reply Inline Actions I find this handling of `--help` strange, but the current `MCSubtargetInfo::getSchedModelForCPU` does the same thing. Meinersbur: I find this handling of `--help` strange, but the current `MCSubtargetInfo…
		greenedAuthorUnsubmitted Done Reply Inline Actions Right. That's what I used as guidance. greened: Right. That's what I used as guidance.
		errs() << "'" << CPU
		<< "' is not a recognized processor for this target"
		<< " (ignoring processor)\n";
		return MCSystemModel::getDefaultSystemModel();
		}
		assert(Found->Value && "Missing processor SystemModel value");
		return (const MCSystemModel )Found->Value;
		}

		Optional<unsigned> MCSubtargetInfo::getCacheSize(unsigned Level) const {
		const MCSystemModel::CacheLevelSet *Levels =
		getSystemModel().getCacheLevelInfo(Level);

		if (Levels == nullptr) {
		return Optional<unsigned>();
		}

		Optional<unsigned> Size;
		for (const auto LevelInfo : Levels) {
		if (!Size) {
		Size = Optional<unsigned>(LevelInfo->getSizeInBytes());
		continue;
		}
		if (LevelInfo->getSizeInBytes() != *Size)
		// Cache at this level are of different sizes.
		return resolveCacheSize(Level, *Levels);
		}

		return Size;
		}

		Optional<unsigned>
		MCSubtargetInfo::getCacheAssociativity(unsigned Level) const {
		const MCSystemModel::CacheLevelSet *Levels =
		getSystemModel().getCacheLevelInfo(Level);

		if (Levels == nullptr) {
		return Optional<unsigned>();
		}

		Optional<unsigned> Associativity;
		for (const auto LevelInfo : Levels) {
		if (!Associativity) {
		Associativity = Optional<unsigned>(LevelInfo->getAssociativity());
		continue;
		}
		if (LevelInfo->getAssociativity() != *Associativity)
		// Cache at this level are of different sizes.
		return resolveCacheAssociativity(Level, *Levels);
		}

		return Associativity;
		}

		Optional<unsigned> MCSubtargetInfo::getCacheLineSize(unsigned Level) const {
		const MCSystemModel::CacheLevelSet *Levels =
		getSystemModel().getCacheLevelInfo(Level);

		if (Levels == nullptr) {
		return Optional<unsigned>();
		}

		Optional<unsigned> Size;
		for (const auto LevelInfo : Levels) {
		if (!Size) {
		Size = Optional<unsigned>(LevelInfo->getLineSizeInBytes());
		continue;
		}
		if (LevelInfo->getLineSizeInBytes() != *Size)
		return resolveCacheLineSize(Level, *Levels);
		}

		return Size;
		}

		bool MCSubtargetInfo::prefetchReads() const {
		Optional<bool> Enabled;

		const MCSystemModel::PrefetchConfigSet &PrefetcherConfigs =
		getSystemModel().getSoftwarePrefetcherInfo();

		for (const auto *PrefetcherConfig : PrefetcherConfigs) {
		if (!Enabled) {
		Enabled = Optional<bool>(PrefetcherConfig->isEnabledForReads());
		continue;
		}
		if (PrefetcherConfig->isEnabledForReads() != *Enabled)
		return resolvePrefetchReads(PrefetcherConfigs);
		}

		return Enabled ? *Enabled : false;
		}

		bool MCSubtargetInfo::prefetchWrites() const {
		Optional<bool> Enabled;

		const MCSystemModel::PrefetchConfigSet &PrefetcherConfigs =
		getSystemModel().getSoftwarePrefetcherInfo();

		for (const auto *PrefetcherConfig : PrefetcherConfigs) {
		if (!Enabled) {
		Enabled = Optional<bool>(PrefetcherConfig->isEnabledForWrites());
		continue;
		}
		if (PrefetcherConfig->isEnabledForWrites() != *Enabled)
		return resolvePrefetchWrites(PrefetcherConfigs);
		}

		return Enabled ? *Enabled : false;
		}

		bool MCSubtargetInfo::useReadPrefetchForWrites() const {
		Optional<bool> UseReadPFForWrites;

		const MCSystemModel::PrefetchConfigSet &PrefetcherConfigs =
		getSystemModel().getSoftwarePrefetcherInfo();

		for (const auto *PrefetcherConfig : PrefetcherConfigs) {
		if (!UseReadPFForWrites) {
		UseReadPFForWrites =
		Optional<bool>(PrefetcherConfig->useReadPrefetchForWrites());
		continue;
		}
		if (PrefetcherConfig->useReadPrefetchForWrites() != *UseReadPFForWrites)
		return
		resolveUseReadPrefetchForWrites(PrefetcherConfigs);
		}

		return UseReadPFForWrites ? *UseReadPFForWrites : false;
		}

		unsigned MCSubtargetInfo::getPrefetchDistance() const {
		Optional<unsigned> Distance;

		const MCSystemModel::PrefetchConfigSet &PrefetcherConfigs =
		getSystemModel().getSoftwarePrefetcherInfo();

		for (const auto *PrefetcherConfig : PrefetcherConfigs) {
		if (!Distance) {
		Distance =
		Optional<unsigned>(PrefetcherConfig->getDistanceInInstructions());
		continue;
		}
		if (PrefetcherConfig->getDistanceInInstructions() != *Distance)
		return resolvePrefetchDistanceInInstructions(PrefetcherConfigs);
		}

		return Distance ? *Distance : 0;
		}

		unsigned MCSubtargetInfo::getMaxPrefetchIterationsAhead() const {
		Optional<unsigned> Distance;

		const MCSystemModel::PrefetchConfigSet &PrefetcherConfigs =
		getSystemModel().getSoftwarePrefetcherInfo();

		for (const auto *PrefetcherConfig : PrefetcherConfigs) {
		if (!Distance) {
		Distance =
		Optional<unsigned>(PrefetcherConfig->getMaxDistanceInIterations());
		continue;
		}
		if (PrefetcherConfig->getMaxDistanceInIterations() != *Distance)
		return resolveMaxPrefetchIterationsAhead(PrefetcherConfigs);
		}

		return Distance ? *Distance : 0;
		}

		unsigned MCSubtargetInfo::getMinPrefetchStride() const {
		Optional<unsigned> Distance;

		const MCSystemModel::PrefetchConfigSet &PrefetcherConfigs =
		getSystemModel().getSoftwarePrefetcherInfo();

		for (const auto *PrefetcherConfig : PrefetcherConfigs) {
		if (!Distance) {
		Distance = Optional<unsigned>(PrefetcherConfig->getMinByteStride());
		continue;
		}
		if (PrefetcherConfig->getMinByteStride() != *Distance)
		return resolveMinPrefetchStride(PrefetcherConfigs);
		}

		return Distance ? *Distance : 0;
		}

llvm/lib/MC/MCSystemModel.cpp

This file was added.

				//=== MC/MCSystemModel.cpp - Target System Model ------------- C++ --=======//
				//
				// The LLVM Compiler Infrastructure
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This file defioned MCSystemModel methods.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/MC/MCSystemModel.h"
				#include <functional>

				namespace llvm {

				MCWriteCombiningBufferInfo::~MCWriteCombiningBufferInfo() {}
				MCSoftwarePrefetcherConfig::~MCSoftwarePrefetcherConfig() {}
				MCCacheLevelInfo::~MCCacheLevelInfo() {}
				MCMemoryModel::~MCMemoryModel() {}

				namespace {

				MCWriteCombiningBufferInfo DefaultWCBuffers(0,
				"Default Write-Combining Buffer",
				0);

				MCSoftwarePrefetcherConfig DefaultPrefetcherConfig(0,
				"Default Prefetcher",
				false,
				false,
				false,
				0,
				0,
				0,
				0,
				0,
				0);

				const MCMemoryModel DefaultMemoryModel(0,
				"Default Memory Model",
				nullptr,
				0,
				DefaultWCBuffers,
				DefaultPrefetcherConfig);
				MeinersburUnsubmitted Done Reply Inline Actions where are these used? Meinersbur: where are these used?
				greenedAuthorUnsubmitted Done Reply Inline Actions They aren't anymore. Will remove. greened: They aren't anymore. Will remove.

				} // end anonymous namespace


				MCExecutionResource::~MCExecutionResource() {}
				MCSystemModel::~MCSystemModel() {}

				const MCSystemModel MCSystemModel::Default(0,
				"Default System Model",
				nullptr,
				0);


				void MCSystemModel::initCacheInfoCache() const {
				// Create a global cache topology. The tricky part is collapsing
				// the execution resource levels properly. For example, let's say
				// we have a system with a CPU socket and a GPU socket. The CPU
				// socket contains two core types: big and lttle. The CPUsocket
				MeinersburUnsubmitted Done Reply Inline Actions [typo] `lttle` Meinersbur: [typo] `lttle`
				// contains an L3 cache, the big core contains and L2 and L1 cache
				// and the little core contains an L1 cache. The GPU socket
				// contains shared L2 and L3 caches and GPU cores have a private L1
				// cache:
				//
				// System
				// / \
				// (L2, L3) GPU CPU (L3)
				// \| / \
				// (L1) C L (L1) B (L1, L2)
				//
				// We want the final topology to look like this:
				//
				// L3 (GPU) L3 (CPU)
				// L2 (GPU) L2 (B)
				// L1 (C) L1 (L) L1(B)
				//
				// The algorithm below recursively determines the topology for the
				// resources below the current one, then merges the lists from all
				// child resources so that it is no longer than the maximum size of
				// any child list. Child lists are ordered by cache level, so the
				// lowest level of cache appears first.
				//
				auto mergeCacheInfo = [](const CacheLevelInfo &I1,
				const CacheLevelInfo &I2) -> CacheLevelInfo {
				CacheLevelInfo Result;
				unsigned Length = std::max(I1.size(), I2.size());
				for (unsigned i = 0; i < Length; ++i) {
				Result.push_back(CacheLevelSet());
				if (i < I1.size())
				Result.back().append(I1[i].begin(), I1[i].end());
				if (i < I2.size())
				Result.back().append(I2[i].begin(), I2[i].end());
				}

				return Result;
				};

				std::function<CacheLevelInfo(const MCExecutionResourceDesc &)> getCacheInfo =
				[&](const MCExecutionResourceDesc &Desc) -> CacheLevelInfo {
				const MCExecutionResource &Resource = Desc.getResource();

				CacheLevelInfo Result;
				for (const auto &ContainedDesc : Resource) {
				Result = mergeCacheInfo(Result, getCacheInfo(ContainedDesc));
				}

				// Add cache information for this resource.
				for (const auto &Level : Resource.getMemoryModel()) {
				Result.push_back(CacheLevelSet());
				Result.back().push_back(&Level);
				}

				return Result;
				};

				CacheLevels.clear();

				for (const auto &Desc : *this)
				CacheLevels = mergeCacheInfo(CacheLevels, getCacheInfo(Desc));
				}

				void MCSystemModel::initPrefetchConfigCache() const {
				Prefetchers.clear();

				using WorkListType = SmallVector<const MCExecutionResourceDesc *, 4>;
				WorkListType WorkList;
				for (const auto &ResourceDesc : *this) {
				WorkList.push_back(&ResourceDesc);
				}

				while (!WorkList.empty()) {
				const MCExecutionResourceDesc *Item = WorkList.back();
				WorkList.pop_back();
				const MCExecutionResource &Resource = Item->getResource();
				const MCSoftwarePrefetcherConfig &Prefetcher =
				Resource.getMemoryModel().getSoftwarePrefetcherConfig();
				if (Prefetcher.isEnabledForReads() \|\| Prefetcher.isEnabledForWrites()) {
				Prefetchers.push_back(&Prefetcher);
				}
				for (const auto &ResourceDesc : Resource) {
				WorkList.push_back(&ResourceDesc);
				}
				}
				}

				} // end llvm namespace

llvm/lib/Target/AArch64/AArch64Subtarget.h

Show First 20 Lines • Show All 323 Lines • ▼ Show 20 Lines	public:
}		}

bool useRSqrt() const { return UseRSqrt; }		bool useRSqrt() const { return UseRSqrt; }
bool force32BitJumpTables() const { return Force32BitJumpTables; }		bool force32BitJumpTables() const { return Force32BitJumpTables; }
unsigned getMaxInterleaveFactor() const { return MaxInterleaveFactor; }		unsigned getMaxInterleaveFactor() const { return MaxInterleaveFactor; }
unsigned getVectorInsertExtractBaseCost() const {		unsigned getVectorInsertExtractBaseCost() const {
return VectorInsertExtractBaseCost;		return VectorInsertExtractBaseCost;
}		}
unsigned getCacheLineSize() const { return CacheLineSize; }		unsigned getCacheLineSize() const override { return CacheLineSize; }
unsigned getPrefetchDistance() const { return PrefetchDistance; }		unsigned getPrefetchDistance() const override { return PrefetchDistance; }
unsigned getMinPrefetchStride() const { return MinPrefetchStride; }		unsigned getMinPrefetchStride() const override { return MinPrefetchStride; }
unsigned getMaxPrefetchIterationsAhead() const {		unsigned getMaxPrefetchIterationsAhead() const override {
return MaxPrefetchIterationsAhead;		return MaxPrefetchIterationsAhead;
}		}
unsigned getPrefFunctionAlignment() const { return PrefFunctionAlignment; }		unsigned getPrefFunctionAlignment() const { return PrefFunctionAlignment; }
unsigned getPrefLoopAlignment() const { return PrefLoopAlignment; }		unsigned getPrefLoopAlignment() const { return PrefLoopAlignment; }

unsigned getMaximumJumpTableSize() const { return MaxJumpTableSize; }		unsigned getMaximumJumpTableSize() const { return MaxJumpTableSize; }

unsigned getWideningBaseCost() const { return WideningBaseCost; }		unsigned getWideningBaseCost() const { return WideningBaseCost; }
▲ Show 20 Lines • Show All 112 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h

Show First 20 Lines • Show All 147 Lines • ▼ Show 20 Lines	int getInterleavedMemoryOpCost(unsigned Opcode, Type *VecTy, unsigned Factor,
unsigned AddressSpace,		unsigned AddressSpace,
bool UseMaskForCond = false,		bool UseMaskForCond = false,
bool UseMaskForGaps = false);		bool UseMaskForGaps = false);

bool		bool
shouldConsiderAddressTypePromotion(const Instruction &I,		shouldConsiderAddressTypePromotion(const Instruction &I,
bool &AllowPromotionWithoutCommonHeader);		bool &AllowPromotionWithoutCommonHeader);

unsigned getCacheLineSize();

unsigned getPrefetchDistance();

unsigned getMinPrefetchStride();

unsigned getMaxPrefetchIterationsAhead();

bool shouldExpandReduction(const IntrinsicInst *II) const {		bool shouldExpandReduction(const IntrinsicInst *II) const {
return false;		return false;
}		}

bool useReductionIntrinsic(unsigned Opcode, Type *Ty,		bool useReductionIntrinsic(unsigned Opcode, Type *Ty,
TTI::ReductionFlags Flags) const;		TTI::ReductionFlags Flags) const;

int getArithmeticReductionCost(unsigned Opcode, Type *Ty,		int getArithmeticReductionCost(unsigned Opcode, Type *Ty,
Show All 9 Lines

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

Show First 20 Lines • Show All 871 Lines • ▼ Show 20 Lines	if (const GetElementPtrInst *GEPInst = dyn_cast<GetElementPtrInst>(U)) {
AllowPromotionWithoutCommonHeader = true;		AllowPromotionWithoutCommonHeader = true;
break;		break;
}		}
}		}
}		}
return Considerable;		return Considerable;
}		}

unsigned AArch64TTIImpl::getCacheLineSize() {
return ST->getCacheLineSize();
}

unsigned AArch64TTIImpl::getPrefetchDistance() {
return ST->getPrefetchDistance();
}

unsigned AArch64TTIImpl::getMinPrefetchStride() {
return ST->getMinPrefetchStride();
}

unsigned AArch64TTIImpl::getMaxPrefetchIterationsAhead() {
return ST->getMaxPrefetchIterationsAhead();
}

bool AArch64TTIImpl::useReductionIntrinsic(unsigned Opcode, Type *Ty,		bool AArch64TTIImpl::useReductionIntrinsic(unsigned Opcode, Type *Ty,
TTI::ReductionFlags Flags) const {		TTI::ReductionFlags Flags) const {
assert(isa<VectorType>(Ty) && "Expected Ty to be a vector type");		assert(isa<VectorType>(Ty) && "Expected Ty to be a vector type");
unsigned ScalarBits = Ty->getScalarSizeInBits();		unsigned ScalarBits = Ty->getScalarSizeInBits();
switch (Opcode) {		switch (Opcode) {
case Instruction::FAdd:		case Instruction::FAdd:
case Instruction::FMul:		case Instruction::FMul:
case Instruction::And:		case Instruction::And:
▲ Show 20 Lines • Show All 97 Lines • Show Last 20 Lines

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.cpp

	Show All 32 Lines

	#define GET_INSTRINFO_MC_DESC			#define GET_INSTRINFO_MC_DESC
	#include "AMDGPUGenInstrInfo.inc"			#include "AMDGPUGenInstrInfo.inc"

	#define GET_SUBTARGETINFO_MC_DESC			#define GET_SUBTARGETINFO_MC_DESC
	#include "AMDGPUGenSubtargetInfo.inc"			#include "AMDGPUGenSubtargetInfo.inc"

	#define NoSchedModel NoSchedModelR600			#define NoSchedModel NoSchedModelR600
				#define NoCaches NoCachesR600
				#define NoWCBuffers NoWCBuffersR600
				#define NoSoftwarePrefetcher NoSoftwarePrefetcherR600
				#define TransitionSoftwarePrefetcher TransitionSoftwarePrefetcherR600
				#define NoMemoryModel NoMemoryModelR600
				#define MinimalMemoryModel MinimalMemoryModelR600
				#define MinimalCore MinimalCoreR600
				#define MinimalCoreContained MinimalCoreContainedR600
				#define MinimalCoreResourceDesc MinimalCoreResourceDescR600
				#define MinimalSystemModelResources MinimalSystemModelResourcesR600
				#define MinimalSystemModel MinimalSystemModelR600
	#define GET_SUBTARGETINFO_MC_DESC			#define GET_SUBTARGETINFO_MC_DESC
	#include "R600GenSubtargetInfo.inc"			#include "R600GenSubtargetInfo.inc"
	#undef NoSchedModelR600			#undef NoSchedModelR600
				#undef NoCachesR600
				#undef NoWCBuffersR600
				#undef NoSoftwarePrefetcherR600
				#undef TransitionSoftwarePrefetcherR600
				#undef NoMemoryModelR600
				#undef MinimalMemoryModelR600
				#undef MinimalCoreR600
				#undef MinimalCoreContainedR600
				#undef MinimalCoreResourceDescR600
				#undef MinimalSystemModelResourcesR600
				#undef MinimalSystemModelR600

	#define GET_REGINFO_MC_DESC			#define GET_REGINFO_MC_DESC
	#include "AMDGPUGenRegisterInfo.inc"			#include "AMDGPUGenRegisterInfo.inc"

	#define GET_REGINFO_MC_DESC			#define GET_REGINFO_MC_DESC
	#include "R600GenRegisterInfo.inc"			#include "R600GenRegisterInfo.inc"

	static MCInstrInfo *createAMDGPUMCInstrInfo() {			static MCInstrInfo *createAMDGPUMCInstrInfo() {
	▲ Show 20 Lines • Show All 83 Lines • Show Last 20 Lines

llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.h

Show First 20 Lines • Show All 62 Lines • ▼ Show 20 Lines	public:
// The Hexagon target can unroll loops with run-time trip counts.		// The Hexagon target can unroll loops with run-time trip counts.
void getUnrollingPreferences(Loop *L, ScalarEvolution &SE,		void getUnrollingPreferences(Loop *L, ScalarEvolution &SE,
TTI::UnrollingPreferences &UP);		TTI::UnrollingPreferences &UP);

/// Bias LSR towards creating post-increment opportunities.		/// Bias LSR towards creating post-increment opportunities.
bool shouldFavorPostInc() const;		bool shouldFavorPostInc() const;

// L1 cache prefetch.		// L1 cache prefetch.
unsigned getPrefetchDistance() const;		unsigned getPrefetchDistance() const override;
unsigned getCacheLineSize() const;		unsigned getCacheLineSize() const override;

/// @}		/// @}

/// \name Vector TTI Implementations		/// \name Vector TTI Implementations
/// @{		/// @{

unsigned getNumberOfRegisters(bool vector) const;		unsigned getNumberOfRegisters(bool vector) const;
unsigned getMaxInterleaveFactor(unsigned VF);		unsigned getMaxInterleaveFactor(unsigned VF);
▲ Show 20 Lines • Show All 72 Lines • Show Last 20 Lines

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.h

Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	public:
/// @{		/// @{
bool useColdCCForColdCall(Function &F);		bool useColdCCForColdCall(Function &F);
bool enableAggressiveInterleaving(bool LoopHasReductions);		bool enableAggressiveInterleaving(bool LoopHasReductions);
const TTI::MemCmpExpansionOptions *enableMemCmpExpansion(		const TTI::MemCmpExpansionOptions *enableMemCmpExpansion(
bool IsZeroCmp) const;		bool IsZeroCmp) const;
bool enableInterleavedAccessVectorization();		bool enableInterleavedAccessVectorization();
unsigned getNumberOfRegisters(bool Vector);		unsigned getNumberOfRegisters(bool Vector);
unsigned getRegisterBitWidth(bool Vector) const;		unsigned getRegisterBitWidth(bool Vector) const;
unsigned getCacheLineSize();		unsigned getCacheLineSize() const override;
unsigned getPrefetchDistance();		unsigned getPrefetchDistance() const override;
unsigned getMaxInterleaveFactor(unsigned VF);		unsigned getMaxInterleaveFactor(unsigned VF);
int vectorCostAdjustment(int Cost, unsigned Opcode, Type Ty1, Type Ty2);		int vectorCostAdjustment(int Cost, unsigned Opcode, Type Ty1, Type Ty2);
int getArithmeticInstrCost(		int getArithmeticInstrCost(
unsigned Opcode, Type *Ty,		unsigned Opcode, Type *Ty,
TTI::OperandValueKind Opd1Info = TTI::OK_AnyValue,		TTI::OperandValueKind Opd1Info = TTI::OK_AnyValue,
TTI::OperandValueKind Opd2Info = TTI::OK_AnyValue,		TTI::OperandValueKind Opd2Info = TTI::OK_AnyValue,
TTI::OperandValueProperties Opd1PropInfo = TTI::OP_None,		TTI::OperandValueProperties Opd1PropInfo = TTI::OP_None,
TTI::OperandValueProperties Opd2PropInfo = TTI::OP_None,		TTI::OperandValueProperties Opd2PropInfo = TTI::OP_None,
Show All 23 Lines

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp

Show First 20 Lines • Show All 268 Lines • ▼ Show 20 Lines	unsigned PPCTTIImpl::getRegisterBitWidth(bool Vector) const {
}		}

if (ST->isPPC64())		if (ST->isPPC64())
return 64;		return 64;
return 32;		return 32;

}		}

unsigned PPCTTIImpl::getCacheLineSize() {		unsigned PPCTTIImpl::getCacheLineSize() const {
// Check first if the user specified a custom line size.		// Check first if the user specified a custom line size.
if (CacheLineSize.getNumOccurrences() > 0)		if (CacheLineSize.getNumOccurrences() > 0)
return CacheLineSize;		return CacheLineSize;

// On P7, P8 or P9 we have a cache line size of 128.		// On P7, P8 or P9 we have a cache line size of 128.
unsigned Directive = ST->getDarwinDirective();		unsigned Directive = ST->getDarwinDirective();
if (Directive == PPC::DIR_PWR7 \|\| Directive == PPC::DIR_PWR8 \|\|		if (Directive == PPC::DIR_PWR7 \|\| Directive == PPC::DIR_PWR8 \|\|
Directive == PPC::DIR_PWR9)		Directive == PPC::DIR_PWR9)
return 128;		return 128;

// On other processors return a default of 64 bytes.		// On other processors return a default of 64 bytes.
return 64;		return 64;
}		}

unsigned PPCTTIImpl::getPrefetchDistance() {		unsigned PPCTTIImpl::getPrefetchDistance() const {
// This seems like a reasonable default for the BG/Q (this pass is enabled, by		// This seems like a reasonable default for the BG/Q (this pass is enabled, by
// default, only on the BG/Q).		// default, only on the BG/Q).
return 300;		return 300;
}		}

unsigned PPCTTIImpl::getMaxInterleaveFactor(unsigned VF) {		unsigned PPCTTIImpl::getMaxInterleaveFactor(unsigned VF) {
unsigned Directive = ST->getDarwinDirective();		unsigned Directive = ST->getDarwinDirective();
// The 440 has no SIMD support, but floating-point instructions		// The 440 has no SIMD support, but floating-point instructions
▲ Show 20 Lines • Show All 237 Lines • Show Last 20 Lines

llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.h

Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines	public:
/// @}		/// @}

/// \name Vector TTI Implementations		/// \name Vector TTI Implementations
/// @{		/// @{

unsigned getNumberOfRegisters(bool Vector);		unsigned getNumberOfRegisters(bool Vector);
unsigned getRegisterBitWidth(bool Vector) const;		unsigned getRegisterBitWidth(bool Vector) const;

unsigned getCacheLineSize() { return 256; }		unsigned getCacheLineSize() const override { return 256; }
unsigned getPrefetchDistance() { return 2000; }		unsigned getPrefetchDistance() const override { return 2000; }
unsigned getMinPrefetchStride() { return 2048; }		unsigned getMinPrefetchStride() const override { return 2048; }

bool hasDivRemOp(Type *DataType, bool IsSigned);		bool hasDivRemOp(Type *DataType, bool IsSigned);
bool prefersVectorizedAddressing() { return false; }		bool prefersVectorizedAddressing() { return false; }
bool LSRWithInstrQueries() { return true; }		bool LSRWithInstrQueries() { return true; }
bool supportsEfficientVectorElementLoadStore() { return true; }		bool supportsEfficientVectorElementLoadStore() { return true; }
bool enableInterleavedAccessVectorization() { return true; }		bool enableInterleavedAccessVectorization() { return true; }

int getArithmeticInstrCost(		int getArithmeticInstrCost(
Show All 40 Lines

llvm/lib/Transforms/Scalar/LoopDataPrefetch.cpp

Show First 20 Lines • Show All 86 Lines • ▼ Show 20 Lines	private:
}		}

unsigned getMaxPrefetchIterationsAhead() {		unsigned getMaxPrefetchIterationsAhead() {
if (MaxPrefetchIterationsAhead.getNumOccurrences() > 0)		if (MaxPrefetchIterationsAhead.getNumOccurrences() > 0)
return MaxPrefetchIterationsAhead;		return MaxPrefetchIterationsAhead;
return TTI->getMaxPrefetchIterationsAhead();		return TTI->getMaxPrefetchIterationsAhead();
}		}

		bool prefetchReads() {
		return TTI->prefetchReads();
		}

		bool prefetchWrites() {
		if (PrefetchWrites.getNumOccurrences() > 0)
		return PrefetchWrites;
		return TTI->prefetchWrites();
		}

		bool useReadPrefetchForWrites() {
		return TTI->useReadPrefetchForWrites();
		}

AssumptionCache *AC;		AssumptionCache *AC;
LoopInfo *LI;		LoopInfo *LI;
ScalarEvolution *SE;		ScalarEvolution *SE;
const TargetTransformInfo *TTI;		const TargetTransformInfo *TTI;
OptimizationRemarkEmitter *ORE;		OptimizationRemarkEmitter *ORE;
};		};

/// Legacy class for inserting loop data prefetches.		/// Legacy class for inserting loop data prefetches.
▲ Show 20 Lines • Show All 146 Lines • ▼ Show 20 Lines	bool LoopDataPrefetch::runOnLoop(Loop *L) {

SmallVector<std::pair<Instruction , const SCEVAddRecExpr >, 16> PrefLoads;		SmallVector<std::pair<Instruction , const SCEVAddRecExpr >, 16> PrefLoads;
for (const auto BB : L->blocks()) {		for (const auto BB : L->blocks()) {
for (auto &I : *BB) {		for (auto &I : *BB) {
Value *PtrValue;		Value *PtrValue;
Instruction *MemI;		Instruction *MemI;

if (LoadInst *LMemI = dyn_cast<LoadInst>(&I)) {		if (LoadInst *LMemI = dyn_cast<LoadInst>(&I)) {
		if (!prefetchReads()) continue;
MemI = LMemI;		MemI = LMemI;
PtrValue = LMemI->getPointerOperand();		PtrValue = LMemI->getPointerOperand();
} else if (StoreInst *SMemI = dyn_cast<StoreInst>(&I)) {		} else if (StoreInst *SMemI = dyn_cast<StoreInst>(&I)) {
if (!PrefetchWrites) continue;		if (!prefetchWrites()) continue;
MemI = SMemI;		MemI = SMemI;
PtrValue = SMemI->getPointerOperand();		PtrValue = SMemI->getPointerOperand();
} else continue;		} else continue;

unsigned PtrAddrSpace = PtrValue->getType()->getPointerAddressSpace();		unsigned PtrAddrSpace = PtrValue->getType()->getPointerAddressSpace();
if (PtrAddrSpace)		if (PtrAddrSpace)
continue;		continue;

Show All 40 Lines	for (auto &I : *BB) {
SCEVExpander SCEVE(*SE, I.getModule()->getDataLayout(), "prefaddr");		SCEVExpander SCEVE(*SE, I.getModule()->getDataLayout(), "prefaddr");
Value *PrefPtrValue = SCEVE.expandCodeFor(NextLSCEV, I8Ptr, MemI);		Value *PrefPtrValue = SCEVE.expandCodeFor(NextLSCEV, I8Ptr, MemI);

IRBuilder<> Builder(MemI);		IRBuilder<> Builder(MemI);
Module *M = BB->getParent()->getParent();		Module *M = BB->getParent()->getParent();
Type *I32 = Type::getInt32Ty(BB->getContext());		Type *I32 = Type::getInt32Ty(BB->getContext());
Function *PrefetchFunc =		Function *PrefetchFunc =
Intrinsic::getDeclaration(M, Intrinsic::prefetch);		Intrinsic::getDeclaration(M, Intrinsic::prefetch);
		int PrefetchType =
		MemI->mayReadFromMemory() ? 0 :
		useReadPrefetchForWrites() ? 0 : 1;
Builder.CreateCall(		Builder.CreateCall(
PrefetchFunc,		PrefetchFunc,
{PrefPtrValue,		{PrefPtrValue,
ConstantInt::get(I32, MemI->mayReadFromMemory() ? 0 : 1),		ConstantInt::get(I32, PrefetchType),
ConstantInt::get(I32, 3), ConstantInt::get(I32, 1)});		ConstantInt::get(I32, 3), ConstantInt::get(I32, 1)});
++NumPrefetches;		++NumPrefetches;
LLVM_DEBUG(dbgs() << " Access: " << PtrValue << ", SCEV: " << LSCEV		LLVM_DEBUG(dbgs() << " Access: " << PtrValue << ", SCEV: " << LSCEV
<< "\n");		<< "\n");
ORE->emit([&]() {		ORE->emit([&]() {
return OptimizationRemark(DEBUG_TYPE, "Prefetched", MemI)		return OptimizationRemark(DEBUG_TYPE, "Prefetched", MemI)
<< "prefetched memory access";		<< "prefetched memory access";
});		});

MadeChange = true;		MadeChange = true;
}		}
}		}

return MadeChange;		return MadeChange;
}		}

llvm/test/TableGen/SystemModelEmitter.td

This file was added.

				// RUN: llvm-tblgen -gen-subtarget -I %p/../../include %s \| FileCheck %s

				include "llvm/Target/Target.td"

				// Define a thread.
				def CoreThread : Thread;

				// Define a big core.
				def BigCoreL1 : CacheLevel<Size<_64KiB.Value>,
				LineSize<64>,
				Ways<2>,
				Latency<3>>;

				def BigCoreL2 : CacheLevel<Size<_512KiB.Value>,
				LineSize<64>,
				Ways<16>,
				Latency<12>>;

				def BigCoreCacheHierarchy : CacheHierarchy<[BigCoreL1, BigCoreL2]>;

				def BigCoreWCBuffers : WriteCombiningBuffer<4>;

				def BigCoreSoftwarePrefetcher : SoftwarePrefetcher<Enabled,
				HeuristicByteDistance,
				MinByteDistance<64>,
				MaxByteDistance<640>,
				InstructionDistance<200>,
				MaxIterationDistance<4>,
				MinByteStride<2048>>;

				def BigCoreMemoryModel : MemoryModel<BigCoreCacheHierarchy,
				BigCoreWCBuffers,
				BigCoreSoftwarePrefetcher>;

				def BigCoreThreadDesc : ExecutionResourceDesc<CoreThread, 4>;

				def BigCore : Core<[BigCoreThreadDesc], BigCoreMemoryModel>;

				// Define a little core.
				def LittleCoreL1 : CacheLevel<Size<_32KiB.Value>,
				LineSize<32>,
				Ways<2>,
				Latency<3>>;

				def LittleCoreCacheHierarchy : CacheHierarchy<[LittleCoreL1]>;

				def LittleCoreWCBuffers : WriteCombiningBuffer<2>;

				def LittleCoreSoftwarePrefetcher : SoftwarePrefetcher<Enabled,
				HeuristicByteDistance,
				MinByteDistance<12>,
				MaxByteDistance<120>,
				InstructionDistance<100>,
				MaxIterationDistance<2>,
				MinByteStride<512>>;


				def LittleCoreMemoryModel : MemoryModel<LittleCoreCacheHierarchy,
				LittleCoreWCBuffers,
				LittleCoreSoftwarePrefetcher>;

				def LittleCoreThreadDesc : ExecutionResourceDesc<CoreThread, 2>;

				def LittleCore : Core<[LittleCoreThreadDesc], LittleCoreMemoryModel>;

				// Define the socket-level memory model.
				def SocketL3 : CacheLevel<Size<_2MiB.Value>,
				LineSize<64>,
				Ways<16>,
				Latency<33>>;

				def SocketCacheHierarchy : CacheHierarchy<[SocketL3]>;

				// Prefetching and write-combining are handled at the core level.
				def SocketMemoryModel : MemoryModel<SocketCacheHierarchy,
				NoWCBuffers,
				NoSoftwarePrefetcher>;

				// Define an execution engine containing big cores.
				def HomogeneousCoreDesc : ExecutionResourceDesc<BigCore, 32>;

				def HomogeneousSocket : Socket<[HomogeneousCoreDesc], SocketMemoryModel>;

				def HomogeneousSocketDesc : ExecutionResourceDesc<HomogeneousSocket, 1>;

				def HomogeneousModel : SystemModel<[HomogeneousSocketDesc]>;

				// Define a socket containing big cores and little cores.
				def HeterogeneousBigCoreDesc : ExecutionResourceDesc<BigCore, 2>;
				def HeterogeneousLittleCoreDesc : ExecutionResourceDesc<LittleCore, 16>;

				def HeterogeneousSocket : Socket<[HeterogeneousBigCoreDesc,
				HeterogeneousLittleCoreDesc],
				SocketMemoryModel>;

				def HeterogeneousSocketDesc : ExecutionResourceDesc<HeterogeneousSocket, 1>;

				def HeterogeneousModel : SystemModel<[HeterogeneousSocketDesc]>;


				def HomogeneousProc : Processor<"BigProc", NoItineraries, []> {
				let System = HomogeneousModel;
				}

				def HeterogeneousProc : Processor<"BigLittleProc", NoItineraries, []> {
				let System = HeterogeneousModel;
				}

				def MyProcDefault : Processor<"DefaultProc", NoItineraries, []>;

				def MyTarget : Target;

				// CHECK: // System models
				// CHECK-NEXT: // ===============================================================
				// CHECK-NEXT: //

				// CHECK: // Cache models
				// CHECK-NEXT: //
				// CHECK-NEXT: static const llvm::MCCacheLevelInfo BigCoreCacheHierarchy[] = {
				// CHECK-NEXT: llvm::MCCacheLevelInfo(1, "BigCoreL1", 65536, 64, 2, 3),
				// CHECK-NEXT: llvm::MCCacheLevelInfo(2, "BigCoreL2", 524288, 64, 16, 12)
				// CHECK-NEXT: }; // BigCoreCacheHierarchy

				// CHECK: static const llvm::MCCacheLevelInfo LittleCoreCacheHierarchy[] = {
				// CHECK-NEXT: llvm::MCCacheLevelInfo(3, "LittleCoreL1", 32768, 32, 2, 3)
				// CHECK-NEXT: }; // LittleCoreCacheHierarchy

				// CHECK: static const llvm::MCCacheLevelInfo NoCaches[] = {
				// CHECK-NEXT: llvm::MCCacheLevelInfo(0, "Empty", 0, 0, 0, 0)
				// CHECK-NEXT: }; // NoCaches

				// CHECK: static const llvm::MCCacheLevelInfo SocketCacheHierarchy[] = {
				// CHECK-NEXT: llvm::MCCacheLevelInfo(4, "SocketL3", 2097152, 64, 16, 33)
				// CHECK-NEXT: }; // SocketCacheHierarchy

				// CHECK: // Write-combining buffers
				// CHECK-NEXT: //
				// CHECK-NEXT: static const llvm::MCWriteCombiningBufferInfo BigCoreWCBuffers(5, "BigCoreWCBuffers", 4);

				// CHECK: static const llvm::MCWriteCombiningBufferInfo LittleCoreWCBuffers(6, "LittleCoreWCBuffers", 2);

				// CHECK: static const llvm::MCWriteCombiningBufferInfo NoWCBuffers(7, "NoWCBuffers", 0);

				// CHECK: // Software prefetch configs
				// CHECK-NEXT: //
				// CHECK-NEXT: static const llvm::MCSoftwarePrefetcherConfig BigCoreSoftwarePrefetcher(8, "BigCoreSoftwarePrefetcher", 1, 0, 64, 640, 200, 4, 2048);

				// CHECK: static const llvm::MCSoftwarePrefetcherConfig LittleCoreSoftwarePrefetcher(9, "LittleCoreSoftwarePrefetcher", 1, 0, 12, 120, 100, 2, 512);

				// CHECK: static const llvm::MCSoftwarePrefetcherConfig NoSoftwarePrefetcher(10, "NoSoftwarePrefetcher", 0, 0, 0, 0, 0, 0, 0);

				// CHECK: static const llvm::MCSoftwarePrefetcherConfig TransitionSoftwarePrefetcher(11, "TransitionSoftwarePrefetcher", 1, 0, 0, 0, 0, 4294967295, 1);

				// CHECK: // Memory models
				// CHECK-NEXT: //
				// CHECK-NEXT: static const llvm::MCMemoryModel BigCoreMemoryModel(12, "BigCoreMemoryModel", BigCoreCacheHierarchy, 2, BigCoreWCBuffers, BigCoreSoftwarePrefetcher);

				// CHECK: static const llvm::MCMemoryModel LittleCoreMemoryModel(13, "LittleCoreMemoryModel", LittleCoreCacheHierarchy, 1, LittleCoreWCBuffers, LittleCoreSoftwarePrefetcher);

				// CHECK: static const llvm::MCMemoryModel MinimalMemoryModel(14, "MinimalMemoryModel", NoCaches, 0, NoWCBuffers, TransitionSoftwarePrefetcher);

				// CHECK: static const llvm::MCMemoryModel NoMemoryModel(15, "NoMemoryModel", NoCaches, 0, NoWCBuffers, NoSoftwarePrefetcher);

				// CHECK: static const llvm::MCMemoryModel SocketMemoryModel(16, "SocketMemoryModel", SocketCacheHierarchy, 1, NoWCBuffers, NoSoftwarePrefetcher);

				// CHECK: // System models
				// CHECK-NEXT: //
				// CHECK-NEXT: static const llvm::MCExecutionResourceDesc *CoreThreadContained[] = {
				// CHECK-NEXT: nullptr
				// CHECK-NEXT: };

				// CHECK: static const llvm::MCExecutionResource CoreThread(17, "CoreThread", CoreThreadContained, 0, NoMemoryModel);

				// CHECK: static const llvm::MCExecutionResourceDesc BigCoreThreadDesc(18, "BigCoreThreadDesc", &CoreThread, 4);

				// CHECK: static const llvm::MCExecutionResourceDesc *BigCoreContained[] = {
				// CHECK-NEXT: &BigCoreThreadDesc
				// CHECK-NEXT: };

				// CHECK: static const llvm::MCExecutionResource BigCore(19, "BigCore", BigCoreContained, 1, BigCoreMemoryModel);

				// CHECK: static const llvm::MCExecutionResourceDesc HeterogeneousBigCoreDesc(20, "HeterogeneousBigCoreDesc", &BigCore, 2);

				// CHECK: static const llvm::MCExecutionResourceDesc LittleCoreThreadDesc(21, "LittleCoreThreadDesc", &CoreThread, 2);

				// CHECK: static const llvm::MCExecutionResourceDesc *LittleCoreContained[] = {
				// CHECK-NEXT: &LittleCoreThreadDesc
				// CHECK-NEXT: };

				// CHECK: static const llvm::MCExecutionResource LittleCore(22, "LittleCore", LittleCoreContained, 1, LittleCoreMemoryModel);

				// CHECK: static const llvm::MCExecutionResourceDesc HeterogeneousLittleCoreDesc(23, "HeterogeneousLittleCoreDesc", &LittleCore, 16);

				// CHECK: static const llvm::MCExecutionResourceDesc *HeterogeneousSocketContained[] = {
				// CHECK-NEXT: &HeterogeneousBigCoreDesc,
				// CHECK-NEXT: &HeterogeneousLittleCoreDesc
				// CHECK-NEXT: };

				// CHECK: static const llvm::MCExecutionResource HeterogeneousSocket(24, "HeterogeneousSocket", HeterogeneousSocketContained, 2, SocketMemoryModel);

				// CHECK: static const llvm::MCExecutionResourceDesc HeterogeneousSocketDesc(25, "HeterogeneousSocketDesc", &HeterogeneousSocket, 1);

				// CHECK: static const llvm::MCExecutionResourceDesc *HeterogeneousModelResources[] = {
				// CHECK-NEXT: &HeterogeneousSocketDesc
				// CHECK-NEXT: };

				// CHECK: static const llvm::MCSystemModel HeterogeneousModel(26, "HeterogeneousModel", HeterogeneousModelResources, 1);

				// CHECK: static const llvm::MCExecutionResourceDesc HomogeneousCoreDesc(26, "HomogeneousCoreDesc", &BigCore, 32);

				// CHECK: static const llvm::MCExecutionResourceDesc *HomogeneousSocketContained[] = {
				// CHECK-NEXT: &HomogeneousCoreDesc
				// CHECK-NEXT: };

				// CHECK: static const llvm::MCExecutionResource HomogeneousSocket(27, "HomogeneousSocket", HomogeneousSocketContained, 1, SocketMemoryModel);

				// CHECK: static const llvm::MCExecutionResourceDesc HomogeneousSocketDesc(28, "HomogeneousSocketDesc", &HomogeneousSocket, 1);

				// CHECK: static const llvm::MCExecutionResourceDesc *HomogeneousModelResources[] = {
				// CHECK-NEXT: &HomogeneousSocketDesc
				// CHECK-NEXT: };

				// CHECK: static const llvm::MCSystemModel HomogeneousModel(29, "HomogeneousModel", HomogeneousModelResources, 1);

				// CHECK: static const llvm::MCExecutionResourceDesc *MinimalCoreContained[] = {
				// CHECK-NEXT: nullptr
				// CHECK-NEXT: };

				// CHECK: static const llvm::MCExecutionResource MinimalCore(29, "MinimalCore", MinimalCoreContained, 0, MinimalMemoryModel);

				// CHECK: static const llvm::MCExecutionResourceDesc MinimalCoreResourceDesc(30, "MinimalCoreResourceDesc", &MinimalCore, 1);

				// CHECK: static const llvm::MCExecutionResourceDesc *MinimalSystemModelResources[] = {
				// CHECK-NEXT: &MinimalCoreResourceDesc
				// CHECK-NEXT: };

				// CHECK: static const llvm::MCSystemModel MinimalSystemModel(31, "MinimalSystemModel", MinimalSystemModelResources, 1);

				// CHECK: // Sorted (by key) array of execution engine model for CPU subtype.
				// CHECK-NEXT: extern const llvm::SubtargetInfoKV MyTargetProcSystemModelKV[] = {
				// CHECK-NEXT: { "BigLittleProc", (const void *)&HeterogeneousModel },
				// CHECK-NEXT: { "BigProc", (const void *)&HomogeneousModel },
				// CHECK-NEXT: { "DefaultProc", (const void *)&MinimalSystemModel },
				// CHECK-NEXT: };

				// CHECK: const SubtargetInfoKV *ProcSystem) :
				// CHECK-NEXT: MCSubtargetInfo(TT, CPU, FS, PF, PD, ProcSched,
				// CHECK-NEXT: WPR, WL, RA, IS, OC, FP, ProcSystem) { }

				// CHECK: static inline MCSubtargetInfo *createMyTargetMCSubtargetInfoImpl(const Triple &TT, StringRef CPU, StringRef FS) {
				// CHECK-NEXT: return new MyTargetGenMCSubtargetInfo(TT, CPU, FS, None, MyTargetSubTypeKV,
				// CHECK-NEXT: MyTargetProcSchedKV, MyTargetWriteProcResTable, MyTargetWriteLatencyTable, MyTargetReadAdvanceTable,
				// CHECK-NEXT: nullptr, nullptr, nullptr, MyTargetProcSystemModelKV);
				// CHECK-NEXT: }

				// CHECK: extern const llvm::SubtargetInfoKV MyTargetProcSystemModelKV[];
				// CHECK-NEXT: MyTargetGenSubtargetInfo::MyTargetGenSubtargetInfo(const Triple &TT, StringRef CPU, StringRef FS)
				// CHECK-NEXT: : TargetSubtargetInfo(TT, CPU, FS, None, makeArrayRef(MyTargetSubTypeKV, 3),
				// CHECK-NEXT: MyTargetProcSchedKV, MyTargetWriteProcResTable, MyTargetWriteLatencyTable, MyTargetReadAdvanceTable,
				// CHECK-NEXT: nullptr, nullptr, nullptr,
				// CHECK-NEXT: MyTargetProcSystemModelKV) {}

llvm/unittests/CodeGen/MachineInstrTest.cpp

Show First 20 Lines • Show All 41 Lines • ▼ Show 20 Lines	void emitEpilogue(MachineFunction &MF,
MachineBasicBlock &MBB) const override {}		MachineBasicBlock &MBB) const override {}
bool hasFP(const MachineFunction &MF) const override { return false; }		bool hasFP(const MachineFunction &MF) const override { return false; }
};		};

class BogusSubtarget : public TargetSubtargetInfo {		class BogusSubtarget : public TargetSubtargetInfo {
public:		public:
BogusSubtarget(TargetMachine &TM)		BogusSubtarget(TargetMachine &TM)
: TargetSubtargetInfo(Triple(""), "", "", {}, {}, nullptr, nullptr,		: TargetSubtargetInfo(Triple(""), "", "", {}, {}, nullptr, nullptr,
nullptr, nullptr, nullptr, nullptr, nullptr),		nullptr, nullptr, nullptr, nullptr, nullptr,
		nullptr),
FL(), TL(TM) {}		FL(), TL(TM) {}
~BogusSubtarget() override {}		~BogusSubtarget() override {}

const TargetFrameLowering *getFrameLowering() const override { return &FL; }		const TargetFrameLowering *getFrameLowering() const override { return &FL; }

const TargetLowering *getTargetLowering() const override { return &TL; }		const TargetLowering *getTargetLowering() const override { return &TL; }

const TargetInstrInfo *getInstrInfo() const override { return &TII; }		const TargetInstrInfo *getInstrInfo() const override { return &TII; }
▲ Show 20 Lines • Show All 219 Lines • Show Last 20 Lines

llvm/unittests/MC/CMakeLists.txt

	set(LLVM_LINK_COMPONENTS			set(LLVM_LINK_COMPONENTS
	${LLVM_TARGETS_TO_BUILD}			${LLVM_TARGETS_TO_BUILD}
	MC			MC
	MCDisassembler			MCDisassembler
	Support			Support
	)			)

	add_llvm_unittest(MCTests			add_llvm_unittest(MCTests
	Disassembler.cpp			Disassembler.cpp
	DwarfLineTables.cpp			DwarfLineTables.cpp
	StringTableBuilderTest.cpp			StringTableBuilderTest.cpp
				SystemModel.cpp
	TargetRegistry.cpp			TargetRegistry.cpp
	)			)

llvm/unittests/MC/SystemModel.cpp

This file was added.

				//===- unittests/MC/SystemModel.cpp ---------------------------------------===//
				//
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/MC/MCSystemModel.h"
				#include "gtest/gtest.h"

				using namespace llvm;

				namespace {

				TEST(SystemModel, Topology1Tests) {
				// Test this topology:
				//
				// System
				// / \
				// (L2, L3) GPU CPU (L3)
				// \| / \
				// (L1) C L (L1) B (L1, L2)

				const unsigned BigL1 = 0;
				const unsigned BigL2 = 1;

				const unsigned LittleL1 = 0;

				const unsigned GPUCoreL1 = 0;

				const unsigned CPUL3 = 0;

				const unsigned GPUL2 = 0;
				const unsigned GPUL3 = 1;

				const unsigned Big = 0;
				const unsigned Little = 1;

				const unsigned GPU = 0;

				const unsigned CPUSocket = 0;
				const unsigned GPUSocket = 1;

				// Define cache parameters.
				const char *BigCacheLevelNames[] = { "BigL1", "BigL2" };
				unsigned BigCacheLevelSizes[] = { 102416, 1024 1024*4 };
				unsigned BigCacheLevelLineSizes[] = { 32, 32 };
				unsigned BigCacheLevelAssociativities[] = { 8, 24 };
				unsigned BigCacheLevelLatencies[] = { 2, 12 };

				const char *LittleCacheLevelNames[] = { "LittleL1" };
				unsigned LittleCacheLevelSizes[] = { 1024*8 };
				unsigned LittleCacheLevelLineSizes[] = { 32 };
				unsigned LittleCacheLevelAssociativities[] = { 8 };
				unsigned LittleCacheLevelLatencies[] = { 2 };

				const char *CPUCacheLevelNames[] = { "CPUL3" };
				unsigned CPUCacheLevelSizes[] = { 102410248 };
				unsigned CPUCacheLevelLineSizes[] = { 32 };
				unsigned CPUCacheLevelAssociativities[] = { 32 };
				unsigned CPUCacheLevelLatencies[] = { 50 };

				const char *GPUCoreCacheLevelNames[] = { "GPUCoreL1" };
				unsigned GPUCoreCacheLevelSizes[] = { 1024*32 };
				unsigned GPUCoreCacheLevelLineSizes[] = { 64 };
				unsigned GPUCoreCacheLevelAssociativities[] = { 8 };
				unsigned GPUCoreCacheLevelLatencies[] = { 2 };

				const char *GPUCacheLevelNames[] = { "GPUL2", "GPUL3" };
				unsigned GPUCacheLevelSizes[] = { 102464, 10241024*2 };
				unsigned GPUCacheLevelLineSizes[] = { 64, 64 };
				unsigned GPUCacheLevelAssociativities[] = { 24, 32 };
				unsigned GPUCacheLevelLatencies[] = { 12, 50 };

				// Define thread parameters.
				const char *ThreadName = "Thread";

				// Define core parameters.
				// The GPU has four cores with two thread team schedulers of vector
				// length 64, for a total of 512 "threads."
				const char *CPUCoreNames[] = { "BigCore", "LittleCore" };
				unsigned CPUCoreCounts[] = { 2, 8 };
				unsigned CPUThreadCounts[] = { 4, 2 };

				const char *GPUCoreNames[] = { "GPUCore" };
				unsigned GPUCoreCounts[] = { 4 };
				// Threads in a core. The GPU has two thread team schedulers, each
				// team may be a vector length of, say, 64 which we don't model.
				unsigned GPUThreadCounts[] = { 2 };

				// Define socket parameters.
				const char *SocketNames[] = { "CPU", "GPU" };
				unsigned CoreTypeCounts[] = { 2, 1 };

				unsigned ID = 0;

				// Define write-combining buffers.
				MCWriteCombiningBufferInfo WCBufs(ID++, "WCBufs", 4);
				MCWriteCombiningBufferInfo NoWCBufs(ID++, "NoWCBufs", 0);

				// Define software prefetchers.
				MCSoftwarePrefetcherConfig Prefetcher(ID++, "Prefetcher", true, 1024, 512,
				4096, 100, 4, 32);

				MCSoftwarePrefetcherConfig NoPrefetcher(ID++, "NoPrefetcher", false, 0, 0, 0,
				0, 0, 0);

				// Define caches.
				MCCacheLevelInfo BigCoreCacheLevels[] = {
				MCCacheLevelInfo(ID++,
				BigCacheLevelNames[BigL1],
				BigCacheLevelSizes[BigL1],
				BigCacheLevelLineSizes[BigL1],
				BigCacheLevelAssociativities[BigL1],
				BigCacheLevelLatencies[BigL1]),
				MCCacheLevelInfo(ID++,
				BigCacheLevelNames[BigL2],
				BigCacheLevelSizes[BigL2],
				BigCacheLevelLineSizes[BigL2],
				BigCacheLevelAssociativities[BigL2],
				BigCacheLevelLatencies[BigL2]),
				};

				MCCacheLevelInfo LittleCoreCacheLevels[] = {
				MCCacheLevelInfo(ID++,
				LittleCacheLevelNames[LittleL1],
				LittleCacheLevelSizes[LittleL1],
				LittleCacheLevelLineSizes[LittleL1],
				LittleCacheLevelAssociativities[LittleL1],
				LittleCacheLevelLatencies[LittleL1]),
				};

				MCCacheLevelInfo CPUCacheLevels[] = {
				MCCacheLevelInfo(ID++,
				CPUCacheLevelNames[CPUL3],
				CPUCacheLevelSizes[CPUL3],
				CPUCacheLevelLineSizes[CPUL3],
				CPUCacheLevelAssociativities[CPUL3],
				CPUCacheLevelLatencies[CPUL3]),
				};

				// Each GPU core has a small cache.
				MCCacheLevelInfo GPUCoreCacheLevels[] = {
				MCCacheLevelInfo(ID++,
				GPUCoreCacheLevelNames[GPUCoreL1],
				GPUCoreCacheLevelSizes[GPUCoreL1],
				GPUCoreCacheLevelLineSizes[GPUCoreL1],
				GPUCoreCacheLevelAssociativities[GPUCoreL1],
				GPUCoreCacheLevelLatencies[GPUCoreL1]),
				};

				// All GPU cores share two higher levels of cache.
				MCCacheLevelInfo GPUCacheLevels[] = {
				MCCacheLevelInfo(ID++,
				GPUCacheLevelNames[GPUL2],
				GPUCacheLevelSizes[GPUL2],
				GPUCacheLevelLineSizes[GPUL2],
				GPUCacheLevelAssociativities[GPUL2],
				GPUCacheLevelLatencies[GPUL2]),
				MCCacheLevelInfo(ID++,
				GPUCacheLevelNames[GPUL3],
				GPUCacheLevelSizes[GPUL3],
				GPUCacheLevelLineSizes[GPUL3],
				GPUCacheLevelAssociativities[GPUL3],
				GPUCacheLevelLatencies[GPUL3]),
				};

				// Define memory models.
				MCMemoryModel BigMemModel(ID++,
				"BigMemModel",
				BigCoreCacheLevels,
				2,
				WCBufs,
				Prefetcher);

				MCMemoryModel LittleMemModel(ID++,
				"LittleMemModel",
				LittleCoreCacheLevels,
				1,
				WCBufs,
				Prefetcher);

				MCMemoryModel CPUMemModel(ID++,
				"CPUMemModel",
				CPUCacheLevels,
				1,
				NoWCBufs,
				NoPrefetcher);

				MCMemoryModel GPUCoreMemModel(ID++,
				"GPUCoreMemModel",
				GPUCoreCacheLevels,
				1,
				NoWCBufs,
				NoPrefetcher);

				MCMemoryModel GPUMemModel(ID++,
				"GPUMemModel",
				GPUCacheLevels,
				2,
				NoWCBufs,
				NoPrefetcher);

				MCMemoryModel NoMemModel(ID++, "NoModel", nullptr, 0, NoWCBufs, NoPrefetcher);

				// Define threads.
				MCExecutionResource CommonThread(ID++, ThreadName, nullptr, 0, NoMemModel);

				// Define cores.
				MCExecutionResourceDesc BigThreadDesc(ID++,
				"BigThreadDesc",
				&CommonThread,
				CPUThreadCounts[Big]);
				MCExecutionResourceDesc LittleThreadDesc(ID++,
				"LittleThreadDesc",
				&CommonThread,
				CPUThreadCounts[Little]);
				MCExecutionResourceDesc GPUThreadDesc(ID++,
				"GPUThreadDesc",
				&CommonThread,
				GPUThreadCounts[GPU]);

				MCExecutionResourceDesc *BigThreadsList[] = { &BigThreadDesc };
				MCExecutionResourceDesc *LittleThreadsList[] = { &LittleThreadDesc };
				MCExecutionResourceDesc *GPUThreadsList[] = { &GPUThreadDesc };

				MCExecutionResource BigCore(ID++,
				CPUCoreNames[Big],
				BigThreadsList,
				1,
				BigMemModel);

				MCExecutionResource LittleCore(ID++,
				CPUCoreNames[Little],
				LittleThreadsList,
				1,
				LittleMemModel);

				MCExecutionResource GPUCore(ID++,
				GPUCoreNames[GPU],
				GPUThreadsList,
				1,
				GPUCoreMemModel);

				// Define sockets.
				MCExecutionResourceDesc BigCoreDesc(ID++,
				"BigCoreDesc",
				&BigCore,
				CPUCoreCounts[Big]);

				MCExecutionResourceDesc LittleCoreDesc(ID++,
				"LittleCoreDesc",
				&LittleCore,
				CPUCoreCounts[Little]);

				MCExecutionResourceDesc GPUCoreDesc(ID++,
				"GPUCoreDesc",
				&GPUCore,
				GPUCoreCounts[GPU]);

				MCExecutionResourceDesc *CPUCoreList[] = { &BigCoreDesc, &LittleCoreDesc };
				MCExecutionResourceDesc *GPUCoreList[] = { &GPUCoreDesc };

				MCExecutionResource CPUEngine(ID++,
				SocketNames[CPUSocket],
				CPUCoreList,
				2,
				CPUMemModel);

				MCExecutionResource GPUEngine(ID++,
				SocketNames[GPUSocket],
				GPUCoreList,
				1,
				GPUMemModel);

				// Define a node consisting of a CPU socket and a GPU socket.
				MCExecutionResourceDesc CPUSocketDesc(ID++, "CPUSocketDesc", &CPUEngine, 1);
				MCExecutionResourceDesc GPUSocketDesc(ID++, "GPUSocketDesc", &GPUEngine, 1);

				MCExecutionResourceDesc *SocketList[] = { &CPUSocketDesc, &GPUSocketDesc };

				MCSystemModel Node(ID++, "Node", SocketList, 2);

				// Test the topology.
				EXPECT_EQ(Node.getNumExecutionResourceTypes(), 2u);

				unsigned s = 0;
				for (const auto &SocketDesc : Node) {
				EXPECT_EQ(SocketDesc.getNumResources(), 1u);

				const auto &Socket = SocketDesc.getResource();
				EXPECT_STREQ(Socket.getName(), SocketNames[s]);
				EXPECT_EQ(Socket.getNumContainedExecutionResourceTypes(),
				CoreTypeCounts[s]);

				unsigned *CoreCounts = (s == CPUSocket ?
				CPUCoreCounts : GPUCoreCounts);
				const char const CoreNames = (s == CPUSocket ?
				CPUCoreNames : GPUCoreNames);
				unsigned *ThreadCounts = (s == CPUSocket ?
				CPUThreadCounts : GPUThreadCounts);

				unsigned c = 0;
				for (const auto &CoreDesc : Socket) {
				EXPECT_EQ(CoreDesc.getNumResources(), CoreCounts[c]);

				const auto &Core = CoreDesc.getResource();
				EXPECT_STREQ(Core.getName(), CoreNames[c]);

				EXPECT_EQ(Core.getNumContainedExecutionResourceTypes(), 1u);

				const auto &ThreadDesc = Core.getResourceDescriptor(0);
				EXPECT_EQ(ThreadDesc.getNumResources(), ThreadCounts[c]);

				const auto &Thread = ThreadDesc.getResource();
				EXPECT_STREQ(Thread.getName(), ThreadName);

				EXPECT_EQ(Thread.getNumContainedExecutionResourceTypes(), 0u);

				// Check thread-level caches.
				unsigned NumThreadCacheLevels =
				Thread.getMemoryModel().getNumCacheLevels();
				EXPECT_EQ(NumThreadCacheLevels, 0u);

				// Check core-level caches.
				const char const CacheNames = (s == CPUSocket &&
				c == Big ?
				BigCacheLevelNames :
				(s == CPUSocket &&
				c == Little ?
				LittleCacheLevelNames :
				GPUCoreCacheLevelNames));
				const unsigned *CacheSizes = (s == CPUSocket &&
				c == Big ?
				BigCacheLevelSizes :
				(s == CPUSocket &&
				c == Little ?
				LittleCacheLevelSizes :
				GPUCoreCacheLevelSizes));
				const unsigned *CacheLineSizes = (s == CPUSocket &&
				c == Big ?
				BigCacheLevelLineSizes :
				(s == CPUSocket &&
				c == Little ?
				LittleCacheLevelLineSizes :
				GPUCoreCacheLevelLineSizes));
				const unsigned *CacheAssociativities = (s == CPUSocket &&
				c == Big ?
				BigCacheLevelAssociativities :
				(s == CPUSocket &&
				c == Little ?
				LittleCacheLevelAssociativities :
				GPUCoreCacheLevelAssociativities)
				);
				const unsigned *CacheLatencies = (s == CPUSocket &&
				c == Big ?
				BigCacheLevelLatencies :
				(s == CPUSocket &&
				c == Little ?
				LittleCacheLevelLatencies :
				GPUCoreCacheLevelLatencies));

				unsigned lvl = 0;
				for (const auto &CacheLevel : Core.getMemoryModel()) {
				EXPECT_STREQ(CacheLevel.getName(), CacheNames[lvl]);
				EXPECT_EQ(CacheLevel.getSizeInBytes(), CacheSizes[lvl]);
				EXPECT_EQ(CacheLevel.getLineSizeInBytes(), CacheLineSizes[lvl]);
				EXPECT_EQ(CacheLevel.getAssociativity(), CacheAssociativities[lvl]);
				EXPECT_EQ(CacheLevel.getLatency(), CacheLatencies[lvl]);

				++lvl;
				}

				++c;
				}

				// Check socket-level caches.
				const char const CacheNames = (s == CPUSocket ?
				CPUCacheLevelNames :
				GPUCacheLevelNames);
				const unsigned *CacheSizes = (s == CPUSocket ?
				CPUCacheLevelSizes :
				GPUCacheLevelSizes);
				const unsigned *CacheLineSizes = (s == CPUSocket ?
				CPUCacheLevelLineSizes :
				GPUCacheLevelLineSizes);
				const unsigned *CacheAssociativities = (s == CPUSocket ?
				CPUCacheLevelAssociativities :
				GPUCacheLevelAssociativities);
				const unsigned *CacheLatencies = (s == CPUSocket ?
				CPUCacheLevelLatencies :
				GPUCacheLevelLatencies);

				unsigned lvl = 0;
				for (const auto &CacheLevel : Socket.getMemoryModel()) {
				EXPECT_STREQ(CacheLevel.getName(), CacheNames[lvl]);
				EXPECT_EQ(CacheLevel.getSizeInBytes(), CacheSizes[lvl]);
				EXPECT_EQ(CacheLevel.getLineSizeInBytes(), CacheLineSizes[lvl]);
				EXPECT_EQ(CacheLevel.getAssociativity(), CacheAssociativities[lvl]);
				EXPECT_EQ(CacheLevel.getLatency(), CacheLatencies[lvl]);

				++lvl;
				}

				++s;
				}

				// Test the global system representation of the memory model.
				const MCSystemModel::CacheLevelSet *L1Levels =
				Node.getCacheLevelInfo(MCSystemModel::CacheLevel::L1);
				const MCSystemModel::CacheLevelSet *L2Levels =
				Node.getCacheLevelInfo(MCSystemModel::CacheLevel::L2);
				const MCSystemModel::CacheLevelSet *L3Levels =
				Node.getCacheLevelInfo(MCSystemModel::CacheLevel::L3);
				const MCSystemModel::CacheLevelSet *L4Levels =
				Node.getCacheLevelInfo(MCSystemModel::CacheLevel::L4);

				const MCSystemModel::PrefetchConfigSet &PrefetchConfigs =
				Node.getSoftwarePrefetcherInfo();

				EXPECT_NE(L1Levels, nullptr);
				EXPECT_NE(L2Levels, nullptr);
				EXPECT_NE(L3Levels, nullptr);
				EXPECT_EQ(L4Levels, nullptr);

				EXPECT_EQ(L1Levels->size(), 3u);
				EXPECT_EQ(L2Levels->size(), 2u);
				EXPECT_EQ(L3Levels->size(), 2u);
				EXPECT_EQ(PrefetchConfigs.size(), 2u);

				unsigned i = 0;
				for (const auto L1Level : *L1Levels) {
				const char const CacheNames = (i == 0 ? BigCacheLevelNames :
				i == 1 ? LittleCacheLevelNames :
				GPUCoreCacheLevelNames);
				const unsigned *CacheSizes = (i == 0 ? BigCacheLevelSizes :
				i == 1 ? LittleCacheLevelSizes :
				GPUCoreCacheLevelSizes);
				const unsigned *CacheLineSizes = (i == 0 ? BigCacheLevelLineSizes :
				i == 1 ? LittleCacheLevelLineSizes :
				GPUCoreCacheLevelLineSizes
				);
				const unsigned *CacheAssociativities = (i == 0 ?
				BigCacheLevelAssociativities :
				i == 1 ?
				LittleCacheLevelAssociativities :
				GPUCoreCacheLevelAssociativities);
				const unsigned *CacheLatencies = (i == 0 ? BigCacheLevelLatencies :
				i == 1 ? LittleCacheLevelLatencies :
				GPUCoreCacheLevelLatencies
				);

				unsigned Index = (i == 0 ? BigL1 :
				i == 1 ? LittleL1 : GPUCoreL1);

				EXPECT_STREQ(L1Level->getName(), CacheNames[Index]);
				EXPECT_EQ(L1Level->getSizeInBytes(), CacheSizes[Index]);
				EXPECT_EQ(L1Level->getLineSizeInBytes(), CacheLineSizes[Index]);
				EXPECT_EQ(L1Level->getAssociativity(), CacheAssociativities[Index]);
				EXPECT_EQ(L1Level->getLatency(), CacheLatencies[Index]);

				++i;
				}

				i = 0;
				for (const auto L2Level : *L2Levels) {
				const char const CacheNames = (i == 0 ? BigCacheLevelNames :
				GPUCacheLevelNames);
				const unsigned *CacheSizes = (i == 0 ? BigCacheLevelSizes :
				GPUCacheLevelSizes);
				const unsigned *CacheLineSizes = (i == 0 ? BigCacheLevelLineSizes :
				GPUCacheLevelLineSizes);
				const unsigned *CacheAssociativities = (i == 0 ?
				BigCacheLevelAssociativities :
				GPUCacheLevelAssociativities);
				const unsigned *CacheLatencies = (i == 0 ? BigCacheLevelLatencies :
				GPUCacheLevelLatencies);

				unsigned Index = (i == 0 ? BigL2 : GPUL2);

				EXPECT_STREQ(L2Level->getName(), CacheNames[Index]);
				EXPECT_EQ(L2Level->getSizeInBytes(), CacheSizes[Index]);
				EXPECT_EQ(L2Level->getLineSizeInBytes(), CacheLineSizes[Index]);
				EXPECT_EQ(L2Level->getAssociativity(), CacheAssociativities[Index]);
				EXPECT_EQ(L2Level->getLatency(), CacheLatencies[Index]);

				++i;
				}

				i = 0;
				for (const auto L3Level : *L3Levels) {
				const char const CacheNames = (i == 0 ? CPUCacheLevelNames :
				GPUCacheLevelNames);
				const unsigned *CacheSizes = (i == 0 ? CPUCacheLevelSizes :
				GPUCacheLevelSizes);
				const unsigned *CacheLineSizes = (i == 0 ? CPUCacheLevelLineSizes :
				GPUCacheLevelLineSizes);
				const unsigned *CacheAssociativities = (i == 0 ?
				CPUCacheLevelAssociativities :
				GPUCacheLevelAssociativities);
				const unsigned *CacheLatencies = (i == 0 ? CPUCacheLevelLatencies :
				GPUCacheLevelLatencies);

				unsigned Index = (i == 0 ? CPUL3 : GPUL3);

				EXPECT_STREQ(L3Level->getName(), CacheNames[Index]);
				EXPECT_EQ(L3Level->getSizeInBytes(), CacheSizes[Index]);
				EXPECT_EQ(L3Level->getLineSizeInBytes(), CacheLineSizes[Index]);
				EXPECT_EQ(L3Level->getAssociativity(), CacheAssociativities[Index]);
				EXPECT_EQ(L3Level->getLatency(), CacheLatencies[Index]);

				++i;
				}
				}

				TEST(SystemModel, Topology2Tests) {
				// Test this topology:
				//
				// System
				// / \
				// GPU CPU (L3)
				// \| / \
				// (L1) C L (L1) B (L1, L2)

				const unsigned BigL1 = 0;
				const unsigned BigL2 = 1;

				const unsigned LittleL1 = 0;

				const unsigned GPUCoreL1 = 0;

				const unsigned CPUL3 = 0;

				const unsigned Big = 0;
				const unsigned Little = 1;

				const unsigned GPU = 0;

				const unsigned CPUSocket = 0;
				const unsigned GPUSocket = 1;

				// Define cache parameters.
				const char *BigCacheLevelNames[] = { "BigL1", "BigL2" };
				unsigned BigCacheLevelSizes[] = { 102416, 1024 1024*4 };
				unsigned BigCacheLevelLineSizes[] = { 32, 32 };
				unsigned BigCacheLevelAssociativities[] = { 8, 24 };
				unsigned BigCacheLevelLatencies[] = { 2, 12 };

				const char *LittleCacheLevelNames[] = { "LittleL1" };
				unsigned LittleCacheLevelSizes[] = { 1024*8 };
				unsigned LittleCacheLevelLineSizes[] = { 32 };
				unsigned LittleCacheLevelAssociativities[] = { 8 };
				unsigned LittleCacheLevelLatencies[] = { 2 };

				const char *CPUCacheLevelNames[] = { "CPUL3" };
				unsigned CPUCacheLevelSizes[] = { 102410248 };
				unsigned CPUCacheLevelLineSizes[] = { 32 };
				unsigned CPUCacheLevelAssociativities[] = { 32 };
				unsigned CPUCacheLevelLatencies[] = { 50 };

				const char *GPUCoreCacheLevelNames[] = { "GPUCoreL1" };
				unsigned GPUCoreCacheLevelSizes[] = { 1024*32 };
				unsigned GPUCoreCacheLevelLineSizes[] = { 64 };
				unsigned GPUCoreCacheLevelAssociativities[] = { 8 };
				unsigned GPUCoreCacheLevelLatencies[] = { 2 };

				// Define thread parameters.
				const char *ThreadName = "Thread";

				// Define core parameters.
				// The GPU has four cores with two thread team schedulers of vector
				// length 64, for a total of 512 "threads."
				const char *CPUCoreNames[] = { "BigCore", "LittleCore" };
				unsigned CPUCoreCounts[] = { 2, 8 };
				unsigned CPUThreadCounts[] = { 4, 2 };

				const char *GPUCoreNames[] = { "GPUCore" };
				unsigned GPUCoreCounts[] = { 4 };
				// Threads in a core. The GPU has two thread team schedulers, each
				// team may be a vector length of, say, 64 which we don't model.
				unsigned GPUThreadCounts[] = { 2 };

				// Define socket parameters.
				const char *SocketNames[] = { "CPU", "GPU" };
				unsigned CoreTypeCounts[] = { 2, 1 };

				unsigned ID = 0;

				// Define write-combining buffers.
				MCWriteCombiningBufferInfo WCBufs(ID++, "WCBufs", 4);
				MCWriteCombiningBufferInfo NoWCBufs(ID++, "NoWCBufs", 0);

				// Define software prefetchers.
				MCSoftwarePrefetcherConfig Prefetcher(ID++, "Prefetcher", true, 1024, 512,
				4096, 100, 4, 32);

				MCSoftwarePrefetcherConfig NoPrefetcher(ID++, "NoPrefetcher", false, 0, 0, 0,
				0, 0, 0);

				// Define caches.
				MCCacheLevelInfo BigCoreCacheLevels[] = {
				MCCacheLevelInfo(ID++,
				BigCacheLevelNames[BigL1],
				BigCacheLevelSizes[BigL1],
				BigCacheLevelLineSizes[BigL1],
				BigCacheLevelAssociativities[BigL1],
				BigCacheLevelLatencies[BigL1]),
				MCCacheLevelInfo(ID++,
				BigCacheLevelNames[BigL2],
				BigCacheLevelSizes[BigL2],
				BigCacheLevelLineSizes[BigL2],
				BigCacheLevelAssociativities[BigL2],
				BigCacheLevelLatencies[BigL2]),
				};

				MCCacheLevelInfo LittleCoreCacheLevels[] = {
				MCCacheLevelInfo(ID++,
				LittleCacheLevelNames[LittleL1],
				LittleCacheLevelSizes[LittleL1],
				LittleCacheLevelLineSizes[LittleL1],
				LittleCacheLevelAssociativities[LittleL1],
				LittleCacheLevelLatencies[LittleL1]),
				};

				MCCacheLevelInfo CPUCacheLevels[] = {
				MCCacheLevelInfo(ID++,
				CPUCacheLevelNames[CPUL3],
				CPUCacheLevelSizes[CPUL3],
				CPUCacheLevelLineSizes[CPUL3],
				CPUCacheLevelAssociativities[CPUL3],
				CPUCacheLevelLatencies[CPUL3]),
				};

				// Each GPU core has a small cache.
				MCCacheLevelInfo GPUCoreCacheLevels[] = {
				MCCacheLevelInfo(ID++,
				GPUCoreCacheLevelNames[GPUCoreL1],
				GPUCoreCacheLevelSizes[GPUCoreL1],
				GPUCoreCacheLevelLineSizes[GPUCoreL1],
				GPUCoreCacheLevelAssociativities[GPUCoreL1],
				GPUCoreCacheLevelLatencies[GPUCoreL1]),
				};

				// Define memory models.
				MCMemoryModel BigMemModel(ID++,
				"BigMemModel",
				BigCoreCacheLevels,
				2,
				WCBufs,
				Prefetcher);

				MCMemoryModel LittleMemModel(ID++,
				"LittleMemModel",
				LittleCoreCacheLevels,
				1,
				WCBufs,
				Prefetcher);

				MCMemoryModel CPUMemModel(ID++,
				"CPUMemModel",
				CPUCacheLevels,
				1,
				NoWCBufs,
				NoPrefetcher);

				MCMemoryModel GPUCoreMemModel(ID++,
				"GPUCoreMemModel",
				GPUCoreCacheLevels,
				1,
				NoWCBufs,
				NoPrefetcher);

				MCMemoryModel NoMemModel(ID++, "NoModel", nullptr, 0, NoWCBufs, NoPrefetcher);

				// Define threads.
				MCExecutionResource CommonThread(ID++, ThreadName, nullptr, 0, NoMemModel);

				// Define cores.
				MCExecutionResourceDesc BigThreadDesc(ID++,
				"BigThreadDesc",
				&CommonThread,
				CPUThreadCounts[Big]);
				MCExecutionResourceDesc LittleThreadDesc(ID++,
				"LittleThreadDesc",
				&CommonThread,
				CPUThreadCounts[Little]);
				MCExecutionResourceDesc GPUThreadDesc(ID++,
				"GPUThreadDesc",
				&CommonThread,
				GPUThreadCounts[GPU]);

				MCExecutionResourceDesc *BigThreadsList[] = { &BigThreadDesc };
				MCExecutionResourceDesc *LittleThreadsList[] = { &LittleThreadDesc };
				MCExecutionResourceDesc *GPUThreadsList[] = { &GPUThreadDesc };

				MCExecutionResource BigCore(ID++,
				CPUCoreNames[Big],
				BigThreadsList,
				1,
				BigMemModel);

				MCExecutionResource LittleCore(ID++,
				CPUCoreNames[Little],
				LittleThreadsList,
				1,
				LittleMemModel);

				MCExecutionResource GPUCore(ID++,
				GPUCoreNames[GPU],
				GPUThreadsList,
				1,
				GPUCoreMemModel);

				// Define sockets.
				MCExecutionResourceDesc BigCoreDesc(ID++,
				"BigCoreDesc",
				&BigCore,
				CPUCoreCounts[Big]);

				MCExecutionResourceDesc LittleCoreDesc(ID++,
				"LittleCoreDesc",
				&LittleCore,
				CPUCoreCounts[Little]);

				MCExecutionResourceDesc GPUCoreDesc(ID++,
				"GPUCoreDesc",
				&GPUCore,
				GPUCoreCounts[GPU]);

				MCExecutionResourceDesc *CPUCoreList[] = { &BigCoreDesc, &LittleCoreDesc };
				MCExecutionResourceDesc *GPUCoreList[] = { &GPUCoreDesc };

				MCExecutionResource CPUEngine(ID++,
				SocketNames[CPUSocket],
				CPUCoreList,
				2,
				CPUMemModel);

				MCExecutionResource GPUEngine(ID++,
				SocketNames[GPUSocket],
				GPUCoreList,
				1,
				NoMemModel);

				// Define a node consisting of a CPU socket and a GPU socket.
				MCExecutionResourceDesc CPUSocketDesc(ID++, "CPUSocketDesc", &CPUEngine, 1);
				MCExecutionResourceDesc GPUSocketDesc(ID++, "GPUSocketDesc", &GPUEngine, 1);

				MCExecutionResourceDesc *SocketList[] = { &CPUSocketDesc, &GPUSocketDesc };

				MCSystemModel Node(ID++, "Node", SocketList, 2);

				// Test the topology.
				EXPECT_EQ(Node.getNumExecutionResourceTypes(), 2u);

				unsigned s = 0;
				for (const auto &SocketDesc : Node) {
				EXPECT_EQ(SocketDesc.getNumResources(), 1u);

				const auto &Socket = SocketDesc.getResource();
				EXPECT_STREQ(Socket.getName(), SocketNames[s]);
				EXPECT_EQ(Socket.getNumContainedExecutionResourceTypes(),
				CoreTypeCounts[s]);

				unsigned *CoreCounts = (s == 0 ? CPUCoreCounts : GPUCoreCounts);
				const char const CoreNames = (s == 0 ? CPUCoreNames : GPUCoreNames);
				unsigned *ThreadCounts = (s == 0 ? CPUThreadCounts : GPUThreadCounts);

				unsigned c = 0;
				for (const auto &CoreDesc : Socket) {
				EXPECT_EQ(CoreDesc.getNumResources(), CoreCounts[c]);

				const auto &Core = CoreDesc.getResource();
				EXPECT_STREQ(Core.getName(), CoreNames[c]);

				EXPECT_EQ(Core.getNumContainedExecutionResourceTypes(), 1u);

				const auto &ThreadDesc = Core.getResourceDescriptor(0);
				EXPECT_EQ(ThreadDesc.getNumResources(), ThreadCounts[c]);

				const auto &Thread = ThreadDesc.getResource();
				EXPECT_STREQ(Thread.getName(), ThreadName);

				EXPECT_EQ(Thread.getNumContainedExecutionResourceTypes(), 0u);

				// Check thread-level caches.
				unsigned NumThreadCacheLevels =
				Thread.getMemoryModel().getNumCacheLevels();
				EXPECT_EQ(NumThreadCacheLevels, 0u);

				// Check core-level caches.
				const char const CacheNames = (s == CPUSocket &&
				c == Big ?
				BigCacheLevelNames :
				(s == CPUSocket &&
				c == Little ?
				LittleCacheLevelNames :
				GPUCoreCacheLevelNames));
				const unsigned *CacheSizes = (s == CPUSocket &&
				c == Big ?
				BigCacheLevelSizes :
				(s == CPUSocket &&
				c == Little ?
				LittleCacheLevelSizes :
				GPUCoreCacheLevelSizes));
				const unsigned *CacheLineSizes = (s == CPUSocket &&
				c == Big ?
				BigCacheLevelLineSizes :
				(s == CPUSocket &&
				c == Little ?
				LittleCacheLevelLineSizes :
				GPUCoreCacheLevelLineSizes));
				const unsigned *CacheAssociativities = (s == CPUSocket &&
				c == Big ?
				BigCacheLevelAssociativities :
				(s == CPUSocket &&
				c == Little ?
				LittleCacheLevelAssociativities :
				GPUCoreCacheLevelAssociativities)
				);
				const unsigned *CacheLatencies = (s == CPUSocket &&
				c == Big ?
				BigCacheLevelLatencies :
				(s == CPUSocket &&
				c == Little ?
				LittleCacheLevelLatencies :
				GPUCoreCacheLevelLatencies));

				unsigned lvl = 0;
				for (const auto &CacheLevel : Core.getMemoryModel()) {
				EXPECT_STREQ(CacheLevel.getName(), CacheNames[lvl]);
				EXPECT_EQ(CacheLevel.getSizeInBytes(), CacheSizes[lvl]);
				EXPECT_EQ(CacheLevel.getLineSizeInBytes(), CacheLineSizes[lvl]);
				EXPECT_EQ(CacheLevel.getAssociativity(), CacheAssociativities[lvl]);
				EXPECT_EQ(CacheLevel.getLatency(), CacheLatencies[lvl]);

				++lvl;
				}

				++c;
				}

				// Check socket-level caches.
				const char const CacheNames = CPUCacheLevelNames;
				const unsigned *CacheSizes = CPUCacheLevelSizes;
				const unsigned *CacheLineSizes = CPUCacheLevelLineSizes;
				const unsigned *CacheAssociativities = CPUCacheLevelAssociativities;
				const unsigned *CacheLatencies = CPUCacheLevelLatencies;

				if (s == GPUSocket) {
				unsigned NumSocketCacheLevels =
				Socket.getMemoryModel().getNumCacheLevels();
				EXPECT_EQ(NumSocketCacheLevels, 0u);
				}

				unsigned lvl = 0;
				for (const auto &CacheLevel : Socket.getMemoryModel()) {
				EXPECT_STREQ(CacheLevel.getName(), CacheNames[lvl]);
				EXPECT_EQ(CacheLevel.getSizeInBytes(), CacheSizes[lvl]);
				EXPECT_EQ(CacheLevel.getLineSizeInBytes(), CacheLineSizes[lvl]);
				EXPECT_EQ(CacheLevel.getAssociativity(), CacheAssociativities[lvl]);
				EXPECT_EQ(CacheLevel.getLatency(), CacheLatencies[lvl]);

				++lvl;
				}

				++s;
				}

				// Test the global system representation of the memory model.
				const MCSystemModel::CacheLevelSet *L1Levels =
				Node.getCacheLevelInfo(MCSystemModel::CacheLevel::L1);
				const MCSystemModel::CacheLevelSet *L2Levels =
				Node.getCacheLevelInfo(MCSystemModel::CacheLevel::L2);
				const MCSystemModel::CacheLevelSet *L3Levels =
				Node.getCacheLevelInfo(MCSystemModel::CacheLevel::L3);
				const MCSystemModel::CacheLevelSet *L4Levels =
				Node.getCacheLevelInfo(MCSystemModel::CacheLevel::L4);

				const MCSystemModel::PrefetchConfigSet &PrefetchConfigs =
				Node.getSoftwarePrefetcherInfo();

				EXPECT_NE(L1Levels, nullptr);
				EXPECT_NE(L2Levels, nullptr);
				EXPECT_NE(L3Levels, nullptr);
				EXPECT_EQ(L4Levels, nullptr);

				EXPECT_EQ(L1Levels->size(), 3u);
				EXPECT_EQ(L2Levels->size(), 1u);
				EXPECT_EQ(L3Levels->size(), 1u);
				EXPECT_EQ(PrefetchConfigs.size(), 2u);

				unsigned i = 0;
				for (const auto L1Level : *L1Levels) {
				const char const CacheNames = (i == 0 ? BigCacheLevelNames :
				i == 1 ? LittleCacheLevelNames :
				GPUCoreCacheLevelNames);
				const unsigned *CacheSizes = (i == 0 ? BigCacheLevelSizes :
				i == 1 ? LittleCacheLevelSizes :
				GPUCoreCacheLevelSizes);
				const unsigned *CacheLineSizes = (i == 0 ? BigCacheLevelLineSizes :
				i == 1 ? LittleCacheLevelLineSizes :
				GPUCoreCacheLevelLineSizes
				);
				const unsigned *CacheAssociativities = (i == 0 ?
				BigCacheLevelAssociativities :
				i == 1 ?
				LittleCacheLevelAssociativities :
				GPUCoreCacheLevelAssociativities);
				const unsigned *CacheLatencies = (i == 0 ? BigCacheLevelLatencies :
				i == 1 ? LittleCacheLevelLatencies :
				GPUCoreCacheLevelLatencies
				);

				unsigned Index = (i == 0 ? BigL1 :
				i == 1 ? LittleL1 : GPUCoreL1);

				EXPECT_STREQ(L1Level->getName(), CacheNames[Index]);
				EXPECT_EQ(L1Level->getSizeInBytes(), CacheSizes[Index]);
				EXPECT_EQ(L1Level->getLineSizeInBytes(), CacheLineSizes[Index]);
				EXPECT_EQ(L1Level->getAssociativity(), CacheAssociativities[Index]);
				EXPECT_EQ(L1Level->getLatency(), CacheLatencies[Index]);

				++i;
				}

				i = 0;
				for (const auto L2Level : *L2Levels) {
				const char const CacheNames = BigCacheLevelNames;
				const unsigned *CacheSizes = BigCacheLevelSizes;
				const unsigned *CacheLineSizes = BigCacheLevelLineSizes;
				const unsigned *CacheAssociativities = BigCacheLevelAssociativities;
				const unsigned *CacheLatencies = BigCacheLevelLatencies;

				unsigned Index = BigL2;

				EXPECT_STREQ(L2Level->getName(), CacheNames[Index]);
				EXPECT_EQ(L2Level->getSizeInBytes(), CacheSizes[Index]);
				EXPECT_EQ(L2Level->getLineSizeInBytes(), CacheLineSizes[Index]);
				EXPECT_EQ(L2Level->getAssociativity(), CacheAssociativities[Index]);
				EXPECT_EQ(L2Level->getLatency(), CacheLatencies[Index]);

				++i;
				}

				i = 0;
				for (const auto L3Level : *L3Levels) {
				const char const CacheNames = CPUCacheLevelNames;
				const unsigned *CacheSizes = CPUCacheLevelSizes;
				const unsigned *CacheLineSizes = CPUCacheLevelLineSizes;
				const unsigned *CacheAssociativities = CPUCacheLevelAssociativities;
				const unsigned *CacheLatencies = CPUCacheLevelLatencies;

				unsigned Index = CPUL3;

				EXPECT_STREQ(L3Level->getName(), CacheNames[Index]);
				EXPECT_EQ(L3Level->getSizeInBytes(), CacheSizes[Index]);
				EXPECT_EQ(L3Level->getLineSizeInBytes(), CacheLineSizes[Index]);
				EXPECT_EQ(L3Level->getAssociativity(), CacheAssociativities[Index]);
				EXPECT_EQ(L3Level->getLatency(), CacheLatencies[Index]);

				++i;
				}
				}

				TEST(SystemModel, Topology3Tests) {
				// Test this topology:
				//
				// System
				// / \
				// (L1) GPU CPU (L3)
				// \| / \
				// C L (L1) B (L1, L2)

				const unsigned BigL1 = 0;
				const unsigned BigL2 = 1;

				const unsigned LittleL1 = 0;

				const unsigned CPUL3 = 0;

				const unsigned GPUL1 = 0;

				const unsigned Big = 0;
				const unsigned Little = 1;

				const unsigned GPU = 0;

				const unsigned CPUSocket = 0;
				const unsigned GPUSocket = 1;

				// Define cache parameters.
				const char *BigCacheLevelNames[] = { "BigL1", "BigL2" };
				unsigned BigCacheLevelSizes[] = { 102416, 1024 1024*4 };
				unsigned BigCacheLevelLineSizes[] = { 32, 32 };
				unsigned BigCacheLevelAssociativities[] = { 8, 24 };
				unsigned BigCacheLevelLatencies[] = { 2, 12 };

				const char *LittleCacheLevelNames[] = { "LittleL1" };
				unsigned LittleCacheLevelSizes[] = { 1024*8 };
				unsigned LittleCacheLevelLineSizes[] = { 32 };
				unsigned LittleCacheLevelAssociativities[] = { 8 };
				unsigned LittleCacheLevelLatencies[] = { 2 };

				const char *CPUCacheLevelNames[] = { "CPUL3" };
				unsigned CPUCacheLevelSizes[] = { 102410248 };
				unsigned CPUCacheLevelLineSizes[] = { 32 };
				unsigned CPUCacheLevelAssociativities[] = { 32 };
				unsigned CPUCacheLevelLatencies[] = { 50 };

				const char *GPUCacheLevelNames[] = { "GPUL1" };
				unsigned GPUCacheLevelSizes[] = { 1024*64 };
				unsigned GPUCacheLevelLineSizes[] = { 64 };
				unsigned GPUCacheLevelAssociativities[] = { 24 };
				unsigned GPUCacheLevelLatencies[] = { 12 };

				// Define thread parameters.
				const char *ThreadName = "Thread";

				// Define core parameters.
				// The GPU has four cores with two thread team schedulers of vector
				// length 64, for a total of 512 "threads."
				const char *CPUCoreNames[] = { "BigCore", "LittleCore" };
				unsigned CPUCoreCounts[] = { 2, 8 };
				unsigned CPUThreadCounts[] = { 4, 2 };

				const char *GPUCoreNames[] = { "GPUCore" };
				unsigned GPUCoreCounts[] = { 4 };
				// Threads in a core. The GPU has two thread team schedulers, each
				// team may be a vector length of, say, 64 which we don't model.
				unsigned GPUThreadCounts[] = { 2 };

				// Define socket parameters.
				const char *SocketNames[] = { "CPU", "GPU" };
				unsigned CoreTypeCounts[] = { 2, 1 };

				unsigned ID = 0;

				// Define write-combining buffers.
				MCWriteCombiningBufferInfo WCBufs(ID++, "WCBufs", 4);
				MCWriteCombiningBufferInfo NoWCBufs(ID++, "NoWCBufs", 0);

				// Define software prefetchers.
				MCSoftwarePrefetcherConfig Prefetcher(ID++, "Prefetcher", true, 1024, 512,
				4096, 100, 4, 32);

				MCSoftwarePrefetcherConfig NoPrefetcher(ID++, "NoPrefetcher", false, 0, 0, 0,
				0, 0, 0);

				// Define caches.
				MCCacheLevelInfo BigCoreCacheLevels[] = {
				MCCacheLevelInfo(ID++,
				BigCacheLevelNames[BigL1],
				BigCacheLevelSizes[BigL1],
				BigCacheLevelLineSizes[BigL1],
				BigCacheLevelAssociativities[BigL1],
				BigCacheLevelLatencies[BigL1]),
				MCCacheLevelInfo(ID++,
				BigCacheLevelNames[BigL2],
				BigCacheLevelSizes[BigL2],
				BigCacheLevelLineSizes[BigL2],
				BigCacheLevelAssociativities[BigL2],
				BigCacheLevelLatencies[BigL2]),
				};

				MCCacheLevelInfo LittleCoreCacheLevels[] = {
				MCCacheLevelInfo(ID++,
				LittleCacheLevelNames[LittleL1],
				LittleCacheLevelSizes[LittleL1],
				LittleCacheLevelLineSizes[LittleL1],
				LittleCacheLevelAssociativities[LittleL1],
				LittleCacheLevelLatencies[LittleL1]),
				};

				MCCacheLevelInfo CPUCacheLevels[] = {
				MCCacheLevelInfo(ID++,
				CPUCacheLevelNames[CPUL3],
				CPUCacheLevelSizes[CPUL3],
				CPUCacheLevelLineSizes[CPUL3],
				CPUCacheLevelAssociativities[CPUL3],
				CPUCacheLevelLatencies[CPUL3]),
				};

				// All GPU cores share one level of cache.
				MCCacheLevelInfo GPUCacheLevels[] = {
				MCCacheLevelInfo(ID++,
				GPUCacheLevelNames[GPUL1],
				GPUCacheLevelSizes[GPUL1],
				GPUCacheLevelLineSizes[GPUL1],
				GPUCacheLevelAssociativities[GPUL1],
				GPUCacheLevelLatencies[GPUL1]),
				};

				// Define memory models.
				MCMemoryModel BigMemModel(ID++,
				"BigMemModel",
				BigCoreCacheLevels,
				2,
				WCBufs,
				Prefetcher);

				MCMemoryModel LittleMemModel(ID++,
				"LittleMemModel",
				LittleCoreCacheLevels,
				1,
				WCBufs,
				Prefetcher);

				MCMemoryModel CPUMemModel(ID++,
				"CPUMemModel",
				CPUCacheLevels,
				1,
				NoWCBufs,
				NoPrefetcher);

				MCMemoryModel GPUMemModel(ID++,
				"GPUMemModel",
				GPUCacheLevels,
				1,
				NoWCBufs,
				NoPrefetcher);

				MCMemoryModel NoMemModel(ID++, "NoModel", nullptr, 0, NoWCBufs, NoPrefetcher);

				// Define threads.
				MCExecutionResource CommonThread(ID++, ThreadName, nullptr, 0, NoMemModel);

				// Define cores.
				MCExecutionResourceDesc BigThreadDesc(ID++,
				"BigThreadDesc",
				&CommonThread,
				CPUThreadCounts[Big]);
				MCExecutionResourceDesc LittleThreadDesc(ID++,
				"LittleThreadDesc",
				&CommonThread,
				CPUThreadCounts[Little]);
				MCExecutionResourceDesc GPUThreadDesc(ID++,
				"GPUThreadDesc",
				&CommonThread,
				GPUThreadCounts[GPU]);

				MCExecutionResourceDesc *BigThreadsList[] = { &BigThreadDesc };
				MCExecutionResourceDesc *LittleThreadsList[] = { &LittleThreadDesc };
				MCExecutionResourceDesc *GPUThreadsList[] = { &GPUThreadDesc };

				MCExecutionResource BigCore(ID++,
				CPUCoreNames[Big],
				BigThreadsList,
				1,
				BigMemModel);

				MCExecutionResource LittleCore(ID++,
				CPUCoreNames[Little],
				LittleThreadsList,
				1,
				LittleMemModel);

				MCExecutionResource GPUCore(ID++,
				GPUCoreNames[GPU],
				GPUThreadsList,
				1,
				NoMemModel);

				// Define sockets.
				MCExecutionResourceDesc BigCoreDesc(ID++,
				"BigCoreDesc",
				&BigCore,
				CPUCoreCounts[Big]);

				MCExecutionResourceDesc LittleCoreDesc(ID++,
				"LittleCoreDesc",
				&LittleCore,
				CPUCoreCounts[Little]);

				MCExecutionResourceDesc GPUCoreDesc(ID++,
				"GPUCoreDesc",
				&GPUCore,
				GPUCoreCounts[GPU]);

				MCExecutionResourceDesc *CPUCoreList[] = { &BigCoreDesc, &LittleCoreDesc };
				MCExecutionResourceDesc *GPUCoreList[] = { &GPUCoreDesc };

				MCExecutionResource CPUEngine(ID++,
				SocketNames[CPUSocket],
				CPUCoreList,
				2,
				CPUMemModel);

				MCExecutionResource GPUEngine(ID++,
				SocketNames[GPUSocket],
				GPUCoreList,
				1,
				GPUMemModel);

				// Define a node consisting of a CPU socket and a GPU socket.
				MCExecutionResourceDesc CPUSocketDesc(ID++, "CPUSocketDesc", &CPUEngine, 1);
				MCExecutionResourceDesc GPUSocketDesc(ID++, "GPUSocketDesc", &GPUEngine, 1);

				MCExecutionResourceDesc *SocketList[] = { &CPUSocketDesc, &GPUSocketDesc };

				MCSystemModel Node(ID++, "Node", SocketList, 2);

				// Test the topology.
				EXPECT_EQ(Node.getNumExecutionResourceTypes(), 2u);

				unsigned s = 0;
				for (const auto &SocketDesc : Node) {
				EXPECT_EQ(SocketDesc.getNumResources(), 1u);

				const auto &Socket = SocketDesc.getResource();
				EXPECT_STREQ(Socket.getName(), SocketNames[s]);
				EXPECT_EQ(Socket.getNumContainedExecutionResourceTypes(),
				CoreTypeCounts[s]);

				unsigned *CoreCounts = (s == CPUSocket ?
				CPUCoreCounts : GPUCoreCounts);
				const char const CoreNames = (s == CPUSocket ?
				CPUCoreNames : GPUCoreNames);
				unsigned *ThreadCounts = (s == CPUSocket ?
				CPUThreadCounts : GPUThreadCounts);

				unsigned c = 0;
				for (const auto &CoreDesc : Socket) {
				EXPECT_EQ(CoreDesc.getNumResources(), CoreCounts[c]);

				const auto &Core = CoreDesc.getResource();
				EXPECT_STREQ(Core.getName(), CoreNames[c]);

				EXPECT_EQ(Core.getNumContainedExecutionResourceTypes(), 1u);

				const auto &ThreadDesc = Core.getResourceDescriptor(0);
				EXPECT_EQ(ThreadDesc.getNumResources(), ThreadCounts[c]);

				const auto &Thread = ThreadDesc.getResource();
				EXPECT_STREQ(Thread.getName(), ThreadName);

				EXPECT_EQ(Thread.getNumContainedExecutionResourceTypes(), 0u);

				// Check thread-level caches.
				unsigned NumThreadCacheLevels =
				Thread.getMemoryModel().getNumCacheLevels();
				EXPECT_EQ(NumThreadCacheLevels, 0u);

				// Check core-level caches.
				if (s == GPUSocket) {
				unsigned NumCoreCacheLevels =
				Thread.getMemoryModel().getNumCacheLevels();
				EXPECT_EQ(NumCoreCacheLevels, 0u);
				}

				const char const CacheNames = (c == Big ?
				BigCacheLevelNames :
				LittleCacheLevelNames);
				const unsigned *CacheSizes = (c == Big ?
				BigCacheLevelSizes :
				LittleCacheLevelSizes);
				const unsigned *CacheLineSizes = (c == Big ?
				BigCacheLevelLineSizes :
				LittleCacheLevelLineSizes);
				const unsigned *CacheAssociativities = (c == Big ?
				BigCacheLevelAssociativities :
				LittleCacheLevelAssociativities);
				const unsigned *CacheLatencies = (c == Big ?
				BigCacheLevelLatencies :
				LittleCacheLevelLatencies);

				unsigned lvl = 0;
				for (const auto &CacheLevel : Core.getMemoryModel()) {
				EXPECT_STREQ(CacheLevel.getName(), CacheNames[lvl]);
				EXPECT_EQ(CacheLevel.getSizeInBytes(), CacheSizes[lvl]);
				EXPECT_EQ(CacheLevel.getLineSizeInBytes(), CacheLineSizes[lvl]);
				EXPECT_EQ(CacheLevel.getAssociativity(), CacheAssociativities[lvl]);
				EXPECT_EQ(CacheLevel.getLatency(), CacheLatencies[lvl]);

				++lvl;
				}

				++c;
				}

				// Check socket-level caches.
				const char const CacheNames = (s == CPUSocket ?
				CPUCacheLevelNames :
				GPUCacheLevelNames);
				const unsigned *CacheSizes = (s == CPUSocket ?
				CPUCacheLevelSizes :
				GPUCacheLevelSizes);
				const unsigned *CacheLineSizes = (s == CPUSocket ?
				CPUCacheLevelLineSizes :
				GPUCacheLevelLineSizes);
				const unsigned *CacheAssociativities = (s == CPUSocket ?
				CPUCacheLevelAssociativities :
				GPUCacheLevelAssociativities);
				const unsigned *CacheLatencies = (s == CPUSocket ?
				CPUCacheLevelLatencies :
				GPUCacheLevelLatencies);

				unsigned lvl = 0;
				for (const auto &CacheLevel : Socket.getMemoryModel()) {
				EXPECT_STREQ(CacheLevel.getName(), CacheNames[lvl]);
				EXPECT_EQ(CacheLevel.getSizeInBytes(), CacheSizes[lvl]);
				EXPECT_EQ(CacheLevel.getLineSizeInBytes(), CacheLineSizes[lvl]);
				EXPECT_EQ(CacheLevel.getAssociativity(), CacheAssociativities[lvl]);
				EXPECT_EQ(CacheLevel.getLatency(), CacheLatencies[lvl]);

				++lvl;
				}

				++s;
				}

				// Test the global system representation of the memory model.
				const MCSystemModel::CacheLevelSet *L1Levels =
				Node.getCacheLevelInfo(MCSystemModel::CacheLevel::L1);
				const MCSystemModel::CacheLevelSet *L2Levels =
				Node.getCacheLevelInfo(MCSystemModel::CacheLevel::L2);
				const MCSystemModel::CacheLevelSet *L3Levels =
				Node.getCacheLevelInfo(MCSystemModel::CacheLevel::L3);
				const MCSystemModel::CacheLevelSet *L4Levels =
				Node.getCacheLevelInfo(MCSystemModel::CacheLevel::L4);

				const MCSystemModel::PrefetchConfigSet &PrefetchConfigs =
				Node.getSoftwarePrefetcherInfo();

				EXPECT_NE(L1Levels, nullptr);
				EXPECT_NE(L2Levels, nullptr);
				EXPECT_NE(L3Levels, nullptr);
				EXPECT_EQ(L4Levels, nullptr);

				EXPECT_EQ(L1Levels->size(), 3u);
				EXPECT_EQ(L2Levels->size(), 1u);
				EXPECT_EQ(L3Levels->size(), 1u);
				EXPECT_EQ(PrefetchConfigs.size(), 2u);

				unsigned i = 0;
				for (const auto L1Level : *L1Levels) {
				const char const CacheNames = (i == 0 ? BigCacheLevelNames :
				i == 1 ? LittleCacheLevelNames :
				GPUCacheLevelNames);
				const unsigned *CacheSizes = (i == 0 ? BigCacheLevelSizes :
				i == 1 ? LittleCacheLevelSizes :
				GPUCacheLevelSizes);
				const unsigned *CacheLineSizes = (i == 0 ? BigCacheLevelLineSizes :
				i == 1 ? LittleCacheLevelLineSizes :
				GPUCacheLevelLineSizes);
				const unsigned *CacheAssociativities = (i == 0 ?
				BigCacheLevelAssociativities :
				i == 1 ?
				LittleCacheLevelAssociativities :
				GPUCacheLevelAssociativities);
				const unsigned *CacheLatencies = (i == 0 ? BigCacheLevelLatencies :
				i == 1 ? LittleCacheLevelLatencies :
				GPUCacheLevelLatencies);

				unsigned Index = (i == 0 ? BigL1 :
				i == 1 ? LittleL1 : GPUL1);

				EXPECT_STREQ(L1Level->getName(), CacheNames[Index]);
				EXPECT_EQ(L1Level->getSizeInBytes(), CacheSizes[Index]);
				EXPECT_EQ(L1Level->getLineSizeInBytes(), CacheLineSizes[Index]);
				EXPECT_EQ(L1Level->getAssociativity(), CacheAssociativities[Index]);
				EXPECT_EQ(L1Level->getLatency(), CacheLatencies[Index]);

				++i;
				}

				i = 0;
				for (const auto L2Level : *L2Levels) {
				const char const CacheNames = BigCacheLevelNames;
				const unsigned *CacheSizes = BigCacheLevelSizes;
				const unsigned *CacheLineSizes = BigCacheLevelLineSizes;
				const unsigned *CacheAssociativities = BigCacheLevelAssociativities;
				const unsigned *CacheLatencies = BigCacheLevelLatencies;

				unsigned Index = BigL2;

				EXPECT_STREQ(L2Level->getName(), CacheNames[Index]);
				EXPECT_EQ(L2Level->getSizeInBytes(), CacheSizes[Index]);
				EXPECT_EQ(L2Level->getLineSizeInBytes(), CacheLineSizes[Index]);
				EXPECT_EQ(L2Level->getAssociativity(), CacheAssociativities[Index]);
				EXPECT_EQ(L2Level->getLatency(), CacheLatencies[Index]);

				++i;
				}

				i = 0;
				for (const auto L3Level : *L3Levels) {
				const char const CacheNames = CPUCacheLevelNames;
				const unsigned *CacheSizes = CPUCacheLevelSizes;
				const unsigned *CacheLineSizes = CPUCacheLevelLineSizes;
				const unsigned *CacheAssociativities = CPUCacheLevelAssociativities;
				const unsigned *CacheLatencies = CPUCacheLevelLatencies;

				unsigned Index = CPUL3;

				EXPECT_STREQ(L3Level->getName(), CacheNames[Index]);
				EXPECT_EQ(L3Level->getSizeInBytes(), CacheSizes[Index]);
				EXPECT_EQ(L3Level->getLineSizeInBytes(), CacheLineSizes[Index]);
				EXPECT_EQ(L3Level->getAssociativity(), CacheAssociativities[Index]);
				EXPECT_EQ(L3Level->getLatency(), CacheLatencies[Index]);

				++i;
				}
				}

				} // end namespace

llvm/utils/TableGen/SubtargetEmitter.cpp

Show All 23 Lines
#include "llvm/Support/Format.h"		#include "llvm/Support/Format.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/TableGen/Error.h"		#include "llvm/TableGen/Error.h"
#include "llvm/TableGen/Record.h"		#include "llvm/TableGen/Record.h"
#include "llvm/TableGen/TableGenBackend.h"		#include "llvm/TableGen/TableGenBackend.h"
#include <algorithm>		#include <algorithm>
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>
		#include <functional>
#include <iterator>		#include <iterator>
#include <map>		#include <map>
#include <string>		#include <string>
#include <vector>		#include <vector>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "subtarget-emitter"		#define DEBUG_TYPE "subtarget-emitter"
▲ Show 20 Lines • Show All 76 Lines • ▼ Show 20 Lines	class SubtargetEmitter {
void EmitSchedModelHelpers(const std::string &ClassName, raw_ostream &OS);		void EmitSchedModelHelpers(const std::string &ClassName, raw_ostream &OS);
void emitSchedModelHelpersImpl(raw_ostream &OS,		void emitSchedModelHelpersImpl(raw_ostream &OS,
bool OnlyExpandMCInstPredicates = false);		bool OnlyExpandMCInstPredicates = false);
void emitGenMCSubtargetInfo(raw_ostream &OS);		void emitGenMCSubtargetInfo(raw_ostream &OS);
void EmitMCInstrAnalysisPredicateFunctions(raw_ostream &OS);		void EmitMCInstrAnalysisPredicateFunctions(raw_ostream &OS);

void EmitSchedModel(raw_ostream &OS);		void EmitSchedModel(raw_ostream &OS);
void EmitHwModeCheck(const std::string &ClassName, raw_ostream &OS);		void EmitHwModeCheck(const std::string &ClassName, raw_ostream &OS);

		void EmitCacheHierarchies(raw_ostream &OS, unsigned &ID);
		void EmitWriteCombiningBuffers(raw_ostream &OS, unsigned &ID);
		void EmitPrefetchConfigs(raw_ostream &OS, unsigned &ID);
		void EmitMemoryModel(raw_ostream &OS, Record &M, unsigned &ID);
		void EmitMemoryModels(raw_ostream &OS, unsigned &ID);
		void EmitExecutionResource(raw_ostream &OS, Record &Resource, unsigned &ID,
		SmallPtrSetImpl<Record *> &Emitted);
		void EmitExecutionResourceDesc(raw_ostream &OS, Record &ResourceDesc,
		unsigned &ID,
		SmallPtrSetImpl<Record *> &Emitted);
		void EmitExecutionResourceDescResource(raw_ostream &OS, Record &ResourceDesc,
		unsigned &ID,
		SmallPtrSetImpl<Record *> &Emitted);
		void EmitExecutionResourceList(raw_ostream &OS, StringRef ListName,
		const ListInit &ResourceList,
		unsigned &ID,
		SmallPtrSetImpl<Record *> &Emitted);
		void EmitSystemModel(raw_ostream &OS, Record &E, unsigned &ID,
		SmallPtrSetImpl<Record *> &Emitted);
		void EmitSystemModels(raw_ostream &OS);

void ParseFeaturesFunction(raw_ostream &OS, unsigned NumFeatures,		void ParseFeaturesFunction(raw_ostream &OS, unsigned NumFeatures,
unsigned NumProcs);		unsigned NumProcs);

public:		public:
SubtargetEmitter(RecordKeeper &R, CodeGenTarget &TGT)		SubtargetEmitter(RecordKeeper &R, CodeGenTarget &TGT)
: TGT(TGT), Records(R), SchedModels(TGT.getSchedModels()),		: TGT(TGT), Records(R), SchedModels(TGT.getSchedModels()),
Target(TGT.getName()) {}		Target(TGT.getName()) {}

▲ Show 20 Lines • Show All 1,543 Lines • ▼ Show 20 Lines	void SubtargetEmitter::EmitHwModeCheck(const std::string &ClassName,
for (unsigned M = 1, NumModes = CGH.getNumModeIds(); M != NumModes; ++M) {		for (unsigned M = 1, NumModes = CGH.getNumModeIds(); M != NumModes; ++M) {
const HwMode &HM = CGH.getMode(M);		const HwMode &HM = CGH.getMode(M);
OS << " if (checkFeatures(\"" << HM.Features		OS << " if (checkFeatures(\"" << HM.Features
<< "\")) return " << M << ";\n";		<< "\")) return " << M << ";\n";
}		}
OS << " return 0;\n}\n";		OS << " return 0;\n}\n";
}		}

		// EmitCacheHierarchies - Emits all cache hierarchy information.
		//
		void SubtargetEmitter::EmitCacheHierarchies(raw_ostream &OS, unsigned &ID) {
		std::vector<Record*> CacheHierarchies =
		Records.getAllDerivedDefinitions("CacheHierarchy");

		for (const auto &CH : CacheHierarchies) {
		RecordVal *Levels = CH->getValue("Levels");
		const ListInit &LevelList = *cast<const ListInit>(Levels->getValue());

		std::function<const char * ()> Delimeter = [&]() -> const char * {
		Delimeter = []() -> const char * {
		return ",";
		};
		return "";
		};

		OS << "static const llvm::MCCacheLevelInfo " << CH->getName() << "[] = {";
		for (const auto &L : LevelList) {
		OS << Delimeter() << '\n';

		const DefInit &LevelDef = *cast<const DefInit>(L);
		Record *Level = LevelDef.getDef();

		RecordVal *LevelSize = Level->getValue("Size");
		RecordVal *LevelLineSize = Level->getValue("LineSize");
		RecordVal *LevelWays = Level->getValue("Ways");
		RecordVal *LevelLatency = Level->getValue("Latency");

		OS << " llvm::MCCacheLevelInfo("
		<< ID++ << ", \""
		<< Level->getName() << "\", "
		<< LevelSize->getValue()->getAsString() << ", "
		<< LevelLineSize->getValue()->getAsString() << ", "
		<< LevelWays->getValue()->getAsString() << ", "
		<< LevelLatency->getValue()->getAsString() << ')';
		}
		if (LevelList.empty()) {
		OS << "\n llvm::MCCacheLevelInfo(0, \"Empty\", 0, 0, 0, 0)";
		}
		OS << "\n}; // " << CH->getName() << "\n\n";
		}
		}

		// Emits all write-combining buffer information.
		//
		void SubtargetEmitter::EmitWriteCombiningBuffers(raw_ostream &OS,
		unsigned &ID) {
		std::vector<Record*> WCBuffers =
		Records.getAllDerivedDefinitions("WriteCombiningBuffer");

		for (const auto &WCBuffer : WCBuffers) {
		RecordVal *WCBuffers = WCBuffer->getValue("NumBuffers");

		OS << "static const llvm::MCWriteCombiningBufferInfo " << WCBuffer->getName()
		<< "(";
		OS << ID++ << ", \"" << WCBuffer->getName() << "\", ";
		OS << WCBuffers->getValue()->getAsString() << ");\n\n";
		}
		}

		// Emits all prefetcer information.
		//
		void SubtargetEmitter::EmitPrefetchConfigs(raw_ostream &OS, unsigned &ID) {
		std::vector<Record*> Prefetchers =
		Records.getAllDerivedDefinitions("SoftwarePrefetcher");

		for (const auto &P : Prefetchers) {
		RecordVal *EnabledForReads = P->getValue("EnabledForWrites");
		RecordVal *EnabledForWrites = P->getValue("EnabledForReads");
		RecordVal *UseReadPFForWrites = P->getValue("UseReadPFForWrites");
		RecordVal *ByteDistance = P->getValue("BytesAhead");
		RecordVal *MaxByteDistance = P->getValue("MaxBytesAhead");
		RecordVal *MinByteDistance = P->getValue("MinBytesAhead");
		RecordVal *InstructionDistance = P->getValue("InstructionsAhead");
		RecordVal *MaxIterationDistance = P->getValue("MaxIterationsAhead");
		RecordVal *MinByteStride = P->getValue("MinStride");

		OS << "static const llvm::MCSoftwarePrefetcherConfig " << P->getName()
		<< "("
		<< ID++ << ", \"" << P->getName() << "\", "
		<< EnabledForReads->getValue()->getAsString() << ", "
		<< EnabledForWrites->getValue()->getAsString() << ", "
		<< UseReadPFForWrites->getValue()->getAsString() << ", "
		<< ByteDistance->getValue()->getAsString() << ", "
		<< MinByteDistance->getValue()->getAsString() << ", "
		<< MaxByteDistance->getValue()->getAsString() << ", "
		<< InstructionDistance->getValue()->getAsString() << ", "
		<< MaxIterationDistance->getValue()->getAsString() << ", "
		<< MinByteStride->getValue()->getAsString() << ");\n\n";
		}
		}

		// Emits one memory system definition.
		//
		void SubtargetEmitter::EmitMemoryModel(raw_ostream &OS, Record &M,
		unsigned &ID) {
		RecordVal *CacheHierarchyValue = M.getValue("Caches");
		const DefInit &CacheHierarchyDef =
		*cast<const DefInit >(CacheHierarchyValue->getValue());
		Record *CacheHierarchy = CacheHierarchyDef.getDef();

		// Get cache level information.
		RecordVal *CacheLevelValue = CacheHierarchy->getValue("Levels");
		const ListInit &CacheLevelsList =
		*cast<const ListInit>(CacheLevelValue->getValue());
		unsigned NumLevels = CacheLevelsList.size();

		RecordVal *WCBufferValue = M.getValue("WCBuffers");
		const DefInit &WCBufferDef =
		*cast<const DefInit>(WCBufferValue->getValue());
		Record *WCBuffer = WCBufferDef.getDef();

		RecordVal *PrefetcherValue = M.getValue("Prefetcher");
		const DefInit &PrefetcherDef =
		*cast<const DefInit>(PrefetcherValue->getValue());
		Record *Prefetcher = PrefetcherDef.getDef();

		OS << "static const llvm::MCMemoryModel " << M.getName() << "("
		<< ID++ << ", \"" << M.getName() << "\", "
		<< CacheHierarchy->getName() << ", "
		<< NumLevels << ", "
		<< WCBuffer->getName() << ", "
		<< Prefetcher->getName() << ");\n\n";
		}

		// Emits all memory system information.
		//
		void SubtargetEmitter::EmitMemoryModels(raw_ostream &OS, unsigned &ID) {
		std::vector<Record*> MemoryModels =
		Records.getAllDerivedDefinitions("MemoryModel");

		for (const auto &M : MemoryModels) {
		EmitMemoryModel(OS, *M, ID);
		}
		}

		// Emits all contained execution slices if not already emitted, then
		// emits this one.
		//
		void
		SubtargetEmitter::EmitExecutionResource(raw_ostream &OS,
		Record &Resource,
		unsigned &ID,
		SmallPtrSetImpl<Record *> &Emitted) {
		RecordVal *ContainedValue = Resource.getValue("Contained");
		const ListInit &ContainedList =
		*cast<const ListInit>(ContainedValue->getValue());
		unsigned NumContained = ContainedList.size();

		// Emit all contained resources.
		std::string ListName(Resource.getName());
		ListName += "Contained";
		EmitExecutionResourceList(OS, ListName, ContainedList, ID, Emitted);

		// Emit the memory model;
		RecordVal *MemoryModelValue = Resource.getValue("MemModel");
		const DefInit &MemoryModelDef =
		*cast<const DefInit>(MemoryModelValue->getValue());
		Record *MemoryModel = MemoryModelDef.getDef();

		// Now emit this resource.
		OS << "static const llvm::MCExecutionResource " << Resource.getName()
		<< "("
		<< ID++ << ", \""
		<< Resource.getName() << "\", "
		<< ListName << ", "
		<< NumContained << ", "
		<< MemoryModel->getName() << ");\n\n";
		}

		// Emits an execution resource descriptor, emitting all contained
		// resources.
		//
		void
		SubtargetEmitter::EmitExecutionResourceDesc(raw_ostream &OS,
		Record &ResourceDesc,
		unsigned &ID,
		SmallPtrSetImpl<Record *> &Emitted) {
		EmitExecutionResourceDescResource(OS, ResourceDesc, ID, Emitted);

		RecordVal *ResourceValue = ResourceDesc.getValue("Resource");
		const DefInit &ResourceDef =
		*cast<const DefInit>(ResourceValue->getValue());
		Record *ResourceRecord = ResourceDef.getDef();

		RecordVal *NumValue = ResourceDesc.getValue("NumResources");
		const IntInit &ResourceInt =
		*cast<const IntInit>(NumValue->getValue());

		OS << "static const llvm::MCExecutionResourceDesc " << ResourceDesc.getName()
		<< "("
		<< ID++ << ", \""
		<< ResourceDesc.getName() << "\", "
		<< '&' << ResourceRecord->getName() << ", "
		<< ResourceInt.getValue() << ");\n\n";
		}

		// Emit the resource referenced by this execution resource
		// descriptor.
		//
		void SubtargetEmitter::
		EmitExecutionResourceDescResource(raw_ostream &OS,
		Record &ResourceDesc,
		unsigned &ID,
		SmallPtrSetImpl<Record *> &Emitted) {
		RecordVal *ResourceValue = ResourceDesc.getValue("Resource");
		const DefInit &ResourceDef =
		*cast<const DefInit>(ResourceValue->getValue());
		Record *ResourceRecord = ResourceDef.getDef();

		if (Emitted.insert(ResourceRecord).second)
		EmitExecutionResource(OS, *ResourceRecord, ID, Emitted);
		}

		// Emit an execution resource list, creating the
		// MCExecutionResourceDesc necessary to describe contained resources.
		//
		void SubtargetEmitter::
		EmitExecutionResourceList(raw_ostream &OS,
		StringRef ListName,
		const ListInit &ResourceList,
		unsigned &ID,
		SmallPtrSetImpl<Record *> &Emitted) {
		// First, emit all contained resources. We have to do this here
		// because we don't want resource definitions to appear in the
		// middle of the array below.

		for (auto ResourceDesc : ResourceList) {
		const DefInit &ResourceDescDef = *cast<const DefInit>(ResourceDesc);
		Record *ResourceDescRecord = ResourceDescDef.getDef();

		if (Emitted.insert(ResourceDescRecord).second) {
		EmitExecutionResourceDesc(OS, *ResourceDescRecord, ID, Emitted);
		}
		}

		// Now emit the actual lists of resource descriptors.
		std::function<const char * ()> Delimeter = [&]() -> const char * {
		Delimeter = []() -> const char * {
		return ",";
		};
		return "";
		};

		OS << "static const llvm::MCExecutionResourceDesc *" << ListName << "[] = {";
		for (auto ResourceDesc : ResourceList) {
		const DefInit &ResourceDescDef = *cast<const DefInit>(ResourceDesc);
		Record *ResourceDescRecord = ResourceDescDef.getDef();

		OS << Delimeter() << "\n &" << ResourceDescRecord->getName();
		}
		if (ResourceList.empty()) {
		OS << "\n nullptr";
		}
		OS << "\n};\n\n";
		}

		// Emit an execution engine.
		//
		void SubtargetEmitter::EmitSystemModel(raw_ostream &OS, Record &E,
		unsigned &ID,
		SmallPtrSetImpl<Record *> &Emitted) {
		RecordVal *ResourcesValue = E.getValue("Resources");
		const ListInit &ResourceList =
		*cast<const ListInit>(ResourcesValue->getValue());
		unsigned NumResources = ResourceList.size();

		std::string ResourceListName;
		ResourceListName += E.getName();
		ResourceListName += "Resources";

		EmitExecutionResourceList(OS, ResourceListName, ResourceList, ID, Emitted);

		OS << "static const llvm::MCSystemModel " << E.getName() << "("
		<< ID << ", \"" << E.getName() << "\", "
		<< ResourceListName << ", "
		<< NumResources << ");\n\n";
		}

		// Emits all memory model and execution resource information.
		//
		void SubtargetEmitter::EmitSystemModels(raw_ostream &OS) {
		unsigned ID = 1;

		OS << "// ===============================================================\n"
		<< "// System models\n"
		<< "// ===============================================================\n"
		<< "//\n\n";

		// Emit memory models.
		OS << "// ===============================================================\n"
		<< "// Cache models\n"
		<< "//\n";

		EmitCacheHierarchies(OS, ID);

		OS << "// ===============================================================\n"
		<< "// Write-combining buffers\n"
		<< "//\n";

		EmitWriteCombiningBuffers(OS, ID);

		OS << "// ===============================================================\n"
		<< "// Software prefetch configs\n"
		<< "//\n";

		EmitPrefetchConfigs(OS, ID);

		OS << "// ===============================================================\n"
		<< "// Memory models\n"
		<< "//\n";

		EmitMemoryModels(OS, ID);

		// Emit execution engines.
		OS << "// ===============================================================\n"
		<< "// System models\n"
		<< "//\n";

		// Emit a resource list for this engine.
		SmallPtrSet<Record*, 8> EmittedResources;

		std::vector<Record*> Systems =
		Records.getAllDerivedDefinitions("SystemModel");

		for (const auto &S : Systems) {
		EmitSystemModel(OS, *S, ID, EmittedResources);
		}

		// Emit lookup tables.

		// Gather and sort processor information
		std::vector<Record*> ProcessorList =
		Records.getAllDerivedDefinitions("Processor");
		llvm::sort(ProcessorList, LessRecordFieldName());

		// Begin processor->system model table
		OS << "// Sorted (by key) array of execution engine model for CPU subtype.\n"
		<< "extern const llvm::SubtargetInfoKV " << Target
		<< "ProcSystemModelKV[] = {\n";
		// For each processor
		for (Record *Processor : ProcessorList) {
		StringRef Name = Processor->getValueAsString("Name");
		StringRef SystemName = Processor->getValueAsDef("System")->getName();

		// Emit as { "cpu", execution engine },
		OS << " { \"" << Name << "\", (const void *)&" << SystemName << " },\n";
		}
		// End processor->execution engine model table
		OS << "};\n\n";
		}

//		//
// ParseFeaturesFunction - Produces a subtarget specific function for parsing		// ParseFeaturesFunction - Produces a subtarget specific function for parsing
// the subtarget features string.		// the subtarget features string.
//		//
void SubtargetEmitter::ParseFeaturesFunction(raw_ostream &OS,		void SubtargetEmitter::ParseFeaturesFunction(raw_ostream &OS,
unsigned NumFeatures,		unsigned NumFeatures,
unsigned NumProcs) {		unsigned NumProcs) {
std::vector<Record*> Features =		std::vector<Record*> Features =
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines	OS << "struct " << Target
<< "GenMCSubtargetInfo : public MCSubtargetInfo {\n";		<< "GenMCSubtargetInfo : public MCSubtargetInfo {\n";
OS << " " << Target << "GenMCSubtargetInfo(const Triple &TT, \n"		OS << " " << Target << "GenMCSubtargetInfo(const Triple &TT, \n"
<< " StringRef CPU, StringRef FS, ArrayRef<SubtargetFeatureKV> PF,\n"		<< " StringRef CPU, StringRef FS, ArrayRef<SubtargetFeatureKV> PF,\n"
<< " ArrayRef<SubtargetFeatureKV> PD,\n"		<< " ArrayRef<SubtargetFeatureKV> PD,\n"
<< " const SubtargetInfoKV *ProcSched,\n"		<< " const SubtargetInfoKV *ProcSched,\n"
<< " const MCWriteProcResEntry *WPR,\n"		<< " const MCWriteProcResEntry *WPR,\n"
<< " const MCWriteLatencyEntry *WL,\n"		<< " const MCWriteLatencyEntry *WL,\n"
<< " const MCReadAdvanceEntry RA, const InstrStage IS,\n"		<< " const MCReadAdvanceEntry RA, const InstrStage IS,\n"
<< " const unsigned OC, const unsigned FP) :\n"		<< " const unsigned OC, const unsigned FP,\n"
		<< " const SubtargetInfoKV *ProcSystem) :\n"
<< " MCSubtargetInfo(TT, CPU, FS, PF, PD, ProcSched,\n"		<< " MCSubtargetInfo(TT, CPU, FS, PF, PD, ProcSched,\n"
<< " WPR, WL, RA, IS, OC, FP) { }\n\n"		<< " WPR, WL, RA, IS, OC, FP, ProcSystem) { }\n\n"
<< " unsigned resolveVariantSchedClass(unsigned SchedClass,\n"		<< " unsigned resolveVariantSchedClass(unsigned SchedClass,\n"
<< " const MCInst *MI, unsigned CPUID) const override {\n"		<< " const MCInst *MI, unsigned CPUID) const override {\n"
<< " return " << Target << "_MC"		<< " return " << Target << "_MC"
<< "::resolveVariantSchedClassImpl(SchedClass, MI, CPUID); \n";		<< "::resolveVariantSchedClassImpl(SchedClass, MI, CPUID); \n";
OS << " }\n";		OS << " }\n";
OS << "};\n";		OS << "};\n";
}		}

▲ Show 20 Lines • Show All 44 Lines • ▼ Show 20 Lines	#if 0
OS << "namespace {\n";		OS << "namespace {\n";
#endif		#endif
unsigned NumFeatures = FeatureKeyValues(OS);		unsigned NumFeatures = FeatureKeyValues(OS);
OS << "\n";		OS << "\n";
unsigned NumProcs = CPUKeyValues(OS);		unsigned NumProcs = CPUKeyValues(OS);
OS << "\n";		OS << "\n";
EmitSchedModel(OS);		EmitSchedModel(OS);
OS << "\n";		OS << "\n";
		EmitSystemModels(OS);
		OS << "\n";
#if 0		#if 0
OS << "} // end anonymous namespace\n\n";		OS << "} // end anonymous namespace\n\n";
#endif		#endif

// MCInstrInfo initialization routine.		// MCInstrInfo initialization routine.
emitGenMCSubtargetInfo(OS);		emitGenMCSubtargetInfo(OS);

OS << "\nstatic inline MCSubtargetInfo *create" << Target		OS << "\nstatic inline MCSubtargetInfo *create" << Target
Show All 12 Lines	#endif
OS << Target << "ProcSchedKV, "		OS << Target << "ProcSchedKV, "
<< Target << "WriteProcResTable, "		<< Target << "WriteProcResTable, "
<< Target << "WriteLatencyTable, "		<< Target << "WriteLatencyTable, "
<< Target << "ReadAdvanceTable, ";		<< Target << "ReadAdvanceTable, ";
OS << '\n'; OS.indent(22);		OS << '\n'; OS.indent(22);
if (SchedModels.hasItineraries()) {		if (SchedModels.hasItineraries()) {
OS << Target << "Stages, "		OS << Target << "Stages, "
<< Target << "OperandCycles, "		<< Target << "OperandCycles, "
<< Target << "ForwardingPaths";		<< Target << "ForwardingPaths, ";
} else		} else
OS << "nullptr, nullptr, nullptr";		OS << "nullptr, nullptr, nullptr, ";
		OS << Target << "ProcSystemModelKV";
OS << ");\n}\n\n";		OS << ");\n}\n\n";

OS << "} // end namespace llvm\n\n";		OS << "} // end namespace llvm\n\n";

OS << "#endif // GET_SUBTARGETINFO_MC_DESC\n\n";		OS << "#endif // GET_SUBTARGETINFO_MC_DESC\n\n";

OS << "\n#ifdef GET_SUBTARGETINFO_TARGET_DESC\n";		OS << "\n#ifdef GET_SUBTARGETINFO_TARGET_DESC\n";
OS << "#undef GET_SUBTARGETINFO_TARGET_DESC\n\n";		OS << "#undef GET_SUBTARGETINFO_TARGET_DESC\n\n";
▲ Show 20 Lines • Show All 55 Lines • ▼ Show 20 Lines	OS << "extern const llvm::MCReadAdvanceEntry "
<< Target << "ReadAdvanceTable[];\n";		<< Target << "ReadAdvanceTable[];\n";

if (SchedModels.hasItineraries()) {		if (SchedModels.hasItineraries()) {
OS << "extern const llvm::InstrStage " << Target << "Stages[];\n";		OS << "extern const llvm::InstrStage " << Target << "Stages[];\n";
OS << "extern const unsigned " << Target << "OperandCycles[];\n";		OS << "extern const unsigned " << Target << "OperandCycles[];\n";
OS << "extern const unsigned " << Target << "ForwardingPaths[];\n";		OS << "extern const unsigned " << Target << "ForwardingPaths[];\n";
}		}

		OS << "extern const llvm::SubtargetInfoKV " << Target
		<< "ProcSystemModelKV[];\n";

OS << ClassName << "::" << ClassName << "(const Triple &TT, StringRef CPU, "		OS << ClassName << "::" << ClassName << "(const Triple &TT, StringRef CPU, "
<< "StringRef FS)\n"		<< "StringRef FS)\n"
<< " : TargetSubtargetInfo(TT, CPU, FS, ";		<< " : TargetSubtargetInfo(TT, CPU, FS, ";
if (NumFeatures)		if (NumFeatures)
OS << "makeArrayRef(" << Target << "FeatureKV, " << NumFeatures << "), ";		OS << "makeArrayRef(" << Target << "FeatureKV, " << NumFeatures << "), ";
else		else
OS << "None, ";		OS << "None, ";
if (NumProcs)		if (NumProcs)
OS << "makeArrayRef(" << Target << "SubTypeKV, " << NumProcs << "), ";		OS << "makeArrayRef(" << Target << "SubTypeKV, " << NumProcs << "), ";
else		else
OS << "None, ";		OS << "None, ";
OS << '\n'; OS.indent(24);		OS << '\n'; OS.indent(24);
OS << Target << "ProcSchedKV, "		OS << Target << "ProcSchedKV, "
<< Target << "WriteProcResTable, "		<< Target << "WriteProcResTable, "
<< Target << "WriteLatencyTable, "		<< Target << "WriteLatencyTable, "
<< Target << "ReadAdvanceTable, ";		<< Target << "ReadAdvanceTable, ";
OS << '\n'; OS.indent(24);		OS << '\n'; OS.indent(24);
if (SchedModels.hasItineraries()) {		if (SchedModels.hasItineraries()) {
OS << Target << "Stages, "		OS << Target << "Stages, "
<< Target << "OperandCycles, "		<< Target << "OperandCycles, "
<< Target << "ForwardingPaths";		<< Target << "ForwardingPaths, ";
} else		} else
OS << "nullptr, nullptr, nullptr";		OS << "nullptr, nullptr, nullptr,";
OS << ") {}\n\n";		OS << '\n';
		OS.indent(24);
		OS << Target << "ProcSystemModelKV"
		<< ") {}\n\n";

EmitSchedModelHelpers(ClassName, OS);		EmitSchedModelHelpers(ClassName, OS);
EmitHwModeCheck(ClassName, OS);		EmitHwModeCheck(ClassName, OS);

OS << "} // end namespace llvm\n\n";		OS << "} // end namespace llvm\n\n";

OS << "#endif // GET_SUBTARGETINFO_CTOR\n\n";		OS << "#endif // GET_SUBTARGETINFO_CTOR\n\n";

Show All 11 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[System Model] Introduce a target system modelNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 188611

llvm/include/llvm/Analysis/TargetTransformInfo.h

llvm/include/llvm/Analysis/TargetTransformInfoImpl.h

llvm/include/llvm/CodeGen/BasicTTIImpl.h

llvm/include/llvm/CodeGen/TargetSubtargetInfo.h

llvm/include/llvm/MC/MCSubtargetInfo.h

llvm/include/llvm/MC/MCSystemModel.h

llvm/include/llvm/Target/Target.td

llvm/include/llvm/Target/TargetCacheModel.td

llvm/include/llvm/Target/TargetMemoryModel.td

llvm/include/llvm/Target/TargetSoftwarePrefetchConfig.td

llvm/include/llvm/Target/TargetSystemModel.td

llvm/include/llvm/Target/TargetWCBufferModel.td

llvm/lib/Analysis/TargetTransformInfo.cpp

llvm/lib/CodeGen/TargetSubtargetInfo.cpp

llvm/lib/MC/CMakeLists.txt

llvm/lib/MC/MCSubtargetInfo.cpp

llvm/lib/MC/MCSystemModel.cpp

llvm/lib/Target/AArch64/AArch64Subtarget.h

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.h

llvm/lib/Target/AArch64/AArch64TargetTransformInfo.cpp

llvm/lib/Target/AMDGPU/MCTargetDesc/AMDGPUMCTargetDesc.cpp

llvm/lib/Target/Hexagon/HexagonTargetTransformInfo.h

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.h

llvm/lib/Target/PowerPC/PPCTargetTransformInfo.cpp

llvm/lib/Target/SystemZ/SystemZTargetTransformInfo.h

llvm/lib/Transforms/Scalar/LoopDataPrefetch.cpp

llvm/test/TableGen/SystemModelEmitter.td

llvm/unittests/CodeGen/MachineInstrTest.cpp

llvm/unittests/MC/CMakeLists.txt

llvm/unittests/MC/SystemModel.cpp

llvm/utils/TableGen/SubtargetEmitter.cpp

[System Model] Introduce a target system model
Needs ReviewPublic