This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
BasicTTIImpl.h
4/4
SwitchLoweringUtils.h
2/2
TargetLowering.h
-
lib/
-
CodeGen/
22/22
SwitchLoweringUtils.cpp
-
TargetLoweringBase.cpp
-
Target/AArch64/
-
AArch64/
-
AArch64ISelLowering.cpp
-
AArch64Subtarget.h
-
AArch64Subtarget.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
-
max-jump-table.ll

Differential D60295

[CodeGen] Replace -max-jump-table-size with -max-jump-table-targets
ClosedPublic

Authored by evandro on Apr 4 2019, 5:08 PM.

Download Raw Diff

Details

Reviewers

hans
ayonam
junbuml
craig.topper
RKSimon
efriedma

Commits

rG3bd8ba156b52: [CodeGen] Replace -max-jump-table-size with -max-jump-table-targets
rL372893: [CodeGen] Replace -max-jump-table-size with -max-jump-table-targets

Summary

Modern processors predict the targets of an indirect branch regardless of the size of any jump table used to glean its target address. Moreover, branch predictors typically use resources limited by the number of actual targets that occur at run time.

This patch changes the semantics of the option -max-jump-table-size to limit the number of different targets instead of the number of entries in a jump table. Thus, it is now renamed to -max-jump-table-targets.

Before, when -max-jump-table-size was specified, it could happen that cluster jump tables could have targets used repeatedly, but each one was counted and typically resulted in tables with the same number of entries. With this patch, when specifying -max-jump-table-targets, tables may have different lengths, since the number of unique targets is counted towards the limit, but the number of unique targets in tables is the same, but for the last one containing the balance of targets.

Diff Detail

Event Timeline

evandro created this revision.Apr 4 2019, 5:08 PM

Herald added a project: Restricted Project. · View Herald TranscriptApr 4 2019, 5:08 PM

Herald added subscribers: llvm-commits, jsji, jocewei and 24 others. · View Herald Transcript

I'm still collecting data on other architectures, but I've observed improvements between 3 and 15% in some SPEC benchmarks, such as 253.perlbmk and 400.perlbench, and proprietary ones on AArch64.

This patch changes the semantics of the options min-jump-table-entries and max-jump-table-size to use the number of different targets instead of the number of entries in a jump table. Thus, the are now renamed to min-jump-table-cases and max-jump-table-cases, respectively.

Hmm, I'll have to think about this one.

I'm not sure it makes sense to change -min-jump-table-entries. I think that one is really supposed to decide when a jump table is better than separate branches, and I'm not sure it should change because of this.

-max-jump-table-size was introduced exactly for this purpose (limiting the load on the branch predictor), so making that based on the number of targets sounds reasonable (I'd suggest calling it max-jump-table-targets instead of -cases though).

I also don't see exactly how the semantics of max-jump-table-size are changed with this patch? What am I missing? Can you upload it again with more context maybe?

In D60295#1456226, @hans wrote:

I also don't see exactly how the semantics of max-jump-table-size are changed with this patch? What am I missing? Can you upload it again with more context maybe?

Look at llvm/include/llvm/CodeGen/TargetLowering.h below.

What does this do for codesize?

aheejin added inline comments.Apr 8 2019, 5:30 AM

llvm/test/CodeGen/WebAssembly/cfg-stackify.ll
1 ↗	(On Diff #193810)	Why does this test need this option?

In D60295#1456414, @evandro wrote:

In D60295#1456226, @hans wrote:

I also don't see exactly how the semantics of max-jump-table-size are changed with this patch? What am I missing? Can you upload it again with more context maybe?

Look at llvm/include/llvm/CodeGen/TargetLowering.h below.

Thanks! I had forgotten how this works :-)

I still think looking at the number of cases isn't that much better than looking at the size of the range though. As you said, the point is to limit the load on the branch target predictor, and IIUC that's limited on the number of *different branch targets*, which is really orthogonal to the number of cases. I realize that we don't have that information as readily available, but do you agree that limiting the jump table to a certain number of different targets would be a better approach?

llvm/include/llvm/CodeGen/TargetLowering.h
962	What's the `(NumCases < Range)` part for? Based on the description, I'd expect this to just check "NumCases <= MaxJumpTableCases"

In D60295#1458162, @dmgreen wrote:

What does this do for codesize?

Since there aren't that many jump tables, the increase in code size is negligible.

evandro marked 2 inline comments as done.Apr 19 2019, 10:33 AM

evandro added inline comments.

llvm/test/CodeGen/WebAssembly/cfg-stackify.ll
1 ↗	(On Diff #193810)	Because before there was no jump table and now there is, since the number of targets is small enough. Otherwise, this test would have to be modified for a reason unrelated to the test.

In D60295#1462585, @hans wrote:

I still think looking at the number of cases isn't that much better than looking at the size of the range though. As you said, the point is to limit the load on the branch target predictor, and IIUC that's limited on the number of *different branch targets*, which is really orthogonal to the number of cases. I realize that we don't have that information as readily available, but do you agree that limiting the jump table to a certain number of different targets would be a better approach?

For each case there is a target in the table, that may potentially be reached or not at run time, but would prevent stressing the predictor. So, cases and targets are not orthogonal, but the same.

llvm/include/llvm/CodeGen/TargetLowering.h
962	It's how I infer that there may be a default case. Or am I missing a better way to do so?

Since there aren't that many jump tables, the increase in code size is negligible.

For a point of reference on the codesize tests I ran, the increases from this patch are larger than the decreases from D59936 (which was the last decent codesize change I saw). Codesize changes might seem small at times, but they often come small change at a time. Plus it depends how you measure them.

Also, many cpus don't really work the way you claim. Some don't even have branch predictors, or the time would not be dominated by the branch mispredict. With those it's more about the difference between jump table setup code and the equivalent series of branches.

In D60295#1472786, @evandro wrote:

In D60295#1462585, @hans wrote:

I still think looking at the number of cases isn't that much better than looking at the size of the range though. As you said, the point is to limit the load on the branch target predictor, and IIUC that's limited on the number of *different branch targets*, which is really orthogonal to the number of cases. I realize that we don't have that information as readily available, but do you agree that limiting the jump table to a certain number of different targets would be a better approach?

For each case there is a target in the table, that may potentially be reached or not at run time, but would prevent stressing the predictor. So, cases and targets are not orthogonal, but the same.

But if that's so, shouldn't the "default" cases count too?

I don't know how CPUs do this internally, but in your change description you wrote: "Rather, branch predictors typically use resources limited by the number of actual targets that occur at run time."

I interpreted this as the number of actual target addresses, which is less than or equal to the number of cases. For example in

switch (x) {
case 1: case 8: case 12: foo(); break;
case 3: case 7: case 9: bar(); break;
default: baz();
}

there are only 3 branch targets, but 6 cases, and an indirect jump through a jump table could take 12 different values.

The question is which of those dimensions that the branch target predictor is limited by.

Does the CPU keep track of the branch target for each of the 12 possible in-range values? Or does it have some kind of reverse map that tracks which values map to which of the 3 targets? I don't have the expertise to answer this, but so far I'm not convinced that limiting the number of cases is better than limiting the size of the range.

evandro marked an inline comment as done.Apr 25 2019, 6:21 AM

@hans, yes, default, whether explicit or implicit, counts too, for the resources used inside the target only care about different addresses. In you example, a typical branch predictor will see up to 3 target addresses. Of course, branch predictors are trained by branches that are actually taken, so only target addresses that are executed count. Which opens the window for future work involving FDO or just the static probabilities given by the default heuristics.

I'll study your example further.

evandro updated this revision to Diff 206941.Jun 27 2019, 2:41 PM

evandro retitled this revision from [SelectionDAG] Change the jump table size unit from entry to target to [CodeGen] Change the jump table size unit from entry to target.

evandro edited the summary of this revision. (Show Details)

Ping! 🔔

Herald added a subscriber: • wuzish. · View Herald TranscriptJul 11 2019, 8:31 AM

Please upload with the full context.

xbolva00 added a reviewer: efriedma.Jul 11 2019, 8:40 AM

diff with context?

evandro updated this revision to Diff 209282.Jul 11 2019, 12:01 PM

🔔 ¡Ping! 🔔

Hi Evandro,

Very sorry for the slow reply here, but it needed some thinking.

From the change description:

[CodeGen] Change the jump table size unit from entry to target

It's not so clear what a "jump table size unit" is. I think maybe "Replace -max-jump-table-size with -max-jump-table-targets" or similar would be more clear.

llvm/include/llvm/CodeGen/SwitchLoweringUtils.h
232	Maye add "unique" in here to make it clearer.
llvm/lib/CodeGen/SwitchLoweringUtils.cpp
51	I would probably just have gone with "Targets.insert(Clusters[i].MBB)" since C isn't getting used anywhere else.
97	MaxTargets isn't a great variable name for something that's "Number of targets, including the default if it's reachable". Maybe getJumpTableNumTargets() could take the default case into account so it doesn't need to be handled separately?
158	Hmm, this is problematic. getJumpTableNumTargets has linear time complexity in the number of clusters, so I think this is essentially adding another factor O(n) to the overall time complexity, which is unfortunate. One way to solve this within the original O(n^2) complexity of the current code, would be to build an auxiliary data structure in the outer for-loop, e.g. TotalNumTargets[], where TotalNumTargets[x] is the total number of targets in clusters i..x (i being defined in the outer loop). That would be built in O(n) time, and the inner loop would then query TotalNumTargets[j] (in constant time).

evandro marked 4 inline comments as done.Aug 6 2019, 5:21 PM

evandro added inline comments.

llvm/lib/CodeGen/SwitchLoweringUtils.cpp
158	If I understood you correctly, pre calculating the number of targets in clusters i..x, where x will be j, is the same as performing this calculation inside the inner loop.

evandro updated this revision to Diff 213770.Aug 6 2019, 6:07 PM

evandro retitled this revision from [CodeGen] Change the jump table size unit from entry to target to [CodeGen] Replace -max-jump-table-size with -max-jump-table-targets.

evandro edited the summary of this revision. (Show Details)Aug 6 2019, 6:37 PM

Ping❗ 🔔🔔🔔

Very sorry again for the slow reply. I think this patch still suffers from the O(n^3) time complexity.

llvm/lib/CodeGen/SwitchLoweringUtils.cpp
158	No, the current code does O(n) work in the inner loop, making the overall time complexity O(n^3). Pre-computing the number of targets in clusters i..x in the outer loop would preserve the overall time complexity of O(n^2).

evandro marked 2 inline comments as done.Aug 15 2019, 6:06 PM

evandro added inline comments.

llvm/lib/CodeGen/SwitchLoweringUtils.cpp
158	Let me see if I understood what you mean in a patch...

evandro updated this revision to Diff 215516.Aug 15 2019, 6:07 PM

evandro marked an inline comment as done.

evandro marked an inline comment as done.Aug 16 2019, 11:24 AM

🔔🔔 ‼️Ping‼️ 🔔🔔

In D60295#1641609, @evandro wrote:

🔔🔔 ‼️Ping‼️ 🔔🔔

Sorry, I've been busy with LLVM 9 release and other work. Hopefully I can get to this sometime next week; I haven't forgotten about it.

I think this keeps the algorithm still withing O(n^2) so that's good.

But I am worried that this adds more complexity to something that is already very complex. Can you share numbers to show that this is worth it?

I left some comments for how to maybe make this cleaner, and I think that's the main improvement that's needed: it needs be easier to read the code, understand what's happening and see that it's correct.

I also worry that this makes the code slower, even though the new functionality is not used by most targets. Even though it's still O(n^2) time complexity, we're doing more work now. Can you measure and see how this affects compile time of a large switch? (Maybe try the program from https://bugs.llvm.org/show_bug.cgi?id=23490)

llvm/lib/CodeGen/SwitchLoweringUtils.cpp
145	Maybe PartitionStats is a better name, since it's really about numbers, not any other kinds of traits.
150	I guess you mean PartitionTrait, not PartitionTarget. Also, the indexing of this seems very confusing, and that seems to be because elements are inserted with push_back(). Couldn't PartitionTrait[x] be the traits for Clusters[i..x], i.e. a straight-forward indexing? Also, since it's a vector, I don't think the name should be in singular.
154	Using the same variable, defined on line 158, both for initializing the array, and then for lookups later is confusing. I think it would be easier to read of the code in this block was more like: PartitionTraits[j].Range = ... PartitionTraits[j].Cases = ... PartitionTraits[j].Targets = ...
161	Here the indexing gets confusing (also we probably don't want to copy the struct, but use a const-ref).

RKSimon mentioned this in rL371415: Fix typo in comment noticed in D60295. NFCI..Sep 9 2019, 9:06 AM

RKSimon mentioned this in rG9ede7c039563: Fix typo in comment noticed in D60295. NFCI..

evandro marked 2 inline comments as done.Sep 12 2019, 10:58 AM

evandro marked 2 inline comments as done.Sep 12 2019, 12:53 PM

Please, stand by for numbers.

Running llc -O1 3 times on the byte code from a.i in the "reduced some more" archive:

Before: average of 38.383s ± 0.064s
After: average of 38.440s ± 0.049s

Or an increase of 0.15% in the run time.

lgtm with comments

llvm/include/llvm/CodeGen/SwitchLoweringUtils.h
236	Since it has access to Clusters and First and Last, passing in Cases and Range seems redundant. It seems the function should be able to figure those things out itself.
llvm/lib/CodeGen/SwitchLoweringUtils.cpp
142	The comment needs updating for the PartitionTrait rename.
150	s/traits/stats/ And thanks for updating the indexing, this is easier to follow.

This revision is now accepted and ready to land.Sep 17 2019, 6:38 AM

Thank you.

llvm/include/llvm/CodeGen/SwitchLoweringUtils.h
236	That is true, but, whenever `getJumpTableNumTargets()` is called, those values are calculated and then used again. So, at least for this use case, it seems to be more efficient to let `getJumpTableNumTargets()` calculate and return them through references.

evandro marked an inline comment as done.Sep 17 2019, 8:31 AM

evandro added inline comments.

llvm/include/llvm/CodeGen/SwitchLoweringUtils.h
236	I take it back. `Cases` needs the `TotalCases` array to be calculated. So, no, with the current arguments, `getJumpTableNumCases()` can only figure `Range` out.

Update the patch Including the suggested refactoring.

hans added inline comments.Sep 18 2019, 12:18 AM

llvm/lib/CodeGen/SwitchLoweringUtils.cpp
46	I think it would be simpler if this just returned a PartitionStats object, instead of "returning" via a reference parameter.
51	The point of the AllCases vector was to avoid having to iterate from First to Last each time to compute the number of cases. Now that we're iterating from First to Last anyway to count the number of targets, there's no point to this optimization really, and we might as well count the number of cases while counting the number of targets. That would make the code simpler.

evandro marked 4 inline comments as done.Sep 18 2019, 8:14 AM

evandro added inline comments.

llvm/lib/CodeGen/SwitchLoweringUtils.cpp
51	Of course!

evandro updated this revision to Diff 220679.Sep 18 2019, 8:23 AM

evandro marked an inline comment as done.

hans added inline comments.Sep 20 2019, 12:06 AM

llvm/lib/CodeGen/SwitchLoweringUtils.cpp
52	There's still no need for the vector. The number of cases from First to Last is the sum of (Clusters[i].High - Clusters[i].Low) for each i between First and Last. The sum can be computed directly since we're running the for-loop anyway. There's no need to use the vector.

evandro marked 2 inline comments as done.Sep 20 2019, 11:16 AM

evandro added inline comments.

llvm/lib/CodeGen/SwitchLoweringUtils.cpp
52	I see, inlining the functions makes this clear.

evandro updated this revision to Diff 221092.Sep 20 2019, 12:27 PM

evandro marked an inline comment as done.

evandro updated this revision to Diff 221095.Sep 20 2019, 12:46 PM

hans added inline comments.Sep 23 2019, 1:51 AM

llvm/lib/CodeGen/SwitchLoweringUtils.cpp
44	I'd suggest not declaring these until they're used.
47	I don't know what the "accumulated" refers to here. I don't think there's any need for a comment here actually.
50	There's nothing accumulated here. I'd suggest just dropping the comment.
52	It would be better to just declare the variables here: APInt Hi = ... APInt Lo = ...
60	And declare these variables here (no need to reuse the same variables as in the loop): APInt Hi = ... APInt Lo = ...

evandro updated this revision to Diff 221373.Sep 23 2019, 11:30 AM

evandro marked 5 inline comments as done.

I think you uploaded a new patch but the code that I commented on still looks the same?

🤦🏻‍♂️

evandro updated this revision to Diff 221564.Sep 24 2019, 11:07 AM

Thanks! Looks good to me.

Thank you.

Closed by commit rL372893: [CodeGen] Replace -max-jump-table-size with -max-jump-table-targets (authored by evandro). · Explain WhySep 25 2019, 9:09 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

BasicTTIImpl.h

4 lines

SwitchLoweringUtils.h

8 lines

TargetLowering.h

28 lines

lib/

CodeGen/

SwitchLoweringUtils.cpp

28 lines

TargetLoweringBase.cpp

18 lines

Target/

AArch64/

AArch64ISelLowering.cpp

7 lines

AArch64Subtarget.h

4 lines

AArch64Subtarget.cpp

4 lines

test/

CodeGen/

AArch64/

max-jump-table.ll

46 lines

Diff 213770

llvm/include/llvm/CodeGen/BasicTTIImpl.h

Show First 20 Lines • Show All 311 Lines • ▼ Show 20 Lines	unsigned getIntrinsicCost(Intrinsic::ID IID, Type *RetTy,
}		}

return BaseT::getIntrinsicCost(IID, RetTy, ParamTys, U);		return BaseT::getIntrinsicCost(IID, RetTy, ParamTys, U);
}		}

unsigned getEstimatedNumberOfCaseClusters(const SwitchInst &SI,		unsigned getEstimatedNumberOfCaseClusters(const SwitchInst &SI,
unsigned &JumpTableSize) {		unsigned &JumpTableSize) {
/// Try to find the estimated number of clusters. Note that the number of		/// Try to find the estimated number of clusters. Note that the number of
/// clusters identified in this function could be different from the actural		/// clusters identified in this function could be different from the actual
/// numbers found in lowering. This function ignore switches that are		/// numbers found in lowering. This function ignore switches that are
/// lowered with a mix of jump table / bit test / BTree. This function was		/// lowered with a mix of jump table / bit test / BTree. This function was
/// initially intended to be used when estimating the cost of switch in		/// initially intended to be used when estimating the cost of switch in
/// inline cost heuristic, but it's a generic cost model to be used in other		/// inline cost heuristic, but it's a generic cost model to be used in other
/// places (e.g., in loop unrolling).		/// places (e.g., in loop unrolling).
unsigned N = SI.getNumCases();		unsigned N = SI.getNumCases();
const TargetLoweringBase *TLI = getTLI();		const TargetLoweringBase *TLI = getTLI();
const DataLayout &DL = this->getDataLayout();		const DataLayout &DL = this->getDataLayout();
Show All 29 Lines	unsigned getEstimatedNumberOfCaseClusters(const SwitchInst &SI,
// Check if suitable for a jump table.		// Check if suitable for a jump table.
if (IsJTAllowed) {		if (IsJTAllowed) {
if (N < 2 \|\| N < TLI->getMinimumJumpTableEntries())		if (N < 2 \|\| N < TLI->getMinimumJumpTableEntries())
return N;		return N;
uint64_t Range =		uint64_t Range =
(MaxCaseVal - MinCaseVal)		(MaxCaseVal - MinCaseVal)
.getLimitedValue(std::numeric_limits<uint64_t>::max() - 1) + 1;		.getLimitedValue(std::numeric_limits<uint64_t>::max() - 1) + 1;
// Check whether a range of clusters is dense enough for a jump table		// Check whether a range of clusters is dense enough for a jump table
if (TLI->isSuitableForJumpTable(&SI, N, Range)) {		if (TLI->isSuitableForJumpTable(&SI, N, 0, Range)) {
JumpTableSize = Range;		JumpTableSize = Range;
return 1;		return 1;
}		}
}		}
return N;		return N;
}		}

unsigned getJumpBufAlignment() { return getTLI()->getJumpBufAlignment(); }		unsigned getJumpBufAlignment() { return getTLI()->getJumpBufAlignment(); }
▲ Show 20 Lines • Show All 1,334 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/SwitchLoweringUtils.h

Show First 20 Lines • Show All 215 Lines • ▼ Show 20 Lines	struct BitTestBlock {
BitTestBlock(APInt F, APInt R, const Value *SV, unsigned Rg, MVT RgVT, bool E,		BitTestBlock(APInt F, APInt R, const Value *SV, unsigned Rg, MVT RgVT, bool E,
bool CR, MachineBasicBlock P, MachineBasicBlock D,		bool CR, MachineBasicBlock P, MachineBasicBlock D,
BitTestInfo C, BranchProbability Pr)		BitTestInfo C, BranchProbability Pr)
: First(std::move(F)), Range(std::move(R)), SValue(SV), Reg(Rg),		: First(std::move(F)), Range(std::move(R)), SValue(SV), Reg(Rg),
RegVT(RgVT), Emitted(E), ContiguousRange(CR), Parent(P), Default(D),		RegVT(RgVT), Emitted(E), ContiguousRange(CR), Parent(P), Default(D),
Cases(std::move(C)), Prob(Pr) {}		Cases(std::move(C)), Prob(Pr) {}
};		};

/// Return the range of value within a range.		/// Return the range of values within a range.
uint64_t getJumpTableRange(const CaseClusterVector &Clusters, unsigned First,		uint64_t getJumpTableRange(const CaseClusterVector &Clusters, unsigned First,
unsigned Last);		unsigned Last);

/// Return the number of cases within a range.		/// Return the number of cases within a range.
uint64_t getJumpTableNumCases(const SmallVectorImpl<unsigned> &TotalCases,		uint64_t getJumpTableNumCases(const SmallVectorImpl<unsigned> &TotalCases,
unsigned First, unsigned Last);		unsigned First, unsigned Last);

		/// Return the number of unique case targets within a range.
		hansUnsubmitted Done Reply Inline Actions Maye add "unique" in here to make it clearer. hans: Maye add "unique" in here to make it clearer.
		uint64_t getJumpTableNumTargets(const CaseClusterVector &Clusters,
		unsigned First, unsigned Last,
		bool HasReachableDefault,
		uint64_t Cases, uint64_t Range);
		hansUnsubmitted Done Reply Inline Actions Since it has access to Clusters and First and Last, passing in Cases and Range seems redundant. It seems the function should be able to figure those things out itself. hans: Since it has access to Clusters and First and Last, passing in Cases and Range seems redundant.
		evandroAuthorUnsubmitted Done Reply Inline Actions That is true, but, whenever `getJumpTableNumTargets()` is called, those values are calculated and then used again. So, at least for this use case, it seems to be more efficient to let `getJumpTableNumTargets()` calculate and return them through references. evandro: That is true, but, whenever `getJumpTableNumTargets()` is called, those values are calculated…
		evandroAuthorUnsubmitted Done Reply Inline Actions I take it back. `Cases` needs the `TotalCases` array to be calculated. So, no, with the current arguments, `getJumpTableNumCases()` can only figure `Range` out. evandro: I take it back. `Cases` needs the `TotalCases` array to be calculated. So, no, with the…

struct SwitchWorkListItem {		struct SwitchWorkListItem {
MachineBasicBlock *MBB;		MachineBasicBlock *MBB;
CaseClusterIt FirstCluster;		CaseClusterIt FirstCluster;
CaseClusterIt LastCluster;		CaseClusterIt LastCluster;
const ConstantInt *GE;		const ConstantInt *GE;
const ConstantInt *LT;		const ConstantInt *LT;
BranchProbability DefaultProb;		BranchProbability DefaultProb;
};		};
▲ Show 20 Lines • Show All 58 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/TargetLowering.h

Show First 20 Lines • Show All 953 Lines • ▼ Show 20 Lines	switch (Op) {
case ISD::STRICT_FNEARBYINT: EqOpc = ISD::FNEARBYINT; break;		case ISD::STRICT_FNEARBYINT: EqOpc = ISD::FNEARBYINT; break;
case ISD::STRICT_FMAXNUM: EqOpc = ISD::FMAXNUM; break;		case ISD::STRICT_FMAXNUM: EqOpc = ISD::FMAXNUM; break;
case ISD::STRICT_FMINNUM: EqOpc = ISD::FMINNUM; break;		case ISD::STRICT_FMINNUM: EqOpc = ISD::FMINNUM; break;
case ISD::STRICT_FCEIL: EqOpc = ISD::FCEIL; break;		case ISD::STRICT_FCEIL: EqOpc = ISD::FCEIL; break;
case ISD::STRICT_FFLOOR: EqOpc = ISD::FFLOOR; break;		case ISD::STRICT_FFLOOR: EqOpc = ISD::FFLOOR; break;
case ISD::STRICT_FROUND: EqOpc = ISD::FROUND; break;		case ISD::STRICT_FROUND: EqOpc = ISD::FROUND; break;
case ISD::STRICT_FTRUNC: EqOpc = ISD::FTRUNC; break;		case ISD::STRICT_FTRUNC: EqOpc = ISD::FTRUNC; break;
case ISD::STRICT_FP_ROUND: EqOpc = ISD::FP_ROUND; break;		case ISD::STRICT_FP_ROUND: EqOpc = ISD::FP_ROUND; break;
case ISD::STRICT_FP_EXTEND: EqOpc = ISD::FP_EXTEND; break;		case ISD::STRICT_FP_EXTEND: EqOpc = ISD::FP_EXTEND; break;
		hansUnsubmitted Done Reply Inline Actions What's the `(NumCases < Range)` part for? Based on the description, I'd expect this to just check "NumCases <= MaxJumpTableCases" hans: What's the `(NumCases < Range)` part for? Based on the description, I'd expect this to just…
		evandroAuthorUnsubmitted Done Reply Inline Actions It's how I infer that there may be a default case. Or am I missing a better way to do so? evandro: It's how I infer that there may be a default case. Or am I missing a better way to do so?
}		}

auto Action = getOperationAction(EqOpc, VT);		auto Action = getOperationAction(EqOpc, VT);

// We don't currently handle Custom or Promote for strict FP pseudo-ops.		// We don't currently handle Custom or Promote for strict FP pseudo-ops.
// For now, we just expand for those cases.		// For now, we just expand for those cases.
if (Action != Legal)		if (Action != Legal)
Action = Expand;		Action = Expand;
▲ Show 20 Lines • Show All 49 Lines • ▼ Show 20 Lines	bool rangeFitsInWord(const APInt &Low, const APInt &High,
const DataLayout &DL) const {		const DataLayout &DL) const {
// FIXME: Using the pointer type doesn't seem ideal.		// FIXME: Using the pointer type doesn't seem ideal.
uint64_t BW = DL.getIndexSizeInBits(0u);		uint64_t BW = DL.getIndexSizeInBits(0u);
uint64_t Range = (High - Low).getLimitedValue(UINT64_MAX - 1) + 1;		uint64_t Range = (High - Low).getLimitedValue(UINT64_MAX - 1) + 1;
return Range <= BW;		return Range <= BW;
}		}

/// Return true if lowering to a jump table is suitable for a set of case		/// Return true if lowering to a jump table is suitable for a set of case
/// clusters which may contain \p NumCases cases, \p Range range of values.		/// clusters which may contain \p NumCases cases, \p Range range of values,
virtual bool isSuitableForJumpTable(const SwitchInst *SI, uint64_t NumCases,		/// \p NumTargets targets.
		virtual bool isSuitableForJumpTable(const SwitchInst *SI,
		uint64_t NumCases, uint64_t NumTargets,
uint64_t Range) const {		uint64_t Range) const {
// FIXME: This function check the maximum table size and density, but the		// FIXME: This function check the maximum table size and density, but the
// minimum size is not checked. It would be nice if the minimum size is		// minimum size is not checked. It would be nice if the minimum size is
// also combined within this function. Currently, the minimum size check is		// also combined within this function. Currently, the minimum size check is
// performed in findJumpTable() in SelectionDAGBuiler and		// performed in findJumpTable() in SelectionDAGBuiler and
// getEstimatedNumberOfCaseClusters() in BasicTTIImpl.		// getEstimatedNumberOfCaseClusters() in BasicTTIImpl.
const bool OptForSize = SI->getParent()->getParent()->hasOptSize();		const bool OptForSize = SI->getParent()->getParent()->hasOptSize();
const unsigned MinDensity = getMinimumJumpTableDensity(OptForSize);		const unsigned MinDensity = getMinimumJumpTableDensity(OptForSize);
const unsigned MaxJumpTableSize = getMaximumJumpTableSize();		const unsigned MaxJumpTableTargets = getMaximumJumpTableTargets();

// Check whether the number of cases is small enough and		// Check whether the number of targets is small enough and
// the range is dense enough for a jump table.		// the range is dense enough for a jump table.
if ((OptForSize \|\| Range <= MaxJumpTableSize) &&		if ((OptForSize \|\| NumTargets <= MaxJumpTableTargets) &&
(NumCases * 100 >= Range * MinDensity)) {		NumCases * 100 >= Range * MinDensity)
return true;		return true;
}
return false;		return false;
}		}

/// Return true if lowering to a bit test is suitable for a set of case		/// Return true if lowering to a bit test is suitable for a set of case
/// clusters which contains \p NumDests unique destinations, \p Low and		/// clusters which contains \p NumDests unique destinations, \p Low and
/// \p High as its lowest and highest case values, and expects \p NumCmps		/// \p High as its lowest and highest case values, and expects \p NumCmps
/// case value comparisons. Check if the number of destinations, comparison		/// case value comparisons. Check if the number of destinations, comparison
/// metric, and range are all suitable.		/// metric, and range are all suitable.
▲ Show 20 Lines • Show All 486 Lines • ▼ Show 20 Lines
}		}

/// Return lower limit for number of blocks in a jump table.		/// Return lower limit for number of blocks in a jump table.
virtual unsigned getMinimumJumpTableEntries() const;		virtual unsigned getMinimumJumpTableEntries() const;

/// Return lower limit of the density in a jump table.		/// Return lower limit of the density in a jump table.
unsigned getMinimumJumpTableDensity(bool OptForSize) const;		unsigned getMinimumJumpTableDensity(bool OptForSize) const;

/// Return upper limit for number of entries in a jump table.		/// Return upper limit for number of targets in a jump table.
/// Zero if no limit.		unsigned getMaximumJumpTableTargets() const;
unsigned getMaximumJumpTableSize() const;

virtual bool isJumpTableRelative() const {		virtual bool isJumpTableRelative() const {
return TM.isPositionIndependent();		return TM.isPositionIndependent();
}		}

/// If a physical register, this specifies the register that		/// If a physical register, this specifies the register that
/// llvm.savestack/llvm.restorestack should save and restore.		/// llvm.savestack/llvm.restorestack should save and restore.
unsigned getStackPointerRegisterToSaveRestore() const {		unsigned getStackPointerRegisterToSaveRestore() const {
▲ Show 20 Lines • Show All 401 Lines • ▼ Show 20 Lines	protected:
/// llvm.longjmp or the version without _. Defaults to false.		/// llvm.longjmp or the version without _. Defaults to false.
void setUseUnderscoreLongJmp(bool Val) {		void setUseUnderscoreLongJmp(bool Val) {
UseUnderscoreLongJmp = Val;		UseUnderscoreLongJmp = Val;
}		}

/// Indicate the minimum number of blocks to generate jump tables.		/// Indicate the minimum number of blocks to generate jump tables.
void setMinimumJumpTableEntries(unsigned Val);		void setMinimumJumpTableEntries(unsigned Val);

/// Indicate the maximum number of entries in jump tables.		/// Indicate the maximum number of targets in jump tables.
/// Set to zero to generate unlimited jump tables.		void setMaximumJumpTableTargets(unsigned);
void setMaximumJumpTableSize(unsigned);

/// If set to a physical register, this specifies the register that		/// If set to a physical register, this specifies the register that
/// llvm.savestack/llvm.restorestack should save and restore.		/// llvm.savestack/llvm.restorestack should save and restore.
void setStackPointerRegisterToSaveRestore(unsigned R) {		void setStackPointerRegisterToSaveRestore(unsigned R) {
StackPointerRegisterToSaveRestore = R;		StackPointerRegisterToSaveRestore = R;
}		}

/// Tells the code generator that the target has multiple (allocatable)		/// Tells the code generator that the target has multiple (allocatable)
▲ Show 20 Lines • Show All 2,197 Lines • Show Last 20 Lines

llvm/lib/CodeGen/SwitchLoweringUtils.cpp

//===- SwitchLoweringUtils.cpp - Switch Lowering --------------------------===//		//===- SwitchLoweringUtils.cpp - Switch Lowering --------------------------===//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file contains switch inst lowering optimizations and utilities for		// This file contains switch inst lowering optimizations and utilities for
// codegen, so that it can be used for both SelectionDAG and GlobalISel.		// codegen, so that it can be used for both SelectionDAG and GlobalISel.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

		#include "llvm/ADT/SmallSet.h"
#include "llvm/CodeGen/MachineJumpTableInfo.h"		#include "llvm/CodeGen/MachineJumpTableInfo.h"
#include "llvm/CodeGen/SwitchLoweringUtils.h"		#include "llvm/CodeGen/SwitchLoweringUtils.h"

using namespace llvm;		using namespace llvm;
using namespace SwitchCG;		using namespace SwitchCG;

uint64_t SwitchCG::getJumpTableRange(const CaseClusterVector &Clusters,		uint64_t SwitchCG::getJumpTableRange(const CaseClusterVector &Clusters,
unsigned First, unsigned Last) {		unsigned First, unsigned Last) {
Show All 13 Lines	SwitchCG::getJumpTableNumCases(const SmallVectorImpl<unsigned> &TotalCases,
unsigned First, unsigned Last) {		unsigned First, unsigned Last) {
assert(Last >= First);		assert(Last >= First);
assert(TotalCases[Last] >= TotalCases[First]);		assert(TotalCases[Last] >= TotalCases[First]);
uint64_t NumCases =		uint64_t NumCases =
TotalCases[Last] - (First == 0 ? 0 : TotalCases[First - 1]);		TotalCases[Last] - (First == 0 ? 0 : TotalCases[First - 1]);
return NumCases;		return NumCases;
}		}

		uint64_t
		hansUnsubmitted Done Reply Inline Actions I'd suggest not declaring these until they're used. hans: I'd suggest not declaring these until they're used.
		SwitchCG::getJumpTableNumTargets(const CaseClusterVector &Clusters,
		unsigned First, unsigned Last,
		hansUnsubmitted Done Reply Inline Actions I think it would be simpler if this just returned a PartitionStats object, instead of "returning" via a reference parameter. hans: I think it would be simpler if this just returned a PartitionStats object, instead of…
		bool HasReachableDefault,
		hansUnsubmitted Done Reply Inline Actions I don't know what the "accumulated" refers to here. I don't think there's any need for a comment here actually. hans: I don't know what the "accumulated" refers to here. I don't think there's any need for a…
		uint64_t Cases, uint64_t Range) {
		assert(Last >= First);
		SmallSet<const MachineBasicBlock*, 8> Targets;
		hansUnsubmitted Done Reply Inline Actions There's nothing accumulated here. I'd suggest just dropping the comment. hans: There's nothing accumulated here. I'd suggest just dropping the comment.

		hansUnsubmitted Done Reply Inline Actions I would probably just have gone with "Targets.insert(Clusters[i].MBB)" since C isn't getting used anywhere else. hans: I would probably just have gone with "Targets.insert(Clusters[i].MBB)" since C isn't getting…
		hansUnsubmitted Done Reply Inline Actions The point of the AllCases vector was to avoid having to iterate from First to Last each time to compute the number of cases. Now that we're iterating from First to Last anyway to count the number of targets, there's no point to this optimization really, and we might as well count the number of cases while counting the number of targets. That would make the code simpler. hans: The point of the AllCases vector was to avoid having to iterate from First to Last each time to…
		evandroAuthorUnsubmitted Done Reply Inline Actions Of course! evandro: Of course!
		for (unsigned i = First; i <= Last; i++)
		hansUnsubmitted Done Reply Inline Actions There's still no need for the vector. The number of cases from First to Last is the sum of (Clusters[i].High - Clusters[i].Low) for each i between First and Last. The sum can be computed directly since we're running the for-loop anyway. There's no need to use the vector. hans: There's still no need for the vector. The number of cases from First to Last is the sum of…
		evandroAuthorUnsubmitted Done Reply Inline Actions I see, inlining the functions makes this clear. evandro: I see, inlining the functions makes this clear.
		hansUnsubmitted Done Reply Inline Actions It would be better to just declare the variables here: APInt Hi = ... APInt Lo = ... hans: It would be better to just declare the variables here: APInt Hi = ... APInt Lo = ...
		Targets.insert(Clusters[i].MBB);

		return Targets.size() + (HasReachableDefault && Range > Cases);
		}

void SwitchCG::SwitchLowering::findJumpTables(CaseClusterVector &Clusters,		void SwitchCG::SwitchLowering::findJumpTables(CaseClusterVector &Clusters,
const SwitchInst *SI,		const SwitchInst *SI,
MachineBasicBlock *DefaultMBB) {		MachineBasicBlock *DefaultMBB) {
		hansUnsubmitted Done Reply Inline Actions And declare these variables here (no need to reuse the same variables as in the loop): APInt Hi = ... APInt Lo = ... hans: And declare these variables here (no need to reuse the same variables as in the loop): APInt…
#ifndef NDEBUG		#ifndef NDEBUG
// Clusters must be non-empty, sorted, and only contain Range clusters.		// Clusters must be non-empty, sorted, and only contain Range clusters.
assert(!Clusters.empty());		assert(!Clusters.empty());
for (CaseCluster &C : Clusters)		for (CaseCluster &C : Clusters)
assert(C.Kind == CC_Range);		assert(C.Kind == CC_Range);
for (unsigned i = 1, e = Clusters.size(); i < e; ++i)		for (unsigned i = 1, e = Clusters.size(); i < e; ++i)
assert(Clusters[i - 1].High->getValue().slt(Clusters[i].Low->getValue()));		assert(Clusters[i - 1].High->getValue().slt(Clusters[i].Low->getValue()));
#endif		#endif
Show All 15 Lines	#endif
for (unsigned i = 0; i < N; ++i) {		for (unsigned i = 0; i < N; ++i) {
const APInt &Hi = Clusters[i].High->getValue();		const APInt &Hi = Clusters[i].High->getValue();
const APInt &Lo = Clusters[i].Low->getValue();		const APInt &Lo = Clusters[i].Low->getValue();
TotalCases[i] = (Hi - Lo).getLimitedValue() + 1;		TotalCases[i] = (Hi - Lo).getLimitedValue() + 1;
if (i != 0)		if (i != 0)
TotalCases[i] += TotalCases[i - 1];		TotalCases[i] += TotalCases[i - 1];
}		}

		const bool HasReachableDefault =
		!isa<UnreachableInst>(DefaultMBB->getBasicBlock()->getFirstNonPHIOrDbg());
uint64_t Range = getJumpTableRange(Clusters,0, N - 1);		uint64_t Range = getJumpTableRange(Clusters, 0, N - 1);
uint64_t NumCases = getJumpTableNumCases(TotalCases, 0, N - 1);		uint64_t NumCases = getJumpTableNumCases(TotalCases, 0, N - 1);
		uint64_t NumTargets = getJumpTableNumTargets(Clusters, 0, N - 1,
		HasReachableDefault,
		hansUnsubmitted Done Reply Inline Actions MaxTargets isn't a great variable name for something that's "Number of targets, including the default if it's reachable". Maybe getJumpTableNumTargets() could take the default case into account so it doesn't need to be handled separately? hans: MaxTargets isn't a great variable name for something that's "Number of targets, including the…
		NumCases, Range);
assert(NumCases < UINT64_MAX / 100);		assert(NumCases < UINT64_MAX / 100);
assert(Range >= NumCases);		assert(Range >= NumCases);

// Cheap case: the whole range may be suitable for jump table.		// Cheap case: the whole range may be suitable for jump table.
if (TLI->isSuitableForJumpTable(SI, NumCases, Range)) {		if (TLI->isSuitableForJumpTable(SI, NumCases, NumTargets, Range)) {
CaseCluster JTCluster;		CaseCluster JTCluster;
if (buildJumpTable(Clusters, 0, N - 1, SI, DefaultMBB, JTCluster)) {		if (buildJumpTable(Clusters, 0, N - 1, SI, DefaultMBB, JTCluster)) {
Clusters[0] = JTCluster;		Clusters[0] = JTCluster;
Clusters.resize(1);		Clusters.resize(1);
return;		return;
}		}
}		}

Show All 22 Lines	enum PartitionScores : unsigned {
NoTable = 0,		NoTable = 0,
Table = 1,		Table = 1,
FewCases = 1,		FewCases = 1,
SingleCase = 2		SingleCase = 2
};		};

// Base case: There is only one way to partition Clusters[N-1].		// Base case: There is only one way to partition Clusters[N-1].
MinPartitions[N - 1] = 1;		MinPartitions[N - 1] = 1;
LastElement[N - 1] = N - 1;		LastElement[N - 1] = N - 1;
		hansUnsubmitted Done Reply Inline Actions The comment needs updating for the PartitionTrait rename. hans: The comment needs updating for the PartitionTrait rename.
PartitionsScore[N - 1] = PartitionScores::SingleCase;		PartitionsScore[N - 1] = PartitionScores::SingleCase;

// Note: loop indexes are signed to avoid underflow.		// Note: loop indexes are signed to avoid underflow.
		hansUnsubmitted Done Reply Inline Actions Maybe PartitionStats is a better name, since it's really about numbers, not any other kinds of traits. hans: Maybe PartitionStats is a better name, since it's really about numbers, not any other kinds of…
for (int64_t i = N - 2; i >= 0; i--) {		for (int64_t i = N - 2; i >= 0; i--) {
// Find optimal partitioning of Clusters[i..N-1].		// Find optimal partitioning of Clusters[i..N-1].
// Baseline: Put Clusters[i] into a partition on its own.		// Baseline: Put Clusters[i] into a partition on its own.
MinPartitions[i] = MinPartitions[i + 1] + 1;		MinPartitions[i] = MinPartitions[i + 1] + 1;
LastElement[i] = i;		LastElement[i] = i;
		hansUnsubmitted Done Reply Inline Actions I guess you mean PartitionTrait, not PartitionTarget. Also, the indexing of this seems very confusing, and that seems to be because elements are inserted with push_back(). Couldn't PartitionTrait[x] be the traits for Clusters[i..x], i.e. a straight-forward indexing? Also, since it's a vector, I don't think the name should be in singular. hans: I guess you mean PartitionTrait, not PartitionTarget. Also, the indexing of this seems very…
		hansUnsubmitted Done Reply Inline Actions s/traits/stats/ And thanks for updating the indexing, this is easier to follow. hans: s/traits/stats/ And thanks for updating the indexing, this is easier to follow.
PartitionsScore[i] = PartitionsScore[i + 1] + PartitionScores::SingleCase;		PartitionsScore[i] = PartitionsScore[i + 1] + PartitionScores::SingleCase;

// Search for a solution that results in fewer partitions.		// Search for a solution that results in fewer partitions.
for (int64_t j = N - 1; j > i; j--) {		for (int64_t j = N - 1; j > i; j--) {
		hansUnsubmitted Done Reply Inline Actions Using the same variable, defined on line 158, both for initializing the array, and then for lookups later is confusing. I think it would be easier to read of the code in this block was more like: PartitionTraits[j].Range = ... PartitionTraits[j].Cases = ... PartitionTraits[j].Targets = ... hans: Using the same variable, defined on line 158, both for initializing the array, and then for…
// Try building a partition from Clusters[i..j].		// Try building a partition from Clusters[i..j].
Range = getJumpTableRange(Clusters, i, j);		Range = getJumpTableRange(Clusters, i, j);
NumCases = getJumpTableNumCases(TotalCases, i, j);		NumCases = getJumpTableNumCases(TotalCases, i, j);
		NumTargets = getJumpTableNumTargets(Clusters, i, j, HasReachableDefault,
		hansUnsubmitted Done Reply Inline Actions Hmm, this is problematic. getJumpTableNumTargets has linear time complexity in the number of clusters, so I think this is essentially adding another factor O(n) to the overall time complexity, which is unfortunate. One way to solve this within the original O(n^2) complexity of the current code, would be to build an auxiliary data structure in the outer for-loop, e.g. TotalNumTargets[], where TotalNumTargets[x] is the total number of targets in clusters i..x (i being defined in the outer loop). That would be built in O(n) time, and the inner loop would then query TotalNumTargets[j] (in constant time). hans: Hmm, this is problematic. getJumpTableNumTargets has linear time complexity in the number of…
		evandroAuthorUnsubmitted Done Reply Inline Actions If I understood you correctly, pre calculating the number of targets in clusters i..x, where x will be j, is the same as performing this calculation inside the inner loop. evandro: If I understood you correctly, pre calculating the number of targets in clusters i..x, where x…
		hansUnsubmitted Done Reply Inline Actions No, the current code does O(n) work in the inner loop, making the overall time complexity O(n^3). Pre-computing the number of targets in clusters i..x in the outer loop would preserve the overall time complexity of O(n^2). hans: No, the current code does O(n) work in the inner loop, making the overall time complexity O…
		evandroAuthorUnsubmitted Done Reply Inline Actions Let me see if I understood what you mean in a patch... evandro: Let me see if I understood what you mean in a patch...
		NumCases, Range);
assert(NumCases < UINT64_MAX / 100);		assert(NumCases < UINT64_MAX / 100);
assert(Range >= NumCases);		assert(Range >= NumCases);
		hansUnsubmitted Done Reply Inline Actions Here the indexing gets confusing (also we probably don't want to copy the struct, but use a const-ref). hans: Here the indexing gets confusing (also we probably don't want to copy the struct, but use a…

if (TLI->isSuitableForJumpTable(SI, NumCases, Range)) {		if (TLI->isSuitableForJumpTable(SI, NumCases, NumTargets, Range)) {
unsigned NumPartitions = 1 + (j == N - 1 ? 0 : MinPartitions[j + 1]);		unsigned NumPartitions = 1 + (j == N - 1 ? 0 : MinPartitions[j + 1]);
unsigned Score = j == N - 1 ? 0 : PartitionsScore[j + 1];		unsigned Score = j == N - 1 ? 0 : PartitionsScore[j + 1];
int64_t NumEntries = j - i + 1;		int64_t NumEntries = j - i + 1;

if (NumEntries == 1)		if (NumEntries == 1)
Score += PartitionScores::SingleCase;		Score += PartitionScores::SingleCase;
else if (NumEntries <= SmallNumberOfEntries)		else if (NumEntries <= SmallNumberOfEntries)
Score += PartitionScores::FewCases;		Score += PartitionScores::FewCases;
▲ Show 20 Lines • Show All 340 Lines • Show Last 20 Lines

llvm/lib/CodeGen/TargetLoweringBase.cpp

Show First 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	static cl::opt<bool> JumpIsExpensiveOverride(
"jump-is-expensive", cl::init(false),		"jump-is-expensive", cl::init(false),
cl::desc("Do not create extra branches to split comparison logic."),		cl::desc("Do not create extra branches to split comparison logic."),
cl::Hidden);		cl::Hidden);

static cl::opt<unsigned> MinimumJumpTableEntries		static cl::opt<unsigned> MinimumJumpTableEntries
("min-jump-table-entries", cl::init(4), cl::Hidden,		("min-jump-table-entries", cl::init(4), cl::Hidden,
cl::desc("Set minimum number of entries to use a jump table."));		cl::desc("Set minimum number of entries to use a jump table."));

static cl::opt<unsigned> MaximumJumpTableSize		static cl::opt<unsigned> MaximumJumpTableTargets
("max-jump-table-size", cl::init(UINT_MAX), cl::Hidden,		("max-jump-table-targets", cl::init(UINT_MAX), cl::Hidden,
cl::desc("Set maximum size of jump tables."));		cl::desc("Set maximum number of targets to use in a jump table."));

/// Minimum jump table density for normal functions.		/// Minimum jump table density for normal functions.
static cl::opt<unsigned>		static cl::opt<unsigned>
JumpTableDensity("jump-table-density", cl::init(10), cl::Hidden,		JumpTableDensity("jump-table-density", cl::init(10), cl::Hidden,
cl::desc("Minimum density for building a jump table in "		cl::desc("Minimum density for building a jump table in "
"a normal function"));		"a normal function"));

/// Minimum jump table density for -Os or -Oz functions.		/// Minimum jump table density for -Os or -Oz functions.
▲ Show 20 Lines • Show All 1,670 Lines • ▼ Show 20 Lines
unsigned TargetLoweringBase::getMinimumJumpTableEntries() const {		unsigned TargetLoweringBase::getMinimumJumpTableEntries() const {
return MinimumJumpTableEntries;		return MinimumJumpTableEntries;
}		}

void TargetLoweringBase::setMinimumJumpTableEntries(unsigned Val) {		void TargetLoweringBase::setMinimumJumpTableEntries(unsigned Val) {
MinimumJumpTableEntries = Val;		MinimumJumpTableEntries = Val;
}		}

unsigned TargetLoweringBase::getMinimumJumpTableDensity(bool OptForSize) const {		unsigned TargetLoweringBase::getMaximumJumpTableTargets() const {
return OptForSize ? OptsizeJumpTableDensity : JumpTableDensity;		return MaximumJumpTableTargets;
}		}

unsigned TargetLoweringBase::getMaximumJumpTableSize() const {		void TargetLoweringBase::setMaximumJumpTableTargets(unsigned Val) {
return MaximumJumpTableSize;		MaximumJumpTableTargets = Val;
}		}

void TargetLoweringBase::setMaximumJumpTableSize(unsigned Val) {		unsigned TargetLoweringBase::getMinimumJumpTableDensity(bool OptForSize) const {
MaximumJumpTableSize = Val;		return OptForSize ? OptsizeJumpTableDensity : JumpTableDensity;
}		}

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Reciprocal Estimates		// Reciprocal Estimates
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Get the reciprocal estimate attribute string for a function that will		/// Get the reciprocal estimate attribute string for a function that will
/// override the target defaults.		/// override the target defaults.
▲ Show 20 Lines • Show All 181 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 635 Lines • ▼ Show 20 Lines	AArch64TargetLowering::AArch64TargetLowering(const TargetMachine &TM,
EnableExtLdPromotion = true;		EnableExtLdPromotion = true;

// Set required alignment.		// Set required alignment.
setMinFunctionAlignment(2);		setMinFunctionAlignment(2);
// Set preferred alignments.		// Set preferred alignments.
setPrefFunctionAlignment(STI.getPrefFunctionAlignment());		setPrefFunctionAlignment(STI.getPrefFunctionAlignment());
setPrefLoopAlignment(STI.getPrefLoopAlignment());		setPrefLoopAlignment(STI.getPrefLoopAlignment());

// Only change the limit for entries in a jump table if specified by		// Only change the limit for targets in a jump table if specified by
// the sub target, but not at the command line.		// the sub target, but not at the command line.
unsigned MaxJT = STI.getMaximumJumpTableSize();		if (getMaximumJumpTableTargets() == UINT_MAX)
if (MaxJT && getMaximumJumpTableSize() == UINT_MAX)		setMaximumJumpTableTargets(STI.getMaximumJumpTableTargets());
setMaximumJumpTableSize(MaxJT);

setHasExtractBitsInsn(true);		setHasExtractBitsInsn(true);

setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::Other, Custom);		setOperationAction(ISD::INTRINSIC_WO_CHAIN, MVT::Other, Custom);

if (Subtarget->hasNEON()) {		if (Subtarget->hasNEON()) {
// FIXME: v1f64 shouldn't be legal if we can avoid it, because it leads to		// FIXME: v1f64 shouldn't be legal if we can avoid it, because it leads to
// silliness like this:		// silliness like this:
▲ Show 20 Lines • Show All 11,550 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64Subtarget.h

Show First 20 Lines • Show All 194 Lines • ▼ Show 20 Lines	protected:
uint8_t MaxInterleaveFactor = 2;		uint8_t MaxInterleaveFactor = 2;
uint8_t VectorInsertExtractBaseCost = 3;		uint8_t VectorInsertExtractBaseCost = 3;
uint16_t CacheLineSize = 0;		uint16_t CacheLineSize = 0;
uint16_t PrefetchDistance = 0;		uint16_t PrefetchDistance = 0;
uint16_t MinPrefetchStride = 1;		uint16_t MinPrefetchStride = 1;
unsigned MaxPrefetchIterationsAhead = UINT_MAX;		unsigned MaxPrefetchIterationsAhead = UINT_MAX;
unsigned PrefFunctionAlignment = 0;		unsigned PrefFunctionAlignment = 0;
unsigned PrefLoopAlignment = 0;		unsigned PrefLoopAlignment = 0;
unsigned MaxJumpTableSize = 0;		unsigned MaxJumpTableTargets = UINT_MAX;
unsigned WideningBaseCost = 0;		unsigned WideningBaseCost = 0;

// ReserveXRegister[i] - X#i is not available as a general purpose register.		// ReserveXRegister[i] - X#i is not available as a general purpose register.
BitVector ReserveXRegister;		BitVector ReserveXRegister;

// CustomCallUsedXRegister[i] - X#i call saved.		// CustomCallUsedXRegister[i] - X#i call saved.
BitVector CustomCallSavedXRegs;		BitVector CustomCallSavedXRegs;

▲ Show 20 Lines • Show All 145 Lines • ▼ Show 20 Lines	public:
unsigned getPrefetchDistance() const { return PrefetchDistance; }		unsigned getPrefetchDistance() const { return PrefetchDistance; }
unsigned getMinPrefetchStride() const { return MinPrefetchStride; }		unsigned getMinPrefetchStride() const { return MinPrefetchStride; }
unsigned getMaxPrefetchIterationsAhead() const {		unsigned getMaxPrefetchIterationsAhead() const {
return MaxPrefetchIterationsAhead;		return MaxPrefetchIterationsAhead;
}		}
unsigned getPrefFunctionAlignment() const { return PrefFunctionAlignment; }		unsigned getPrefFunctionAlignment() const { return PrefFunctionAlignment; }
unsigned getPrefLoopAlignment() const { return PrefLoopAlignment; }		unsigned getPrefLoopAlignment() const { return PrefLoopAlignment; }

unsigned getMaximumJumpTableSize() const { return MaxJumpTableSize; }		unsigned getMaximumJumpTableTargets() const { return MaxJumpTableTargets; }

unsigned getWideningBaseCost() const { return WideningBaseCost; }		unsigned getWideningBaseCost() const { return WideningBaseCost; }

/// CPU has TBI (top byte of addresses is ignored during HW address		/// CPU has TBI (top byte of addresses is ignored during HW address
/// translation) and OS enables it.		/// translation) and OS enables it.
bool supportsAddressTopByteIgnored() const;		bool supportsAddressTopByteIgnored() const;

bool hasPerfMon() const { return HasPerfMon; }		bool hasPerfMon() const { return HasPerfMon; }
▲ Show 20 Lines • Show All 113 Lines • Show Last 20 Lines

llvm/lib/Target/AArch64/AArch64Subtarget.cpp

Show First 20 Lines • Show All 89 Lines • ▼ Show 20 Lines	void AArch64Subtarget::initializeProperties() {
case Cyclone:		case Cyclone:
CacheLineSize = 64;		CacheLineSize = 64;
PrefetchDistance = 280;		PrefetchDistance = 280;
MinPrefetchStride = 2048;		MinPrefetchStride = 2048;
MaxPrefetchIterationsAhead = 3;		MaxPrefetchIterationsAhead = 3;
break;		break;
case ExynosM1:		case ExynosM1:
MaxInterleaveFactor = 4;		MaxInterleaveFactor = 4;
MaxJumpTableSize = 8;		MaxJumpTableTargets = 8;
PrefFunctionAlignment = 4;		PrefFunctionAlignment = 4;
PrefLoopAlignment = 3;		PrefLoopAlignment = 3;
break;		break;
case ExynosM3:		case ExynosM3:
MaxInterleaveFactor = 4;		MaxInterleaveFactor = 4;
MaxJumpTableSize = 20;		MaxJumpTableTargets = 20;
PrefFunctionAlignment = 5;		PrefFunctionAlignment = 5;
PrefLoopAlignment = 4;		PrefLoopAlignment = 4;
break;		break;
case Falkor:		case Falkor:
MaxInterleaveFactor = 4;		MaxInterleaveFactor = 4;
// FIXME: remove this to enable 64-bit SLP if performance looks good.		// FIXME: remove this to enable 64-bit SLP if performance looks good.
MinVectorRegisterBitWidth = 128;		MinVectorRegisterBitWidth = 128;
CacheLineSize = 128;		CacheLineSize = 128;
▲ Show 20 Lines • Show All 191 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/max-jump-table.ll

; RUN: llc %s -O2 -print-machineinstrs -mtriple=aarch64-linux-gnu -jump-table-density=40 -o /dev/null 2> %t; FileCheck %s --check-prefixes=CHECK,CHECK0 < %t		; RUN: llc %s -O2 -print-machineinstrs -mtriple=aarch64-linux-gnu -jump-table-density=40 -o /dev/null 2> %t; FileCheck %s --check-prefixes=CHECK,CHECK0 < %t
; RUN: llc %s -O2 -print-machineinstrs -mtriple=aarch64-linux-gnu -jump-table-density=40 -max-jump-table-size=4 -o /dev/null 2> %t; FileCheck %s --check-prefixes=CHECK,CHECK4 < %t		; RUN: llc %s -O2 -print-machineinstrs -mtriple=aarch64-linux-gnu -jump-table-density=40 -max-jump-table-targets=4 -o /dev/null 2> %t; FileCheck %s --check-prefixes=CHECK,CHECK4 < %t
; RUN: llc %s -O2 -print-machineinstrs -mtriple=aarch64-linux-gnu -jump-table-density=40 -max-jump-table-size=8 -o /dev/null 2> %t; FileCheck %s --check-prefixes=CHECK,CHECK8 < %t		; RUN: llc %s -O2 -print-machineinstrs -mtriple=aarch64-linux-gnu -jump-table-density=40 -max-jump-table-targets=8 -o /dev/null 2> %t; FileCheck %s --check-prefixes=CHECK,CHECK8 < %t
; RUN: llc %s -O2 -print-machineinstrs -mtriple=aarch64-linux-gnu -jump-table-density=40 -max-jump-table-size=16 -o /dev/null 2> %t; FileCheck %s --check-prefixes=CHECK,CHECK16 < %t		; RUN: llc %s -O2 -print-machineinstrs -mtriple=aarch64-linux-gnu -jump-table-density=40 -max-jump-table-targets=16 -o /dev/null 2> %t; FileCheck %s --check-prefixes=CHECK,CHECK16 < %t
; RUN: llc %s -O2 -print-machineinstrs -mtriple=aarch64-linux-gnu -jump-table-density=40 -mcpu=exynos-m1 -o /dev/null 2> %t; FileCheck %s --check-prefixes=CHECK,CHECKM1 < %t		; RUN: llc %s -O2 -print-machineinstrs -mtriple=aarch64-linux-gnu -jump-table-density=40 -mcpu=exynos-m1 -o /dev/null 2> %t; FileCheck %s --check-prefixes=CHECK,CHECKM1 < %t
; RUN: llc %s -O2 -print-machineinstrs -mtriple=aarch64-linux-gnu -jump-table-density=40 -mcpu=exynos-m3 -o /dev/null 2> %t; FileCheck %s --check-prefixes=CHECK,CHECKM3 < %t		; RUN: llc %s -O2 -print-machineinstrs -mtriple=aarch64-linux-gnu -jump-table-density=40 -mcpu=exynos-m3 -o /dev/null 2> %t; FileCheck %s --check-prefixes=CHECK,CHECKM3 < %t

declare void @ext(i32, i32)		declare void @ext(i32, i32)

define i32 @jt1(i32 %a, i32 %b) {		define i32 @jt1(i32 %a, i32 %b) {
entry:		entry:
switch i32 %a, label %return [		switch i32 %a, label %return [
i32 1, label %bb1		i32 1, label %bb1
i32 2, label %bb2		i32 2, label %bb2
▲ Show 20 Lines • Show All 66 Lines • ▼ Show 20 Lines	switch i32 %x, label %return [
i32 15, label %bb6		i32 15, label %bb6
]		]
; CHECK-LABEL: function jt2:		; CHECK-LABEL: function jt2:
; CHECK-NEXT: Jump Tables:		; CHECK-NEXT: Jump Tables:
; CHECK0-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.5 %bb.6{{$}}		; CHECK0-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.5 %bb.6{{$}}
; CHECK0-NOT: %jump-table.1:		; CHECK0-NOT: %jump-table.1:
; CHECK4-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4{{$}}		; CHECK4-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4{{$}}
; CHECK4-NOT: %jump-table.1:		; CHECK4-NOT: %jump-table.1:
; CHECK8-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4{{$}}		; CHECK8-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.5 %bb.6{{$}}
; CHECK8-NOT: %jump-table.1:		; CHECK8-NOT: %jump-table.1:
; CHECK16-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.5 %bb.6{{$}}		; CHECK16-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.5 %bb.6{{$}}
; CHECK16-NOT: %jump-table.1:		; CHECK16-NOT: %jump-table.1:
; CHECKM1-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4{{$}}		; CHECKM1-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.5 %bb.6{{$}}
; CHECKM1-NOT: %jump-table.1:		; CHECKM1-NOT: %jump-table.1:
; CHECKM3-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.5 %bb.6{{$}}		; CHECKM3-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.7 %bb.5 %bb.6{{$}}
; CHECKM3-NOT: %jump-table.1:		; CHECKM3-NOT: %jump-table.1:
; CHECK-DAG: End machine code for function jt2.		; CHECK-DAG: End machine code for function jt2.

bb1: tail call void @ext(i32 6, i32 1) br label %return		bb1: tail call void @ext(i32 6, i32 1) br label %return
bb2: tail call void @ext(i32 5, i32 2) br label %return		bb2: tail call void @ext(i32 5, i32 2) br label %return
bb3: tail call void @ext(i32 4, i32 3) br label %return		bb3: tail call void @ext(i32 4, i32 3) br label %return
bb4: tail call void @ext(i32 3, i32 4) br label %return		bb4: tail call void @ext(i32 3, i32 4) br label %return
bb5: tail call void @ext(i32 2, i32 5) br label %return		bb5: tail call void @ext(i32 2, i32 5) br label %return
bb6: tail call void @ext(i32 1, i32 6) br label %return		bb6: tail call void @ext(i32 1, i32 6) br label %return

return: ret void		return: ret void
}		}

define void @jt3(i32 %x) {		define void @jt3(i32 %x) {
entry:		entry:
switch i32 %x, label %return [		switch i32 %x, label %return [
i32 1, label %bb1		i32 1, label %bb1
i32 2, label %bb2		i32 2, label %bb2
Show All 13 Lines	entry:
]		]
; CHECK-LABEL: function jt3:		; CHECK-LABEL: function jt3:
; CHECK-NEXT: Jump Tables:		; CHECK-NEXT: Jump Tables:
; CHECK0-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10 %bb.13 %bb.11 %bb.12		; CHECK0-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10 %bb.13 %bb.11 %bb.12
; CHECK0-NOT: %jump-table.1:		; CHECK0-NOT: %jump-table.1:
; CHECK4-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4		; CHECK4-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4
; CHECK4-NEXT: %jump-table.1: %bb.5 %bb.6 %bb.7 %bb.8		; CHECK4-NEXT: %jump-table.1: %bb.5 %bb.6 %bb.7 %bb.8
; CHECK4-NOT: %jump-table.2:		; CHECK4-NOT: %jump-table.2:
; CHECK8-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4		; CHECK8-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7
; CHECK8-NEXT: %jump-table.1: %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10		; CHECK8-NEXT: %jump-table.1: %bb.8 %bb.13 %bb.9 %bb.10 %bb.13 %bb.11 %bb.12
; CHECK8-NOT: %jump-table.2:		; CHECK8-NOT: %jump-table.2:
; CHECK16-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7		; CHECK16-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10 %bb.13 %bb.11 %bb.12
; CHECK16-NEXT: %jump-table.1: %bb.8 %bb.13 %bb.9 %bb.10 %bb.13 %bb.11 %bb.12		; CHECK16-NOT: %jump-table.1:
; CHECK16-NOT: %jump-table.2:		; CHECKM1-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7
; CHECKM1-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4		; CHECKM1-NEXT: %jump-table.1: %bb.8 %bb.13 %bb.9 %bb.10 %bb.13 %bb.11 %bb.12
; CHECKM1-NEXT: %jump-table.1: %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10
; CHECKM1-NOT: %jump-table.2:		; CHECKM1-NOT: %jump-table.2:
; CHECKM3-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10		; CHECKM3-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10
; CHECKM3-NOT: %jump-table.1:		; CHECKM3-NOT: %jump-table.1:
; CHECK-DAG: End machine code for function jt3.		; CHECK-DAG: End machine code for function jt3.

bb1: tail call void @ext(i32 1, i32 12) br label %return		bb1: tail call void @ext(i32 1, i32 12) br label %return
bb2: tail call void @ext(i32 2, i32 11) br label %return		bb2: tail call void @ext(i32 2, i32 11) br label %return
bb3: tail call void @ext(i32 3, i32 10) br label %return		bb3: tail call void @ext(i32 3, i32 10) br label %return
Show All 30 Lines	switch i32 %x, label %default [
i32 23, label %bb12		i32 23, label %bb12
]		]
; CHECK-LABEL: function jt4:		; CHECK-LABEL: function jt4:
; CHECK-NEXT: Jump Tables:		; CHECK-NEXT: Jump Tables:
; CHECK0-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10 %bb.13 %bb.11 %bb.12		; CHECK0-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10 %bb.13 %bb.11 %bb.12
; CHECK0-NOT: %jump-table.1:		; CHECK0-NOT: %jump-table.1:
; CHECK4-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4		; CHECK4-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4
; CHECK4-NEXT: %jump-table.1: %bb.5 %bb.6 %bb.7 %bb.8		; CHECK4-NEXT: %jump-table.1: %bb.5 %bb.6 %bb.7 %bb.8
; CHECK4-NOT: %jump-table.2:		; CHECK4-NEXT: %jump-table.2: %bb.9 %bb.10 %bb.13 %bb.11 %bb.12
; CHECK8-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4		; CHECK4-NOT: %jump-table.3:
; CHECK8-NEXT: %jump-table.1: %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10		; CHECK8-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7 %bb.8
		; CHECK8-NEXT: %jump-table.1: %bb.9 %bb.10 %bb.13 %bb.11 %bb.12
; CHECK8-NOT: %jump-table.2:		; CHECK8-NOT: %jump-table.2:
; CHECK16-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7		; CHECK16-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10 %bb.13 %bb.11 %bb.12
; CHECK16-NEXT: %jump-table.1: %bb.8 %bb.13 %bb.9 %bb.10 %bb.13 %bb.11 %bb.12		; CHECK16-NOT: %jump-table.1:
; CHECK16-NOT: %jump-table.2:		; CHECKM1-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7 %bb.8
; CHECKM1-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4		; CHECKM1-NEXT: %jump-table.1: %bb.9 %bb.10 %bb.13 %bb.11 %bb.12
; CHECKM1-NEXT: %jump-table.1: %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10
; CHECKM1-NOT: %jump-table.2:		; CHECKM1-NOT: %jump-table.2:
; CHECKM3-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10		; CHECKM3-NEXT: %jump-table.0: %bb.1 %bb.2 %bb.3 %bb.4 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.13 %bb.5 %bb.6 %bb.7 %bb.8 %bb.13 %bb.9 %bb.10
; CHECKM3-NOT: %jump-table.1:		; CHECKM3-NOT: %jump-table.1:
; CHECK-DAG: End machine code for function jt4.		; CHECK-DAG: End machine code for function jt4.

bb1: tail call void @ext(i32 1, i32 12) br label %return		bb1: tail call void @ext(i32 1, i32 12) br label %return
bb2: tail call void @ext(i32 2, i32 11) br label %return		bb2: tail call void @ext(i32 2, i32 11) br label %return
bb3: tail call void @ext(i32 3, i32 10) br label %return		bb3: tail call void @ext(i32 3, i32 10) br label %return
Show All 13 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[CodeGen] Replace -max-jump-table-size with -max-jump-table-targetsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 213770

llvm/include/llvm/CodeGen/BasicTTIImpl.h

llvm/include/llvm/CodeGen/SwitchLoweringUtils.h

llvm/include/llvm/CodeGen/TargetLowering.h

llvm/lib/CodeGen/SwitchLoweringUtils.cpp

llvm/lib/CodeGen/TargetLoweringBase.cpp

llvm/lib/Target/AArch64/AArch64ISelLowering.cpp

llvm/lib/Target/AArch64/AArch64Subtarget.h

llvm/lib/Target/AArch64/AArch64Subtarget.cpp

llvm/test/CodeGen/AArch64/max-jump-table.ll

[CodeGen] Replace -max-jump-table-size with -max-jump-table-targets
ClosedPublic