This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
MIRYamlMapping.h
-
MachineJumpTableInfo.h
-
lib/CodeGen/
-
CodeGen/
-
AsmPrinter/
-
AsmPrinter.cpp
-
MachineFunction.cpp
-
SelectionDAG/
-
TargetLowering.cpp
-
test/CodeGen/
-
CodeGen/
-
AArch64/
-
jumptable-large.ll
-
PowerPC/
-
jumptable-large.ll
-
X86/
-
jumptable-large.ll

Differential D34409

Use 64bit jump table with large code model on 64bit
Needs ReviewPublic

Authored by yuyichao on Jun 20 2017, 9:45 AM.

Download Raw Diff

Details

Reviewers

loladiro
joerg
lattner
t.p.northover

Summary

The data and code segments can be more than 32bit apart so the offset needs to be 64bit in size.

Diff Detail

Event Timeline

yuyichao created this revision.Jun 20 2017, 9:45 AM

Herald added subscribers: javed.absar, nemanjai. · View Herald TranscriptJun 20 2017, 9:45 AM

This looks reasonable to me.

This revision is now accepted and ready to land.Jun 20 2017, 12:24 PM

At least the PPC change is definitely wrong. AArch64 should be wrong as well from what I discussed with Tim.

Blindly moving to 64bit differences is *not* the right approach. Moving to function relative offsets or not using a separate section is.

The entirely sensible assumption of the PPC backend is that a single function is no longer than 2GB/4GB. The position of the sections is irrelevant. The jump offsets inside the function can all be expressed as 32bit offset. The only problematic part is hooking up the address computation of this base in the backend.

This revision now requires changes to proceed.Jun 20 2017, 1:42 PM

The entirely sensible assumption of the PPC backend is that a single function is no longer than 2GB/4GB.

What's being produced is an offset from the basic blocks (.text) to the Jump table (.rodata). That's not necessarily 32-bits, and I think there are entirely legitimate reasons for putting jump tables in .rodata.

I expect something better could be done for PPC, but this is entirely in line with the existing 32-bit code and correctness comes before performance. I pretty strongly object to characterising the patch as "wrong".

In D34409#785891, @t.p.northover wrote:

The entirely sensible assumption of the PPC backend is that a single function is no longer than 2GB/4GB.

What's being produced is an offset from the basic blocks (.text) to the Jump table (.rodata). That's not necessarily 32-bits, and I think there are entirely legitimate reasons for putting jump tables in .rodata.

That's not what PPC64 creates. It puts offsets between the BB and a picbase into the jumptable.

I expect something better could be done for PPC, but this is entirely in line with the existing 32-bit code and correctness comes before performance. I pretty strongly object to characterising the patch as "wrong".

Please read what I said. The PPC change is wrong: the code works correctly for the large code model. For X86_64, two option exists: non-PIC should work fine when using absolute pointers. That's the primary use case of large code model right now, i.e. for JIT. The better approach is to follow what PPC does and this is not done by this patch.

I expect something better could be done for PPC, but this is entirely in line with the existing 32-bit code and correctness comes before performance. I pretty strongly object to characterising the patch as "wrong".

Exactly, the code generated for aarch64, ppc64 and x86-64 in this case was obviously wrong in correctness sense (feels wierd to say it out...). The new code generated for x86-64 and aarch64 looks correct to me. I'm not very familiar with ppc64 assembly to tell but given the code generating the assembly is pretty generic I expect it to be correct at least. I have no idea what exactly would be a better code to generate for either of these cases. I do agree a function can be assumed to be smaller than 2G so if a different but existing code path can be used I'll be happy to make that change for a specific arch. Otherwise, I'd like to keep the "generic" way currently used since that matches what's done in 32bit mode and leave the performance optimization part to people more familiar with the performance model and trade offs on different archs.

FWIW, the PPC backend does seems to be doing some transformation so that a function local offset is end up being used. The original function never returns EK_GPRel32BlockAddress and I can't really verify the correctlyness of the resulting assembly so I went with the version that shares the code path with two other archs that I can verify. If EK_GPRel32BlockAddress should work on ppc64 and generates an pc offset table, I'll be happy to make that change.

It puts offsets between the BB and a picbase into the jumptable.

Yes, I assumed it's some other valid transformation done by the backend since I checked in the debugger that an offset between the BB and a data section was asked for. Would EK_GPRel32BlockAddress be the right solution on ppc64 then?

For AArch64 there was a previous discussion here: https://reviews.llvm.org/D32564 (some replies don't seem to be here, the thread is at http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20170424/448545.html). That patch would supersede this one I believe, but until I get around to resurrecting it this patch at least makes things work.

FWIW, the PPC backend does seems to be doing some transformation so that a function local offset is end up being used.

Ah yes, I've run the code myself and see what Joerg meant now (apologies, and thanks for overriding me). It seems like disabling this for PPC is probably a good idea, the existing 32-bit offsets are probably fine.

Which leaves x86. It looks like x86 is much like PPC and correct right now: jump tables immediately follow each function.

So I think I should probably actually do something about that older patch.

The x86_64 code is mixed: it is correct for non-PIC but large, it doesn't work correctly for PIC. That covers the cases most people have been interested in, especially since large model and PIC runs into various other issues too from what I remember.

It would be really nice to use the proper 32bit difference for PIC in general. It would be a stepping stone to using 8bit or 16bit labels for smaller functions. But I run out of time banging my head against the lowering when I last looked at it.

the existing 32-bit offsets are probably fine.

Ah, I just realize that the large code model is indeed doing some magic here. I'll leave it as is then and update the test. (Though it feels a little strange since the wrong format is asked for....)

Which leaves x86. It looks like x86 is much like PPC and correct right now:

I believe it's actually wrong. The jump table is emitted into .rodata and can be placed far away. It's where I actually saw segfault/assertion.

That covers the cases most people have been interested in, especially since large model and PIC runs into various other issues too from what I remember.

For some reason we are using large code model with PIC in the JIT and this is the only issue I see on x86-64 so far.

PPC changes reverted and tests updated. I believe this addresses all the comment so far.

So to summarize,

On PPC64, there's some magic that fixes the large offset so the final code is correct already.
On AArch64, there might be something fancy we can do (use the pointer that we use to find the data section?) but this should make it work correctly before that.
On X86-64, this is needed to make it behave correctly. Not sure if there's better trick.

For some reason we are using large code model with PIC in the JIT and this is the only issue I see on x86-64 so far.

Who is we? I'm moderately sure that is not the case in general, since it would create less efficient code.

If you want to create a stop-gap solution that bad, I would make it emit the jump table as writable for non-PIC && largeish code model. That's a much more contained change. It would be really better to work on the real fix and not add more hacks...

Who is we? I'm moderately sure that is not the case in general, since it would create less efficient code.

No that's not the default. "We" are julia on x86-64...... Ref https://github.com/JuliaLang/julia/pull/22451 . In short I think we turned it on to support threading. Comments there are welcome =)

Why would a writable jump table help?

If the jump table is writable, you can just use the absolute pointers, PIC or no PIC. That would be the "block address" encoding.

If the jump table is writable

FWIW, that also sounds like a hack and a (minor?) security issue.

It would be really better to work on the real fix and not add more hacks...

And what exactly do you mean by "the real fix". I believe this is a reasonable generic fallback before an optimization is implemented for a particular arch and given that this is exactly what GCC does on x86-64 I think this is (one of) the correct solution on x86-64 too.
On AArch64, GCC throws an error, which I think is much better than silently generating the wrong code....

In D34409#786482, @yuyichao wrote:

If the jump table is writable

FWIW, that also sounds like a hack and a (minor?) security issue.

It's no better or worse than the GOT.

It would be really better to work on the real fix and not add more hacks...

And what exactly do you mean by "the real fix". I believe this is a reasonable generic fallback before an optimization is implemented for a particular arch and given that this is exactly what GCC does on x86-64 I think this is (one of) the correct solution on x86-64 too.
On AArch64, GCC throws an error, which I think is much better than silently generating the wrong code....

There are three options given here so far:
(1) Use the plain block address. Requires replacing the default getSectionForJumpTable and getJumpTableEncoding. Change is localized to the affected architectures.
(2) Introduce a whole new generic 64bit label difference. Non-localized infrastructure change.
(3) Properly switch to function-relative 32bit labels. Change is localized the affected architecture or at least support glue.

In terms of code overhead for the access, (1) is strictly the shortest, plain indirect branch to a indexed memory location. (2) needs a pointer load + offset computation, (3) needs a pointer load + PC-relative offset computation. As such, (2) and (3) are often somewhat equal. (1) and (2) require the same amount of memory for the jump table, making (2) not very attractive when relocations themselve are ephemeral. (3) saves significant amounts of space for any non-trivial jump table.

Note that GCC is quite different as it often will not create a separate jump table section. That's also an option [(4)] supported by LLVM with some overrides and it will work for large code model at the expensive of making more static data executable.

Based on all that, I do not consider the complexity of (2) justified at all for a short term workaround of target-specific limitations. (1) and (4) are easier and create faster code. (3) is the preferred implementation for 64bit platforms as it minimizes size of executable code and total binary size at the expensive of a slightly more complex access vector. The only reason why it isn't implemented for AArch64 and X86_64 yet is the necessary function-specific base address.

I see the non-arch specific property as the good part since currently everything other than PPC claims to support large model with PIC and just generate wrong code. If a fallback that's always implementable and correct is defined, the only arch-specific changes needed will be for efficiency and not for correctness.

Note that GCC is quite different as it often will not create a separate jump table section.

AFAICT it is using rodata as the section. Same as LLVM here.

The only reason why it isn't implemented for AArch64 and X86_64 yet is the necessary function-specific base address.

Looking at the assembly, I think AArch64 should always has a function base address to use (accessing the jump table in PIC way already requires adrp) but that does not seem to be the case for x86_64 where the function local address isn't stored in any register and it seems that doing that will in general require one more instructions.

Revision Contents

Path

Size

include/

llvm/

CodeGen/

MIRYamlMapping.h

2 lines

MachineJumpTableInfo.h

9 lines

lib/

CodeGen/

AsmPrinter/

AsmPrinter.cpp

12 lines

MachineFunction.cpp

2 lines

SelectionDAG/

TargetLowering.cpp

20 lines

test/

CodeGen/

AArch64/

jumptable-large.ll

51 lines

PowerPC/

jumptable-large.ll

51 lines

X86/

jumptable-large.ll

51 lines

Diff 103283

include/llvm/CodeGen/MIRYamlMapping.h

Show First 20 Lines • Show All 123 Lines • ▼ Show 20 Lines	static void enumeration(yaml::IO &IO,
IO.enumCase(EntryKind, "block-address",		IO.enumCase(EntryKind, "block-address",
MachineJumpTableInfo::EK_BlockAddress);		MachineJumpTableInfo::EK_BlockAddress);
IO.enumCase(EntryKind, "gp-rel64-block-address",		IO.enumCase(EntryKind, "gp-rel64-block-address",
MachineJumpTableInfo::EK_GPRel64BlockAddress);		MachineJumpTableInfo::EK_GPRel64BlockAddress);
IO.enumCase(EntryKind, "gp-rel32-block-address",		IO.enumCase(EntryKind, "gp-rel32-block-address",
MachineJumpTableInfo::EK_GPRel32BlockAddress);		MachineJumpTableInfo::EK_GPRel32BlockAddress);
IO.enumCase(EntryKind, "label-difference32",		IO.enumCase(EntryKind, "label-difference32",
MachineJumpTableInfo::EK_LabelDifference32);		MachineJumpTableInfo::EK_LabelDifference32);
		IO.enumCase(EntryKind, "label-difference64",
		MachineJumpTableInfo::EK_LabelDifference64);
IO.enumCase(EntryKind, "inline", MachineJumpTableInfo::EK_Inline);		IO.enumCase(EntryKind, "inline", MachineJumpTableInfo::EK_Inline);
IO.enumCase(EntryKind, "custom32", MachineJumpTableInfo::EK_Custom32);		IO.enumCase(EntryKind, "custom32", MachineJumpTableInfo::EK_Custom32);
}		}
};		};

} // end namespace yaml		} // end namespace yaml
} // end namespace llvm		} // end namespace llvm

▲ Show 20 Lines • Show All 349 Lines • Show Last 20 Lines

include/llvm/CodeGen/MachineJumpTableInfo.h

Show First 20 Lines • Show All 61 Lines • ▼ Show 20 Lines	enum JTEntryKind {
/// the address of the jump table. This is used for PIC jump tables where		/// the address of the jump table. This is used for PIC jump tables where
/// gprel32 is not supported. e.g.:		/// gprel32 is not supported. e.g.:
/// .word LBB123 - LJTI1_2		/// .word LBB123 - LJTI1_2
/// If the .set directive is supported, this is emitted as:		/// If the .set directive is supported, this is emitted as:
/// .set L4_5_set_123, LBB123 - LJTI1_2		/// .set L4_5_set_123, LBB123 - LJTI1_2
/// .word L4_5_set_123		/// .word L4_5_set_123
EK_LabelDifference32,		EK_LabelDifference32,

		/// EK_LabelDifference64 - Each entry is the address of the block minus
		/// the address of the jump table. This is used for PIC jump tables where
		/// gprel64 is not supported. e.g.:
		/// .word LBB123 - LJTI1_2
		/// If the .set directive is supported, this is emitted as:
		/// .set L4_5_set_123, LBB123 - LJTI1_2
		/// .word L4_5_set_123
		EK_LabelDifference64,

/// EK_Inline - Jump table entries are emitted inline at their point of		/// EK_Inline - Jump table entries are emitted inline at their point of
/// use. It is the responsibility of the target to emit the entries.		/// use. It is the responsibility of the target to emit the entries.
EK_Inline,		EK_Inline,

/// EK_Custom32 - Each entry is a 32-bit value that is custom lowered by the		/// EK_Custom32 - Each entry is a 32-bit value that is custom lowered by the
/// TargetLowering::LowerCustomJumpTableEntry hook.		/// TargetLowering::LowerCustomJumpTableEntry hook.
EK_Custom32		EK_Custom32
};		};
▲ Show 20 Lines • Show All 53 Lines • Show Last 20 Lines

lib/CodeGen/AsmPrinter/AsmPrinter.cpp

Show First 20 Lines • Show All 1,532 Lines • ▼ Show 20 Lines	void AsmPrinter::EmitJumpTableInfo() {
const std::vector<MachineJumpTableEntry> &JT = MJTI->getJumpTables();		const std::vector<MachineJumpTableEntry> &JT = MJTI->getJumpTables();
if (JT.empty()) return;		if (JT.empty()) return;

// Pick the directive to use to print the jump table entries, and switch to		// Pick the directive to use to print the jump table entries, and switch to
// the appropriate section.		// the appropriate section.
const Function *F = MF->getFunction();		const Function *F = MF->getFunction();
const TargetLoweringObjectFile &TLOF = getObjFileLowering();		const TargetLoweringObjectFile &TLOF = getObjFileLowering();
bool JTInDiffSection = !TLOF.shouldPutJumpTableInFunctionSection(		bool JTInDiffSection = !TLOF.shouldPutJumpTableInFunctionSection(
MJTI->getEntryKind() == MachineJumpTableInfo::EK_LabelDifference32,		MJTI->getEntryKind() == MachineJumpTableInfo::EK_LabelDifference32 \|\|
		MJTI->getEntryKind() == MachineJumpTableInfo::EK_LabelDifference64,
*F);		*F);
if (JTInDiffSection) {		if (JTInDiffSection) {
// Drop it in the readonly section.		// Drop it in the readonly section.
MCSection ReadOnlySection = TLOF.getSectionForJumpTable(F, TM);		MCSection ReadOnlySection = TLOF.getSectionForJumpTable(F, TM);
OutStreamer->SwitchSection(ReadOnlySection);		OutStreamer->SwitchSection(ReadOnlySection);
}		}

EmitAlignment(Log2_32(MJTI->getEntryAlignment(DL)));		EmitAlignment(Log2_32(MJTI->getEntryAlignment(DL)));

// Jump tables in code sections are marked with a data_region directive		// Jump tables in code sections are marked with a data_region directive
// where that's supported.		// where that's supported.
if (!JTInDiffSection)		if (!JTInDiffSection)
OutStreamer->EmitDataRegion(MCDR_DataRegionJT32);		OutStreamer->EmitDataRegion(MCDR_DataRegionJT32);

for (unsigned JTI = 0, e = JT.size(); JTI != e; ++JTI) {		for (unsigned JTI = 0, e = JT.size(); JTI != e; ++JTI) {
const std::vector<MachineBasicBlock*> &JTBBs = JT[JTI].MBBs;		const std::vector<MachineBasicBlock*> &JTBBs = JT[JTI].MBBs;

// If this jump table was deleted, ignore it.		// If this jump table was deleted, ignore it.
if (JTBBs.empty()) continue;		if (JTBBs.empty()) continue;

// For the EK_LabelDifference32 entry, if using .set avoids a relocation,		// For the EK_LabelDifference(32\|64) entry, if using .set avoids a relocation,
/// emit a .set directive for each unique entry.		/// emit a .set directive for each unique entry.
if (MJTI->getEntryKind() == MachineJumpTableInfo::EK_LabelDifference32 &&		if ((MJTI->getEntryKind() == MachineJumpTableInfo::EK_LabelDifference32 \|\|
		MJTI->getEntryKind() == MachineJumpTableInfo::EK_LabelDifference64) &&
MAI->doesSetDirectiveSuppressReloc()) {		MAI->doesSetDirectiveSuppressReloc()) {
SmallPtrSet<const MachineBasicBlock*, 16> EmittedSets;		SmallPtrSet<const MachineBasicBlock*, 16> EmittedSets;
const TargetLowering *TLI = MF->getSubtarget().getTargetLowering();		const TargetLowering *TLI = MF->getSubtarget().getTargetLowering();
const MCExpr *Base = TLI->getPICJumpTableRelocBaseExpr(MF,JTI,OutContext);		const MCExpr *Base = TLI->getPICJumpTableRelocBaseExpr(MF,JTI,OutContext);
for (unsigned ii = 0, ee = JTBBs.size(); ii != ee; ++ii) {		for (unsigned ii = 0, ee = JTBBs.size(); ii != ee; ++ii) {
const MachineBasicBlock *MBB = JTBBs[ii];		const MachineBasicBlock *MBB = JTBBs[ii];
if (!EmittedSets.insert(MBB).second)		if (!EmittedSets.insert(MBB).second)
continue;		continue;
▲ Show 20 Lines • Show All 57 Lines • ▼ Show 20 Lines	case MachineJumpTableInfo::EK_GPRel64BlockAddress: {
// EK_GPRel64BlockAddress - Each entry is an address of block, encoded		// EK_GPRel64BlockAddress - Each entry is an address of block, encoded
// with a relocation as gp-relative, e.g.:		// with a relocation as gp-relative, e.g.:
// .gpdword LBB123		// .gpdword LBB123
MCSymbol *MBBSym = MBB->getSymbol();		MCSymbol *MBBSym = MBB->getSymbol();
OutStreamer->EmitGPRel64Value(MCSymbolRefExpr::create(MBBSym, OutContext));		OutStreamer->EmitGPRel64Value(MCSymbolRefExpr::create(MBBSym, OutContext));
return;		return;
}		}


		case MachineJumpTableInfo::EK_LabelDifference64:
case MachineJumpTableInfo::EK_LabelDifference32: {		case MachineJumpTableInfo::EK_LabelDifference32: {
// Each entry is the address of the block minus the address of the jump		// Each entry is the address of the block minus the address of the jump
// table. This is used for PIC jump tables where gprel32 is not supported.		// table. This is used for PIC jump tables where gprel32 or gprel64 is not supported.
// e.g.:		// e.g.:
// .word LBB123 - LJTI1_2		// .word LBB123 - LJTI1_2
// If the .set directive avoids relocations, this is emitted as:		// If the .set directive avoids relocations, this is emitted as:
// .set L4_5_set_123, LBB123 - LJTI1_2		// .set L4_5_set_123, LBB123 - LJTI1_2
// .word L4_5_set_123		// .word L4_5_set_123
if (MAI->doesSetDirectiveSuppressReloc()) {		if (MAI->doesSetDirectiveSuppressReloc()) {
Value = MCSymbolRefExpr::create(GetJTSetSymbol(UID, MBB->getNumber()),		Value = MCSymbolRefExpr::create(GetJTSetSymbol(UID, MBB->getNumber()),
OutContext);		OutContext);
▲ Show 20 Lines • Show All 1,210 Lines • Show Last 20 Lines

lib/CodeGen/MachineFunction.cpp

	Show First 20 Lines • Show All 759 Lines • ▼ Show 20 Lines
	/// Return the size of each entry in the jump table.			/// Return the size of each entry in the jump table.
	unsigned MachineJumpTableInfo::getEntrySize(const DataLayout &TD) const {			unsigned MachineJumpTableInfo::getEntrySize(const DataLayout &TD) const {
	// The size of a jump table entry is 4 bytes unless the entry is just the			// The size of a jump table entry is 4 bytes unless the entry is just the
	// address of a block, in which case it is the pointer size.			// address of a block, in which case it is the pointer size.
	switch (getEntryKind()) {			switch (getEntryKind()) {
	case MachineJumpTableInfo::EK_BlockAddress:			case MachineJumpTableInfo::EK_BlockAddress:
	return TD.getPointerSize();			return TD.getPointerSize();
	case MachineJumpTableInfo::EK_GPRel64BlockAddress:			case MachineJumpTableInfo::EK_GPRel64BlockAddress:
				case MachineJumpTableInfo::EK_LabelDifference64:
	return 8;			return 8;
	case MachineJumpTableInfo::EK_GPRel32BlockAddress:			case MachineJumpTableInfo::EK_GPRel32BlockAddress:
	case MachineJumpTableInfo::EK_LabelDifference32:			case MachineJumpTableInfo::EK_LabelDifference32:
	case MachineJumpTableInfo::EK_Custom32:			case MachineJumpTableInfo::EK_Custom32:
	return 4;			return 4;
	case MachineJumpTableInfo::EK_Inline:			case MachineJumpTableInfo::EK_Inline:
	return 0;			return 0;
	}			}
	llvm_unreachable("Unknown jump table encoding!");			llvm_unreachable("Unknown jump table encoding!");
	}			}

	/// Return the alignment of each entry in the jump table.			/// Return the alignment of each entry in the jump table.
	unsigned MachineJumpTableInfo::getEntryAlignment(const DataLayout &TD) const {			unsigned MachineJumpTableInfo::getEntryAlignment(const DataLayout &TD) const {
	// The alignment of a jump table entry is the alignment of int32 unless the			// The alignment of a jump table entry is the alignment of int32 unless the
	// entry is just the address of a block, in which case it is the pointer			// entry is just the address of a block, in which case it is the pointer
	// alignment.			// alignment.
	switch (getEntryKind()) {			switch (getEntryKind()) {
	case MachineJumpTableInfo::EK_BlockAddress:			case MachineJumpTableInfo::EK_BlockAddress:
	return TD.getPointerABIAlignment();			return TD.getPointerABIAlignment();
	case MachineJumpTableInfo::EK_GPRel64BlockAddress:			case MachineJumpTableInfo::EK_GPRel64BlockAddress:
				case MachineJumpTableInfo::EK_LabelDifference64:
	return TD.getABIIntegerTypeAlignment(64);			return TD.getABIIntegerTypeAlignment(64);
	case MachineJumpTableInfo::EK_GPRel32BlockAddress:			case MachineJumpTableInfo::EK_GPRel32BlockAddress:
	case MachineJumpTableInfo::EK_LabelDifference32:			case MachineJumpTableInfo::EK_LabelDifference32:
	case MachineJumpTableInfo::EK_Custom32:			case MachineJumpTableInfo::EK_Custom32:
	return TD.getABIIntegerTypeAlignment(32);			return TD.getABIIntegerTypeAlignment(32);
	case MachineJumpTableInfo::EK_Inline:			case MachineJumpTableInfo::EK_Inline:
	return 1;			return 1;
	}			}
	▲ Show 20 Lines • Show All 211 Lines • Show Last 20 Lines

lib/CodeGen/SelectionDAG/TargetLowering.cpp

	Show First 20 Lines • Show All 282 Lines • ▼ Show 20 Lines

	/// Return the entry encoding for a jump table in the current function. The			/// Return the entry encoding for a jump table in the current function. The
	/// returned value is a member of the MachineJumpTableInfo::JTEntryKind enum.			/// returned value is a member of the MachineJumpTableInfo::JTEntryKind enum.
	unsigned TargetLowering::getJumpTableEncoding() const {			unsigned TargetLowering::getJumpTableEncoding() const {
	// In non-pic modes, just use the address of a block.			// In non-pic modes, just use the address of a block.
	if (!isPositionIndependent())			if (!isPositionIndependent())
	return MachineJumpTableInfo::EK_BlockAddress;			return MachineJumpTableInfo::EK_BlockAddress;

				const auto &TM = getTargetMachine();
				if (TM.getPointerSize() == 8 && TM.getCodeModel() == CodeModel::Large) {
				// In PIC mode, if the target supports a GPRel64 directive, use it.
				if (TM.getMCAsmInfo()->getGPRel64Directive() != nullptr)
				return MachineJumpTableInfo::EK_GPRel64BlockAddress;

				// Otherwise, use a label difference.
				return MachineJumpTableInfo::EK_LabelDifference64;
				} else {
	// In PIC mode, if the target supports a GPRel32 directive, use it.			// In PIC mode, if the target supports a GPRel32 directive, use it.
	if (getTargetMachine().getMCAsmInfo()->getGPRel32Directive() != nullptr)			if (TM.getMCAsmInfo()->getGPRel32Directive() != nullptr)
	return MachineJumpTableInfo::EK_GPRel32BlockAddress;			return MachineJumpTableInfo::EK_GPRel32BlockAddress;

	// Otherwise, use a label difference.			// Otherwise, use a label difference.
	return MachineJumpTableInfo::EK_LabelDifference32;			return MachineJumpTableInfo::EK_LabelDifference32;
	}			}
				}

	SDValue TargetLowering::getPICJumpTableRelocBase(SDValue Table,			SDValue TargetLowering::getPICJumpTableRelocBase(SDValue Table,
	SelectionDAG &DAG) const {			SelectionDAG &DAG) const {
	// If our PIC model is GP relative, use the global offset table as the base.			// If our PIC model is GP relative, use the global offset table as the base.
	unsigned JTEncoding = getJumpTableEncoding();			unsigned JTEncoding = getJumpTableEncoding();

	if ((JTEncoding == MachineJumpTableInfo::EK_GPRel64BlockAddress) \|\|			if ((JTEncoding == MachineJumpTableInfo::EK_GPRel64BlockAddress) \|\|
	(JTEncoding == MachineJumpTableInfo::EK_GPRel32BlockAddress))			(JTEncoding == MachineJumpTableInfo::EK_GPRel32BlockAddress))
	▲ Show 20 Lines • Show All 3,577 Lines • Show Last 20 Lines

test/CodeGen/AArch64/jumptable-large.ll

This file was added.

				; RUN: llc -O0 -relocation-model=pic -code-model=small -march=aarch64 %s -o - \| FileCheck --check-prefix=CHECK --check-prefix=CHECK-SMALL %s
				; RUN: llc -O0 -relocation-model=pic -code-model=large -march=aarch64 %s -o - \| FileCheck --check-prefix=CHECK --check-prefix=CHECK-LARGE %s

				define double @f(i64) {
				top:
				switch i64 %0, label %L19 [
				i64 0, label %if
				i64 1, label %if1
				i64 2, label %if2
				i64 3, label %if3
				]

				if: ; preds = %top
				%1 = call double @g1(double -1.000000e+00)
				ret double %1

				if1: ; preds = %top
				%2 = call double @g2(double 1.000000e+00)
				ret double %2

				if2: ; preds = %top
				%3 = call double @g3(double 1.000000e+00)
				ret double %3

				if3: ; preds = %top
				%4 = call double @g4(double 1.000000e+00)
				ret double %4

				L19: ; preds = %top
				%5 = call double @g5(double 1.000000e+00)
				ret double %5
				; CHECK-LABEL: .LJTI0_0:
				; CHECK-LARGE-NEXT: .xword .LBB{{.*}}-.LJTI0_0
				; CHECK-LARGE-NEXT: .xword .LBB{{.*}}-.LJTI0_0
				; CHECK-LARGE-NEXT: .xword .LBB{{.*}}-.LJTI0_0
				; CHECK-LARGE-NEXT: .xword .LBB{{.*}}-.LJTI0_0
				; CHECK-SMALL-NEXT: .word .LBB{{.*}}-.LJTI0_0
				; CHECK-SMALL-NEXT: .word .LBB{{.*}}-.LJTI0_0
				; CHECK-SMALL-NEXT: .word .LBB{{.*}}-.LJTI0_0
				; CHECK-SMALL-NEXT: .word .LBB{{.*}}-.LJTI0_0
				}

				declare double @g1(double)

				declare double @g2(double)

				declare double @g3(double)

				declare double @g4(double)

				declare double @g5(double)

test/CodeGen/PowerPC/jumptable-large.ll

This file was added.

				; RUN: llc -O0 -relocation-model=pic -code-model=small -march=ppc64 %s -o - \| FileCheck --check-prefix=CHECK --check-prefix=CHECK-SMALL %s
				; RUN: llc -O0 -relocation-model=pic -code-model=large -march=ppc64 %s -o - \| FileCheck --check-prefix=CHECK --check-prefix=CHECK-LARGE %s

				define double @f(i64) {
				top:
				switch i64 %0, label %L19 [
				i64 0, label %if
				i64 1, label %if1
				i64 2, label %if2
				i64 3, label %if3
				]

				if: ; preds = %top
				%1 = call double @g1(double -1.000000e+00)
				ret double %1

				if1: ; preds = %top
				%2 = call double @g2(double 1.000000e+00)
				ret double %2

				if2: ; preds = %top
				%3 = call double @g3(double 1.000000e+00)
				ret double %3

				if3: ; preds = %top
				%4 = call double @g4(double 1.000000e+00)
				ret double %4

				L19: ; preds = %top
				%5 = call double @g5(double 1.000000e+00)
				ret double %5
				; CHECK-LABEL: .LJTI0_0:
				; CHECK-LARGE-NEXT: .long .LBB{{.*}}-.L0$pb
				; CHECK-LARGE-NEXT: .long .LBB{{.*}}-.L0$pb
				; CHECK-LARGE-NEXT: .long .LBB{{.*}}-.L0$pb
				; CHECK-LARGE-NEXT: .long .LBB{{.*}}-.L0$pb
				; CHECK-SMALL-NEXT: .long .LBB{{.*}}-.LJTI0_0
				; CHECK-SMALL-NEXT: .long .LBB{{.*}}-.LJTI0_0
				; CHECK-SMALL-NEXT: .long .LBB{{.*}}-.LJTI0_0
				; CHECK-SMALL-NEXT: .long .LBB{{.*}}-.LJTI0_0
				}

				declare double @g1(double)

				declare double @g2(double)

				declare double @g3(double)

				declare double @g4(double)

				declare double @g5(double)

test/CodeGen/X86/jumptable-large.ll

This file was added.

				; RUN: llc -O0 -relocation-model=pic -code-model=small -march=x86-64 %s -o - \| FileCheck --check-prefix=CHECK --check-prefix=CHECK-SMALL %s
				; RUN: llc -O0 -relocation-model=pic -code-model=large -march=x86-64 %s -o - \| FileCheck --check-prefix=CHECK --check-prefix=CHECK-LARGE %s

				define double @f(i64) {
				top:
				switch i64 %0, label %L19 [
				i64 0, label %if
				i64 1, label %if1
				i64 2, label %if2
				i64 3, label %if3
				]

				if: ; preds = %top
				%1 = call double @g1(double -1.000000e+00)
				ret double %1

				if1: ; preds = %top
				%2 = call double @g2(double 1.000000e+00)
				ret double %2

				if2: ; preds = %top
				%3 = call double @g3(double 1.000000e+00)
				ret double %3

				if3: ; preds = %top
				%4 = call double @g4(double 1.000000e+00)
				ret double %4

				L19: ; preds = %top
				%5 = call double @g5(double 1.000000e+00)
				ret double %5
				; CHECK-LABEL: .LJTI0_0:
				; CHECK-LARGE-NEXT: .quad .LBB{{.*}}-.LJTI0_0
				; CHECK-LARGE-NEXT: .quad .LBB{{.*}}-.LJTI0_0
				; CHECK-LARGE-NEXT: .quad .LBB{{.*}}-.LJTI0_0
				; CHECK-LARGE-NEXT: .quad .LBB{{.*}}-.LJTI0_0
				; CHECK-SMALL-NEXT: .long .LBB{{.*}}-.LJTI0_0
				; CHECK-SMALL-NEXT: .long .LBB{{.*}}-.LJTI0_0
				; CHECK-SMALL-NEXT: .long .LBB{{.*}}-.LJTI0_0
				; CHECK-SMALL-NEXT: .long .LBB{{.*}}-.LJTI0_0
				}

				declare double @g1(double)

				declare double @g2(double)

				declare double @g3(double)

				declare double @g4(double)

				declare double @g5(double)

This is an archive of the discontinued LLVM Phabricator instance.

Use 64bit jump table with large code model on 64bitNeeds ReviewPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 103283

include/llvm/CodeGen/MIRYamlMapping.h

include/llvm/CodeGen/MachineJumpTableInfo.h

lib/CodeGen/AsmPrinter/AsmPrinter.cpp

lib/CodeGen/MachineFunction.cpp

lib/CodeGen/SelectionDAG/TargetLowering.cpp

test/CodeGen/AArch64/jumptable-large.ll

test/CodeGen/PowerPC/jumptable-large.ll

test/CodeGen/X86/jumptable-large.ll

Use 64bit jump table with large code model on 64bit
Needs ReviewPublic