This is an archive of the discontinued LLVM Phabricator instance.

This is just a demonstration of what I meant in the comments for llvm.org/PR38544: getting rid of explicit overlapping subregister indices solves the problem encountered in tc_subregliveness_noliveseg.ll.

I tried this and found that everything builds, SPEC is NFC and all tests pass, so this looks fine to me.

I suspect that the "reinterpret" subregindexes are just better names when used with the floating point registers, but we should wait for Ulis comment on this before we commit it.

Well, the original rationale for using different subreg indices for float/vector registers is given here:
http://lists.llvm.org/pipermail/llvm-commits/Week-of-Mon-20150504/274358.html

The main underlying issue is that the architecture states that writing to a floating-point register (which overlays the upper half of a vector register) may clobber the lower half of that vector register. I believe the current processor implementations do not actually do that, but it still would be preferable to model this correctly.

On the other hand, just using the different subreg index names doesn't actually do that, it's just a cosmetic marker. So I don't necessarily object to reverting that difference (as this patch would do). Just wondering if there's a better way to actually express the underlying issue.

This situation is the same as the case of RAX vs EAX on X86. EAX is the lower half of RAX, but modifying EAX does not preserve the upper bits of RAX. On the other hand, modifying AX (lower half of EAX) does preserve the upper half of EAX. Originally, the former behavior was modeled for both cases, i.e. overwriting a lone subregister would be considered as overwriting the entire register. I added phony registers to X86 (e.g. HAX covering the upper half of EAX) to model the latter behavior for EAX, EBX, etc.

In case of VF128, if a register from that class has only one subregister, then both, the register and the subregister will share the same register unit(s), which means that from the point of view of register aliasing, they are assumed to clobber each other without preserving any parts.

Running tblgen with -debug shows

RC FP64Bit Units:
 F0S F1S F2S F3S F4S F5S F6S F7S F8S F9S F10S F11S F12S F13S F14S F15S
[...]
RC VF128Bit Units: 
 F0S F1S F2S F3S F4S F5S F6S F7S F8S F9S F10S F11S F12S F13S F14S F15S
[...]
RC FP128Bit Units:
 F0S F1S F2S F3S F4S F5S F6S F7S F8S F9S F10S F11S F12S F13S F14S F15S

These are unit sets for the entire class, but the fact that they are identical is a sign that the individual registers share units with their subregisters.

Ah, that's good to know! So if I understand this correctly, accessing even the 32-bit part (F0S) would be considered to clobber F0D. This is not really necessary, but probably doesn't hurt at this point. We could do the same thing as the HAX you mention to get this modeled exactly.

In any case, then I agree that this patch is fine.

This revision is now accepted and ready to land.Aug 15 2018, 7:58 AM

Closed by commit rL339778: [SystemZ] Replace subreg_r with subreg_h (authored by kparzysz). · Explain WhyAug 15 2018, 8:22 AM

This revision was automatically updated to reflect the committed changes.

@kparzysz @uweigand

We're seeing tlbgen warnings after this patch:

warning: SubRegIndex SystemZ::subreg_h64 and SystemZ::subreg_h32 compose ambiguously as SystemZ::subreg_hh32 or SystemZ::subreg_h32

Will investigate.

The problem is in how tblgen calculates sub-registers. For a given register R, subreg indices of its sub-registers will be added to R, in other words a super-register "inherits" subreg indices from its subregisters. For example, V0 (from class VR128) has an explicit subreg index subreg_h64. It also has a sub-register F0D, which has a subreg index subreg_h32. As a result, V0 will have two subreg indices: subreg_h64 and subreg_h32. From the register structure (V0 vs F0D vs F0S) it is inferred that the composition V0.subreg_h64.subreg_h32 is same as V0.subreg_h32, a in general that the composition subreg_h64.subreg_h32 is equivalent to subreg_h32. At the same time, repeating this logic for a register F0Q from class FPR128 leads to adding subreg_h32 twice, which tblgen tries to resolve by referring to user-defined compositions. This is how subreg_hh32 shows up. However, now there is another result for composing subreg_h64.subreg_h32, hence the warning.

This warning didn't exist back when the subreg_r was introduced, but that's because VR128 didn't exist back then. If subreg_r was not added, adding VR128 would cause the warning to appear.

That warning can be eliminated by resolving such conflicts in favor of user-defined compositions.

In D50725#1202790, @kparzysz wrote:

That warning can be eliminated by resolving such conflicts in favor of user-defined compositions.

Are you or someone else working on such a patch?

I just don't want to have warnings sit and persist here.

Yes, I do have one. I'll post it on Monday.

kparzysz mentioned this in D50977: [TableGen] Examine entire subreg compositions to detect ambiguity.Aug 20 2018, 10:14 AM

In D50725#1204855, @kparzysz wrote:

Yes, I do have one. I'll post it on Monday.

I'm still seeing this warning as of 11/3/2018

warning: SubRegIndex SystemZ::subreg_h64 and SystemZ::subreg_h32 compose ambiguously as SystemZ::subreg_hh32 or SystemZ::subreg_h32

When can we expect a resolution?

We're all still waiting on a fix here....

If Krzysztof can't address this, maybe the code owner for SystemZ (CC-ed I
believe) can take a look?

In D50725#1303747, @chandlerc wrote:

We're all still waiting on a fix here....

If Krzysztof can't address this, maybe the code owner for SystemZ (CC-ed I
believe) can take a look?

Here's the scoop:

Without this patch (causing the warning to appear), TableGen generated incorrect lane masks, with this patch it doesn't, but we get a warning.
The situation indicated by the warning is harmless. D50977 was an attempt to silence this case, but the patch caused some concerns.
The best solution would be to fix TableGen, but there root of the problem seems to be in some mismatch between the way that TableGen generates subregister indices and the assumptions it makes when it calculates lane masks. IIRC, in this case there two ways to address a subregister (that maps to the same physical register in the end) with two corresponding (distinct) lane masks.

I went for fixing (2) because I don't have a clear understanding on how to fix (3). I think the problem has to do with how TableGen generates subregister indexes, and any change there could affect a lot of target and codegen code.

At the moment I'm stuck between a simple silencing of the warning and investigating the root of the problem for which I don't really have enough time at the moment.

kparzysz mentioned this in rL347894: [TableGen] Examine entire subreg compositions to detect ambiguity.Nov 29 2018, 10:23 AM

Revision Contents

Path

Size

llvm/

trunk/

lib/

Target/

SystemZ/

SystemZISelLowering.cpp

4 lines

SystemZInstrFP.td

10 lines

SystemZInstrInfo.cpp

8 lines

SystemZInstrVector.td

16 lines

SystemZRegisterInfo.td

7 lines

Diff 160814

llvm/trunk/lib/Target/SystemZ/SystemZISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,906 Lines • ▼ Show 20 Lines	if (Subtarget.hasHighWord()) {
In64 = DAG.getTargetInsertSubreg(SystemZ::subreg_h32, DL,		In64 = DAG.getTargetInsertSubreg(SystemZ::subreg_h32, DL,
MVT::i64, SDValue(U64, 0), In);		MVT::i64, SDValue(U64, 0), In);
} else {		} else {
In64 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, In);		In64 = DAG.getNode(ISD::ANY_EXTEND, DL, MVT::i64, In);
In64 = DAG.getNode(ISD::SHL, DL, MVT::i64, In64,		In64 = DAG.getNode(ISD::SHL, DL, MVT::i64, In64,
DAG.getConstant(32, DL, MVT::i64));		DAG.getConstant(32, DL, MVT::i64));
}		}
SDValue Out64 = DAG.getNode(ISD::BITCAST, DL, MVT::f64, In64);		SDValue Out64 = DAG.getNode(ISD::BITCAST, DL, MVT::f64, In64);
return DAG.getTargetExtractSubreg(SystemZ::subreg_r32,		return DAG.getTargetExtractSubreg(SystemZ::subreg_h32,
DL, MVT::f32, Out64);		DL, MVT::f32, Out64);
}		}
if (InVT == MVT::f32 && ResVT == MVT::i32) {		if (InVT == MVT::f32 && ResVT == MVT::i32) {
SDNode *U64 = DAG.getMachineNode(TargetOpcode::IMPLICIT_DEF, DL, MVT::f64);		SDNode *U64 = DAG.getMachineNode(TargetOpcode::IMPLICIT_DEF, DL, MVT::f64);
SDValue In64 = DAG.getTargetInsertSubreg(SystemZ::subreg_r32, DL,		SDValue In64 = DAG.getTargetInsertSubreg(SystemZ::subreg_h32, DL,
MVT::f64, SDValue(U64, 0), In);		MVT::f64, SDValue(U64, 0), In);
SDValue Out64 = DAG.getNode(ISD::BITCAST, DL, MVT::i64, In64);		SDValue Out64 = DAG.getNode(ISD::BITCAST, DL, MVT::i64, In64);
if (Subtarget.hasHighWord())		if (Subtarget.hasHighWord())
return DAG.getTargetExtractSubreg(SystemZ::subreg_h32, DL,		return DAG.getTargetExtractSubreg(SystemZ::subreg_h32, DL,
MVT::i32, Out64);		MVT::i32, Out64);
SDValue Shift = DAG.getNode(ISD::SRL, DL, MVT::i64, Out64,		SDValue Shift = DAG.getNode(ISD::SRL, DL, MVT::i64, Out64,
DAG.getConstant(32, DL, MVT::i64));		DAG.getConstant(32, DL, MVT::i64));
return DAG.getNode(ISD::TRUNCATE, DL, MVT::i32, Shift);		return DAG.getNode(ISD::TRUNCATE, DL, MVT::i32, Shift);
▲ Show 20 Lines • Show All 4,361 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/SystemZ/SystemZInstrFP.td

Show First 20 Lines • Show All 91 Lines • ▼ Show 20 Lines
}		}

// The sign of an FP128 is in the high register.		// The sign of an FP128 is in the high register.
let Predicates = [FeatureNoVectorEnhancements1] in		let Predicates = [FeatureNoVectorEnhancements1] in
def : Pat<(fcopysign FP32:$src1, (f32 (fpround (f128 FP128:$src2)))),		def : Pat<(fcopysign FP32:$src1, (f32 (fpround (f128 FP128:$src2)))),
(CPSDRsd FP32:$src1, (EXTRACT_SUBREG FP128:$src2, subreg_h64))>;		(CPSDRsd FP32:$src1, (EXTRACT_SUBREG FP128:$src2, subreg_h64))>;
let Predicates = [FeatureVectorEnhancements1] in		let Predicates = [FeatureVectorEnhancements1] in
def : Pat<(fcopysign FP32:$src1, (f32 (fpround (f128 VR128:$src2)))),		def : Pat<(fcopysign FP32:$src1, (f32 (fpround (f128 VR128:$src2)))),
(CPSDRsd FP32:$src1, (EXTRACT_SUBREG VR128:$src2, subreg_r64))>;		(CPSDRsd FP32:$src1, (EXTRACT_SUBREG VR128:$src2, subreg_h64))>;

// fcopysign with an FP64 result.		// fcopysign with an FP64 result.
let isCodeGenOnly = 1 in		let isCodeGenOnly = 1 in
def CPSDRds : BinaryRRFb<"cpsdr", 0xB372, fcopysign, FP64, FP64, FP32>;		def CPSDRds : BinaryRRFb<"cpsdr", 0xB372, fcopysign, FP64, FP64, FP32>;
def CPSDRdd : BinaryRRFb<"cpsdr", 0xB372, fcopysign, FP64, FP64, FP64>;		def CPSDRdd : BinaryRRFb<"cpsdr", 0xB372, fcopysign, FP64, FP64, FP64>;

// The sign of an FP128 is in the high register.		// The sign of an FP128 is in the high register.
let Predicates = [FeatureNoVectorEnhancements1] in		let Predicates = [FeatureNoVectorEnhancements1] in
def : Pat<(fcopysign FP64:$src1, (f64 (fpround (f128 FP128:$src2)))),		def : Pat<(fcopysign FP64:$src1, (f64 (fpround (f128 FP128:$src2)))),
(CPSDRdd FP64:$src1, (EXTRACT_SUBREG FP128:$src2, subreg_h64))>;		(CPSDRdd FP64:$src1, (EXTRACT_SUBREG FP128:$src2, subreg_h64))>;
let Predicates = [FeatureVectorEnhancements1] in		let Predicates = [FeatureVectorEnhancements1] in
def : Pat<(fcopysign FP64:$src1, (f64 (fpround (f128 VR128:$src2)))),		def : Pat<(fcopysign FP64:$src1, (f64 (fpround (f128 VR128:$src2)))),
(CPSDRdd FP64:$src1, (EXTRACT_SUBREG VR128:$src2, subreg_r64))>;		(CPSDRdd FP64:$src1, (EXTRACT_SUBREG VR128:$src2, subreg_h64))>;

// fcopysign with an FP128 result. Use "upper" as the high half and leave		// fcopysign with an FP128 result. Use "upper" as the high half and leave
// the low half as-is.		// the low half as-is.
class CopySign128<RegisterOperand cls, dag upper>		class CopySign128<RegisterOperand cls, dag upper>
: Pat<(fcopysign FP128:$src1, cls:$src2),		: Pat<(fcopysign FP128:$src1, cls:$src2),
(INSERT_SUBREG FP128:$src1, upper, subreg_h64)>;		(INSERT_SUBREG FP128:$src1, upper, subreg_h64)>;

let Predicates = [FeatureNoVectorEnhancements1] in {		let Predicates = [FeatureNoVectorEnhancements1] in {
▲ Show 20 Lines • Show All 60 Lines • ▼ Show 20 Lines	def LEDBRA : TernaryRRFe<"ledbra", 0xB344, FP32, FP64>,
Requires<[FeatureFPExtension]>;		Requires<[FeatureFPExtension]>;
def LEXBRA : TernaryRRFe<"lexbra", 0xB346, FP128, FP128>,		def LEXBRA : TernaryRRFe<"lexbra", 0xB346, FP128, FP128>,
Requires<[FeatureFPExtension]>;		Requires<[FeatureFPExtension]>;
def LDXBRA : TernaryRRFe<"ldxbra", 0xB345, FP128, FP128>,		def LDXBRA : TernaryRRFe<"ldxbra", 0xB345, FP128, FP128>,
Requires<[FeatureFPExtension]>;		Requires<[FeatureFPExtension]>;

let Predicates = [FeatureNoVectorEnhancements1] in {		let Predicates = [FeatureNoVectorEnhancements1] in {
def : Pat<(f32 (fpround FP128:$src)),		def : Pat<(f32 (fpround FP128:$src)),
(EXTRACT_SUBREG (LEXBR FP128:$src), subreg_hr32)>;		(EXTRACT_SUBREG (LEXBR FP128:$src), subreg_hh32)>;
def : Pat<(f64 (fpround FP128:$src)),		def : Pat<(f64 (fpround FP128:$src)),
(EXTRACT_SUBREG (LDXBR FP128:$src), subreg_h64)>;		(EXTRACT_SUBREG (LDXBR FP128:$src), subreg_h64)>;
}		}

// Extend register floating-point values to wider representations.		// Extend register floating-point values to wider representations.
def LDEBR : UnaryRRE<"ldebr", 0xB304, fpextend, FP64, FP32>;		def LDEBR : UnaryRRE<"ldebr", 0xB304, fpextend, FP64, FP32>;
def LXEBR : UnaryRRE<"lxebr", 0xB306, null_frag, FP128, FP32>;		def LXEBR : UnaryRRE<"lxebr", 0xB306, null_frag, FP128, FP32>;
def LXDBR : UnaryRRE<"lxdbr", 0xB305, null_frag, FP128, FP64>;		def LXDBR : UnaryRRE<"lxdbr", 0xB305, null_frag, FP128, FP64>;
▲ Show 20 Lines • Show All 242 Lines • ▼ Show 20 Lines
}		}
def MEEB : BinaryRXE<"meeb", 0xED17, fmul, FP32, load, 4>;		def MEEB : BinaryRXE<"meeb", 0xED17, fmul, FP32, load, 4>;
def MDB : BinaryRXE<"mdb", 0xED1C, fmul, FP64, load, 8>;		def MDB : BinaryRXE<"mdb", 0xED1C, fmul, FP64, load, 8>;

// f64 multiplication of two FP32 registers.		// f64 multiplication of two FP32 registers.
def MDEBR : BinaryRRE<"mdebr", 0xB30C, null_frag, FP64, FP32>;		def MDEBR : BinaryRRE<"mdebr", 0xB30C, null_frag, FP64, FP32>;
def : Pat<(fmul (f64 (fpextend FP32:$src1)), (f64 (fpextend FP32:$src2))),		def : Pat<(fmul (f64 (fpextend FP32:$src1)), (f64 (fpextend FP32:$src2))),
(MDEBR (INSERT_SUBREG (f64 (IMPLICIT_DEF)),		(MDEBR (INSERT_SUBREG (f64 (IMPLICIT_DEF)),
FP32:$src1, subreg_r32), FP32:$src2)>;		FP32:$src1, subreg_h32), FP32:$src2)>;

// f64 multiplication of an FP32 register and an f32 memory.		// f64 multiplication of an FP32 register and an f32 memory.
def MDEB : BinaryRXE<"mdeb", 0xED0C, null_frag, FP64, load, 4>;		def MDEB : BinaryRXE<"mdeb", 0xED0C, null_frag, FP64, load, 4>;
def : Pat<(fmul (f64 (fpextend FP32:$src1)),		def : Pat<(fmul (f64 (fpextend FP32:$src1)),
(f64 (extloadf32 bdxaddr12only:$addr))),		(f64 (extloadf32 bdxaddr12only:$addr))),
(MDEB (INSERT_SUBREG (f64 (IMPLICIT_DEF)), FP32:$src1, subreg_r32),		(MDEB (INSERT_SUBREG (f64 (IMPLICIT_DEF)), FP32:$src1, subreg_h32),
bdxaddr12only:$addr)>;		bdxaddr12only:$addr)>;

// f128 multiplication of two FP64 registers.		// f128 multiplication of two FP64 registers.
def MXDBR : BinaryRRE<"mxdbr", 0xB307, null_frag, FP128, FP64>;		def MXDBR : BinaryRRE<"mxdbr", 0xB307, null_frag, FP128, FP64>;
let Predicates = [FeatureNoVectorEnhancements1] in		let Predicates = [FeatureNoVectorEnhancements1] in
def : Pat<(fmul (f128 (fpextend FP64:$src1)), (f128 (fpextend FP64:$src2))),		def : Pat<(fmul (f128 (fpextend FP64:$src1)), (f128 (fpextend FP64:$src2))),
(MXDBR (INSERT_SUBREG (f128 (IMPLICIT_DEF)),		(MXDBR (INSERT_SUBREG (f128 (IMPLICIT_DEF)),
FP64:$src1, subreg_h64), FP64:$src2)>;		FP64:$src1, subreg_h64), FP64:$src2)>;
▲ Show 20 Lines • Show All 94 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/SystemZ/SystemZInstrInfo.cpp

Show First 20 Lines • Show All 874 Lines • ▼ Show 20 Lines	if (SystemZ::GRX32BitRegClass.contains(DestReg, SrcReg)) {
return;		return;
}		}

// Move 128-bit floating-point values between VR128 and FP128.		// Move 128-bit floating-point values between VR128 and FP128.
if (SystemZ::VR128BitRegClass.contains(DestReg) &&		if (SystemZ::VR128BitRegClass.contains(DestReg) &&
SystemZ::FP128BitRegClass.contains(SrcReg)) {		SystemZ::FP128BitRegClass.contains(SrcReg)) {
unsigned SrcRegHi =		unsigned SrcRegHi =
RI.getMatchingSuperReg(RI.getSubReg(SrcReg, SystemZ::subreg_h64),		RI.getMatchingSuperReg(RI.getSubReg(SrcReg, SystemZ::subreg_h64),
SystemZ::subreg_r64, &SystemZ::VR128BitRegClass);		SystemZ::subreg_h64, &SystemZ::VR128BitRegClass);
unsigned SrcRegLo =		unsigned SrcRegLo =
RI.getMatchingSuperReg(RI.getSubReg(SrcReg, SystemZ::subreg_l64),		RI.getMatchingSuperReg(RI.getSubReg(SrcReg, SystemZ::subreg_l64),
SystemZ::subreg_r64, &SystemZ::VR128BitRegClass);		SystemZ::subreg_h64, &SystemZ::VR128BitRegClass);

BuildMI(MBB, MBBI, DL, get(SystemZ::VMRHG), DestReg)		BuildMI(MBB, MBBI, DL, get(SystemZ::VMRHG), DestReg)
.addReg(SrcRegHi, getKillRegState(KillSrc))		.addReg(SrcRegHi, getKillRegState(KillSrc))
.addReg(SrcRegLo, getKillRegState(KillSrc));		.addReg(SrcRegLo, getKillRegState(KillSrc));
return;		return;
}		}
if (SystemZ::FP128BitRegClass.contains(DestReg) &&		if (SystemZ::FP128BitRegClass.contains(DestReg) &&
SystemZ::VR128BitRegClass.contains(SrcReg)) {		SystemZ::VR128BitRegClass.contains(SrcReg)) {
unsigned DestRegHi =		unsigned DestRegHi =
RI.getMatchingSuperReg(RI.getSubReg(DestReg, SystemZ::subreg_h64),		RI.getMatchingSuperReg(RI.getSubReg(DestReg, SystemZ::subreg_h64),
SystemZ::subreg_r64, &SystemZ::VR128BitRegClass);		SystemZ::subreg_h64, &SystemZ::VR128BitRegClass);
unsigned DestRegLo =		unsigned DestRegLo =
RI.getMatchingSuperReg(RI.getSubReg(DestReg, SystemZ::subreg_l64),		RI.getMatchingSuperReg(RI.getSubReg(DestReg, SystemZ::subreg_l64),
SystemZ::subreg_r64, &SystemZ::VR128BitRegClass);		SystemZ::subreg_h64, &SystemZ::VR128BitRegClass);

if (DestRegHi != SrcReg)		if (DestRegHi != SrcReg)
copyPhysReg(MBB, MBBI, DL, DestRegHi, SrcReg, false);		copyPhysReg(MBB, MBBI, DL, DestRegHi, SrcReg, false);
BuildMI(MBB, MBBI, DL, get(SystemZ::VREPG), DestRegLo)		BuildMI(MBB, MBBI, DL, get(SystemZ::VREPG), DestRegLo)
.addReg(SrcReg, getKillRegState(KillSrc)).addImm(1);		.addReg(SrcReg, getKillRegState(KillSrc)).addImm(1);
return;		return;
}		}

▲ Show 20 Lines • Show All 984 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/SystemZ/SystemZInstrVector.td

	Show First 20 Lines • Show All 1,399 Lines • ▼ Show 20 Lines
	multiclass ScalarToVectorFP<Instruction vrep, ValueType vt, RegisterOperand cls,			multiclass ScalarToVectorFP<Instruction vrep, ValueType vt, RegisterOperand cls,
	SubRegIndex subreg> {			SubRegIndex subreg> {
	def : Pat<(vt (scalar_to_vector cls:$scalar)),			def : Pat<(vt (scalar_to_vector cls:$scalar)),
	(INSERT_SUBREG (vt (IMPLICIT_DEF)), cls:$scalar, subreg)>;			(INSERT_SUBREG (vt (IMPLICIT_DEF)), cls:$scalar, subreg)>;
	def : Pat<(vt (z_replicate cls:$scalar)),			def : Pat<(vt (z_replicate cls:$scalar)),
	(vrep (INSERT_SUBREG (vt (IMPLICIT_DEF)), cls:$scalar,			(vrep (INSERT_SUBREG (vt (IMPLICIT_DEF)), cls:$scalar,
	subreg), 0)>;			subreg), 0)>;
	}			}
	defm : ScalarToVectorFP<VREPF, v4f32, FP32, subreg_r32>;			defm : ScalarToVectorFP<VREPF, v4f32, FP32, subreg_h32>;
	defm : ScalarToVectorFP<VREPG, v2f64, FP64, subreg_r64>;			defm : ScalarToVectorFP<VREPG, v2f64, FP64, subreg_h64>;

	// Match v2f64 insertions. The AddedComplexity counters the 3 added by			// Match v2f64 insertions. The AddedComplexity counters the 3 added by
	// TableGen for the base register operand in VLVG-based integer insertions			// TableGen for the base register operand in VLVG-based integer insertions
	// and ensures that this version is strictly better.			// and ensures that this version is strictly better.
	let AddedComplexity = 4 in {			let AddedComplexity = 4 in {
	def : Pat<(z_vector_insert (v2f64 VR128:$vec), FP64:$elt, 0),			def : Pat<(z_vector_insert (v2f64 VR128:$vec), FP64:$elt, 0),
	(VPDI (INSERT_SUBREG (v2f64 (IMPLICIT_DEF)), FP64:$elt,			(VPDI (INSERT_SUBREG (v2f64 (IMPLICIT_DEF)), FP64:$elt,
	subreg_r64), VR128:$vec, 1)>;			subreg_h64), VR128:$vec, 1)>;
	def : Pat<(z_vector_insert (v2f64 VR128:$vec), FP64:$elt, 1),			def : Pat<(z_vector_insert (v2f64 VR128:$vec), FP64:$elt, 1),
	(VPDI VR128:$vec, (INSERT_SUBREG (v2f64 (IMPLICIT_DEF)), FP64:$elt,			(VPDI VR128:$vec, (INSERT_SUBREG (v2f64 (IMPLICIT_DEF)), FP64:$elt,
	subreg_r64), 0)>;			subreg_h64), 0)>;
	}			}

	// We extract floating-point element X by replicating (for elements other			// We extract floating-point element X by replicating (for elements other
	// than 0) and then taking a high subreg. The AddedComplexity counters the			// than 0) and then taking a high subreg. The AddedComplexity counters the
	// 3 added by TableGen for the base register operand in VLGV-based integer			// 3 added by TableGen for the base register operand in VLGV-based integer
	// extractions and ensures that this version is strictly better.			// extractions and ensures that this version is strictly better.
	let AddedComplexity = 4 in {			let AddedComplexity = 4 in {
	def : Pat<(f32 (z_vector_extract (v4f32 VR128:$vec), 0)),			def : Pat<(f32 (z_vector_extract (v4f32 VR128:$vec), 0)),
	(EXTRACT_SUBREG VR128:$vec, subreg_r32)>;			(EXTRACT_SUBREG VR128:$vec, subreg_h32)>;
	def : Pat<(f32 (z_vector_extract (v4f32 VR128:$vec), imm32zx2:$index)),			def : Pat<(f32 (z_vector_extract (v4f32 VR128:$vec), imm32zx2:$index)),
	(EXTRACT_SUBREG (VREPF VR128:$vec, imm32zx2:$index), subreg_r32)>;			(EXTRACT_SUBREG (VREPF VR128:$vec, imm32zx2:$index), subreg_h32)>;

	def : Pat<(f64 (z_vector_extract (v2f64 VR128:$vec), 0)),			def : Pat<(f64 (z_vector_extract (v2f64 VR128:$vec), 0)),
	(EXTRACT_SUBREG VR128:$vec, subreg_r64)>;			(EXTRACT_SUBREG VR128:$vec, subreg_h64)>;
	def : Pat<(f64 (z_vector_extract (v2f64 VR128:$vec), imm32zx1:$index)),			def : Pat<(f64 (z_vector_extract (v2f64 VR128:$vec), imm32zx1:$index)),
	(EXTRACT_SUBREG (VREPG VR128:$vec, imm32zx1:$index), subreg_r64)>;			(EXTRACT_SUBREG (VREPG VR128:$vec, imm32zx1:$index), subreg_h64)>;
	}			}

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	// Support for 128-bit floating-point values in vector registers			// Support for 128-bit floating-point values in vector registers
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	let Predicates = [FeatureVectorEnhancements1] in {			let Predicates = [FeatureVectorEnhancements1] in {
	def : Pat<(f128 (load bdxaddr12only:$addr)),			def : Pat<(f128 (load bdxaddr12only:$addr)),
	▲ Show 20 Lines • Show All 111 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/SystemZ/SystemZRegisterInfo.td

Show All 19 Lines	class SystemZRegWithSubregs<string n, list<Register> subregs>
let Namespace = "SystemZ";		let Namespace = "SystemZ";
}		}

let Namespace = "SystemZ" in {		let Namespace = "SystemZ" in {
def subreg_l32 : SubRegIndex<32, 0>; // Also acts as subreg_ll32.		def subreg_l32 : SubRegIndex<32, 0>; // Also acts as subreg_ll32.
def subreg_h32 : SubRegIndex<32, 32>; // Also acts as subreg_lh32.		def subreg_h32 : SubRegIndex<32, 32>; // Also acts as subreg_lh32.
def subreg_l64 : SubRegIndex<64, 0>;		def subreg_l64 : SubRegIndex<64, 0>;
def subreg_h64 : SubRegIndex<64, 64>;		def subreg_h64 : SubRegIndex<64, 64>;
def subreg_r32 : SubRegIndex<32, 32>; // Reinterpret a wider reg as 32 bits.
def subreg_r64 : SubRegIndex<64, 64>; // Reinterpret a wider reg as 64 bits.
def subreg_hh32 : ComposedSubRegIndex<subreg_h64, subreg_h32>;		def subreg_hh32 : ComposedSubRegIndex<subreg_h64, subreg_h32>;
def subreg_hl32 : ComposedSubRegIndex<subreg_h64, subreg_l32>;		def subreg_hl32 : ComposedSubRegIndex<subreg_h64, subreg_l32>;
def subreg_hr32 : ComposedSubRegIndex<subreg_h64, subreg_r32>;
}		}

// Define a register class that contains values of types TYPES and an		// Define a register class that contains values of types TYPES and an
// associated operand called NAME. SIZE is the size and alignment		// associated operand called NAME. SIZE is the size and alignment
// of the registers and REGLIST is the list of individual registers.		// of the registers and REGLIST is the list of individual registers.
multiclass SystemZRegClass<string name, list<ValueType> types, int size,		multiclass SystemZRegClass<string name, list<ValueType> types, int size,
dag regList, bit allocatable = 1> {		dag regList, bit allocatable = 1> {
def AsmOperand : AsmOperandClass {		def AsmOperand : AsmOperandClass {
▲ Show 20 Lines • Show All 142 Lines • ▼ Show 20 Lines
class FPR32<bits<16> num, string n> : SystemZReg<n> {		class FPR32<bits<16> num, string n> : SystemZReg<n> {
let HWEncoding = num;		let HWEncoding = num;
}		}

// One of the floating-point registers.		// One of the floating-point registers.
class FPR64<bits<16> num, string n, FPR32 high>		class FPR64<bits<16> num, string n, FPR32 high>
: SystemZRegWithSubregs<n, [high]> {		: SystemZRegWithSubregs<n, [high]> {
let HWEncoding = num;		let HWEncoding = num;
let SubRegIndices = [subreg_r32];		let SubRegIndices = [subreg_h32];
}		}

// 8 pairs of FPR64s, with a one-register gap inbetween.		// 8 pairs of FPR64s, with a one-register gap inbetween.
class FPR128<bits<16> num, string n, FPR64 low, FPR64 high>		class FPR128<bits<16> num, string n, FPR64 low, FPR64 high>
: SystemZRegWithSubregs<n, [low, high]> {		: SystemZRegWithSubregs<n, [low, high]> {
let HWEncoding = num;		let HWEncoding = num;
let SubRegIndices = [subreg_l64, subreg_h64];		let SubRegIndices = [subreg_l64, subreg_h64];
let CoveredBySubRegs = 1;		let CoveredBySubRegs = 1;
Show All 26 Lines
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// Vector registers		// Vector registers
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

// A full 128-bit vector register, with an FPR64 as its high part.		// A full 128-bit vector register, with an FPR64 as its high part.
class VR128<bits<16> num, string n, FPR64 high>		class VR128<bits<16> num, string n, FPR64 high>
: SystemZRegWithSubregs<n, [high]> {		: SystemZRegWithSubregs<n, [high]> {
let HWEncoding = num;		let HWEncoding = num;
let SubRegIndices = [subreg_r64];		let SubRegIndices = [subreg_h64];
}		}

// Full vector registers.		// Full vector registers.
foreach I = 0-31 in {		foreach I = 0-31 in {
def V#I : VR128<I, "v"#I, !cast<FPR64>("F"#I#"D")>,		def V#I : VR128<I, "v"#I, !cast<FPR64>("F"#I#"D")>,
DwarfRegNum<[!cast<DwarfMapping>("F"#I#"Dwarf").Id]>;		DwarfRegNum<[!cast<DwarfMapping>("F"#I#"Dwarf").Id]>;
}		}

▲ Show 20 Lines • Show All 79 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[SystemZ] Replace subreg_r with subreg_hClosedPublic

Details

Diff Detail

Event Timeline