Download Raw Diff

Details

Reviewers

• tstellarAMD
arsenm

Commits

rG07e03712d3cd: AMDGPU : Add intrinsics for compare with the full wavefront result
rL276998: AMDGPU : Add intrinsics for compare with the full wavefront result

Summary

Add an LLVM intrinsic / Clang Builtin to expose the v_cmp_ne_i32 instruction.

Diff Detail

Repository: rL LLVM

Event Timeline

wdng updated this revision to Diff 64393.Jul 18 2016, 2:57 PM

wdng retitled this revision from to AMDGPU : Add an LLVM intrinsic / Clang Builtin to expose the v_cmp_ne_i32 instruction..

wdng updated this object.

wdng added reviewers: • tstellarAMD, arsenm.

wdng set the repository for this revision to rL LLVM.

Herald added subscribers: kzhuravl, arsenm. · View Herald TranscriptJul 18 2016, 2:57 PM

We should only do a general icmp, not an intrinsic for specific compare opcodes

This revision now requires changes to proceed.Jul 18 2016, 2:58 PM

In D22482#487548, @arsenm wrote:

We should only do a general icmp, not an intrinsic for specific compare opcodes

Do you mean an LLVM IR icmp instruction or a generic intrinsic that takes a condition code as its third input?

In D22482#488452, @tstellarAMD wrote:

In D22482#487548, @arsenm wrote:

We should only do a general icmp, not an intrinsic for specific compare opcodes

Do you mean an LLVM IR icmp instruction or a generic intrinsic that takes a condition code as its third input?

Yes

Use a general way to implement v_cmp_ne_i32.

Fixed a data type issue for fcmp intrinsic definition.

arsenm added inline comments.Jul 21 2016, 4:12 PM

include/llvm/IR/IntrinsicsAMDGPU.td
394 ↗	(On Diff #64938)	Remove the GCCBuiltins, they don't work with overloaded intrinsics
400 ↗	(On Diff #64938)	the 3rd parameter should be i32
lib/Target/AMDGPU/AMDGPUISelLowering.h
231 ↗	(On Diff #64938)	Needs a comment that this is setcc with the full mask result
lib/Target/AMDGPU/SIISelLowering.cpp
1655–1656 ↗	(On Diff #64938)	These should be put towards the end of the cases
1657 ↗	(On Diff #64938)	Variables should be capitalized and camel case. What happens if the cond code is out of range? There should probably be a clamp
1658 ↗	(On Diff #64938)	Extra spaces between type and name
1659 ↗	(On Diff #64938)	This looks like it goes over 80 characters
lib/Target/AMDGPU/SIInstructions.td
2365–2366 ↗	(On Diff #64938)	This can just be a class. You can also try adding the pattern dag to the v_cmp instruction definition patterns list (although I'm not 100% sure if the multiple patterns actually work). A multiclass might help if you don't want to repeat for i32/i64
2372–2373 ↗	(On Diff #64938)	All compare types should be defined. Additionally i64 and the FP ones are missing
test/CodeGen/AMDGPU/llvm.amdgcn.icmp.ne.ll
6–12 ↗	(On Diff #64938)	There should be a test for every condition code and i32/i64
8 ↗	(On Diff #64938)	nounwind should also be an attribute group
9 ↗	(On Diff #64938)	Call site doesn't need the at tributes

• tstellarAMD added inline comments.Jul 22 2016, 6:20 AM

include/llvm/IR/IntrinsicsAMDGPU.td
400 ↗	(On Diff #64938)	Also, should the return type be i64 instead of double?

Code changes based on Matt's comments.

include/llvm/IR/IntrinsicsAMDGPU.td
400 ↗	(On Diff #65358)	Yes, code has been changed accordingly. Thanks!

arsenm added inline comments.Jul 25 2016, 12:09 PM

lib/Target/AMDGPU/AMDGPUISelLowering.h
231 ↗	(On Diff #65358)	Should have space after the //, and it should be capitalized and punctuated. Maybe clearer would be a compare with a result bit per item in the wavefront or something, mask result sounds more ambiguous maybe
lib/Target/AMDGPU/SIISelLowering.cpp
1909–1910 ↗	(On Diff #65358)	You should do the range check before the static_cast since I think it is undefined behavior to have an out of bounds enum value inserted. This also won't work for fcmp, each should be handled in its own case with its own range check for the specific compare types' range
lib/Target/AMDGPU/SIInstructions.td
2374–2375 ↗	(On Diff #65358)	The unsigned should use the _U32 compare
2383–2384 ↗	(On Diff #65358)	Ditto
2385–2386 ↗	(On Diff #65358)	Ditto
2399–2411 ↗	(On Diff #65358)	This also needs to be done for the unordered compares
test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
102–104 ↗	(On Diff #65358)	Missing unordered compares

Changes based on Matt's comments.

arsenm added inline comments.Jul 25 2016, 3:14 PM

include/llvm/IR/IntrinsicsAMDGPU.td
392 ↗	(On Diff #65427)	You should remove this comment
397 ↗	(On Diff #65427)	And this one
lib/Target/AMDGPU/SIISelLowering.cpp
1907–1909 ↗	(On Diff #65427)	Instead of an assert, how about returning undef? this should also have a test. Same if the operand isn't really constant, you'll need to do the dyn_cast yourself
1918–1922 ↗	(On Diff #65427)	Should refer to FCmpInst
lib/Target/AMDGPU/SIInstructions.td
2413–2425 ↗	(On Diff #65427)	These are not the correct unordered comparison instructions, refer to the existing set of fcmp patterns for which to use

• tstellarAMD added inline comments.Jul 25 2016, 3:15 PM

lib/Target/AMDGPU/SIInstructions.td
2413–2425 ↗	(On Diff #65427)	Unordered compares should select the V_CMP_N* instructions. Take a look at the instruction definitions to see which condition matches to which instruction.

arsenm added inline comments.Jul 25 2016, 3:18 PM

test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
199 ↗	(On Diff #65427)	You should also add a test that uses fabs on the inputs to make sure that source modifiers are folded

Add dyn_cast for type converting.
Fixed not using correct FCmpInst type.
Fixed incorrectly use of unordered insutrctions
Added fabs as input for fcmp test.

Upload correct diff with cached LIT tests update.

Upload correct diff file.

The title of the commit is also inaccurate, it should be intrinsics for compare with the full wavefront result

test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
200 ↗	(On Diff #65538)	Still missing these tests

wdng added inline comments.Jul 26 2016, 4:13 PM

test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
200 ↗	(On Diff #65538)	I have just created one "define void @v_fcmp_f32_oeq_with_fabs(i64 addrspace(1)* %out, float %src, float %a) #1" and put it one the top of tests. Should I write fabs tests for all fcmp comparisons?

wdng retitled this revision from AMDGPU : Add an LLVM intrinsic / Clang Builtin to expose the v_cmp_ne_i32 instruction. to AMDGPU : Add intrinsics for compare with the full wavefront result, such as v_cmp_ne_i32, etc...Jul 26 2016, 4:14 PM

arsenm added inline comments.Jul 26 2016, 4:21 PM

test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
6 ↗	(On Diff #65538)	Use the attribute group
10–15 ↗	(On Diff #65538)	The call site does not need the attribute specified. Can you also test the other operand? The check line should check the actual operands, this currently does not actually check much

Modified one LIT test to check operands.

Fixed corrupted diff patch.

arsenm added inline comments.Jul 26 2016, 6:09 PM

lib/Target/AMDGPU/SIISelLowering.cpp
1907 ↗	(On Diff #65632)	Space before =
1921 ↗	(On Diff #65632)	Ditto
test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
7 ↗	(On Diff #65632)	Missing test for invalid condition code value
9 ↗	(On Diff #65632)	You can move the \|s outside of the regex and then you don't have to escape them
11 ↗	(On Diff #65632)	Don't need call site attributes
11–16 ↗	(On Diff #65632)	Still should test that both operands can have the source modifiers folded
29 ↗	(On Diff #65632)	it doesn't really matter, but there's no reason this test needs to under-align the stores, Fix these to be align 8 or remove the aligns
test/CodeGen/AMDGPU/llvm.amdgcn.icmp.ll
6 ↗	(On Diff #65632)	Missing test for invalid condition code value

Add illegal cond code LIT tests.

arsenm added inline comments.Jul 27 2016, 12:01 PM

lib/Target/AMDGPU/SIInstructions.td
2391–2392 ↗	(On Diff #65761)	Spaces before the types and the next (
test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll
28 ↗	(On Diff #65761)	You should drop the suffix here to strengthen the test. It would be best to reduce to just v_cmp because something could commute the instruction
test/CodeGen/AMDGPU/llvm.amdgcn.icmp.ll
16 ↗	(On Diff #65761)	Ditto

Rename LIT test function names.
Add space between type and parenthesis.

LGTM, you can drop the "such as" part from your commit message

This revision is now accepted and ready to land.Jul 27 2016, 4:00 PM

Closed by commit rL276998: AMDGPU : Add intrinsics for compare with the full wavefront result (authored by wdng). · Explain WhyJul 28 2016, 9:50 AM

This revision was automatically updated to reflect the committed changes.

Diff 65953

llvm/trunk/include/llvm/IR/IntrinsicsAMDGPU.td

Show First 20 Lines • Show All 401 Lines • ▼ Show 20 Lines	def int_amdgcn_ds_swizzle :
GCCBuiltin<"__builtin_amdgcn_ds_swizzle">,		GCCBuiltin<"__builtin_amdgcn_ds_swizzle">,
Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty], [IntrNoMem, IntrConvergent]>;		Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty], [IntrNoMem, IntrConvergent]>;

// llvm.amdgcn.lerp		// llvm.amdgcn.lerp
def int_amdgcn_lerp :		def int_amdgcn_lerp :
GCCBuiltin<"__builtin_amdgcn_lerp">,		GCCBuiltin<"__builtin_amdgcn_lerp">,
Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>;		Intrinsic<[llvm_i32_ty], [llvm_i32_ty, llvm_i32_ty, llvm_i32_ty], [IntrNoMem]>;

		def int_amdgcn_icmp :
		Intrinsic<[llvm_i64_ty], [llvm_anyint_ty, LLVMMatchType<0>, llvm_i32_ty],
		[IntrNoMem, IntrConvergent]>;

		def int_amdgcn_fcmp :
		Intrinsic<[llvm_i64_ty], [llvm_anyfloat_ty, LLVMMatchType<0>, llvm_i32_ty],
		[IntrNoMem, IntrConvergent]>;

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// CI+ Intrinsics		// CI+ Intrinsics
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

def int_amdgcn_s_dcache_inv_vol :		def int_amdgcn_s_dcache_inv_vol :
GCCBuiltin<"__builtin_amdgcn_s_dcache_inv_vol">,		GCCBuiltin<"__builtin_amdgcn_s_dcache_inv_vol">,
Intrinsic<[], [], []>;		Intrinsic<[], [], []>;

Show All 35 Lines

llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h

Show First 20 Lines • Show All 217 Lines • ▼ Show 20 Lines	enum NodeType : unsigned {
UMUL, // 32bit unsigned multiplication		UMUL, // 32bit unsigned multiplication
BRANCH_COND,		BRANCH_COND,
// End AMDIL ISD Opcodes		// End AMDIL ISD Opcodes
ENDPGM,		ENDPGM,
RETURN,		RETURN,
DWORDADDR,		DWORDADDR,
FRACT,		FRACT,
CLAMP,		CLAMP,
		// This is SETCC with the full mask result which is used for a compare with a
		// result bit per item in the wavefront.
		SETCC,

// SIN_HW, COS_HW - f32 for SI, 1 ULP max error, valid from -100 pi to 100 pi.		// SIN_HW, COS_HW - f32 for SI, 1 ULP max error, valid from -100 pi to 100 pi.
// Denormals handled on some parts.		// Denormals handled on some parts.
COS_HW,		COS_HW,
SIN_HW,		SIN_HW,
FMAX_LEGACY,		FMAX_LEGACY,
FMIN_LEGACY,		FMIN_LEGACY,
FMAX3,		FMAX3,
▲ Show 20 Lines • Show All 86 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp

Show First 20 Lines • Show All 2,652 Lines • ▼ Show 20 Lines	const char* AMDGPUTargetLowering::getTargetNodeName(unsigned Opcode) const {
NODE_NAME_CASE(UMUL);		NODE_NAME_CASE(UMUL);
NODE_NAME_CASE(BRANCH_COND);		NODE_NAME_CASE(BRANCH_COND);

// AMDGPU DAG nodes		// AMDGPU DAG nodes
NODE_NAME_CASE(ENDPGM)		NODE_NAME_CASE(ENDPGM)
NODE_NAME_CASE(RETURN)		NODE_NAME_CASE(RETURN)
NODE_NAME_CASE(DWORDADDR)		NODE_NAME_CASE(DWORDADDR)
NODE_NAME_CASE(FRACT)		NODE_NAME_CASE(FRACT)
		NODE_NAME_CASE(SETCC)
NODE_NAME_CASE(CLAMP)		NODE_NAME_CASE(CLAMP)
NODE_NAME_CASE(COS_HW)		NODE_NAME_CASE(COS_HW)
NODE_NAME_CASE(SIN_HW)		NODE_NAME_CASE(SIN_HW)
NODE_NAME_CASE(FMAX_LEGACY)		NODE_NAME_CASE(FMAX_LEGACY)
NODE_NAME_CASE(FMIN_LEGACY)		NODE_NAME_CASE(FMIN_LEGACY)
NODE_NAME_CASE(FMAX3)		NODE_NAME_CASE(FMAX3)
NODE_NAME_CASE(SMAX3)		NODE_NAME_CASE(SMAX3)
NODE_NAME_CASE(UMAX3)		NODE_NAME_CASE(UMAX3)
▲ Show 20 Lines • Show All 177 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.td

	Show First 20 Lines • Show All 138 Lines • ▼ Show 20 Lines
	>;			>;

	// out = (src0 + src1 > 0xFFFFFFFF) ? 1 : 0			// out = (src0 + src1 > 0xFFFFFFFF) ? 1 : 0
	def AMDGPUcarry : SDNode<"AMDGPUISD::CARRY", SDTIntBinOp, []>;			def AMDGPUcarry : SDNode<"AMDGPUISD::CARRY", SDTIntBinOp, []>;

	// out = (src1 > src0) ? 1 : 0			// out = (src1 > src0) ? 1 : 0
	def AMDGPUborrow : SDNode<"AMDGPUISD::BORROW", SDTIntBinOp, []>;			def AMDGPUborrow : SDNode<"AMDGPUISD::BORROW", SDTIntBinOp, []>;

				def AMDGPUSetCCOp : SDTypeProfile<1, 3, [ // setcc
				SDTCisVT<0, i64>, SDTCisSameAs<1, 2>, SDTCisVT<3, OtherVT>
				]>;

				def AMDGPUsetcc : SDNode<"AMDGPUISD::SETCC", AMDGPUSetCCOp>;

	def AMDGPUcvt_f32_ubyte0 : SDNode<"AMDGPUISD::CVT_F32_UBYTE0",			def AMDGPUcvt_f32_ubyte0 : SDNode<"AMDGPUISD::CVT_F32_UBYTE0",
	SDTIntToFPOp, []>;			SDTIntToFPOp, []>;
	def AMDGPUcvt_f32_ubyte1 : SDNode<"AMDGPUISD::CVT_F32_UBYTE1",			def AMDGPUcvt_f32_ubyte1 : SDNode<"AMDGPUISD::CVT_F32_UBYTE1",
	SDTIntToFPOp, []>;			SDTIntToFPOp, []>;
	def AMDGPUcvt_f32_ubyte2 : SDNode<"AMDGPUISD::CVT_F32_UBYTE2",			def AMDGPUcvt_f32_ubyte2 : SDNode<"AMDGPUISD::CVT_F32_UBYTE2",
	SDTIntToFPOp, []>;			SDTIntToFPOp, []>;
	def AMDGPUcvt_f32_ubyte3 : SDNode<"AMDGPUISD::CVT_F32_UBYTE3",			def AMDGPUcvt_f32_ubyte3 : SDNode<"AMDGPUISD::CVT_F32_UBYTE3",
	▲ Show 20 Lines • Show All 125 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AMDGPU/SIISelLowering.cpp

Show All 25 Lines
#include "SIMachineFunctionInfo.h"		#include "SIMachineFunctionInfo.h"
#include "SIRegisterInfo.h"		#include "SIRegisterInfo.h"
#include "llvm/ADT/BitVector.h"		#include "llvm/ADT/BitVector.h"
#include "llvm/ADT/StringSwitch.h"		#include "llvm/ADT/StringSwitch.h"
#include "llvm/CodeGen/CallingConvLower.h"		#include "llvm/CodeGen/CallingConvLower.h"
#include "llvm/CodeGen/MachineInstrBuilder.h"		#include "llvm/CodeGen/MachineInstrBuilder.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"		#include "llvm/CodeGen/MachineRegisterInfo.h"
#include "llvm/CodeGen/SelectionDAG.h"		#include "llvm/CodeGen/SelectionDAG.h"
		#include "llvm/CodeGen/Analysis.h"
#include "llvm/IR/DiagnosticInfo.h"		#include "llvm/IR/DiagnosticInfo.h"
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"

using namespace llvm;		using namespace llvm;

// -amdgpu-fast-fdiv - Command line option to enable faster 2.5 ulp fdiv.		// -amdgpu-fast-fdiv - Command line option to enable faster 2.5 ulp fdiv.
static cl::opt<bool> EnableAMDGPUFastFDIV(		static cl::opt<bool> EnableAMDGPUFastFDIV(
"amdgpu-fast-fdiv",		"amdgpu-fast-fdiv",
▲ Show 20 Lines • Show All 2,166 Lines • ▼ Show 20 Lines	case Intrinsic::amdgcn_div_scale: {
// intrinsic has the numerator as the first operand to match a normal		// intrinsic has the numerator as the first operand to match a normal
// division operation.		// division operation.

SDValue Src0 = Param->isAllOnesValue() ? Numerator : Denominator;		SDValue Src0 = Param->isAllOnesValue() ? Numerator : Denominator;

return DAG.getNode(AMDGPUISD::DIV_SCALE, DL, Op->getVTList(), Src0,		return DAG.getNode(AMDGPUISD::DIV_SCALE, DL, Op->getVTList(), Src0,
Denominator, Numerator);		Denominator, Numerator);
}		}
		case Intrinsic::amdgcn_icmp: {
		const auto *CD = dyn_cast<ConstantSDNode>(Op.getOperand(3));
		int CondCode = CD->getSExtValue();

		if (CondCode < ICmpInst::Predicate::FIRST_ICMP_PREDICATE \|\|
		CondCode >= ICmpInst::Predicate::BAD_ICMP_PREDICATE)
		return DAG.getUNDEF(VT);

		ICmpInst::Predicate IcInput =
		static_cast<ICmpInst::Predicate>(CondCode);
		ISD::CondCode CCOpcode = getICmpCondCode(IcInput);
		return DAG.getNode(AMDGPUISD::SETCC, DL, VT, Op.getOperand(1),
		Op.getOperand(2), DAG.getCondCode(CCOpcode));
		}
		case Intrinsic::amdgcn_fcmp: {
		const auto *CD = dyn_cast<ConstantSDNode>(Op.getOperand(3));
		int CondCode = CD->getSExtValue();

		if (CondCode <= FCmpInst::Predicate::FCMP_FALSE \|\|
		CondCode >= FCmpInst::Predicate::FCMP_TRUE)
		return DAG.getUNDEF(VT);

		FCmpInst::Predicate IcInput =
		static_cast<FCmpInst::Predicate>(CondCode);
		ISD::CondCode CCOpcode = getFCmpCondCode(IcInput);
		return DAG.getNode(AMDGPUISD::SETCC, DL, VT, Op.getOperand(1),
		Op.getOperand(2), DAG.getCondCode(CCOpcode));
		}
case Intrinsic::amdgcn_fmul_legacy:		case Intrinsic::amdgcn_fmul_legacy:
return DAG.getNode(AMDGPUISD::FMUL_LEGACY, DL, VT,		return DAG.getNode(AMDGPUISD::FMUL_LEGACY, DL, VT,
Op.getOperand(1), Op.getOperand(2));		Op.getOperand(1), Op.getOperand(2));
case Intrinsic::amdgcn_sffbh:		case Intrinsic::amdgcn_sffbh:
case AMDGPUIntrinsic::AMDGPU_flbit_i32: // Legacy name.		case AMDGPUIntrinsic::AMDGPU_flbit_i32: // Legacy name.
return DAG.getNode(AMDGPUISD::FFBH_I32, DL, VT, Op.getOperand(1));		return DAG.getNode(AMDGPUISD::FFBH_I32, DL, VT, Op.getOperand(1));
default:		default:
return AMDGPUTargetLowering::LowerOperation(Op, DAG);		return AMDGPUTargetLowering::LowerOperation(Op, DAG);
▲ Show 20 Lines • Show All 1,524 Lines • Show Last 20 Lines

llvm/trunk/lib/Target/AMDGPU/SIInstructions.td

	Show First 20 Lines • Show All 2,360 Lines • ▼ Show 20 Lines
	// DS_SWIZZLE Intrinsic Pattern.			// DS_SWIZZLE Intrinsic Pattern.
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	def : Pat <			def : Pat <
	(int_amdgcn_ds_swizzle i32:$src, imm:$offset16),			(int_amdgcn_ds_swizzle i32:$src, imm:$offset16),
	(DS_SWIZZLE_B32 $src, (as_i16imm $offset16), (i1 0))			(DS_SWIZZLE_B32 $src, (as_i16imm $offset16), (i1 0))
	>;			>;

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
				// V_ICMPIntrinsic Pattern.
				//===----------------------------------------------------------------------===//
				class ICMP_Pattern <PatLeaf cond, Instruction inst, ValueType vt> : Pat <
				(AMDGPUsetcc vt:$src0, vt:$src1, cond),
				(inst $src0, $src1)
				>;

				def : ICMP_Pattern <COND_EQ, V_CMP_EQ_I32_e64, i32>;
				def : ICMP_Pattern <COND_NE, V_CMP_NE_I32_e64, i32>;
				def : ICMP_Pattern <COND_UGT, V_CMP_GT_U32_e64, i32>;
				def : ICMP_Pattern <COND_UGE, V_CMP_GE_U32_e64, i32>;
				def : ICMP_Pattern <COND_ULT, V_CMP_LT_U32_e64, i32>;
				def : ICMP_Pattern <COND_ULE, V_CMP_LE_U32_e64, i32>;
				def : ICMP_Pattern <COND_SGT, V_CMP_GT_I32_e64, i32>;
				def : ICMP_Pattern <COND_SGE, V_CMP_GE_I32_e64, i32>;
				def : ICMP_Pattern <COND_SLT, V_CMP_LT_I32_e64, i32>;
				def : ICMP_Pattern <COND_SLE, V_CMP_LE_I32_e64, i32>;

				def : ICMP_Pattern <COND_EQ, V_CMP_EQ_I64_e64, i64>;
				def : ICMP_Pattern <COND_NE, V_CMP_NE_I64_e64, i64>;
				def : ICMP_Pattern <COND_UGT, V_CMP_GT_U64_e64, i64>;
				def : ICMP_Pattern <COND_UGE, V_CMP_GE_U64_e64, i64>;
				def : ICMP_Pattern <COND_ULT, V_CMP_LT_U64_e64, i64>;
				def : ICMP_Pattern <COND_ULE, V_CMP_LE_U64_e64, i64>;
				def : ICMP_Pattern <COND_SGT, V_CMP_GT_I64_e64, i64>;
				def : ICMP_Pattern <COND_SGE, V_CMP_GE_I64_e64, i64>;
				def : ICMP_Pattern <COND_SLT, V_CMP_LT_I64_e64, i64>;
				def : ICMP_Pattern <COND_SLE, V_CMP_LE_I64_e64, i64>;

				class FCMP_Pattern <PatLeaf cond, Instruction inst, ValueType vt> : Pat <
				(i64 (AMDGPUsetcc (vt (VOP3Mods vt:$src0, i32:$src0_modifiers)),
				(vt (VOP3Mods vt:$src1, i32:$src1_modifiers)), cond)),
				(inst $src0_modifiers, $src0, $src1_modifiers, $src1,
				DSTCLAMP.NONE, DSTOMOD.NONE)
				>;

				def : FCMP_Pattern <COND_OEQ, V_CMP_EQ_F32_e64, f32>;
				def : FCMP_Pattern <COND_ONE, V_CMP_NEQ_F32_e64, f32>;
				def : FCMP_Pattern <COND_OGT, V_CMP_GT_F32_e64, f32>;
				def : FCMP_Pattern <COND_OGE, V_CMP_GE_F32_e64, f32>;
				def : FCMP_Pattern <COND_OLT, V_CMP_LT_F32_e64, f32>;
				def : FCMP_Pattern <COND_OLE, V_CMP_LE_F32_e64, f32>;

				def : FCMP_Pattern <COND_OEQ, V_CMP_EQ_F64_e64, f64>;
				def : FCMP_Pattern <COND_ONE, V_CMP_NEQ_F64_e64, f64>;
				def : FCMP_Pattern <COND_OGT, V_CMP_GT_F64_e64, f64>;
				def : FCMP_Pattern <COND_OGE, V_CMP_GE_F64_e64, f64>;
				def : FCMP_Pattern <COND_OLT, V_CMP_LT_F64_e64, f64>;
				def : FCMP_Pattern <COND_OLE, V_CMP_LE_F64_e64, f64>;

				def : FCMP_Pattern <COND_UEQ, V_CMP_NLG_F32_e64, f32>;
				def : FCMP_Pattern <COND_UNE, V_CMP_NEQ_F32_e64, f32>;
				def : FCMP_Pattern <COND_UGT, V_CMP_NLE_F32_e64, f32>;
				def : FCMP_Pattern <COND_UGE, V_CMP_NLT_F32_e64, f32>;
				def : FCMP_Pattern <COND_ULT, V_CMP_NGE_F32_e64, f32>;
				def : FCMP_Pattern <COND_ULE, V_CMP_NGT_F32_e64, f32>;

				def : FCMP_Pattern <COND_UEQ, V_CMP_NLG_F64_e64, f64>;
				def : FCMP_Pattern <COND_UNE, V_CMP_NEQ_F64_e64, f64>;
				def : FCMP_Pattern <COND_UGT, V_CMP_NLE_F64_e64, f64>;
				def : FCMP_Pattern <COND_UGE, V_CMP_NLT_F64_e64, f64>;
				def : FCMP_Pattern <COND_ULT, V_CMP_NGE_F64_e64, f64>;
				def : FCMP_Pattern <COND_ULE, V_CMP_NGT_F64_e64, f64>;

				//===----------------------------------------------------------------------===//
	// SMRD Patterns			// SMRD Patterns
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//

	multiclass SMRD_Pattern <string Instr, ValueType vt> {			multiclass SMRD_Pattern <string Instr, ValueType vt> {

	// 1. IMM offset			// 1. IMM offset
	def : Pat <			def : Pat <
	(smrd_load (SMRDImm i64:$sbase, i32:$offset)),			(smrd_load (SMRDImm i64:$sbase, i32:$offset)),
	▲ Show 20 Lines • Show All 1,196 Lines • Show Last 20 Lines

llvm/trunk/test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll

				; RUN: llc -march=amdgcn -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN %s
				; RUN: llc -march=amdgcn -mcpu=fiji -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN %s

				declare i64 @llvm.amdgcn.fcmp.f32(float, float, i32) #0
				declare i64 @llvm.amdgcn.fcmp.f64(double, double, i32) #0
				declare float @llvm.fabs.f32(float) #0

				; GCN-LABEL: {{^}}v_fcmp_f32_oeq_with_fabs:
				; GCN: v_cmp_eq_f32_e64 {{s\[[0-9]+:[0-9]+\]}}, {{s[0-9]+}}, \|{{v[0-9]+}}\|
				define void @v_fcmp_f32_oeq_with_fabs(i64 addrspace(1)* %out, float %src, float %a) {
				%temp = call float @llvm.fabs.f32(float %a)
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float %temp, i32 1)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f32_oeq_both_operands_with_fabs:
				; GCN: v_cmp_eq_f32_e64 {{s\[[0-9]+:[0-9]+\]}}, \|{{s[0-9]+}}\|, \|{{v[0-9]+}}\|
				define void @v_fcmp_f32_oeq_both_operands_with_fabs(i64 addrspace(1)* %out, float %src, float %a) {
				%temp = call float @llvm.fabs.f32(float %a)
				%src_input = call float @llvm.fabs.f32(float %src)
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src_input, float %temp, i32 1)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp:
				; GCN-NOT: v_cmp_eq_f32_e64
				define void @v_fcmp(i64 addrspace(1)* %out, float %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float 100.00, i32 -1)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f32_oeq:
				; GCN: v_cmp_eq_f32_e64
				define void @v_fcmp_f32_oeq(i64 addrspace(1)* %out, float %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float 100.00, i32 1)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f32_one:
				; GCN: v_cmp_neq_f32_e64
				define void @v_fcmp_f32_one(i64 addrspace(1)* %out, float %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float 100.00, i32 6)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f32_ogt:
				; GCN: v_cmp_gt_f32_e64
				define void @v_fcmp_f32_ogt(i64 addrspace(1)* %out, float %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float 100.00, i32 2)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f32_oge:
				; GCN: v_cmp_ge_f32_e64
				define void @v_fcmp_f32_oge(i64 addrspace(1)* %out, float %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float 100.00, i32 3)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f32_olt:
				; GCN: v_cmp_lt_f32_e64
				define void @v_fcmp_f32_olt(i64 addrspace(1)* %out, float %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float 100.00, i32 4)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f32_ole:
				; GCN: v_cmp_le_f32_e64
				define void @v_fcmp_f32_ole(i64 addrspace(1)* %out, float %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float 100.00, i32 5)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}


				; GCN-LABEL: {{^}}v_fcmp_f32_ueq:
				; GCN: v_cmp_nlg_f32_e64
				define void @v_fcmp_f32_ueq(i64 addrspace(1)* %out, float %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float 100.00, i32 9)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f32_une:
				; GCN: v_cmp_neq_f32_e64
				define void @v_fcmp_f32_une(i64 addrspace(1)* %out, float %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float 100.00, i32 14)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f32_ugt:
				; GCN: v_cmp_nle_f32_e64
				define void @v_fcmp_f32_ugt(i64 addrspace(1)* %out, float %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float 100.00, i32 10)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f32_uge:
				; GCN: v_cmp_nlt_f32_e64
				define void @v_fcmp_f32_uge(i64 addrspace(1)* %out, float %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float 100.00, i32 11)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f32_ult:
				; GCN: v_cmp_nge_f32_e64
				define void @v_fcmp_f32_ult(i64 addrspace(1)* %out, float %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float 100.00, i32 12)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f32_ule:
				; GCN: v_cmp_ngt_f32_e64
				define void @v_fcmp_f32_ule(i64 addrspace(1)* %out, float %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f32(float %src, float 100.00, i32 13)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f64_oeq:
				; GCN: v_cmp_eq_f64_e64
				define void @v_fcmp_f64_oeq(i64 addrspace(1)* %out, double %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f64(double %src, double 100.00, i32 1)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f64_one:
				; GCN: v_cmp_neq_f64_e64
				define void @v_fcmp_f64_one(i64 addrspace(1)* %out, double %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f64(double %src, double 100.00, i32 6)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f64_ogt:
				; GCN: v_cmp_gt_f64_e64
				define void @v_fcmp_f64_ogt(i64 addrspace(1)* %out, double %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f64(double %src, double 100.00, i32 2)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f64_oge:
				; GCN: v_cmp_ge_f64_e64
				define void @v_fcmp_f64_oge(i64 addrspace(1)* %out, double %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f64(double %src, double 100.00, i32 3)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f64_olt:
				; GCN: v_cmp_lt_f64_e64
				define void @v_fcmp_f64_olt(i64 addrspace(1)* %out, double %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f64(double %src, double 100.00, i32 4)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f64_ole:
				; GCN: v_cmp_le_f64_e64
				define void @v_fcmp_f64_ole(i64 addrspace(1)* %out, double %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f64(double %src, double 100.00, i32 5)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f64_ueq:
				; GCN: v_cmp_nlg_f64_e64
				define void @v_fcmp_f64_ueq(i64 addrspace(1)* %out, double %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f64(double %src, double 100.00, i32 9)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f64_une:
				; GCN: v_cmp_neq_f64_e64
				define void @v_fcmp_f64_une(i64 addrspace(1)* %out, double %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f64(double %src, double 100.00, i32 14)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f64_ugt:
				; GCN: v_cmp_nle_f64_e64
				define void @v_fcmp_f64_ugt(i64 addrspace(1)* %out, double %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f64(double %src, double 100.00, i32 10)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f64_uge:
				; GCN: v_cmp_nlt_f64_e64
				define void @v_fcmp_f64_uge(i64 addrspace(1)* %out, double %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f64(double %src, double 100.00, i32 11)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f64_ult:
				; GCN: v_cmp_nge_f64_e64
				define void @v_fcmp_f64_ult(i64 addrspace(1)* %out, double %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f64(double %src, double 100.00, i32 12)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_fcmp_f64_ule:
				; GCN: v_cmp_ngt_f64_e64
				define void @v_fcmp_f64_ule(i64 addrspace(1)* %out, double %src) {
				%result = call i64 @llvm.amdgcn.fcmp.f64(double %src, double 100.00, i32 13)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				attributes #0 = { nounwind readnone convergent }

llvm/trunk/test/CodeGen/AMDGPU/llvm.amdgcn.icmp.ll

				; RUN: llc -march=amdgcn -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN %s
				; RUN: llc -march=amdgcn -mcpu=fiji -verify-machineinstrs < %s \| FileCheck -check-prefix=GCN %s

				declare i64 @llvm.amdgcn.icmp.i32(i32, i32, i32) #0
				declare i64 @llvm.amdgcn.icmp.i64(i64, i64, i32) #0

				; GCN-LABEL: {{^}}v_icmp_i32_eq:
				; GCN: v_cmp_eq_i32_e64
				define void @v_icmp_i32_eq(i64 addrspace(1)* %out, i32 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i32(i32 %src, i32 100, i32 32)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp:
				; GCN-NOT: v_cmp_eq_i32_e64
				define void @v_icmp(i64 addrspace(1)* %out, i32 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i32(i32 %src, i32 100, i32 30)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}
				; GCN-LABEL: {{^}}v_icmp_i32_ne:
				; GCN: v_cmp_ne_i32_e64
				define void @v_icmp_i32_ne(i64 addrspace(1)* %out, i32 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i32(i32 %src, i32 100, i32 33)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_u32_ugt:
				; GCN: v_cmp_gt_u32_e64
				define void @v_icmp_u32_ugt(i64 addrspace(1)* %out, i32 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i32(i32 %src, i32 100, i32 34)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_u32_uge:
				; GCN: v_cmp_ge_u32_e64
				define void @v_icmp_u32_uge(i64 addrspace(1)* %out, i32 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i32(i32 %src, i32 100, i32 35)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_u32_ult:
				; GCN: v_cmp_lt_u32_e64
				define void @v_icmp_u32_ult(i64 addrspace(1)* %out, i32 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i32(i32 %src, i32 100, i32 36)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_u32_ule:
				; GCN: v_cmp_le_u32_e64
				define void @v_icmp_u32_ule(i64 addrspace(1)* %out, i32 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i32(i32 %src, i32 100, i32 37)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_i32_sgt:
				; GCN: v_cmp_gt_i32_e64
				define void @v_icmp_i32_sgt(i64 addrspace(1)* %out, i32 %src) #1 {
				%result = call i64 @llvm.amdgcn.icmp.i32(i32 %src, i32 100, i32 38)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_i32_sge:
				; GCN: v_cmp_ge_i32_e64
				define void @v_icmp_i32_sge(i64 addrspace(1)* %out, i32 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i32(i32 %src, i32 100, i32 39)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_i32_slt:
				; GCN: v_cmp_lt_i32_e64
				define void @v_icmp_i32_slt(i64 addrspace(1)* %out, i32 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i32(i32 %src, i32 100, i32 40)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}
				; GCN-LABEL: {{^}}v_icmp_i32_sle:
				; GCN: v_cmp_le_i32_e64
				define void @v_icmp_i32_sle(i64 addrspace(1)* %out, i32 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i32(i32 %src, i32 100, i32 41)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_i64_eq:
				; GCN: v_cmp_eq_i64_e64
				define void @v_icmp_i64_eq(i64 addrspace(1)* %out, i64 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i64(i64 %src, i64 100, i32 32)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_i64_ne:
				; GCN: v_cmp_ne_i64_e64
				define void @v_icmp_i64_ne(i64 addrspace(1)* %out, i64 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i64(i64 %src, i64 100, i32 33)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_u64_ugt:
				; GCN: v_cmp_gt_u64_e64
				define void @v_icmp_u64_ugt(i64 addrspace(1)* %out, i64 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i64(i64 %src, i64 100, i32 34)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_u64_uge:
				; GCN: v_cmp_ge_u64_e64
				define void @v_icmp_u64_uge(i64 addrspace(1)* %out, i64 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i64(i64 %src, i64 100, i32 35)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_u64_ult:
				; GCN: v_cmp_lt_u64_e64
				define void @v_icmp_u64_ult(i64 addrspace(1)* %out, i64 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i64(i64 %src, i64 100, i32 36)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_u64_ule:
				; GCN: v_cmp_le_u64_e64
				define void @v_icmp_u64_ule(i64 addrspace(1)* %out, i64 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i64(i64 %src, i64 100, i32 37)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_i64_sgt:
				; GCN: v_cmp_gt_i64_e64
				define void @v_icmp_i64_sgt(i64 addrspace(1)* %out, i64 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i64(i64 %src, i64 100, i32 38)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_i64_sge:
				; GCN: v_cmp_ge_i64_e64
				define void @v_icmp_i64_sge(i64 addrspace(1)* %out, i64 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i64(i64 %src, i64 100, i32 39)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				; GCN-LABEL: {{^}}v_icmp_i64_slt:
				; GCN: v_cmp_lt_i64_e64
				define void @v_icmp_i64_slt(i64 addrspace(1)* %out, i64 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i64(i64 %src, i64 100, i32 40)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}
				; GCN-LABEL: {{^}}v_icmp_i64_sle:
				; GCN: v_cmp_le_i64_e64
				define void @v_icmp_i64_sle(i64 addrspace(1)* %out, i64 %src) {
				%result = call i64 @llvm.amdgcn.icmp.i64(i64 %src, i64 100, i32 41)
				store i64 %result, i64 addrspace(1)* %out
				ret void
				}

				attributes #0 = { nounwind readnone convergent }

This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU : Add intrinsics for compare with the full wavefront result, such as v_cmp_ne_i32, etc..
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 65953

llvm/trunk/include/llvm/IR/IntrinsicsAMDGPU.td

llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h

llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp

llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.td

llvm/trunk/lib/Target/AMDGPU/SIISelLowering.cpp

llvm/trunk/lib/Target/AMDGPU/SIInstructions.td

llvm/trunk/test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll

llvm/trunk/test/CodeGen/AMDGPU/llvm.amdgcn.icmp.ll

This is an archive of the discontinued LLVM Phabricator instance.

AMDGPU : Add intrinsics for compare with the full wavefront result, such as v_cmp_ne_i32, etc..ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 65953

llvm/trunk/include/llvm/IR/IntrinsicsAMDGPU.td

llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.h

llvm/trunk/lib/Target/AMDGPU/AMDGPUISelLowering.cpp

llvm/trunk/lib/Target/AMDGPU/AMDGPUInstrInfo.td

llvm/trunk/lib/Target/AMDGPU/SIISelLowering.cpp

llvm/trunk/lib/Target/AMDGPU/SIInstructions.td

llvm/trunk/test/CodeGen/AMDGPU/llvm.amdgcn.fcmp.ll

llvm/trunk/test/CodeGen/AMDGPU/llvm.amdgcn.icmp.ll

AMDGPU : Add intrinsics for compare with the full wavefront result, such as v_cmp_ne_i32, etc..
ClosedPublic