This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Introduce a generic floating point floor opcode, G_FFLOOR
ClosedPublic

Authored by paquette on Jan 30 2019, 4:19 PM.

Download Raw Diff

Details

Reviewers

Commits

rG616a1fb4920f: [GlobalISel] Introduce a generic floating point floor opcode, G_FFLOOR
rL353057: [GlobalISel] Introduce a generic floating point floor opcode, G_FFLOOR

Summary

This introduces a generic opcode for floating point floor, working towards selecting @llvm.floor.

Diff Detail

Event Timeline

paquette created this revision.Jan 30 2019, 4:19 PM

Herald added subscribers: Petar.Avramovic, javed.absar, kristof.beyls, rovka. · View Herald TranscriptJan 30 2019, 4:19 PM

paquette added a child revision: D57485: [GlobalISel] Add IRTranslator support for G_FFLOOR.Jan 30 2019, 4:22 PM

paquette added a child revision: D57486: [GlobalISel][AArch64] Select G_FFLOOR.Jan 30 2019, 4:24 PM

LGTM

This revision is now accepted and ready to land.Feb 1 2019, 4:04 PM

Closed by commit rL353057: [GlobalISel] Introduce a generic floating point floor opcode, G_FFLOOR (authored by paquette). · Explain WhyFeb 4 2019, 9:10 AM

This revision was automatically updated to reflect the committed changes.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 4 2019, 9:10 AM

Herald added a subscriber: kristina. · View Herald Transcript

@arsenm, adding this opcode breaks AMDGPU somehow. Do you have any idea why that might be?

In D57484#1383395, @paquette wrote:

@arsenm, adding this opcode breaks AMDGPU somehow. Do you have any idea why that might be?

Breaks what?

In D57484#1383408, @arsenm wrote:

In D57484#1383395, @paquette wrote:

@arsenm, adding this opcode breaks AMDGPU somehow. Do you have any idea why that might be?

Breaks what?

Here's an example:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/24100/steps/build%20stage%201/logs/stdio

Some tablegen seems unhappy?

In D57484#1383409, @paquette wrote:

In D57484#1383408, @arsenm wrote:

In D57484#1383395, @paquette wrote:

@arsenm, adding this opcode breaks AMDGPU somehow. Do you have any idea why that might be?

Breaks what?

Here's an example:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/24100/steps/build%20stage%201/logs/stdio

Some tablegen seems unhappy?

I have no idea. I haven't looked at the DAG compatibility stuff

In D57484#1383417, @arsenm wrote:

In D57484#1383409, @paquette wrote:

In D57484#1383408, @arsenm wrote:

In D57484#1383395, @paquette wrote:

@arsenm, adding this opcode breaks AMDGPU somehow. Do you have any idea why that might be?

Breaks what?

Here's an example:
http://lab.llvm.org:8011/builders/clang-ppc64be-linux-lnt/builds/24100/steps/build%20stage%201/logs/stdio

Some tablegen seems unhappy?

I have no idea. I haven't looked at the DAG compatibility stuff

It looks like the definition uncovered an unusual case the importer can't handle yet. These two rules are the cause:

// Convert (x - floor(x)) to fract(x)
def : GCNPat <
  (f32 (fsub (f32 (VOP3Mods f32:$x, i32:$mods)),
             (f32 (ffloor (f32 (VOP3Mods f32:$x, i32:$mods)))))),
  (V_FRACT_F32_e64 $mods, $x, DSTCLAMP.NONE, DSTOMOD.NONE)
>;

// Convert (x + (-floor(x))) to fract(x)
def : GCNPat <
  (f64 (fadd (f64 (VOP3Mods f64:$x, i32:$mods)),
             (f64 (fneg (f64 (ffloor (f64 (VOP3Mods f64:$x, i32:$mods)))))))),
  (V_FRACT_F64_e64 $mods, $x, DSTCLAMP.NONE, DSTOMOD.NONE)
>;

The importer doesn't have any code to handle same-operand constraints in combination with the (foo $x, $y) style of matching complex pattern foo at the moment. I had a quick look for workarounds but there doesn't seem to be a variant (e.g. naming the overall complex-operand and matching that) that works at the moment. This:

(f32 (fsub (f32 VOP3Mods:$a),
           (f32 (ffloor (f32 VOP3Mods:$a)))))

would probably work but then you wouldn't be able to reverse the sub-operands.

I think that this is safe to recommit after the changes in https://reviews.llvm.org/D57980.

Relanded in r353589. This works fine after r353586.

Revision Contents

Path

Size

include/

llvm/

Support/

TargetOpcodes.def

3 lines

Target/

GenericOpcodes.td

7 lines

GlobalISel/

SelectionDAGCompat.td

1 line

test/

CodeGen/

AArch64/

GlobalISel/

legalizer-info-validation.mir

5 lines

Diff 184398

include/llvm/Support/TargetOpcodes.def

	Show First 20 Lines • Show All 523 Lines • ▼ Show 20 Lines
	HANDLE_TARGET_OPCODE(G_FCOS)			HANDLE_TARGET_OPCODE(G_FCOS)

	/// Floating point sine.			/// Floating point sine.
	HANDLE_TARGET_OPCODE(G_FSIN)			HANDLE_TARGET_OPCODE(G_FSIN)

	/// Floating point square root.			/// Floating point square root.
	HANDLE_TARGET_OPCODE(G_FSQRT)			HANDLE_TARGET_OPCODE(G_FSQRT)

				/// Floating point floor.
				HANDLE_TARGET_OPCODE(G_FFLOOR)

	/// Generic AddressSpaceCast.			/// Generic AddressSpaceCast.
	HANDLE_TARGET_OPCODE(G_ADDRSPACE_CAST)			HANDLE_TARGET_OPCODE(G_ADDRSPACE_CAST)

	/// Generic block address			/// Generic block address
	HANDLE_TARGET_OPCODE(G_BLOCK_ADDR)			HANDLE_TARGET_OPCODE(G_BLOCK_ADDR)

	// TODO: Add more generic opcodes as we move along.			// TODO: Add more generic opcodes as we move along.

	/// Marker for the end of the generic opcode.			/// Marker for the end of the generic opcode.
	/// This is used to check if an opcode is in the range of the			/// This is used to check if an opcode is in the range of the
	/// generic opcodes.			/// generic opcodes.
	HANDLE_TARGET_OPCODE_MARKER(PRE_ISEL_GENERIC_OPCODE_END, G_BLOCK_ADDR)			HANDLE_TARGET_OPCODE_MARKER(PRE_ISEL_GENERIC_OPCODE_END, G_BLOCK_ADDR)

	/// BUILTIN_OP_END - This must be the last enum value in this list.			/// BUILTIN_OP_END - This must be the last enum value in this list.
	/// The target-specific post-isel opcode values start here.			/// The target-specific post-isel opcode values start here.
	HANDLE_TARGET_OPCODE_MARKER(GENERIC_OP_END, PRE_ISEL_GENERIC_OPCODE_END)			HANDLE_TARGET_OPCODE_MARKER(GENERIC_OP_END, PRE_ISEL_GENERIC_OPCODE_END)

include/llvm/Target/GenericOpcodes.td

	Show First 20 Lines • Show All 573 Lines • ▼ Show 20 Lines
	// NOTE: Unlike libm sqrt(), this never sets errno. In all other respects it's			// NOTE: Unlike libm sqrt(), this never sets errno. In all other respects it's
	// libm-conformant.			// libm-conformant.
	def G_FSQRT : GenericInstruction {			def G_FSQRT : GenericInstruction {
	let OutOperandList = (outs type0:$dst);			let OutOperandList = (outs type0:$dst);
	let InOperandList = (ins type0:$src1);			let InOperandList = (ins type0:$src1);
	let hasSideEffects = 0;			let hasSideEffects = 0;
	}			}

				// Floating point floor of a value.
				def G_FFLOOR : GenericInstruction {
				let OutOperandList = (outs type0:$dst);
				let InOperandList = (ins type0:$src1);
				let hasSideEffects = 0;
				}

	//------------------------------------------------------------------------------			//------------------------------------------------------------------------------
	// Opcodes for LLVM Intrinsics			// Opcodes for LLVM Intrinsics
	//------------------------------------------------------------------------------			//------------------------------------------------------------------------------
	def G_INTRINSIC_TRUNC : GenericInstruction {			def G_INTRINSIC_TRUNC : GenericInstruction {
	let OutOperandList = (outs type0:$dst);			let OutOperandList = (outs type0:$dst);
	let InOperandList = (ins type0:$src1);			let InOperandList = (ins type0:$src1);
	let hasSideEffects = 0;			let hasSideEffects = 0;
	}			}
	▲ Show 20 Lines • Show All 225 Lines • Show Last 20 Lines

include/llvm/Target/GlobalISel/SelectionDAGCompat.td

	Show First 20 Lines • Show All 87 Lines • ▼ Show 20 Lines
	def : GINodeEquiv<G_CTTZ_ZERO_UNDEF, cttz_zero_undef>;			def : GINodeEquiv<G_CTTZ_ZERO_UNDEF, cttz_zero_undef>;
	def : GINodeEquiv<G_CTPOP, ctpop>;			def : GINodeEquiv<G_CTPOP, ctpop>;
	def : GINodeEquiv<G_EXTRACT_VECTOR_ELT, vector_extract>;			def : GINodeEquiv<G_EXTRACT_VECTOR_ELT, vector_extract>;
	def : GINodeEquiv<G_FCEIL, fceil>;			def : GINodeEquiv<G_FCEIL, fceil>;
	def : GINodeEquiv<G_FCOS, fcos>;			def : GINodeEquiv<G_FCOS, fcos>;
	def : GINodeEquiv<G_FSIN, fsin>;			def : GINodeEquiv<G_FSIN, fsin>;
	def : GINodeEquiv<G_FABS, fabs>;			def : GINodeEquiv<G_FABS, fabs>;
	def : GINodeEquiv<G_FSQRT, fsqrt>;			def : GINodeEquiv<G_FSQRT, fsqrt>;
				def : GINodeEquiv<G_FFLOOR, ffloor>;

	// Broadly speaking G_LOAD is equivalent to ISD::LOAD but there are some			// Broadly speaking G_LOAD is equivalent to ISD::LOAD but there are some
	// complications that tablegen must take care of. For example, Predicates such			// complications that tablegen must take care of. For example, Predicates such
	// as isSignExtLoad require that this is not a perfect 1:1 mapping since a			// as isSignExtLoad require that this is not a perfect 1:1 mapping since a
	// sign-extending load is (G_SEXTLOAD x) in GlobalISel. Additionally,			// sign-extending load is (G_SEXTLOAD x) in GlobalISel. Additionally,
	// G_LOAD handles both atomic and non-atomic loads where as SelectionDAG had			// G_LOAD handles both atomic and non-atomic loads where as SelectionDAG had
	// separate nodes for them. This GINodeEquiv maps the non-atomic loads to			// separate nodes for them. This GINodeEquiv maps the non-atomic loads to
	// G_LOAD with a non-atomic MachineMemOperand.			// G_LOAD with a non-atomic MachineMemOperand.
	Show All 38 Lines

test/CodeGen/AArch64/GlobalISel/legalizer-info-validation.mir

	Show First 20 Lines • Show All 334 Lines • ▼ Show 20 Lines
	#			#
	# DEBUG-NEXT: G_FCOS (opcode {{[0-9]+}}): 1 type index			# DEBUG-NEXT: G_FCOS (opcode {{[0-9]+}}): 1 type index
	# DEBUG: .. the first uncovered type index: 1, OK			# DEBUG: .. the first uncovered type index: 1, OK
	#			#
	# DEBUG-NEXT: G_FSIN (opcode {{[0-9]+}}): 1 type index			# DEBUG-NEXT: G_FSIN (opcode {{[0-9]+}}): 1 type index
	# DEBUG: .. the first uncovered type index: 1, OK			# DEBUG: .. the first uncovered type index: 1, OK
	#			#
	# DEBUG-NEXT: G_FSQRT (opcode {{[0-9]+}}): 1 type index			# DEBUG-NEXT: G_FSQRT (opcode {{[0-9]+}}): 1 type index
	# DEBUG: .. the first uncovered type index: 1, OK			# DEBUG: .. type index coverage check SKIPPED: user-defined predicate detected
				#
				# DEBUG-NEXT: G_FFLOOR (opcode {{[0-9]+}}): 1 type index
				# DEBUG: .. type index coverage check SKIPPED: no rules defined

	# CHECK-NOT: ill-defined			# CHECK-NOT: ill-defined

	---			---
	name: dummy			name: dummy
	body: \|			body: \|
	bb.0:			bb.0:
	...			...