This is an archive of the discontinued LLVM Phabricator instance.

include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
671	I'm not sure what's the point of having variadic templates here. Does `buildICmp` or `buildSelect` make sense with 0, 1, or let's say 4 operands (excluding destination)?
lib/CodeGen/GlobalISel/LegalizerHelper.cpp
40	👍
1063	Do you think we could `MI.setDesc(TII.get(TargetOpcode:: G_CTLZ_ZERO_UNDEF));` here in-place instead of recreating the instruction from scratch?
1083	I don't think this is going to work if `Len` is not a power of 2. For the sake of a smaller example, let's say `Len` is 6. The shift amount is going to take values 1 and 2. Let's say the `SrcReg`'s value is `10 00 00`. The `Op` takes values: `10 00 00` `11 00 00` = `10 00 00` \| `01 00 00`, `i == 0` `11 11 00` = `11 00 00` \| `00 11 00`, `i == 1` The final value is 2, while it had to be 0. This could probably be fixed by rounding `Len` up to the closest power of 2.
1084	Maybe slightly more meaningful name?
1093	Same here, could we change `MI` in-place?
1101	I think there is a little opportunity here, not sure how useful in practice: if we check if (!isLegalOrCustom({TargetOpcode::G_CTPOP, {Ty}}) && isLegalOrCustom({TargetOpcode:: G_CTLZ_ZERO_UNDEF, {Ty}})) here and fall-through if `true` we could lower this to `32 - ctlz_zero_undef(~x & (x-1))` with no extra `select`s. Just a thought, it almost certainly doesn't worth the effort.
1129	We already have -1 constant built just above, maybe do `G_ADD` with it instead?
unittests/CodeGen/GlobalISel/LegalizerHelperTest.cpp
32 ↗	(On Diff #159812)	missing `%0` at the end?
139 ↗	(On Diff #159812)	ditto
185 ↗	(On Diff #159812)	Maybe it makes sense to check here that we actually thread things correctly.
unittests/CodeGen/GlobalISel/LegalizerHelperTest.h
74 ↗	(On Diff #159812)	Shouldn't we call `MachineModuleInfo::doInitialization(Module &M)` here? Also, if we do, MMI will be accessible as `MMI.getModule()` which should make `createDummyModule`s interface simpler.
95 ↗	(On Diff #159812)	Why not `bb.0`?
98 ↗	(On Diff #159812)	Do you want to make these live-in?
99 ↗	(On Diff #159812)	The last two `Twine(...)`s are probably not needed (I would expect `Twine::operator+` to be overloaded for every string type)
101 ↗	(On Diff #159812)	Any reason this is created here and passed to `parseMIR` by reference instead of being created and destroyed all within `parseMIR`?
109 ↗	(On Diff #159812)	It looks like if `MMI` is initialized properly the `Module *M` will be accessible from `MMI`.
116 ↗	(On Diff #159812)	Maybe swap these so an out parameter is the last?
144 ↗	(On Diff #159812)	doesn't look like this is used by any of the tests at the moment.
146 ↗	(On Diff #159812)	Not used by any of the tests?
147 ↗	(On Diff #159812)	So we run multiple tests over the same instance of machine function, adding new instructions w/o clearing out the machine function and the rest of the context? Also, in some not really well defined order probably (like alphabetical by test name or something). Shouldn't we put all of this into `setUp` instead and clear it out properly in `tearDown` and run every test in a clean state?
162 ↗	(On Diff #159812)	Maybe call `x` a `Block` or `SettingUpActionsBlock` or something along these lines, then do Block while (0); here and at the "call site" something like LInfoBuilder(A, { getActionDefinitionsBuilder(G_CTTZ_ZERO_UNDEF).legalFor({s64}); }); instead of LInfoBuilder(A, getActionDefinitionsBuilder(G_CTTZ_ZERO_UNDEF).legalFor({s64});); ? As for the name, maybe `DefineLegalizerInfo` is a tad better than `LInfoBuilder`, not sure.
164 ↗	(On Diff #159812)	Maybe also call `verify(*ST.getInstrInfo());`?
188 ↗	(On Diff #159812)	Do we run MachineVerifier somewhere here?

Thanks roman for the feedback. I'll update the patch shortly based on the feedback.

include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h
671	This required less typing one extra argument ;) . Additionally this lets you directly buildICmp like this auto MIB = something... auto Reg = someReg; auto MIBCmp = Builder.buildICmp(Pred, SomeTy, MIB, Reg); This lets Op0 and Op1 be either MachineInstrBuilder or Register (for the various 4 combinations possible). If you need to explicitly specify the use args, it needs to be something like this. template<typename DstTy, typename UseArg0Ty, typename UseArg1Ty> MachineInstrBuilder buildICmp(CmpInst::Predicate Pred, DstTy &&Dst, UseArg0Ty &&Arg0, UseArg1Ty &&Arg1) { return buildICmp(Pred, getDestFromArg(Dst), getRegFromArg(Arg0), getRegFromArg(Arg1)); } Which is a little more typing and longer.
lib/CodeGen/GlobalISel/LegalizerHelper.cpp
1063	Definitely we should be doing that as often as we can, but in this case, if we go that route, we'd end up in a situation where the final select would need to create a new vreg and then it's probably not safe to just blindly replace the original dest with new reg. It's definitely possible to do it in a more ugly way where we replace the first inst in place, change it's destination to a newly created vreg but this sacrifices readability of code. I think CTLZ is not frequent enough to do this.
1083	Good catch. I'll replace it with the following. NewLen = RoundUpToPow2(Len); x = x \| (x >>1); ... until NewLen/2 return Len - PopCount(x);
1101	That's something we could do. I can possibly address this in a subsequent patch.
1129	I've changed it. Thanks
unittests/CodeGen/GlobalISel/LegalizerHelperTest.h
74 ↗	(On Diff #159812)	This was copied over from another unit test. I'll refactor both of these in a subsequent patch.
116 ↗	(On Diff #159812)	Ditto for all of the above.
144 ↗	(On Diff #159812)	This is currently only used to set the insertion point. I left it available if any of the future tests needed that. I can remove it if required.
146 ↗	(On Diff #159812)	MRI was just left here in case any tests want to use it - but it's easily accessible with B.getMF().getRegInfo(). Again, I don't mind removing it.
147 ↗	(On Diff #159812)	I believe each googletest does not reuse the same test fixture object across multiple tests - it creates a new fixture object, constructs/calls setup, runs test, and call teardown()/destructor. I believe each test in it's own should be in a clean state when it runs. I hope I've not misunderstood your question.
188 ↗	(On Diff #159812)	No we don't currently run the verifier here.

Updated patch based on feedback.

Thank you!

LGTM

unittests/CodeGen/GlobalISel/LegalizerHelperTest.h
147 ↗	(On Diff #159812)	True, (https://github.com/google/googletest/blob/master/googletest/docs/primer.md, https://github.com/google/googletest/blob/master/googletest/docs/faq.md#should-i-use-the-constructordestructor-of-the-test-fixture-or-the-set-uptear-down-function), maybe I got it mixed up with the python version of the library. Thanks!
188 ↗	(On Diff #159812)	Feels like a good idea to do that. Maybe, by default with the possibility to opt out on a test-case per test-case basis.

This revision is now accepted and ready to land.Aug 17 2018, 4:24 PM

Thanks Roman.
Submitted in r340111

Revision Contents

Path

Size

include/

llvm/

CodeGen/

GlobalISel/

LegalizerHelper.h

3 lines

MachineIRBuilder.h

9 lines

Target/

GlobalISel/

SelectionDAGCompat.td

5 lines

lib/

CodeGen/

GlobalISel/

LegalizerHelper.cpp

119 lines

Diff 153768

include/llvm/CodeGen/GlobalISel/LegalizerHelper.h

Show First 20 Lines • Show All 43 Lines • ▼ Show 20 Lines	enum LegalizeResult {
Legalized,		Legalized,

/// Some kind of error has occurred and we could not legalize this		/// Some kind of error has occurred and we could not legalize this
/// instruction.		/// instruction.
UnableToLegalize,		UnableToLegalize,
};		};

LegalizerHelper(MachineFunction &MF);		LegalizerHelper(MachineFunction &MF);
		LegalizerHelper(MachineFunction &MF, const LegalizerInfo &LI);

/// Replace \p MI by a sequence of legal instructions that can implement the		/// Replace \p MI by a sequence of legal instructions that can implement the
/// same operation. Note that this means \p MI may be deleted, so any iterator		/// same operation. Note that this means \p MI may be deleted, so any iterator
/// steps should be performed before calling this function. \p Helper should		/// steps should be performed before calling this function. \p Helper should
/// be initialized to the MachineFunction containing \p MI.		/// be initialized to the MachineFunction containing \p MI.
///		///
/// Considered as an opaque blob, the legal code will use and define the same		/// Considered as an opaque blob, the legal code will use and define the same
/// registers as \p MI.		/// registers as \p MI.
▲ Show 20 Lines • Show All 47 Lines • ▼ Show 20 Lines	void widenScalarDst(MachineInstr &MI, LLT WideTy, unsigned OpIdx = 0,
unsigned TruncOpcode = TargetOpcode::G_TRUNC);		unsigned TruncOpcode = TargetOpcode::G_TRUNC);

/// Helper function to split a wide generic register into bitwise blocks with		/// Helper function to split a wide generic register into bitwise blocks with
/// the given Type (which implies the number of blocks needed). The generic		/// the given Type (which implies the number of blocks needed). The generic
/// registers created are appended to Ops, starting at bit 0 of Reg.		/// registers created are appended to Ops, starting at bit 0 of Reg.
void extractParts(unsigned Reg, LLT Ty, int NumParts,		void extractParts(unsigned Reg, LLT Ty, int NumParts,
SmallVectorImpl<unsigned> &Ops);		SmallVectorImpl<unsigned> &Ops);

		LegalizeResult lowerBitCount(MachineInstr &MI, unsigned TypeIdx, LLT Ty);

MachineRegisterInfo &MRI;		MachineRegisterInfo &MRI;
const LegalizerInfo &LI;		const LegalizerInfo &LI;
};		};

/// Helper function that creates the given libcall.		/// Helper function that creates the given libcall.
LegalizerHelper::LegalizeResult		LegalizerHelper::LegalizeResult
createLibcall(MachineIRBuilder &MIRBuilder, RTLIB::Libcall Libcall,		createLibcall(MachineIRBuilder &MIRBuilder, RTLIB::Libcall Libcall,
const CallLowering::ArgInfo &Result,		const CallLowering::ArgInfo &Result,
ArrayRef<CallLowering::ArgInfo> Args);		ArrayRef<CallLowering::ArgInfo> Args);

} // End namespace llvm.		} // End namespace llvm.

#endif		#endif

include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h

Show First 20 Lines • Show All 658 Lines • ▼ Show 20 Lines	public:
/// \pre \p Op0 and Op1 must be generic virtual registers with the		/// \pre \p Op0 and Op1 must be generic virtual registers with the
/// same number of elements as \p Res. If \p Res is a scalar,		/// same number of elements as \p Res. If \p Res is a scalar,
/// \p Op0 must be either a scalar or pointer.		/// \p Op0 must be either a scalar or pointer.
/// \pre \p Pred must be an integer predicate.		/// \pre \p Pred must be an integer predicate.
///		///
/// \return a MachineInstrBuilder for the newly created instruction.		/// \return a MachineInstrBuilder for the newly created instruction.
MachineInstrBuilder buildICmp(CmpInst::Predicate Pred,		MachineInstrBuilder buildICmp(CmpInst::Predicate Pred,
unsigned Res, unsigned Op0, unsigned Op1);		unsigned Res, unsigned Op0, unsigned Op1);
		template <typename DstTy, typename... UseArgsTy>
		MachineInstrBuilder buildICmp(CmpInst::Predicate Pred, DstTy &&Dst,
		UseArgsTy &&... UseArgs) {
		return buildICmp(Pred, getDestFromArg(Dst), getRegFromArg(UseArgs)...);
		}
		rtereshinUnsubmitted Not Done Reply Inline Actions I'm not sure what's the point of having variadic templates here. Does `buildICmp` or `buildSelect` make sense with 0, 1, or let's say 4 operands (excluding destination)? rtereshin: I'm not sure what's the point of having variadic templates here. Does `buildICmp` or…
		aditya_nandakumarAuthorUnsubmitted Not Done Reply Inline Actions This required less typing one extra argument ;) . Additionally this lets you directly buildICmp like this auto MIB = something... auto Reg = someReg; auto MIBCmp = Builder.buildICmp(Pred, SomeTy, MIB, Reg); This lets Op0 and Op1 be either MachineInstrBuilder or Register (for the various 4 combinations possible). If you need to explicitly specify the use args, it needs to be something like this. template<typename DstTy, typename UseArg0Ty, typename UseArg1Ty> MachineInstrBuilder buildICmp(CmpInst::Predicate Pred, DstTy &&Dst, UseArg0Ty &&Arg0, UseArg1Ty &&Arg1) { return buildICmp(Pred, getDestFromArg(Dst), getRegFromArg(Arg0), getRegFromArg(Arg1)); } Which is a little more typing and longer. aditya_nandakumar: This required less typing one extra argument ;) . Additionally this lets you directly buildICmp…

/// Build and insert a \p Res = G_FCMP \p Pred\p Op0, \p Op1		/// Build and insert a \p Res = G_FCMP \p Pred\p Op0, \p Op1
///		///
/// \pre setBasicBlock or setMI must have been called.		/// \pre setBasicBlock or setMI must have been called.

/// \pre \p Res must be a generic virtual register with scalar or		/// \pre \p Res must be a generic virtual register with scalar or
/// vector type. Typically this starts as s1 or <N x s1>.		/// vector type. Typically this starts as s1 or <N x s1>.
/// \pre \p Op0 and Op1 must be generic virtual registers with the		/// \pre \p Op0 and Op1 must be generic virtual registers with the
Show All 12 Lines	public:
/// with the same type.		/// with the same type.
/// \pre \p Tst must be a generic virtual register with scalar, pointer or		/// \pre \p Tst must be a generic virtual register with scalar, pointer or
/// vector type. If vector then it must have the same number of		/// vector type. If vector then it must have the same number of
/// elements as the other parameters.		/// elements as the other parameters.
///		///
/// \return a MachineInstrBuilder for the newly created instruction.		/// \return a MachineInstrBuilder for the newly created instruction.
MachineInstrBuilder buildSelect(unsigned Res, unsigned Tst,		MachineInstrBuilder buildSelect(unsigned Res, unsigned Tst,
unsigned Op0, unsigned Op1);		unsigned Op0, unsigned Op1);
		template <typename DstTy, typename... UseArgsTy>
		MachineInstrBuilder buildSelect(DstTy &&Dst, UseArgsTy &&... UseArgs) {
		return buildSelect(getDestFromArg(Dst), getRegFromArg(UseArgs)...);
		}

/// Build and insert \p Res = G_INSERT_VECTOR_ELT \p Val,		/// Build and insert \p Res = G_INSERT_VECTOR_ELT \p Val,
/// \p Elt, \p Idx		/// \p Elt, \p Idx
///		///
/// \pre setBasicBlock or setMI must have been called.		/// \pre setBasicBlock or setMI must have been called.
/// \pre \p Res and \p Val must be a generic virtual register		/// \pre \p Res and \p Val must be a generic virtual register
// with the same vector type.		// with the same vector type.
/// \pre \p Elt and \p Idx must be a generic virtual register		/// \pre \p Elt and \p Idx must be a generic virtual register
▲ Show 20 Lines • Show All 173 Lines • Show Last 20 Lines

include/llvm/Target/GlobalISel/SelectionDAGCompat.td

	Show First 20 Lines • Show All 77 Lines • ▼ Show 20 Lines
	def : GINodeEquiv<G_FEXP2, fexp2>;			def : GINodeEquiv<G_FEXP2, fexp2>;
	def : GINodeEquiv<G_FLOG2, flog2>;			def : GINodeEquiv<G_FLOG2, flog2>;
	def : GINodeEquiv<G_INTRINSIC, intrinsic_wo_chain>;			def : GINodeEquiv<G_INTRINSIC, intrinsic_wo_chain>;
	// ISD::INTRINSIC_VOID can also be handled with G_INTRINSIC_W_SIDE_EFFECTS.			// ISD::INTRINSIC_VOID can also be handled with G_INTRINSIC_W_SIDE_EFFECTS.
	def : GINodeEquiv<G_INTRINSIC_W_SIDE_EFFECTS, intrinsic_void>;			def : GINodeEquiv<G_INTRINSIC_W_SIDE_EFFECTS, intrinsic_void>;
	def : GINodeEquiv<G_INTRINSIC_W_SIDE_EFFECTS, intrinsic_w_chain>;			def : GINodeEquiv<G_INTRINSIC_W_SIDE_EFFECTS, intrinsic_w_chain>;
	def : GINodeEquiv<G_BR, br>;			def : GINodeEquiv<G_BR, br>;
	def : GINodeEquiv<G_BSWAP, bswap>;			def : GINodeEquiv<G_BSWAP, bswap>;
				def : GINodeEquiv<G_CTLZ, ctlz>;
				def : GINodeEquiv<G_CTTZ, cttz>;
				def : GINodeEquiv<G_CTLZ_ZERO_UNDEF, ctlz_zero_undef>;
				def : GINodeEquiv<G_CTTZ_ZERO_UNDEF, cttz_zero_undef>;
				def : GINodeEquiv<G_CTPOP, ctpop>;

	// Broadly speaking G_LOAD is equivalent to ISD::LOAD but there are some			// Broadly speaking G_LOAD is equivalent to ISD::LOAD but there are some
	// complications that tablegen must take care of. For example, Predicates such			// complications that tablegen must take care of. For example, Predicates such
	// as isSignExtLoad require that this is not a perfect 1:1 mapping since a			// as isSignExtLoad require that this is not a perfect 1:1 mapping since a
	// sign-extending load is (G_SEXTLOAD x) in GlobalISel. Additionally,			// sign-extending load is (G_SEXTLOAD x) in GlobalISel. Additionally,
	// G_LOAD handles both atomic and non-atomic loads where as SelectionDAG had			// G_LOAD handles both atomic and non-atomic loads where as SelectionDAG had
	// separate nodes for them. This GINodeEquiv maps the non-atomic loads to			// separate nodes for them. This GINodeEquiv maps the non-atomic loads to
	// G_LOAD with a non-atomic MachineMemOperand.			// G_LOAD with a non-atomic MachineMemOperand.
	Show All 38 Lines

lib/CodeGen/GlobalISel/LegalizerHelper.cpp

Show All 11 Lines
/// primary legalization.		/// primary legalization.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/CodeGen/GlobalISel/LegalizerHelper.h"		#include "llvm/CodeGen/GlobalISel/LegalizerHelper.h"
#include "llvm/CodeGen/GlobalISel/CallLowering.h"		#include "llvm/CodeGen/GlobalISel/CallLowering.h"
#include "llvm/CodeGen/GlobalISel/LegalizerInfo.h"		#include "llvm/CodeGen/GlobalISel/LegalizerInfo.h"
#include "llvm/CodeGen/MachineRegisterInfo.h"		#include "llvm/CodeGen/MachineRegisterInfo.h"
		#include "llvm/CodeGen/TargetInstrInfo.h"
#include "llvm/CodeGen/TargetLowering.h"		#include "llvm/CodeGen/TargetLowering.h"
#include "llvm/CodeGen/TargetSubtargetInfo.h"		#include "llvm/CodeGen/TargetSubtargetInfo.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"


#define DEBUG_TYPE "legalizer"		#define DEBUG_TYPE "legalizer"

using namespace llvm;		using namespace llvm;
using namespace LegalizeActions;		using namespace LegalizeActions;

LegalizerHelper::LegalizerHelper(MachineFunction &MF)		LegalizerHelper::LegalizerHelper(MachineFunction &MF)
: MRI(MF.getRegInfo()), LI(*MF.getSubtarget().getLegalizerInfo()) {		: MRI(MF.getRegInfo()), LI(*MF.getSubtarget().getLegalizerInfo()) {
MIRBuilder.setMF(MF);		MIRBuilder.setMF(MF);
}		}

		LegalizerHelper::LegalizerHelper(MachineFunction &MF, const LegalizerInfo &LI)
		: MRI(MF.getRegInfo()), LI(LI) {
		MIRBuilder.setMF(MF);
		}
		rtereshinUnsubmitted Not Done Reply Inline Actions 👍 rtereshin: 👍
LegalizerHelper::LegalizeResult		LegalizerHelper::LegalizeResult
LegalizerHelper::legalizeInstrStep(MachineInstr &MI) {		LegalizerHelper::legalizeInstrStep(MachineInstr &MI) {
LLVM_DEBUG(dbgs() << "Legalizing: "; MI.print(dbgs()));		LLVM_DEBUG(dbgs() << "Legalizing: "; MI.print(dbgs()));

auto Step = LI.getAction(MI, MRI);		auto Step = LI.getAction(MI, MRI);
switch (Step.Action) {		switch (Step.Action) {
case Legal:		case Legal:
LLVM_DEBUG(dbgs() << ".. Already legal\n");		LLVM_DEBUG(dbgs() << ".. Already legal\n");
▲ Show 20 Lines • Show All 935 Lines • ▼ Show 20 Lines	if (DstTy.isScalar()) {
break;		break;
}		}
MI.eraseFromParent();		MI.eraseFromParent();
return Legalized;		return Legalized;
}		}

return UnableToLegalize;		return UnableToLegalize;
}		}
		case TargetOpcode::G_CTLZ_ZERO_UNDEF:
		case TargetOpcode::G_CTTZ_ZERO_UNDEF:
		case TargetOpcode::G_CTLZ:
		case TargetOpcode::G_CTTZ:
		case TargetOpcode::G_CTPOP:
		return lowerBitCount(MI, TypeIdx, Ty);
}		}
}		}

LegalizerHelper::LegalizeResult		LegalizerHelper::LegalizeResult
LegalizerHelper::fewerElementsVector(MachineInstr &MI, unsigned TypeIdx,		LegalizerHelper::fewerElementsVector(MachineInstr &MI, unsigned TypeIdx,
LLT NarrowTy) {		LLT NarrowTy) {
// FIXME: Don't know how to handle secondary types yet.		// FIXME: Don't know how to handle secondary types yet.
if (TypeIdx != 0)		if (TypeIdx != 0)
Show All 24 Lines	case TargetOpcode::G_ADD: {
}		}

MIRBuilder.buildMerge(DstReg, DstRegs);		MIRBuilder.buildMerge(DstReg, DstRegs);
MI.eraseFromParent();		MI.eraseFromParent();
return Legalized;		return Legalized;
}		}
}		}
}		}

		LegalizerHelper::LegalizeResult
		LegalizerHelper::lowerBitCount(MachineInstr &MI, unsigned TypeIdx, LLT Ty) {
		unsigned Opc = MI.getOpcode();
		auto &TII = *MI.getMF()->getSubtarget().getInstrInfo();
		auto isLegalOrCustom = [this](const LegalityQuery &Q) {
		auto QAction = LI.getAction(Q).Action;
		return QAction == Legal \|\| QAction == Custom;
		};
		switch (Opc) {
		default:
		return UnableToLegalize;
		case TargetOpcode::G_CTLZ_ZERO_UNDEF: {
		// This trivially expands to CTLZ.
		MI.setDesc(TII.get(TargetOpcode::G_CTLZ));
		MIRBuilder.recordInsertion(&MI);
		return Legalized;
		}
		case TargetOpcode::G_CTLZ: {
		unsigned SrcReg = MI.getOperand(1).getReg();
		unsigned Len = Ty.getSizeInBits();
		if (isLegalOrCustom({TargetOpcode::G_CTLZ_ZERO_UNDEF, {Ty}})) {
		// If CTLZ_ZERO_UNDEF is legal or custom, emit that and a select with
		// zero.
		auto MIBCtlzZU =
		MIRBuilder.buildInstr(TargetOpcode::G_CTLZ_ZERO_UNDEF, Ty, SrcReg);
		rtereshinUnsubmitted Done Reply Inline Actions Do you think we could `MI.setDesc(TII.get(TargetOpcode:: G_CTLZ_ZERO_UNDEF));` here in-place instead of recreating the instruction from scratch? rtereshin: Do you think we could `MI.setDesc(TII.get(TargetOpcode:: G_CTLZ_ZERO_UNDEF));` here in-place…
		aditya_nandakumarAuthorUnsubmitted Not Done Reply Inline Actions Definitely we should be doing that as often as we can, but in this case, if we go that route, we'd end up in a situation where the final select would need to create a new vreg and then it's probably not safe to just blindly replace the original dest with new reg. It's definitely possible to do it in a more ugly way where we replace the first inst in place, change it's destination to a newly created vreg but this sacrifices readability of code. I think CTLZ is not frequent enough to do this. aditya_nandakumar: Definitely we should be doing that as often as we can, but in this case, if we go that route…
		auto MIBZero = MIRBuilder.buildConstant(Ty, 0);
		auto MIBLen = MIRBuilder.buildConstant(Ty, Len);
		auto MIBICmp = MIRBuilder.buildICmp(CmpInst::ICMP_EQ, LLT::scalar(1),
		SrcReg, MIBZero);
		MIRBuilder.buildSelect(MI.getOperand(0).getReg(), MIBICmp, MIBLen,
		MIBCtlzZU);
		MI.eraseFromParent();
		return Legalized;
		}
		// for now, we do this:
		// x = x \| (x >> 1);
		// x = x \| (x >> 2);
		// ...
		// x = x \| (x >>16);
		// x = x \| (x >>32); // for 64-bit input
		// return popcount(~x);
		//
		// Ref: "Hacker's Delight" by Henry Warren
		unsigned Op = SrcReg;
		for (unsigned i = 0; (1U << i) <= (Len / 2); ++i) {
		rtereshinUnsubmitted Done Reply Inline Actions I don't think this is going to work if `Len` is not a power of 2. For the sake of a smaller example, let's say `Len` is 6. The shift amount is going to take values 1 and 2. Let's say the `SrcReg`'s value is `10 00 00`. The `Op` takes values: `10 00 00` `11 00 00` = `10 00 00` \| `01 00 00`, `i == 0` `11 11 00` = `11 00 00` \| `00 11 00`, `i == 1` The final value is 2, while it had to be 0. This could probably be fixed by rounding `Len` up to the closest power of 2. rtereshin: I don't think this is going to work if `Len` is not a power of 2. For the sake of a smaller…
		aditya_nandakumarAuthorUnsubmitted Not Done Reply Inline Actions Good catch. I'll replace it with the following. NewLen = RoundUpToPow2(Len); x = x \| (x >>1); ... until NewLen/2 return Len - PopCount(x); aditya_nandakumar: Good catch. I'll replace it with the following. NewLen = RoundUpToPow2(Len); x = x \| (x >>1); ..
		auto MIBTmp3 = MIRBuilder.buildConstant(Ty, 1ULL << i);
		rtereshinUnsubmitted Done Reply Inline Actions Maybe slightly more meaningful name? rtereshin: Maybe slightly more meaningful name?
		auto MIBOp = MIRBuilder.buildInstr(
		TargetOpcode::G_OR, Ty, Op,
		MIRBuilder.buildInstr(TargetOpcode::G_LSHR, Ty, Op, MIBTmp3));
		Op = MIBOp->getOperand(0).getReg();
		}
		auto MIBNot = MIRBuilder.buildInstr(TargetOpcode::G_XOR, Ty, Op,
		MIRBuilder.buildConstant(Ty, -1));
		MIRBuilder.buildInstr(TargetOpcode::G_CTPOP, MI.getOperand(0).getReg(),
		MIBNot);
		rtereshinUnsubmitted Done Reply Inline Actions Same here, could we change `MI` in-place? rtereshin: Same here, could we change `MI` in-place?
		MI.eraseFromParent();
		return Legalized;
		}
		case TargetOpcode::G_CTTZ_ZERO_UNDEF: {
		// This trivially expands to CTTZ.
		MI.setDesc(TII.get(TargetOpcode::G_CTTZ));
		MIRBuilder.recordInsertion(&MI);
		return Legalized;
		rtereshinUnsubmitted Not Done Reply Inline Actions I think there is a little opportunity here, not sure how useful in practice: if we check if (!isLegalOrCustom({TargetOpcode::G_CTPOP, {Ty}}) && isLegalOrCustom({TargetOpcode:: G_CTLZ_ZERO_UNDEF, {Ty}})) here and fall-through if `true` we could lower this to `32 - ctlz_zero_undef(~x & (x-1))` with no extra `select`s. Just a thought, it almost certainly doesn't worth the effort. rtereshin: I think there is a little opportunity here, not sure how useful in practice: if we check ```…
		aditya_nandakumarAuthorUnsubmitted Not Done Reply Inline Actions That's something we could do. I can possibly address this in a subsequent patch. aditya_nandakumar: That's something we could do. I can possibly address this in a subsequent patch.
		}
		case TargetOpcode::G_CTTZ: {
		unsigned SrcReg = MI.getOperand(1).getReg();
		unsigned Len = Ty.getSizeInBits();
		if (isLegalOrCustom({TargetOpcode::G_CTTZ_ZERO_UNDEF, {Ty}})) {
		// If CTTZ_ZERO_UNDEF is legal or custom, emit that and a select with
		// zero.
		auto MIBCttzZU =
		MIRBuilder.buildInstr(TargetOpcode::G_CTTZ_ZERO_UNDEF, Ty, SrcReg);
		auto MIBZero = MIRBuilder.buildConstant(Ty, 0);
		auto MIBLen = MIRBuilder.buildConstant(Ty, Len);
		auto MIBICmp = MIRBuilder.buildICmp(CmpInst::ICMP_EQ, LLT::scalar(1),
		SrcReg, MIBZero);
		MIRBuilder.buildSelect(MI.getOperand(0).getReg(), MIBICmp, MIBLen,
		MIBCttzZU);
		MI.eraseFromParent();
		return Legalized;
		}
		// for now, we use: { return popcount(~x & (x - 1)); }
		// unless the target has ctlz but not ctpop, in which case we use:
		// { return 32 - nlz(~x & (x-1)); }
		// Ref: "Hacker's Delight" by Henry Warren
		auto MIBNot = MIRBuilder.buildInstr(TargetOpcode::G_XOR, Ty, SrcReg,
		MIRBuilder.buildConstant(Ty, -1));
		auto MIBTmp = MIRBuilder.buildInstr(
		TargetOpcode::G_AND, Ty, MIBNot,
		MIRBuilder.buildInstr(TargetOpcode::G_SUB, Ty, SrcReg,
		MIRBuilder.buildConstant(Ty, 1)));
		rtereshinUnsubmitted Done Reply Inline Actions We already have -1 constant built just above, maybe do `G_ADD` with it instead? rtereshin: We already have -1 constant built just above, maybe do `G_ADD` with it instead?
		aditya_nandakumarAuthorUnsubmitted Not Done Reply Inline Actions I've changed it. Thanks aditya_nandakumar: I've changed it. Thanks
		if (!isLegalOrCustom({TargetOpcode::G_CTPOP, {Ty}}) &&
		isLegalOrCustom({TargetOpcode::G_CTLZ, {Ty}})) {
		MIRBuilder.buildInstr(
		TargetOpcode::G_SUB, MI.getOperand(0).getReg(),
		MIRBuilder.buildConstant(Ty, Len),
		MIRBuilder.buildInstr(TargetOpcode::G_CTLZ, Ty, MIBTmp));
		MI.eraseFromParent();
		return Legalized;
		}
		MIRBuilder.buildInstr(TargetOpcode::G_CTPOP, MI.getOperand(0).getReg(),
		MIBTmp);
		MI.eraseFromParent();
		return Legalized;
		}
		}
		}

This is an archive of the discontinued LLVM Phabricator instance.

[GISel]: Add Legalization/lowering code for bit counting operationsClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 153768

include/llvm/CodeGen/GlobalISel/LegalizerHelper.h

include/llvm/CodeGen/GlobalISel/MachineIRBuilder.h

include/llvm/Target/GlobalISel/SelectionDAGCompat.td

lib/CodeGen/GlobalISel/LegalizerHelper.cpp

[GISel]: Add Legalization/lowering code for bit counting operations
ClosedPublic