This is an archive of the discontinued LLVM Phabricator instance.

"float2int": Add a new pass to demote from float to int where possible.
ClosedPublic

Authored by jmolloy on Feb 20 2015, 7:58 AM.

Download Raw Diff

Details

Reviewers

Summary

It is possible to have code that converts from integer to float, performs
operations then converts back, and the result is provably the same as if
integers were used.

This can come from different sources, but the most obvious is a helper function
that uses floats but the arguments given at an inlined callsites are integers.

This pass considers all integers requiring a bitwidth less than or equal to
the bitwidth of the mantissa of a floating point type (23 for floats, 52 for
doubles) as exactly representable in floating point.

Diff Detail

Repository: rL LLVM

Event Timeline

jmolloy updated this revision to Diff 20404.Feb 20 2015, 7:58 AM

jmolloy retitled this revision from to "float2int": Add a new pass to demote from float to int where possible..

jmolloy updated this object.

jmolloy edited the test plan for this revision. (Show Details)

jmolloy added a reviewer: hfinkel.

jmolloy set the repository for this revision to rL LLVM.

jmolloy added a parent revision: D7789: [ConstantRange] Teach multiply to be cleverer about signed ranges..

jmolloy added a subscriber: Unknown Object (MLST).

This is neat, thanks for posting it!

lib/Transforms/Scalar/Float2Int.cpp
47	Make this a command-line option.
157	You can also handle Select here (and PHIs, although that might require more work?)
224	This is directly recursive; that's likely a bad idea (especially considering that it has no depth cutoff). Can you make this be worklist-based instead. We don't need a small cutoff if we have a worklist.
256	Why is the FIXME here? Do you want to mark these as unsigned under some circumstances?
281	What's this FIXME for? You'd want to mark the full range? (that would make the unioning unnecessary ;) )
318	Why are you checking for isFullSet() here? What about if you only have i8 (or something definitively smaller than the mantissa)?
333	Hrmm, we have other floating-point types in the IR. You need to either exclude them somewhere, or handle them. Either way, the real way to do this is to call: ... = APFloat::semanticsPrecision(ConvertedToTy->getFltSemantics()) - 1;
341	Exactly where does 23 come from here? I understand that is the number of bits in a single-precision mantissa, but what does it have to do with picking the integer type? Can you use make use of Type::getIntNTy?
385	You should not need to explicitly handle the equal-types case here. IRB::CreateZExtOrTrunc, will also check this and do nothing if the incoming and outgoing bitwidths are equal. (this obviously applies to the checks below as well).
442	Hrmm. To have converted an instruction, you must have converted all uses, right? If we know these need to be dead, and we just don't know the order, we can use the same technique employed by ADCE: First, loop over all instructions calling I->dropAllReferences Then, loop over them again, erasing them as you do here.
test/Transforms/Float2Int/basic.ll
4	Please move these descriptions to be with each associated test.

jmolloy updated this revision to Diff 20682.Feb 25 2015, 9:01 AM

jmolloy edited edge metadata.

Hi Hal,

Thanks for the quick review! Sorry for not getting back to it for a couple of days - I was at a conference.

I've uploaded a new patch addressing most of your comments.

You can also handle Select here (and PHIs, although that might require more work?)

Yes, I think I can. I've added a FIXME for this and will do (at least the select work) in a followup, if that's OK? PHIs I think will certainly need more thought.

Why is the FIXME here? Do you want to mark these as unsigned under some circumstances?

The FIXME was (a) not meant to make it to upstream code review :) and (b) because I hadn't clocked the sense of APSFloat's constructor - I thought it was "isSigned" rather than "isUnsigned", and my current code felt like it was doing the wrong thing. It's actually right.

Why are you checking for isFullSet() here? What about if you only have i8 (or something definitively smaller than the mantissa)?

R's underlying type is i65 (MaxBitwidth + 1), so isFullSet() will only trigger on either an i64 multiplication/addition or a poison value (which I define as "i65 full-set"). 0..255 should never trigger isFullSet here, unless I'm misunderstanding how ConstantRange works.

Hrmm, we have other floating-point types in the IR. You need to either exclude them somewhere, or handle them.

Thanks! I wasn't aware of the fltSemantics precision. Updated.

Exactly where does 23 come from here?

Thanks for noticing that. It's wrong, it should of course be 32. I don't really want to use getIntNTy() because I don't want to get an illegal type - I've restricted myself to i32 or i64 for the moment. I could grab TLI and query that I suppose?

Hrmm. To have converted an instruction, you must have converted all uses, right?

In fact, because I'm using a MapVector and reversing through it I hit all nodes after their uses. So I just don't need the use_empty() check.

jmolloy updated this revision to Diff 20683.Feb 25 2015, 9:14 AM

This is directly recursive; that's likely a bad idea (especially considering that it has no depth cutoff). Can you make this be worklist-based instead. We don't need a small cutoff if we have a worklist.

Hrmm, I agree. It's slightly awkward because I do need the return value of traverse() right there and then, so converting it to be breadth first is going to be awkward.

I think I'll separate these out into two parts - first, scan up the use-def graph using a worklist to find all the nodes that I want to visit. Then, so a linear walk of the result back-to-front calculating node values.

For the moment I've added a cutoff of 32 instructions; I'll go work on reworking the algorithm now.

jmolloy updated this revision to Diff 20686.Feb 25 2015, 9:42 AM

I'm going to deref to Hal on the code for the moment. It sounds like things are still in progress. Once you've got the approach nailed down, I'll try to take another look for nits.

At a high level, why are you restricting yourself to entire sequences? It seems like this would be profitable even if all you did was push the float to int conversion back a bit or the int to float forward a bit. This might even simplify the code.

Is the concern here that the backends might not make appropriate use of the floating point unit for integer ops?

lib/Transforms/Scalar/Float2Int.cpp
441	Can this be a range loop? Can you use a utility to delete things which are recursively trivially dead?

Hi Hal, Philip,

This latest revision removes the recursion, replacing it with a slightly more complicated but still understandable two-phase approach.

Cheers,

James

Hi Philip,

In D7790#129842, @reames wrote:

I'm going to deref to Hal on the code for the moment. It sounds like things are still in progress. Once you've got the approach nailed down, I'll try to take another look for nits.

Thanks!

At a high level, why are you restricting yourself to entire sequences? It seems like this would be profitable even if all you did was push the float to int conversion back a bit or the int to float forward a bit. This might even simplify the code.

The short answer is "first do no harm". I flirted with the idea of doing what you suggest in previous patch incarnations, and had trouble working out when it was profitable.

In the current incarnation, the heuristic is "yes", which simplifies things quite a bit. By converting an entire chain, we:

Never add any instructions.
Never move any instructions.

The former might be required if we had to push a cast up past a PHI, and the latter might cause issues if we pushed a cold cast into a hot region. We'd have to be very careful to only move casts outside of loops, never inside, for example (unless we could remove them completely). This means that an iterative,greedy method is really out of the question and a whole-function analysis is required.

I still hope to be able to do what you want, maybe at a (much?) later stage. I hope that the current infrastructure is a good staging point for such a transform - really it's just adding a weighting to each node and a global sum determining if the transform is worth it. The mechanism shouldn't have to change, I think.

Can this be a range loop? Can you use a utility to delete things which are recursively trivially dead?

In this latest incarnation it's now a trivial loop, so I think using a utility isn't really required. Also, I know which nodes are dead; getting a utility to discover them isn't really required.

It can't be a range loop because I need to reverse-iterate over the container to ensure I hit uses before defs.

Cheers,

James

Hi Hal,

Gentle ping.

Cheers,

James

Hi James,

Just so we have a record of what we talked about on IRC (and can give Hal a chance to disagree :-) ).
On x86, vector i64 muls can be much worse than vector double muls. Since this is pre-LoopV, and we don't know if we'll end up with vector or scalar code, I think the safe thing to do on x86 would be to disable this for cases where we'll do a double -> i64 transformation.

This means we should probably have a target hook for that that x86 can override.

hfinkel added inline comments.Mar 5 2015, 7:29 AM

lib/Transforms/Scalar/Float2Int.cpp
49	Don't put the "-" at the beginning of "-float2int-max-integer-bw".
66	Please make all of the functions here start with a lowercase letter.
75	Remove extra space before >
165	I don't like the terminology of calling this a 'forward' walk. It is walking instructions from the roots (which are at the end of the execution sequence), through their operands. I call this a 'backward' walk.
170	If you visit defs before uses, that seems 'forward' to me.
231	Remove blank line.
296	Remove blank line.
300	Remove blank line.
302	Remove blank line.
339	Remove blank line.
348	Remove blank line.
419	Where do you filter out MinBW > 64?
422	{ } not needed here.
473	Unnecessary line break?
test/Transforms/Float2Int/basic.ll
120	Please add negative test using a large integer type (i128, etc.).

In D7790#134758, @mkuper wrote:

Hi James,

Just so we have a record of what we talked about on IRC (and can give Hal a chance to disagree :-)

Good; I disagree :-)

The first question is answer is: What is the most useful and reasonable canonical form? The reason I support running this pass early in the pipeline is because I believe that demoting these int -> fp -> int sequences to int sequences, when semantically equivalent, is the most useful canonical form.

If it is useful, because of microarchitectural features, to use FP vector ops instead of integer vector ops, then that should be 'actively' handled later (instead of just taking advantage of it when it happens to happen).

So I think that this should run early by default, x86 included. We should also reverse the transformation later, perhaps within the vectorizer, using an actual cost model, if that proves useful.

On x86, vector i64 muls can be much worse than vector double muls. Since this is pre-LoopV, and we don't know if we'll end up with vector or scalar code, I think the safe thing to do on x86 would be to disable this for cases where we'll do a double -> i64 transformation.

This means we should probably have a target hook for that that x86 can override.

jmolloy updated this revision to Diff 22181.Mar 18 2015, 7:40 AM

Hi Hal,

Thanks for the review, sorry for the long round trip time. All comments should be fixed.

Cheers,

James

LGTM.

This revision is now accepted and ready to land.Mar 23 2015, 11:39 AM

jmolloy closed this revision.Jul 17 2015, 2:21 AM

Revision Contents

Path

Size

include/

llvm/

InitializePasses.h

1 line

LinkAllPasses.h

1 line

Transforms/

Scalar.h

7 lines

lib/

Transforms/

IPO/

PassManagerBuilder.cpp

7 lines

Scalar/

CMakeLists.txt

1 line

Float2Int.cpp

535 lines

Scalar.cpp

1 line

test/

Transforms/

Float2Int/

basic.ll

227 lines

toolarge.ll

16 lines

Diff 22181

include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 289 Lines • ▼ Show 20 Lines
	void initializeStackMapLivenessPass(PassRegistry&);			void initializeStackMapLivenessPass(PassRegistry&);
	void initializeMachineCombinerPass(PassRegistry &);			void initializeMachineCombinerPass(PassRegistry &);
	void initializeLoadCombinePass(PassRegistry&);			void initializeLoadCombinePass(PassRegistry&);
	void initializeRewriteSymbolsPass(PassRegistry&);			void initializeRewriteSymbolsPass(PassRegistry&);
	void initializeWinEHPreparePass(PassRegistry&);			void initializeWinEHPreparePass(PassRegistry&);
	void initializePlaceBackedgeSafepointsImplPass(PassRegistry&);			void initializePlaceBackedgeSafepointsImplPass(PassRegistry&);
	void initializePlaceSafepointsPass(PassRegistry&);			void initializePlaceSafepointsPass(PassRegistry&);
	void initializeDwarfEHPreparePass(PassRegistry&);			void initializeDwarfEHPreparePass(PassRegistry&);
				void initializeFloat2IntPass(PassRegistry&);
	}			}

	#endif			#endif

include/llvm/LinkAllPasses.h

Show First 20 Lines • Show All 163 Lines • ▼ Show 20 Lines	ForcePassLinking() {
(void) llvm::createSLPVectorizerPass();		(void) llvm::createSLPVectorizerPass();
(void) llvm::createBBVectorizePass();		(void) llvm::createBBVectorizePass();
(void) llvm::createPartiallyInlineLibCallsPass();		(void) llvm::createPartiallyInlineLibCallsPass();
(void) llvm::createScalarizerPass();		(void) llvm::createScalarizerPass();
(void) llvm::createSeparateConstOffsetFromGEPPass();		(void) llvm::createSeparateConstOffsetFromGEPPass();
(void) llvm::createRewriteSymbolsPass();		(void) llvm::createRewriteSymbolsPass();
(void) llvm::createStraightLineStrengthReducePass();		(void) llvm::createStraightLineStrengthReducePass();
(void) llvm::createMemDerefPrinter();		(void) llvm::createMemDerefPrinter();
		(void) llvm::createFloat2IntPass();

(void)new llvm::IntervalPartition();		(void)new llvm::IntervalPartition();
(void)new llvm::ScalarEvolution();		(void)new llvm::ScalarEvolution();
((llvm::Function*)nullptr)->viewCFGOnly();		((llvm::Function*)nullptr)->viewCFGOnly();
llvm::RGPassManager RGM;		llvm::RGPassManager RGM;
((llvm::RegionPass)nullptr)->runOnRegion((llvm::Region)nullptr, RGM);		((llvm::RegionPass)nullptr)->runOnRegion((llvm::Region)nullptr, RGM);
llvm::AliasSetTracker X((llvm::AliasAnalysis)nullptr);		llvm::AliasSetTracker X((llvm::AliasAnalysis)nullptr);
X.add(nullptr, 0, llvm::AAMDNodes()); // for -print-alias-sets		X.add(nullptr, 0, llvm::AAMDNodes()); // for -print-alias-sets
}		}
} ForcePassLinking; // Force link by creating a global definition.		} ForcePassLinking; // Force link by creating a global definition.
}		}

#endif		#endif

include/llvm/Transforms/Scalar.h

	Show First 20 Lines • Show All 423 Lines • ▼ Show 20 Lines
	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// LoadCombine - Combine loads into bigger loads.			// LoadCombine - Combine loads into bigger loads.
	//			//
	BasicBlockPass *createLoadCombinePass();			BasicBlockPass *createLoadCombinePass();

	FunctionPass *createStraightLineStrengthReducePass();			FunctionPass *createStraightLineStrengthReducePass();


	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// PlaceSafepoints - Rewrite any IR calls to gc.statepoints and insert any			// PlaceSafepoints - Rewrite any IR calls to gc.statepoints and insert any
	// safepoint polls (method entry, backedge) that might be required. This pass			// safepoint polls (method entry, backedge) that might be required. This pass
	// does not generate explicit relocation sequences - that's handled by			// does not generate explicit relocation sequences - that's handled by
	// RewriteStatepointsForGC which can be run at an arbitrary point in the pass			// RewriteStatepointsForGC which can be run at an arbitrary point in the pass
	// order following this pass.			// order following this pass.
	//			//
	ModulePass *createPlaceSafepointsPass();			ModulePass *createPlaceSafepointsPass();

	//===----------------------------------------------------------------------===//			//===----------------------------------------------------------------------===//
	//			//
	// RewriteStatepointsForGC - Rewrite any gc.statepoints which do not yet have			// RewriteStatepointsForGC - Rewrite any gc.statepoints which do not yet have
	// explicit relocations to include explicit relocations.			// explicit relocations to include explicit relocations.
	//			//
	FunctionPass *createRewriteStatepointsForGCPass();			FunctionPass *createRewriteStatepointsForGCPass();

				//===----------------------------------------------------------------------===//
				//
				// Float2Int - Demote floats to ints where possible.
				//
				FunctionPass *createFloat2IntPass();

	} // End llvm namespace			} // End llvm namespace

	#endif			#endif

lib/Transforms/IPO/PassManagerBuilder.cpp

Show First 20 Lines • Show All 53 Lines • ▼ Show 20 Lines
static cl::opt<bool> UseNewSROA("use-new-sroa",		static cl::opt<bool> UseNewSROA("use-new-sroa",
cl::init(true), cl::Hidden,		cl::init(true), cl::Hidden,
cl::desc("Enable the new, experimental SROA pass"));		cl::desc("Enable the new, experimental SROA pass"));

static cl::opt<bool>		static cl::opt<bool>
RunLoopRerolling("reroll-loops", cl::Hidden,		RunLoopRerolling("reroll-loops", cl::Hidden,
cl::desc("Run the loop rerolling pass"));		cl::desc("Run the loop rerolling pass"));

		static cl::opt<bool>
		RunFloat2Int("float-to-int", cl::Hidden, cl::init(true),
		cl::desc("Run the float2int (float demotion) pass"));

static cl::opt<bool> RunLoadCombine("combine-loads", cl::init(false),		static cl::opt<bool> RunLoadCombine("combine-loads", cl::init(false),
cl::Hidden,		cl::Hidden,
cl::desc("Run the load combining pass"));		cl::desc("Run the load combining pass"));

static cl::opt<bool>		static cl::opt<bool>
RunSLPAfterLoopVectorization("run-slp-after-loop-vectorization",		RunSLPAfterLoopVectorization("run-slp-after-loop-vectorization",
cl::init(true), cl::Hidden,		cl::init(true), cl::Hidden,
cl::desc("Run the SLP vectorizer (and BB vectorizer) after the Loop "		cl::desc("Run the SLP vectorizer (and BB vectorizer) after the Loop "
▲ Show 20 Lines • Show All 233 Lines • ▼ Show 20 Lines	void PassManagerBuilder::populateModulePassManager(
MPM.add(createInstructionCombiningPass()); // Clean up after everything.		MPM.add(createInstructionCombiningPass()); // Clean up after everything.
addExtensionsToPM(EP_Peephole, MPM);		addExtensionsToPM(EP_Peephole, MPM);

// FIXME: This is a HACK! The inliner pass above implicitly creates a CGSCC		// FIXME: This is a HACK! The inliner pass above implicitly creates a CGSCC
// pass manager that we are specifically trying to avoid. To prevent this		// pass manager that we are specifically trying to avoid. To prevent this
// we must insert a no-op module pass to reset the pass manager.		// we must insert a no-op module pass to reset the pass manager.
MPM.add(createBarrierNoopPass());		MPM.add(createBarrierNoopPass());

		if (RunFloat2Int)
		MPM.add(createFloat2IntPass());

// Re-rotate loops in all our loop nests. These may have fallout out of		// Re-rotate loops in all our loop nests. These may have fallout out of
// rotated form due to GVN or other transformations, and the vectorizer relies		// rotated form due to GVN or other transformations, and the vectorizer relies
// on the rotated form.		// on the rotated form.
if (ExtraVectorizerPasses)		if (ExtraVectorizerPasses)
MPM.add(createLoopRotatePass());		MPM.add(createLoopRotatePass());

MPM.add(createLoopVectorizePass(DisableUnrollLoops, LoopVectorize));		MPM.add(createLoopVectorizePass(DisableUnrollLoops, LoopVectorize));
// FIXME: Because of #pragma vectorize enable, the passes below are always		// FIXME: Because of #pragma vectorize enable, the passes below are always
▲ Show 20 Lines • Show All 293 Lines • Show Last 20 Lines

lib/Transforms/Scalar/CMakeLists.txt

	add_llvm_library(LLVMScalarOpts			add_llvm_library(LLVMScalarOpts
	ADCE.cpp			ADCE.cpp
	AlignmentFromAssumptions.cpp			AlignmentFromAssumptions.cpp
	BDCE.cpp			BDCE.cpp
	ConstantHoisting.cpp			ConstantHoisting.cpp
	ConstantProp.cpp			ConstantProp.cpp
	CorrelatedValuePropagation.cpp			CorrelatedValuePropagation.cpp
	DCE.cpp			DCE.cpp
	DeadStoreElimination.cpp			DeadStoreElimination.cpp
	EarlyCSE.cpp			EarlyCSE.cpp
	FlattenCFGPass.cpp			FlattenCFGPass.cpp
				Float2Int.cpp
	GVN.cpp			GVN.cpp
	InductiveRangeCheckElimination.cpp			InductiveRangeCheckElimination.cpp
	IndVarSimplify.cpp			IndVarSimplify.cpp
	JumpThreading.cpp			JumpThreading.cpp
	LICM.cpp			LICM.cpp
	LoadCombine.cpp			LoadCombine.cpp
	LoopDeletion.cpp			LoopDeletion.cpp
	LoopIdiomRecognize.cpp			LoopIdiomRecognize.cpp
	Show All 35 Lines

lib/Transforms/Scalar/Float2Int.cpp

This file was added.

				//===- Float2Int.cpp - Demote floating point ops to work on integers ------===//
				//
				// The LLVM Compiler Infrastructure
				//
				// This file is distributed under the University of Illinois Open Source
				// License. See LICENSE.TXT for details.
				//
				//===----------------------------------------------------------------------===//
				//
				// This file implements the Float2Int pass, which aims to demote floating
				// point operations to work on integers, where that is losslessly possible.
				//
				//===----------------------------------------------------------------------===//

				#define DEBUG_TYPE "float2int"
				#include "llvm/ADT/DenseMap.h"
				#include "llvm/ADT/EquivalenceClasses.h"
				#include "llvm/ADT/SmallVector.h"
				#include "llvm/ADT/APInt.h"
				#include "llvm/ADT/APSInt.h"
				#include "llvm/ADT/MapVector.h"
				#include "llvm/IR/Constants.h"
				#include "llvm/IR/ConstantRange.h"
				#include "llvm/IR/InstIterator.h"
				#include "llvm/IR/Instructions.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/Module.h"
				#include "llvm/Pass.h"
				#include "llvm/Support/Debug.h"
				#include "llvm/Transforms/Scalar.h"
				#include <functional> // For std::function
				#include <deque>
				using namespace llvm;

				// The algorithm is simple. Start at instructions that convert from the
				// float to the int domain: fptoui, fptosi and fcmp. Walk up the def-use
				// graph, using an equivalence datastructure to unify graphs that interfere.
				//
				// Mappable instructions are those with an integer corrollary that, given
				// integer domain inputs, produce an integer output; fadd, for example.
				//
				// If a non-mappable instruction is seen, this entire def-use graph is marked
				// as non-transformable. If we see an instruction that converts from the
				// integer domain to FP domain (uitofp,sitofp), we terminate our walk.

				/// The largest integer type worth dealing with.
				static cl::opt<unsigned>
				hfinkelUnsubmitted Not Done Reply Inline Actions Make this a command-line option. hfinkel: Make this a command-line option.
				MaxIntegerBW("float2int-max-integer-bw", cl::init(64), cl::Hidden,
				cl::desc("Max integer bitwidth to consider in float2int"
				hfinkelUnsubmitted Not Done Reply Inline Actions Don't put the "-" at the beginning of "-float2int-max-integer-bw". hfinkel: Don't put the "-" at the beginning of "-float2int-max-integer-bw".
				"(default=64)"));

				namespace {
				struct Float2Int : public FunctionPass {
				static char ID; // Pass identification, replacement for typeid
				Float2Int() : FunctionPass(ID) {
				initializeFloat2IntPass(*PassRegistry::getPassRegistry());
				}

				bool runOnFunction(Function &F) override;
				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.setPreservesCFG();
				}

				void findRoots(Function &F, SmallPtrSet<Instruction*,8> &Roots);
				ConstantRange seen(Instruction *I, ConstantRange R);
				ConstantRange badRange();
				hfinkelUnsubmitted Not Done Reply Inline Actions Please make all of the functions here start with a lowercase letter. hfinkel: Please make all of the functions here start with a lowercase letter.
				ConstantRange unknownRange();
				ConstantRange validateRange(ConstantRange R);
				void walkBackwards(const SmallPtrSetImpl<Instruction*> &Roots);
				void walkForwards();
				bool validateAndTransform();
				Value convert(Instruction I, Type *ToTy);
				void cleanup();

				MapVector<Instruction*, ConstantRange > SeenInsts;
				hfinkelUnsubmitted Not Done Reply Inline Actions Remove extra space before > hfinkel: Remove extra space before >
				SmallPtrSet<Instruction*,8> Roots;
				EquivalenceClasses<Instruction*> ECs;
				MapVector<Instruction, Value> ConvertedInsts;
				LLVMContext *Ctx;
				};
				}

				char Float2Int::ID = 0;
				INITIALIZE_PASS(Float2Int, "float2int", "Float to int", false, false)

				// Given a FCmp predicate, return a matching ICmp predicate if one
				// exists, otherwise return BAD_ICMP_PREDICATE.
				static CmpInst::Predicate mapFCmpPred(CmpInst::Predicate P) {
				switch (P) {
				case CmpInst::FCMP_OEQ:
				case CmpInst::FCMP_UEQ:
				return CmpInst::ICMP_EQ;
				case CmpInst::FCMP_OGT:
				case CmpInst::FCMP_UGT:
				return CmpInst::ICMP_SGT;
				case CmpInst::FCMP_OGE:
				case CmpInst::FCMP_UGE:
				return CmpInst::ICMP_SGE;
				case CmpInst::FCMP_OLT:
				case CmpInst::FCMP_ULT:
				return CmpInst::ICMP_SLT;
				case CmpInst::FCMP_OLE:
				case CmpInst::FCMP_ULE:
				return CmpInst::ICMP_SLE;
				case CmpInst::FCMP_ONE:
				case CmpInst::FCMP_UNE:
				return CmpInst::ICMP_NE;
				default:
				return CmpInst::BAD_ICMP_PREDICATE;
				}
				}

				// Given a floating point binary operator, return the matching
				// integer version.
				static Instruction::BinaryOps mapBinOpcode(unsigned Opcode) {
				switch (Opcode) {
				default: llvm_unreachable("Unhandled opcode!");
				case Instruction::FAdd: return Instruction::Add;
				case Instruction::FSub: return Instruction::Sub;
				case Instruction::FMul: return Instruction::Mul;
				}
				}

				// Find the roots - instructions that convert from the FP domain to
				// integer domain.
				void Float2Int::findRoots(Function &F, SmallPtrSet<Instruction*,8> &Roots) {
				for (auto &I : inst_range(F)) {
				switch (I.getOpcode()) {
				default: break;
				case Instruction::FPToUI:
				case Instruction::FPToSI:
				Roots.insert(&I);
				break;
				case Instruction::FCmp:
				if (mapFCmpPred(cast<CmpInst>(&I)->getPredicate()) !=
				CmpInst::BAD_ICMP_PREDICATE)
				Roots.insert(&I);
				break;
				}
				}
				}

				// Helper - mark I as having been traversed, having range R.
				ConstantRange Float2Int::seen(Instruction *I, ConstantRange R) {
				DEBUG(dbgs() << "F2I: " << *I << ":" << R << "\n");
				if (SeenInsts.find(I) != SeenInsts.end())
				SeenInsts.find(I)->second = R;
				else
				SeenInsts.insert(std::make_pair(I, R));
				return R;
				}

				// Helper - get a range representing a poison value.
				ConstantRange Float2Int::badRange() {
				return ConstantRange(MaxIntegerBW + 1, true);
				}
				ConstantRange Float2Int::unknownRange() {
				hfinkelUnsubmitted Not Done Reply Inline Actions You can also handle Select here (and PHIs, although that might require more work?) hfinkel: You can also handle Select here (and PHIs, although that might require more work?)
				return ConstantRange(MaxIntegerBW + 1, false);
				}
				ConstantRange Float2Int::validateRange(ConstantRange R) {
				if (R.getBitWidth() > MaxIntegerBW + 1)
				return badRange();
				return R;
				}

				hfinkelUnsubmitted Not Done Reply Inline Actions I don't like the terminology of calling this a 'forward' walk. It is walking instructions from the roots (which are at the end of the execution sequence), through their operands. I call this a 'backward' walk. hfinkel: I don't like the terminology of calling this a 'forward' walk. It is walking instructions from…
				// The most obvious way to structure the search is a depth-first, eager
				// search from each root. However, that require direct recursion and so
				// can only handle small instruction sequences. Instead, we split the search
				// up into two phases:
				// - walkBackwards: A breadth-first walk of the use-def graph starting from
				hfinkelUnsubmitted Not Done Reply Inline Actions If you visit defs before uses, that seems 'forward' to me. hfinkel: If you visit defs before uses, that seems 'forward' to me.
				// the roots. Populate "SeenInsts" with interesting
				// instructions and poison values if they're obvious and
				// cheap to compute. Calculate the equivalance set structure
				// while we're here too.
				// - walkForwards: Iterate over SeenInsts in reverse order, so we visit
				// defs before their uses. Calculate the real range info.

				// Breadth-first walk of the use-def graph; determine the set of nodes
				// we care about and eagerly determine if some of them are poisonous.
				void Float2Int::walkBackwards(const SmallPtrSetImpl<Instruction*> &Roots) {
				std::deque<Instruction*> Worklist(Roots.begin(), Roots.end());
				while (!Worklist.empty()) {
				Instruction *I = Worklist.back();
				Worklist.pop_back();

				if (SeenInsts.find(I) != SeenInsts.end())
				// Seen already.
				continue;

				switch (I->getOpcode()) {
				// FIXME: Handle select and phi nodes.
				default:
				// Path terminated uncleanly.
				seen(I, badRange());
				continue;

				case Instruction::UIToFP: {
				// Path terminated cleanly.
				unsigned BW = I->getOperand(0)->getType()->getPrimitiveSizeInBits();
				APInt Min = APInt::getMinValue(BW).zextOrSelf(MaxIntegerBW+1);
				APInt Max = APInt::getMaxValue(BW).zextOrSelf(MaxIntegerBW+1);
				seen(I, validateRange(ConstantRange(Min, Max)));
				continue;
				}

				case Instruction::SIToFP: {
				// Path terminated cleanly.
				unsigned BW = I->getOperand(0)->getType()->getPrimitiveSizeInBits();
				APInt SMin = APInt::getSignedMinValue(BW).sextOrSelf(MaxIntegerBW+1);
				APInt SMax = APInt::getSignedMaxValue(BW).sextOrSelf(MaxIntegerBW+1);
				seen(I, validateRange(ConstantRange(SMin, SMax)));
				continue;
				}

				case Instruction::FAdd:
				case Instruction::FSub:
				case Instruction::FMul:
				case Instruction::FPToUI:
				case Instruction::FPToSI:
				case Instruction::FCmp:
				break;
				}

				seen(I, unknownRange());
				hfinkelUnsubmitted Not Done Reply Inline Actions This is directly recursive; that's likely a bad idea (especially considering that it has no depth cutoff). Can you make this be worklist-based instead. We don't need a small cutoff if we have a worklist. hfinkel: This is directly recursive; that's likely a bad idea (especially considering that it has no…
				for (Value *O : I->operands()) {
				if (Instruction *OI = dyn_cast<Instruction>(O)) {
				// Unify def-use chains if they interfere.
				ECs.unionSets(I, OI);
				Worklist.push_back(OI);
				} else if (!isa<ConstantFP>(O)) {
				// Not an instruction or ConstantFP? we can't do anything.
				hfinkelUnsubmitted Not Done Reply Inline Actions Remove blank line. hfinkel: Remove blank line.
				seen(I, badRange());
				break;
				}
				}
				}
				}

				// Walk forwards down the list of seen instructions, so we visit defs before
				// uses.
				void Float2Int::walkForwards() {
				for (auto It = SeenInsts.rbegin(), E = SeenInsts.rend(); It != E; ++It) {
				if (It->second != unknownRange())
				continue;

				Instruction *I = It->first;
				std::function<ConstantRange(ArrayRef<ConstantRange>)> Op;
				switch (I->getOpcode()) {
				// FIXME: Handle select and phi nodes.
				default:
				case Instruction::UIToFP:
				case Instruction::SIToFP:
				llvm_unreachable("Should have been handled in walkForwards!");

				case Instruction::FAdd:
				Op = [](ArrayRef<ConstantRange> Ops) {
				hfinkelUnsubmitted Not Done Reply Inline Actions Why is the FIXME here? Do you want to mark these as unsigned under some circumstances? hfinkel: Why is the FIXME here? Do you want to mark these as unsigned under some circumstances?
				assert(Ops.size() == 2 && "FAdd is a binary operator!");
				return Ops[0].add(Ops[1]);
				};
				break;

				case Instruction::FSub:
				Op = [](ArrayRef<ConstantRange> Ops) {
				assert(Ops.size() == 2 && "FSub is a binary operator!");
				return Ops[0].sub(Ops[1]);
				};
				break;

				case Instruction::FMul:
				Op = [](ArrayRef<ConstantRange> Ops) {
				assert(Ops.size() == 2 && "FMul is a binary operator!");
				return Ops[0].multiply(Ops[1]);
				};
				break;

				//
				// Root-only instructions - we'll only see these if they're the
				// first node in a walk.
				//
				case Instruction::FPToUI:
				case Instruction::FPToSI:
				hfinkelUnsubmitted Not Done Reply Inline Actions What's this FIXME for? You'd want to mark the full range? (that would make the unioning unnecessary ;) ) hfinkel: What's this FIXME for? You'd want to mark the full range? (that would make the unioning…
				Op = [](ArrayRef<ConstantRange> Ops) {
				assert(Ops.size() == 1 && "FPTo[US]I is a unary operator!");
				return Ops[0];
				};
				break;

				case Instruction::FCmp:
				Op = [](ArrayRef<ConstantRange> Ops) {
				assert(Ops.size() == 2 && "FCmp is a binary operator!");
				return Ops[0].unionWith(Ops[1]);
				};
				break;
				}

				bool Abort = false;
				hfinkelUnsubmitted Not Done Reply Inline Actions Remove blank line. hfinkel: Remove blank line.
				SmallVector<ConstantRange,4> OpRanges;
				for (Value *O : I->operands()) {
				if (Instruction *OI = dyn_cast<Instruction>(O)) {
				assert(SeenInsts.find(OI) != SeenInsts.end() &&
				hfinkelUnsubmitted Not Done Reply Inline Actions Remove blank line. hfinkel: Remove blank line.
				"def not seen before use!");
				OpRanges.push_back(SeenInsts.find(OI)->second);
				hfinkelUnsubmitted Not Done Reply Inline Actions Remove blank line. hfinkel: Remove blank line.
				} else if (ConstantFP *CF = dyn_cast<ConstantFP>(O)) {
				// Work out if the floating point number can be losslessly represented
				// as an integer.
				// APFloat::convertToInteger(&Exact) purports to do what we want, but
				// the exactness can be too precise. For example, negative zero can
				// never be exactly converted to an integer.
				//
				// Instead, we ask APFloat to round itself to an integral value - this
				// preserves sign-of-zero - then compare the result with the original.
				//
				APFloat F = CF->getValueAPF();

				// First, weed out obviously incorrect values. Non-finite numbers
				// can't be represented and neither can negative zero, unless
				// we're in fast math mode.
				if (!F.isFinite() \|\|
				hfinkelUnsubmitted Not Done Reply Inline Actions Why are you checking for isFullSet() here? What about if you only have i8 (or something definitively smaller than the mantissa)? hfinkel: Why are you checking for isFullSet() here? What about if you only have i8 (or something…
				(F.isZero() && F.isNegative() && isa<FPMathOperator>(I) &&
				!I->hasNoSignedZeros())) {
				seen(I, badRange());
				Abort = true;
				break;
				}

				APFloat NewF = F;
				auto Res = NewF.roundToIntegral(APFloat::rmNearestTiesToEven);
				if (Res != APFloat::opOK \|\| NewF.compare(F) != APFloat::cmpEqual) {
				seen(I, badRange());
				Abort = true;
				break;
				}
				// OK, it's representable. Now get it.
				hfinkelUnsubmitted Not Done Reply Inline Actions Hrmm, we have other floating-point types in the IR. You need to either exclude them somewhere, or handle them. Either way, the real way to do this is to call: ... = APFloat::semanticsPrecision(ConvertedToTy->getFltSemantics()) - 1; hfinkel: Hrmm, we have other floating-point types in the IR. You need to either exclude them somewhere…
				APSInt Int(MaxIntegerBW+1, false);
				bool Exact;
				CF->getValueAPF().convertToInteger(Int,
				APFloat::rmNearestTiesToEven,
				&Exact);
				OpRanges.push_back(ConstantRange(Int));
				hfinkelUnsubmitted Not Done Reply Inline Actions Remove blank line. hfinkel: Remove blank line.
				} else {
				llvm_unreachable("Should have already marked this as badRange!");
				hfinkelUnsubmitted Not Done Reply Inline Actions Exactly where does 23 come from here? I understand that is the number of bits in a single-precision mantissa, but what does it have to do with picking the integer type? Can you use make use of Type::getIntNTy? hfinkel: Exactly where does 23 come from here? I understand that is the number of bits in a single…
				}
				}

				// Reduce the operands' ranges to a single range and return.
				if (!Abort)
				seen(I, Op(OpRanges));
				}
				hfinkelUnsubmitted Not Done Reply Inline Actions Remove blank line. hfinkel: Remove blank line.
				}

				// If there is a valid transform to be done, do it.
				bool Float2Int::validateAndTransform() {
				bool MadeChange = false;

				// Iterate over every disjoint partition of the def-use graph.
				for (auto It = ECs.begin(), E = ECs.end(); It != E; ++It) {
				ConstantRange R(MaxIntegerBW + 1, false);
				bool Fail = false;
				Type *ConvertedToTy = nullptr;

				// For every member of the partition, union all the ranges together.
				for (auto MI = ECs.member_begin(It), ME = ECs.member_end();
				MI != ME; ++MI) {
				Instruction I = MI;
				auto SeenI = SeenInsts.find(I);
				assert (SeenI != SeenInsts.end() && "Didn't see this instruction?");

				R = R.unionWith(SeenI->second);
				// We need to ensure I has no users that have not been seen.
				// If it does, transformation would be illegal.
				//
				// Don't count the roots, as they terminate the graphs.
				if (Roots.count(I) == 0) {
				// Set the type of the conversion while we're here.
				if (!ConvertedToTy)
				ConvertedToTy = I->getType();
				for (User *U : I->users()) {
				Instruction *UI = dyn_cast<Instruction>(U);
				if (!UI \|\| SeenInsts.find(UI) == SeenInsts.end()) {
				DEBUG(dbgs() << "F2I: Failing because of " << *U << "\n");
				Fail = true;
				break;
				}
				}
				}
				hfinkelUnsubmitted Not Done Reply Inline Actions You should not need to explicitly handle the equal-types case here. IRB::CreateZExtOrTrunc, will also check this and do nothing if the incoming and outgoing bitwidths are equal. (this obviously applies to the checks below as well). hfinkel: You should not need to explicitly handle the equal-types case here. IRB::CreateZExtOrTrunc…
				if (Fail)
				break;
				}

				// If the set was empty, or we failed, or the range is poisonous,
				// bail out.
				if (ECs.member_begin(It) == ECs.member_end() \|\| Fail \|\|
				R.isFullSet() \|\| R.isSignWrappedSet())
				continue;
				assert(ConvertedToTy && "Must have set the convertedtoty by this point!");

				// The number of bits required is the maximum of the upper and
				// lower limits, plus one so it can be signed.
				unsigned MinBW = std::max(R.getLower().getMinSignedBits(),
				R.getUpper().getMinSignedBits()) + 1;
				DEBUG(dbgs() << "F2I: MinBitwidth=" << MinBW << ", R: " << R << "\n");

				// If we've run off the realms of the exactly representable integers,
				// the floating point result will differ from an integer approximation.

				// Do we need more bits than are in the mantissa of the type we converted
				// to? semanticsPrecision returns the number of mantissa bits plus one
				// for the sign bit.
				unsigned MaxRepresentableBits
				= APFloat::semanticsPrecision(ConvertedToTy->getFltSemantics()) - 1;
				if (MinBW > MaxRepresentableBits) {
				DEBUG(dbgs() << "F2I: Value not guaranteed to be representable!\n");
				continue;
				}
				if (MinBW > 64) {
				DEBUG(dbgs() << "F2I: Value requires more than 64 bits to represent!\n");
				continue;
				}

				hfinkelUnsubmitted Not Done Reply Inline Actions Where do you filter out MinBW > 64? hfinkel: Where do you filter out MinBW > 64?
				// OK, R is known to be representable. Now pick a type for it.
				// FIXME: Pick the smallest legal type that will fit.
				Type Ty = (MinBW > 32) ? Type::getInt64Ty(Ctx) : Type::getInt32Ty(*Ctx);
				hfinkelUnsubmitted Not Done Reply Inline Actions { } not needed here. hfinkel: { } not needed here.

				for (auto MI = ECs.member_begin(It), ME = ECs.member_end();
				MI != ME; ++MI)
				convert(*MI, Ty);
				MadeChange = true;
				}

				return MadeChange;
				}

				Value Float2Int::convert(Instruction I, Type *ToTy) {
				if (ConvertedInsts.find(I) != ConvertedInsts.end())
				// Already converted this instruction.
				return ConvertedInsts[I];

				SmallVector<Value*,4> NewOperands;
				for (Value *V : I->operands()) {
				// Don't recurse if we're an instruction that terminates the path.
				if (I->getOpcode() == Instruction::UIToFP \|\|
				reamesUnsubmitted Not Done Reply Inline Actions Can this be a range loop? Can you use a utility to delete things which are recursively trivially dead? reames: Can this be a range loop? Can you use a utility to delete things which are recursively…
				I->getOpcode() == Instruction::SIToFP) {
				hfinkelUnsubmitted Not Done Reply Inline Actions Hrmm. To have converted an instruction, you must have converted all uses, right? If we know these need to be dead, and we just don't know the order, we can use the same technique employed by ADCE: First, loop over all instructions calling I->dropAllReferences Then, loop over them again, erasing them as you do here. hfinkel: Hrmm. To have converted an instruction, you must have converted all uses, right? If we know…
				NewOperands.push_back(V);
				} else if (Instruction *VI = dyn_cast<Instruction>(V)) {
				NewOperands.push_back(convert(VI, ToTy));
				} else if (ConstantFP *CF = dyn_cast<ConstantFP>(V)) {
				APSInt Val(ToTy->getPrimitiveSizeInBits(), true);
				bool Exact;
				CF->getValueAPF().convertToInteger(Val,
				APFloat::rmNearestTiesToEven,
				&Exact);
				NewOperands.push_back(ConstantInt::get(ToTy, Val));
				} else {
				llvm_unreachable("Unhandled operand type?");
				}
				}

				// Now create a new instruction.
				IRBuilder<> IRB(I);
				Value *NewV = nullptr;
				switch (I->getOpcode()) {
				default: llvm_unreachable("Unhandled instruction!");

				case Instruction::FPToUI:
				NewV = IRB.CreateZExtOrTrunc(NewOperands[0], I->getType());
				break;

				case Instruction::FPToSI:
				NewV = IRB.CreateSExtOrTrunc(NewOperands[0], I->getType());
				break;

				case Instruction::FCmp: {
				CmpInst::Predicate P = mapFCmpPred(cast<CmpInst>(I)->getPredicate());
				hfinkelUnsubmitted Not Done Reply Inline Actions Unnecessary line break? hfinkel: Unnecessary line break?
				assert(P != CmpInst::BAD_ICMP_PREDICATE && "Unhandled predicate!");
				NewV = IRB.CreateICmp(P, NewOperands[0], NewOperands[1], I->getName());
				break;
				}

				case Instruction::UIToFP:
				NewV = IRB.CreateZExtOrTrunc(NewOperands[0], ToTy);
				break;

				case Instruction::SIToFP:
				NewV = IRB.CreateSExtOrTrunc(NewOperands[0], ToTy);
				break;

				case Instruction::FAdd:
				case Instruction::FSub:
				case Instruction::FMul:
				NewV = IRB.CreateBinOp(mapBinOpcode(I->getOpcode()),
				NewOperands[0], NewOperands[1],
				I->getName());
				break;
				}

				// If we're a root instruction, RAUW.
				if (Roots.count(I))
				I->replaceAllUsesWith(NewV);

				ConvertedInsts[I] = NewV;
				return NewV;
				}

				// Perform dead code elimination on the instructions we just modified.
				void Float2Int::cleanup() {
				for (auto I = ConvertedInsts.rbegin(), E = ConvertedInsts.rend();
				I != E; ++I)
				I->first->eraseFromParent();
				}

				bool Float2Int::runOnFunction(Function &F) {
				DEBUG(dbgs() << "F2I: Looking at function " << F.getName() << "\n");
				// Clear out all state.
				ECs = EquivalenceClasses<Instruction*>();
				SeenInsts.clear();
				ConvertedInsts.clear();
				Roots.clear();

				Ctx = &F.getParent()->getContext();

				findRoots(F, Roots);

				walkBackwards(Roots);
				walkForwards();

				bool Modified = validateAndTransform();
				if (Modified)
				cleanup();
				return Modified;
				}

				FunctionPass *llvm::createFloat2IntPass() {
				return new Float2Int();
				}

lib/Transforms/Scalar/Scalar.cpp

Show First 20 Lines • Show All 71 Lines • ▼ Show 20 Lines	void llvm::initializeScalarOpts(PassRegistry &Registry) {
initializeStructurizeCFGPass(Registry);		initializeStructurizeCFGPass(Registry);
initializeSinkingPass(Registry);		initializeSinkingPass(Registry);
initializeTailCallElimPass(Registry);		initializeTailCallElimPass(Registry);
initializeSeparateConstOffsetFromGEPPass(Registry);		initializeSeparateConstOffsetFromGEPPass(Registry);
initializeStraightLineStrengthReducePass(Registry);		initializeStraightLineStrengthReducePass(Registry);
initializeLoadCombinePass(Registry);		initializeLoadCombinePass(Registry);
initializePlaceBackedgeSafepointsImplPass(Registry);		initializePlaceBackedgeSafepointsImplPass(Registry);
initializePlaceSafepointsPass(Registry);		initializePlaceSafepointsPass(Registry);
		initializeFloat2IntPass(Registry);
}		}

void LLVMInitializeScalarOpts(LLVMPassRegistryRef R) {		void LLVMInitializeScalarOpts(LLVMPassRegistryRef R) {
initializeScalarOpts(*unwrap(R));		initializeScalarOpts(*unwrap(R));
}		}

void LLVMAddAggressiveDCEPass(LLVMPassManagerRef PM) {		void LLVMAddAggressiveDCEPass(LLVMPassManagerRef PM) {
unwrap(PM)->add(createAggressiveDCEPass());		unwrap(PM)->add(createAggressiveDCEPass());
▲ Show 20 Lines • Show All 151 Lines • Show Last 20 Lines

test/Transforms/Float2Int/basic.ll

This file was added.

				; RUN: opt < %s -float2int -S \| FileCheck %s

				;
				; Positive tests
				hfinkelUnsubmitted Not Done Reply Inline Actions Please move these descriptions to be with each associated test. hfinkel: Please move these descriptions to be with each associated test.
				;

				; CHECK-LABEL: @simple1
				; CHECK: %1 = zext i8 %a to i32
				; CHECK: %2 = add i32 %1, 1
				; CHECK: %3 = trunc i32 %2 to i16
				; CHECK: ret i16 %3
				define i16 @simple1(i8 %a) {
				%1 = uitofp i8 %a to float
				%2 = fadd float %1, 1.0
				%3 = fptoui float %2 to i16
				ret i16 %3
				}

				; CHECK-LABEL: @simple2
				; CHECK: %1 = zext i8 %a to i32
				; CHECK: %2 = sub i32 %1, 1
				; CHECK: %3 = trunc i32 %2 to i8
				; CHECK: ret i8 %3
				define i8 @simple2(i8 %a) {
				%1 = uitofp i8 %a to float
				%2 = fsub float %1, 1.0
				%3 = fptoui float %2 to i8
				ret i8 %3
				}

				; CHECK-LABEL: @simple3
				; CHECK: %1 = zext i8 %a to i32
				; CHECK: %2 = sub i32 %1, 1
				; CHECK: ret i32 %2
				define i32 @simple3(i8 %a) {
				%1 = uitofp i8 %a to float
				%2 = fsub float %1, 1.0
				%3 = fptoui float %2 to i32
				ret i32 %3
				}

				; CHECK-LABEL: @cmp
				; CHECK: %1 = zext i8 %a to i32
				; CHECK: %2 = zext i8 %b to i32
				; CHECK: %3 = icmp slt i32 %1, %2
				; CHECK: ret i1 %3
				define i1 @cmp(i8 %a, i8 %b) {
				%1 = uitofp i8 %a to float
				%2 = uitofp i8 %b to float
				%3 = fcmp ult float %1, %2
				ret i1 %3
				}

				; CHECK-LABEL: @simple4
				; CHECK: %1 = zext i32 %a to i64
				; CHECK: %2 = add i64 %1, 1
				; CHECK: %3 = trunc i64 %2 to i32
				; CHECK: ret i32 %3
				define i32 @simple4(i32 %a) {
				%1 = uitofp i32 %a to double
				%2 = fadd double %1, 1.0
				%3 = fptoui double %2 to i32
				ret i32 %3
				}

				; CHECK-LABEL: @simple5
				; CHECK: %1 = zext i8 %a to i32
				; CHECK: %2 = zext i8 %b to i32
				; CHECK: %3 = add i32 %1, 1
				; CHECK: %4 = mul i32 %3, %2
				; CHECK: ret i32 %4
				define i32 @simple5(i8 %a, i8 %b) {
				%1 = uitofp i8 %a to float
				%2 = uitofp i8 %b to float
				%3 = fadd float %1, 1.0
				%4 = fmul float %3, %2
				%5 = fptoui float %4 to i32
				ret i32 %5
				}

				; The two chains don't interact - failure of one shouldn't
				; cause failure of the other.

				; CHECK-LABEL: @multi1
				; CHECK: %1 = zext i8 %a to i32
				; CHECK: %2 = zext i8 %b to i32
				; CHECK: %fc = uitofp i8 %c to float
				; CHECK: %x1 = add i32 %1, %2
				; CHECK: %z = fadd float %fc, %d
				; CHECK: %w = fptoui float %z to i32
				; CHECK: %r = add i32 %x1, %w
				; CHECK: ret i32 %r
				define i32 @multi1(i8 %a, i8 %b, i8 %c, float %d) {
				%fa = uitofp i8 %a to float
				%fb = uitofp i8 %b to float
				%fc = uitofp i8 %c to float
				%x = fadd float %fa, %fb
				%y = fptoui float %x to i32
				%z = fadd float %fc, %d
				%w = fptoui float %z to i32
				%r = add i32 %y, %w
				ret i32 %r
				}

				; CHECK-LABEL: @simple_negzero
				; CHECK: %1 = zext i8 %a to i32
				; CHECK: %2 = add i32 %1, 0
				; CHECK: %3 = trunc i32 %2 to i16
				; CHECK: ret i16 %3
				define i16 @simple_negzero(i8 %a) {
				%1 = uitofp i8 %a to float
				%2 = fadd fast float %1, -0.0
				%3 = fptoui float %2 to i16
				ret i16 %3
				}

				;
				; Negative tests
				;

				hfinkelUnsubmitted Not Done Reply Inline Actions Please add negative test using a large integer type (i128, etc.). hfinkel: Please add negative test using a large integer type (i128, etc.).
				; CHECK-LABEL: @neg_multi1
				; CHECK: %fa = uitofp i8 %a to float
				; CHECK: %fc = uitofp i8 %c to float
				; CHECK: %x = fadd float %fa, %fc
				; CHECK: %y = fptoui float %x to i32
				; CHECK: %z = fadd float %fc, %d
				; CHECK: %w = fptoui float %z to i32
				; CHECK: %r = add i32 %y, %w
				; CHECK: ret i32 %r
				; The two chains intersect, which means because one fails, no
				; transform can occur.
				define i32 @neg_multi1(i8 %a, i8 %b, i8 %c, float %d) {
				%fa = uitofp i8 %a to float
				%fc = uitofp i8 %c to float
				%x = fadd float %fa, %fc
				%y = fptoui float %x to i32
				%z = fadd float %fc, %d
				%w = fptoui float %z to i32
				%r = add i32 %y, %w
				ret i32 %r
				}

				; CHECK-LABEL: @neg_muld
				; CHECK: %fa = uitofp i32 %a to double
				; CHECK: %fb = uitofp i32 %b to double
				; CHECK: %mul = fmul double %fa, %fb
				; CHECK: %r = fptoui double %mul to i64
				; CHECK: ret i64 %r
				; The i32 * i32 = i64, which has 64 bits, which is greater than the 52 bits
				; that can be exactly represented in a double.
				define i64 @neg_muld(i32 %a, i32 %b) {
				%fa = uitofp i32 %a to double
				%fb = uitofp i32 %b to double
				%mul = fmul double %fa, %fb
				%r = fptoui double %mul to i64
				ret i64 %r
				}

				; CHECK-LABEL: @neg_mulf
				; CHECK: %fa = uitofp i16 %a to float
				; CHECK: %fb = uitofp i16 %b to float
				; CHECK: %mul = fmul float %fa, %fb
				; CHECK: %r = fptoui float %mul to i32
				; CHECK: ret i32 %r
				; The i16 * i16 = i32, which can't be represented in a float, but can in a
				; double. This should fail, as the written code uses floats, not doubles so
				; the original result may be inaccurate.
				define i32 @neg_mulf(i16 %a, i16 %b) {
				%fa = uitofp i16 %a to float
				%fb = uitofp i16 %b to float
				%mul = fmul float %fa, %fb
				%r = fptoui float %mul to i32
				ret i32 %r
				}

				; CHECK-LABEL: @neg_cmp
				; CHECK: %1 = uitofp i8 %a to float
				; CHECK: %2 = uitofp i8 %b to float
				; CHECK: %3 = fcmp false float %1, %2
				; CHECK: ret i1 %3
				; "false" doesn't have an icmp equivalent.
				define i1 @neg_cmp(i8 %a, i8 %b) {
				%1 = uitofp i8 %a to float
				%2 = uitofp i8 %b to float
				%3 = fcmp false float %1, %2
				ret i1 %3
				}

				; CHECK-LABEL: @neg_div
				; CHECK: %1 = uitofp i8 %a to float
				; CHECK: %2 = fdiv float %1, 1.0
				; CHECK: %3 = fptoui float %2 to i16
				; CHECK: ret i16 %3
				; Division isn't a supported operator.
				define i16 @neg_div(i8 %a) {
				%1 = uitofp i8 %a to float
				%2 = fdiv float %1, 1.0
				%3 = fptoui float %2 to i16
				ret i16 %3
				}

				; CHECK-LABEL: @neg_remainder
				; CHECK: %1 = uitofp i8 %a to float
				; CHECK: %2 = fadd float %1, 1.2
				; CHECK: %3 = fptoui float %2 to i16
				; CHECK: ret i16 %3
				; 1.2 is not an integer.
				define i16 @neg_remainder(i8 %a) {
				%1 = uitofp i8 %a to float
				%2 = fadd float %1, 1.25
				%3 = fptoui float %2 to i16
				ret i16 %3
				}

				; CHECK-LABEL: @neg_toolarge
				; CHECK: %1 = uitofp i80 %a to fp128
				; CHECK: %2 = fadd fp128 %1, %1
				; CHECK: %3 = fptoui fp128 %2 to i80
				; CHECK: ret i80 %3
				; i80 > i64, which is the largest bitwidth handleable by default.
				define i80 @neg_toolarge(i80 %a) {
				%1 = uitofp i80 %a to fp128
				%2 = fadd fp128 %1, %1
				%3 = fptoui fp128 %2 to i80
				ret i80 %3
				}

test/Transforms/Float2Int/toolarge.ll

This file was added.

				; RUN: opt < %s -float2int -float2int-max-integer-bw=256 -S \| FileCheck %s

				; CHECK-LABEL: @neg_toolarge
				; CHECK: %1 = uitofp i80 %a to fp128
				; CHECK: %2 = fadd fp128 %1, %1
				; CHECK: %3 = fptoui fp128 %2 to i80
				; CHECK: ret i80 %3
				; fp128 has a 112-bit mantissa, which can hold an i80. But we only support
				; up to i64, so it should fail (even though the max integer bitwidth is 256).
				define i80 @neg_toolarge(i80 %a) {
				%1 = uitofp i80 %a to fp128
				%2 = fadd fp128 %1, %1
				%3 = fptoui fp128 %2 to i80
				ret i80 %3
				}

This is an archive of the discontinued LLVM Phabricator instance.

"float2int": Add a new pass to demote from float to int where possible.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 22181

include/llvm/InitializePasses.h

include/llvm/LinkAllPasses.h

include/llvm/Transforms/Scalar.h

lib/Transforms/IPO/PassManagerBuilder.cpp

lib/Transforms/Scalar/CMakeLists.txt

lib/Transforms/Scalar/Float2Int.cpp

lib/Transforms/Scalar/Scalar.cpp

test/Transforms/Float2Int/basic.ll

test/Transforms/Float2Int/toolarge.ll

"float2int": Add a new pass to demote from float to int where possible.
ClosedPublic