This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/
-
llvm/
-
CodeGen/
1
ExpandLargeDivRem.h
1
MachinePassRegistry.def
-
Passes.h
-
InitializePasses.h
-
LinkAllPasses.h
-
lib/
-
CodeGen/
-
CMakeLists.txt
-
CodeGen.cpp
20/21
ExpandLargeDivRem.cpp
-
Transforms/Utils/
-
Utils/
3/3
IntegerDivision.cpp
-
test/
-
CodeGen/X86/
-
X86/
2/2
urem-seteq.ll
-
Transforms/ExpandLargeDivRem/
-
ExpandLargeDivRem/
2/2
sdiv129.ll
-
srem129.ll
-
udiv129.ll
4/4
urem129.ll
-
tools/opt/
-
opt/
-
opt.cpp

Differential D126644

[llvm/CodeGen] Add ExpandLargeDivRem pass
ClosedPublic

Authored by mgehre-amd on May 30 2022, 5:45 AM.

Download Raw Diff

Details

Reviewers

efriedma
FreddyYe
MaskRay
LuoYuanke
pengfei

Commits

rG3e39b2710168: [llvm/CodeGen] Add ExpandLargeDivRem pass

Summary

Adds a pass ExpandLargeDivRem to expand div/rem instructions
with more than 128 bits into auto-generated loops.

For example, urem i129 is expanded into a loop that is
automatically generated to implement
a simple shift-subtract algorithm similar to

loop:                                             ; preds = %if.end, %entry
  %i = phi i32 [ 128, %entry ], [ %new_i, %if.end ]
  %r = phi i129 [ 0, %entry ], [ %r3, %if.end ]
  %iext = zext i32 %i to i129
  %2 = lshr i129 %0, %iext
  %3 = trunc i129 %2 to i1
  %new_r = shl i129 %r, 1
  %4 = zext i1 %3 to i129
  %new_r1 = or i129 %new_r, %4
  %loop_exit_cond = icmp eq i32 %i, 0
  %new_i = add i32 %i, -1
  %5 = icmp uge i129 %new_r1, %1
  br i1 %5, label %then, label %if.end

then:                                             ; preds = %loop
  %new_r2 = sub i129 %new_r1, %1
  br label %if.end

if.end:                                           ; preds = %then, %loop
  %r3 = phi i129 [ %new_r2, %then ], [ %new_r1, %loop ]
  br i1 %loop_exit_cond, label %exit, label %loop

  ; Result is in %r3
}

As discussed on https://reviews.llvm.org/D120327, this approach has the advantage
that it is independent of the runtime library. This also helps the clang driver,
which otherwise would need to understand enough about the runtime library
to know whether to allow _BitInts with more than 128 bits.

Targets are still free to disable this pass and instead provide a faster
implementation in a runtime library.

Fixes https://github.com/llvm/llvm-project/issues/44994

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

mgehre-amd created this revision.May 30 2022, 5:45 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 30 2022, 5:45 AM

Herald added subscribers: StephenFan, pengfei, hiraditya, mgorny. · View Herald Transcript

mgehre-amd requested review of this revision.May 30 2022, 5:45 AM

Herald added a project: Restricted Project. · View Herald TranscriptMay 30 2022, 5:45 AM

Harbormaster completed remote builds in B166896: Diff 432895.May 30 2022, 5:46 AM

mgehre-amd mentioned this in D120327: compiler-rt: Add udivmodei5 to builtins and add bitint library.May 30 2022, 5:46 AM

mgehre-amd edited the summary of this revision. (Show Details)Jun 8 2022, 8:15 AM

Herald added a subscriber: jsji. · View Herald TranscriptJun 8 2022, 8:15 AM

My primary concern here is that we're unconditionally iterating over every instruction in the module, even if the pass has nothing to do. In the grand scheme of things, it isn't *that* expensive, but it's still not cheap, and we seem to have grown a number of passes doing similar walks. Can we merge this with some other walk over the module?

LuoYuanke added a subscriber: LuoYuanke.Jun 13 2022, 7:30 PM

In D126644#3580139, @efriedma wrote:

My primary concern here is that we're unconditionally iterating over every instruction in the module, even if the pass has nothing to do. In the grand scheme of things, it isn't *that* expensive, but it's still not cheap, and we seem to have grown a number of passes doing similar walks. Can we merge this with some other walk over the module?

Hey, thanks for taking the time to look into this PR.
I looked around the pipeline to find passes that look relevant. I saw a few that didn't work because they are only enabled above -O0.
Then I found others that in principle would work, but all of those are so focused (createGCLoweringPass, createLowerConstantIntrinsicsPass, createExpandMemCmpPass, createPreISelIntrinsicLoweringPass, ...) that it felt wrong
to dilute their focus by adding the div/rem transformation.
Under -O2, we currently run 454 passes in the middle end and the backend combined.

Do you have a concrete proposal how I could address your concern?

It's very unusual for a lowering strategy to introduce a new function. I would expect to expand the instruction inline and not emit a separate function, which avoids the need for this to be a module pass.

llvm/lib/CodeGen/ExpandLargeDivRem.cpp
54–60	Can you set these all at once with AttributeList
219	You shouldn't need to pre-accumulate the instructions, just use make_early_inc_range on the instruction iterator?
227–228	dyn_cast instead of isa + implicit cast to IntegerType?
228	Since this is ultimately a workaround for SelectionDAG, ideally this would be a TargetLowering control

Actually, how is this really different from lib/Transforms/Utils/IntegerDivision.cpp? This just goes to 128 but the existing code goes up to 64. Can you merge them, or at least move the implementation bodies to the same place?

In D126644#3581484, @arsenm wrote:

Actually, how is this really different from lib/Transforms/Utils/IntegerDivision.cpp? This just goes to 128 but the existing code goes up to 64. Can you merge them, or at least move the implementation bodies to the same place?

Thanks, I really didn't know about IntegerDivision.cpp! I'll try to merge both implementations. If that works, I can use it here and avoid emitting a new function.

Another problem is this will miss divisions embedded in constant expressions

In D126644#3582347, @arsenm wrote:

Another problem is this will miss divisions embedded in constant expressions

Good point. I wasn't able to create a reproducer from C to a sdiv constant expression, but there is a C++ reproducer: https://godbolt.org/z/v5Tsh8n9x.
My hypothesis is that there is no C/C++ code that creates an sdiv/udiv/srem/urem constant expression in a global variable initializers.
So it would be sufficient to walk the ConstantExpr operands of all instructions, looking for sdiv etc that needs to be converted, and
then uses something like replaceConstantUsesInFunction() to turn them into instructions. Does that sound sensible?

Use IntegerDivison.cpp
Do not emit separate functions
Turn into FunctionPass

llvm/lib/CodeGen/ExpandLargeDivRem.cpp
219	I tried, but it crashed. I think that's because we not only remove the current instruction but also insert new basic blocks, which confuses `make_early_inc_range (instructions(F))`.
228	I'm not sure how I would wire TargetLowering into the pass - I didn't find other passes using it. Would TargetTransformInfo also work for you?

Harbormaster completed remote builds in B170468: Diff 437840.Jun 17 2022, 3:46 AM

With the work around constant expressions (e.g. https://reviews.llvm.org/rG941c8e0ea50b), I might not need to handle div/mod constants anymore. For that reason, I would at least land this PR without support for transforming constant expressions.

In D126644#3634017, @mgehre-amd wrote:

With the work around constant expressions (e.g. https://reviews.llvm.org/rG941c8e0ea50b), I might not need to handle div/mod constants anymore. For that reason, I would at least land this PR without support for transforming constant expressions.

div/rem constant expressions have already been removed. So yes, no need to handle them here anymore.

Ping, how can I get review/approval for this PR? Should I add somebody else as reviewer?

FreddyYe added reviewers: LuoYuanke, pengfei.Jul 12 2022, 2:35 AM

I'm a bit confused, where in the code does this actually create functions? It doesn't look like expandDivision() does that, and I can't spot anything else creating __llvm_udivXXX either.

llvm/lib/CodeGen/ExpandLargeDivRem.cpp
41	You should either do the replacement directly in this loop, or not use make_early_inc_range. Doesn't make sense to use it if you're not modifying anything.
42	Unnecessary newline
69	Could use a vector of BinaryOperator to avoid the casts here.
85	At least for the new pass manager, I don't think it's legal to add new functions in a function pass. I'm not sure about the legacy pass manager.
llvm/lib/CodeGen/TargetPassConfig.cpp
859 ↗	(On Diff #437840)	Spurious change.
llvm/lib/Transforms/Utils/IntegerDivision.cpp
35	Precommit as NFC changes?
llvm/test/CodeGen/X86/urem-seteq.ll
380	What happened here?

arsenm added inline comments.Jul 12 2022, 7:26 AM

llvm/lib/CodeGen/ExpandLargeDivRem.cpp
48	This won't handle vectors
llvm/test/Transforms/ExpandLargeDivRem/urem129.ll
14	This looks out of date?
llvm/test/Transforms/ExpandLargeDivRem/values129.ll
6 ↗	(On Diff #437840)	Shouldn't use -O2 in a test like this
21 ↗	(On Diff #437840)	I'm not sure cases that constant fold are the most helpful tests

In D126644#3645177, @nikic wrote:

I'm a bit confused, where in the code does this actually create functions? It doesn't look like expandDivision() does that, and I can't spot anything else creating __llvm_udivXXX either.

Yes, I changed that approach to directly generate a loop in place of the original instruction as it was requested by one of the initial review comments here. I now also updated the PR description to reflect that.

Implement review comments

Thanks for the review! I updated the PR to reflect your comments.

llvm/lib/CodeGen/ExpandLargeDivRem.cpp
48	True, I'm not concerned about vectors right now. I can add a TODO comment here.
85	Yes, this was one of the early comments I got on this PR, so I changed it to not insert new functions but instead insert the loop in-place.
llvm/test/CodeGen/X86/urem-seteq.ll
380	This IR used to crash LLVM. I don't think that the actual assembly is relevant here, and after my changes it's much longer due to the loops that were inserted.
llvm/test/Transforms/ExpandLargeDivRem/urem129.ll
14	Yes, stupid me. I'll update.
llvm/test/Transforms/ExpandLargeDivRem/values129.ll
21 ↗	(On Diff #437840)	I wrote this test when the implementation of the shift-subtract loop was still mine. It is verifying that the value that is computed by that loop is the same as the value of the urem. This is done by comparing whether constant-folding the urem directly gives the same value as first replacing the urem by expandlargedivrem and then constant folding the result. Then, I learned that there was existing implementation of the shift-subtract loop in llvm/lib/Transforms/Utils/IntegerDivision.cpp and switched to that. Because Utils/IntegerDivision.cpp is already tested, I will get rid of this test here.

Harbormaster completed remote builds in B175590: Diff 444914.Jul 15 2022, 1:39 AM

arsenm added inline comments.Jul 18 2022, 3:40 PM

llvm/lib/CodeGen/TargetPassConfig.cpp
1115 ↗	(On Diff #444914)	This doesn't belong in this patch. Currently this is lacking in legality checks for the expansion. I would prefer to see a separate patch which adds the pass and checks the legality per target, rather than relying on the arbitrary command line threshold
llvm/test/Transforms/ExpandLargeDivRem/sdiv129.ll
12	The positioning of the checks looks weird. Usually update_test_checks put these comments inside the function?

mgehre-amd marked 2 inline comments as done.Jul 19 2022, 2:57 AM

mgehre-amd added inline comments.

llvm/lib/CodeGen/TargetPassConfig.cpp
1115 ↗	(On Diff #444914)	Ok, I'll make a follow-up PR to add the pass and configure a per-target threshold. I'm thinking about adding an `int MaxLegalDivRemBitSize` member to TargetTransformInfo.
llvm/test/Transforms/ExpandLargeDivRem/sdiv129.ll
12	I guess the position is still from the time where this was an extra function. I regenerated the check-lines, but update_test_checks.py doesn't move them around. I will re-generate them from scratch to get the typical position.

Remove pass from pass pipeline (will be added in next PR)

Harbormaster completed remote builds in B176210: Diff 445759.Jul 19 2022, 3:04 AM

mgehre-amd mentioned this in D130076: [llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64.Jul 19 2022, 4:36 AM

craig.topper added a subscriber: craig.topper.Jul 27 2022, 9:09 PM

craig.topper added inline comments.

llvm/lib/CodeGen/ExpandLargeDivRem.cpp
13	I think but haven't checked that 32-bit x86, ARM, RISCV32, and other 32-bit targets can't lower division with more than 64 bits.
34	This says "<N> or more bits", but the code exits for `<= ExpandDivRemBits`
llvm/lib/Transforms/Utils/IntegerDivision.cpp
145–148	You already have the DivTy, why not go through ConstantInt::get for all of these?
llvm/test/Transforms/ExpandLargeDivRem/urem129.ll
15	Is this code handling the possible poison result from the ctlz's incorrectly? If [A] is zero then [TMP3] is poison, [TMP4] is poison, [TMP5] is poison, [TMP6] being an or won't block the poison. Assuming I understand poison correctly. I don't think that was caused by this patch. It's an issue in the existing code.

craig.topper added inline comments.Jul 27 2022, 9:50 PM

llvm/test/Transforms/ExpandLargeDivRem/urem129.ll
15	Patch for the poison issue https://reviews.llvm.org/D130680

Fix description of expand-div-rem-bits argument
Use ConstantInt::get

llvm/lib/CodeGen/ExpandLargeDivRem.cpp
13	I checked x86_32 and it lowers 128 bit divisions to __divti3: https://godbolt.org/z/Y6hT9n3x5
34	done
llvm/lib/Transforms/Utils/IntegerDivision.cpp
145–148	done

Harbormaster completed remote builds in B178057: Diff 448313.Jul 28 2022, 5:35 AM

craig.topper added inline comments.Jul 28 2022, 10:20 AM

llvm/lib/CodeGen/ExpandLargeDivRem.cpp
13	But it fails to link https://godbolt.org/z/5x38Gxqa9 so I think that's a bug. __divti3 doesn't exist in 32-bit libgcc.

mgehre-amd marked an inline comment as done.Jul 28 2022, 3:02 PM

mgehre-amd added inline comments.

llvm/lib/CodeGen/ExpandLargeDivRem.cpp
13	That's a good observation! I will update the comment. I also need to modify the follow-up PR, https://reviews.llvm.org/D130076, to set the correct limits for those targets.

Clarify that x86_32 cannot lower division with more than 64 bits.

Harbormaster completed remote builds in B178157: Diff 448452.Jul 28 2022, 3:09 PM

craig.topper added a child revision: D130076: [llvm/CodeGen] Enable the ExpandLargeDivRem pass for X86, Arm and AArch64.Jul 28 2022, 7:07 PM

craig.topper added inline comments.

llvm/lib/CodeGen/ExpandLargeDivRem.cpp
13	I thought about disabling that libcall in 32-bit mode, but the it is supported by compiler-rt. So I guess a linker error when using libgcc is better than a compiler error when using compiler-rt or libgcc until we get this pass in.

We usually provide a runtime library for specialized functionalities like this. Are such division operations common enough to benefit from compiler generated code?

In D126644#3686479, @hiraditya wrote:

We usually provide a runtime library for specialized functionalities like this. Are such division operations common enough to benefit from compiler generated code?

We explored and prototyped the approach of adding a functions to compiler-rt in a different PR. We discovered multiple issues.

One issues is that we don't have a fixed type here. There are existing library functions to do 64 bit or 128 bit division, but here we want to implement divisions with any bitsize above 129. This makes defining an ABI harder.
In addition, the main runtime library on Linux is libgcc, and it lacks such functions. It won't be easy to get them added to libgcc due to the fact that gcc doesn't have _BitInt support yet.
Providing an additional runtime on top of libgcc with just those new functions is also new territory.

So in the end, a lot of things become easier by just doing the transformation on the IR directly.

nikic mentioned this in D121379: [compiler-rt][CMake] Enable TF intrinsics on powerpc32 Linux.Aug 1 2022, 3:12 AM

efriedma mentioned this in D131521: [SDAG] avoid generating libcall to function with same name.Aug 10 2022, 1:00 PM

Friendly ping :-)
I think I addressed all comments. Is somebody able to approve this?

Rebased

Harbormaster completed remote builds in B181260: Diff 452641.Aug 15 2022, 6:16 AM

One issues is that we don't have a fixed type here. There are existing library functions to do 64 bit or 128 bit division, but here we want to implement divisions with any bitsize above 129. This makes defining an ABI harder.

Yeah, X86 psABI has new defination for fixed bitsize. But it doesn't help here.

In addition, the main runtime library on Linux is libgcc, and it lacks such functions. It won't be easy to get them added to libgcc due to the fact that gcc doesn't have _BitInt support yet.

Applaud. Implementing runtime library per to ABI is problematic. It is easy resulting in backward compatibility issues. So I incline to this approach. But I didn't look into the details and long discussions, I'd like to other reviewers to sign off.

llvm/include/llvm/CodeGen/ExpandLargeDivRem.h
2	Update the name.
llvm/include/llvm/CodeGen/MachinePassRegistry.def
46	Nit: Maybe better to use `expand-large-div-rem`.
llvm/lib/CodeGen/ExpandLargeDivRem.cpp
2	Ditto.

Rename expandlargedivrem to expand-large-div-rem
Fix filenames in comments

Harbormaster completed remote builds in B181933: Diff 453569.Aug 18 2022, 2:28 AM

FreddyYe added a parent revision: D130079: Revert "[SelectionDAG] Emit calls to __divei4 and friends for division/remainder of large integers".Aug 25 2022, 4:21 AM

Ping :)

Herald added a subscriber: • pcwang-thead. · View Herald TranscriptAug 25 2022, 11:15 PM

I don't find any problems. Maybe we can let it go :)

This revision is now accepted and ready to land.Aug 26 2022, 12:42 AM

This revision was landed with ongoing or failed builds.Aug 26 2022, 3:55 AM

Closed by commit rG3e39b2710168: [llvm/CodeGen] Add ExpandLargeDivRem pass (authored by mgehre-amd). · Explain Why

This revision was automatically updated to reflect the committed changes.

Matthias Gehre <matthias.gehre@xilinx.com> added a commit: rG3e39b2710168: [llvm/CodeGen] Add ExpandLargeDivRem pass.

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

ExpandLargeDivRem.h

29 lines

MachinePassRegistry.def

1 line

Passes.h

3 lines

InitializePasses.h

1 line

LinkAllPasses.h

1 line

lib/

CodeGen/

CMakeLists.txt

1 line

CodeGen.cpp

1 line

ExpandLargeDivRem.cpp

112 lines

Transforms/

Utils/

IntegerDivision.cpp

61 lines

test/

CodeGen/

X86/

urem-seteq.ll

15 lines

Transforms/

ExpandLargeDivRem/

67 lines

68 lines

61 lines

63 lines

tools/

opt/

opt.cpp

3 lines

Diff 455866

llvm/include/llvm/CodeGen/ExpandLargeDivRem.h

This file was added.

				//===----- ExpandLargeDivRem.h - Expand large div/rem ---------------------===//
				//
				pengfeiUnsubmitted Not Done Reply Inline Actions Update the name. pengfei: Update the name.
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//

				#ifndef LLVM_CODEGEN_EXPANDLARGEDIVREM_H
				#define LLVM_CODEGEN_EXPANDLARGEDIVREM_H

				#include "llvm/IR/PassManager.h"

				namespace llvm {

				/// Expands div/rem instructions with a bitwidth above a threshold
				/// into a loop.
				/// This is useful for backends like x86 that cannot lower divisions
				/// with more than 128 bits.
				class ExpandLargeDivRemPass : public PassInfoMixin<ExpandLargeDivRemPass> {
				public:
				PreservedAnalyses run(Function &F, FunctionAnalysisManager &AM);

				// The backend asserts when seeing large div/rem instructions.
				static bool isRequired() { return true; }
				};
				} // end namespace llvm

				#endif // LLVM_CODEGEN_EXPANDLARGEDIVREM_H

llvm/include/llvm/CodeGen/MachinePassRegistry.def

	Show All 37 Lines
	FUNCTION_PASS("mergeicmps", MergeICmpsPass, ())			FUNCTION_PASS("mergeicmps", MergeICmpsPass, ())
	FUNCTION_PASS("lower-constant-intrinsics", LowerConstantIntrinsicsPass, ())			FUNCTION_PASS("lower-constant-intrinsics", LowerConstantIntrinsicsPass, ())
	FUNCTION_PASS("unreachableblockelim", UnreachableBlockElimPass, ())			FUNCTION_PASS("unreachableblockelim", UnreachableBlockElimPass, ())
	FUNCTION_PASS("consthoist", ConstantHoistingPass, ())			FUNCTION_PASS("consthoist", ConstantHoistingPass, ())
	FUNCTION_PASS("replace-with-veclib", ReplaceWithVeclib, ())			FUNCTION_PASS("replace-with-veclib", ReplaceWithVeclib, ())
	FUNCTION_PASS("partially-inline-libcalls", PartiallyInlineLibCallsPass, ())			FUNCTION_PASS("partially-inline-libcalls", PartiallyInlineLibCallsPass, ())
	FUNCTION_PASS("ee-instrument", EntryExitInstrumenterPass, (false))			FUNCTION_PASS("ee-instrument", EntryExitInstrumenterPass, (false))
	FUNCTION_PASS("post-inline-ee-instrument", EntryExitInstrumenterPass, (true))			FUNCTION_PASS("post-inline-ee-instrument", EntryExitInstrumenterPass, (true))
				FUNCTION_PASS("expand-large-div-rem", ExpandLargeDivRemPass, ())
				pengfeiUnsubmitted Not Done Reply Inline Actions Nit: Maybe better to use `expand-large-div-rem`. pengfei: Nit: Maybe better to use `expand-large-div-rem`.
	FUNCTION_PASS("expand-reductions", ExpandReductionsPass, ())			FUNCTION_PASS("expand-reductions", ExpandReductionsPass, ())
	FUNCTION_PASS("expandvp", ExpandVectorPredicationPass, ())			FUNCTION_PASS("expandvp", ExpandVectorPredicationPass, ())
	FUNCTION_PASS("lowerinvoke", LowerInvokePass, ())			FUNCTION_PASS("lowerinvoke", LowerInvokePass, ())
	FUNCTION_PASS("scalarize-masked-mem-intrin", ScalarizeMaskedMemIntrinPass, ())			FUNCTION_PASS("scalarize-masked-mem-intrin", ScalarizeMaskedMemIntrinPass, ())
	FUNCTION_PASS("tlshoist", TLSVariableHoistPass, ())			FUNCTION_PASS("tlshoist", TLSVariableHoistPass, ())
	FUNCTION_PASS("verify", VerifierPass, ())			FUNCTION_PASS("verify", VerifierPass, ())
	#undef FUNCTION_PASS			#undef FUNCTION_PASS

	▲ Show 20 Lines • Show All 150 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/Passes.h

Show First 20 Lines • Show All 478 Lines • ▼ Show 20 Lines	namespace llvm {
// the corresponding function in a vector library (e.g., SVML, libmvec).		// the corresponding function in a vector library (e.g., SVML, libmvec).
FunctionPass *createReplaceWithVeclibLegacyPass();		FunctionPass *createReplaceWithVeclibLegacyPass();

/// This pass expands the vector predication intrinsics into unpredicated		/// This pass expands the vector predication intrinsics into unpredicated
/// instructions with selects or just the explicit vector length into the		/// instructions with selects or just the explicit vector length into the
/// predicate mask.		/// predicate mask.
FunctionPass *createExpandVectorPredicationPass();		FunctionPass *createExpandVectorPredicationPass();

		// Expands large div/rem instructions.
		FunctionPass *createExpandLargeDivRemPass();

// This pass expands memcmp() to load/stores.		// This pass expands memcmp() to load/stores.
FunctionPass *createExpandMemCmpPass();		FunctionPass *createExpandMemCmpPass();

/// Creates Break False Dependencies pass. \see BreakFalseDeps.cpp		/// Creates Break False Dependencies pass. \see BreakFalseDeps.cpp
FunctionPass *createBreakFalseDeps();		FunctionPass *createBreakFalseDeps();

// This pass expands indirectbr instructions.		// This pass expands indirectbr instructions.
FunctionPass *createIndirectBrExpandPass();		FunctionPass *createIndirectBrExpandPass();
▲ Show 20 Lines • Show All 73 Lines • Show Last 20 Lines

llvm/include/llvm/InitializePasses.h

	Show First 20 Lines • Show All 139 Lines • ▼ Show 20 Lines
	void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry&);			void initializeEarlyCSEMemSSALegacyPassPass(PassRegistry&);
	void initializeEarlyIfConverterPass(PassRegistry&);			void initializeEarlyIfConverterPass(PassRegistry&);
	void initializeEarlyIfPredicatorPass(PassRegistry &);			void initializeEarlyIfPredicatorPass(PassRegistry &);
	void initializeEarlyMachineLICMPass(PassRegistry&);			void initializeEarlyMachineLICMPass(PassRegistry&);
	void initializeEarlyTailDuplicatePass(PassRegistry&);			void initializeEarlyTailDuplicatePass(PassRegistry&);
	void initializeEdgeBundlesPass(PassRegistry&);			void initializeEdgeBundlesPass(PassRegistry&);
	void initializeEHContGuardCatchretPass(PassRegistry &);			void initializeEHContGuardCatchretPass(PassRegistry &);
	void initializeEliminateAvailableExternallyLegacyPassPass(PassRegistry&);			void initializeEliminateAvailableExternallyLegacyPassPass(PassRegistry&);
				void initializeExpandLargeDivRemLegacyPassPass(PassRegistry&);
	void initializeExpandMemCmpPassPass(PassRegistry&);			void initializeExpandMemCmpPassPass(PassRegistry&);
	void initializeExpandPostRAPass(PassRegistry&);			void initializeExpandPostRAPass(PassRegistry&);
	void initializeExpandReductionsPass(PassRegistry&);			void initializeExpandReductionsPass(PassRegistry&);
	void initializeExpandVectorPredicationPass(PassRegistry &);			void initializeExpandVectorPredicationPass(PassRegistry &);
	void initializeMakeGuardsExplicitLegacyPassPass(PassRegistry&);			void initializeMakeGuardsExplicitLegacyPassPass(PassRegistry&);
	void initializeExternalAAWrapperPassPass(PassRegistry&);			void initializeExternalAAWrapperPassPass(PassRegistry&);
	void initializeFEntryInserterPass(PassRegistry&);			void initializeFEntryInserterPass(PassRegistry&);
	void initializeFinalizeISelPass(PassRegistry&);			void initializeFinalizeISelPass(PassRegistry&);
	▲ Show 20 Lines • Show All 291 Lines • Show Last 20 Lines

llvm/include/llvm/LinkAllPasses.h

Show First 20 Lines • Show All 184 Lines • ▼ Show 20 Lines	ForcePassLinking() {
(void) llvm::createInstructionNamerPass();		(void) llvm::createInstructionNamerPass();
(void) llvm::createMetaRenamerPass();		(void) llvm::createMetaRenamerPass();
(void) llvm::createAttributorLegacyPass();		(void) llvm::createAttributorLegacyPass();
(void) llvm::createAttributorCGSCCLegacyPass();		(void) llvm::createAttributorCGSCCLegacyPass();
(void) llvm::createPostOrderFunctionAttrsLegacyPass();		(void) llvm::createPostOrderFunctionAttrsLegacyPass();
(void) llvm::createReversePostOrderFunctionAttrsPass();		(void) llvm::createReversePostOrderFunctionAttrsPass();
(void) llvm::createMergeFunctionsPass();		(void) llvm::createMergeFunctionsPass();
(void) llvm::createMergeICmpsLegacyPass();		(void) llvm::createMergeICmpsLegacyPass();
		(void) llvm::createExpandLargeDivRemPass();
(void) llvm::createExpandMemCmpPass();		(void) llvm::createExpandMemCmpPass();
(void) llvm::createExpandVectorPredicationPass();		(void) llvm::createExpandVectorPredicationPass();
std::string buf;		std::string buf;
llvm::raw_string_ostream os(buf);		llvm::raw_string_ostream os(buf);
(void) llvm::createPrintModulePass(os);		(void) llvm::createPrintModulePass(os);
(void) llvm::createPrintFunctionPass(os);		(void) llvm::createPrintFunctionPass(os);
(void) llvm::createModuleDebugInfoPrinterPass();		(void) llvm::createModuleDebugInfoPrinterPass();
(void) llvm::createPartialInliningPass();		(void) llvm::createPartialInliningPass();
▲ Show 20 Lines • Show All 47 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CMakeLists.txt

Show First 20 Lines • Show All 46 Lines • ▼ Show 20 Lines	add_llvm_component_library(LLVMCodeGen
DeadMachineInstructionElim.cpp		DeadMachineInstructionElim.cpp
DetectDeadLanes.cpp		DetectDeadLanes.cpp
DFAPacketizer.cpp		DFAPacketizer.cpp
DwarfEHPrepare.cpp		DwarfEHPrepare.cpp
EarlyIfConversion.cpp		EarlyIfConversion.cpp
EdgeBundles.cpp		EdgeBundles.cpp
EHContGuardCatchret.cpp		EHContGuardCatchret.cpp
ExecutionDomainFix.cpp		ExecutionDomainFix.cpp
		ExpandLargeDivRem.cpp
ExpandMemCmp.cpp		ExpandMemCmp.cpp
ExpandPostRAPseudos.cpp		ExpandPostRAPseudos.cpp
ExpandReductions.cpp		ExpandReductions.cpp
ExpandVectorPredication.cpp		ExpandVectorPredication.cpp
FaultMaps.cpp		FaultMaps.cpp
FEntryInserter.cpp		FEntryInserter.cpp
FinalizeISel.cpp		FinalizeISel.cpp
FixupStatepointCallerSaved.cpp		FixupStatepointCallerSaved.cpp
▲ Show 20 Lines • Show All 203 Lines • Show Last 20 Lines

llvm/lib/CodeGen/CodeGen.cpp

Show All 30 Lines	void llvm::initializeCodeGen(PassRegistry &Registry) {
initializeDeadMachineInstructionElimPass(Registry);		initializeDeadMachineInstructionElimPass(Registry);
initializeDebugifyMachineModulePass(Registry);		initializeDebugifyMachineModulePass(Registry);
initializeDetectDeadLanesPass(Registry);		initializeDetectDeadLanesPass(Registry);
initializeDwarfEHPrepareLegacyPassPass(Registry);		initializeDwarfEHPrepareLegacyPassPass(Registry);
initializeEarlyIfConverterPass(Registry);		initializeEarlyIfConverterPass(Registry);
initializeEarlyIfPredicatorPass(Registry);		initializeEarlyIfPredicatorPass(Registry);
initializeEarlyMachineLICMPass(Registry);		initializeEarlyMachineLICMPass(Registry);
initializeEarlyTailDuplicatePass(Registry);		initializeEarlyTailDuplicatePass(Registry);
		initializeExpandLargeDivRemLegacyPassPass(Registry);
initializeExpandMemCmpPassPass(Registry);		initializeExpandMemCmpPassPass(Registry);
initializeExpandPostRAPass(Registry);		initializeExpandPostRAPass(Registry);
initializeFEntryInserterPass(Registry);		initializeFEntryInserterPass(Registry);
initializeFinalizeISelPass(Registry);		initializeFinalizeISelPass(Registry);
initializeFinalizeMachineBundlesPass(Registry);		initializeFinalizeMachineBundlesPass(Registry);
initializeFixupStatepointCallerSavedPass(Registry);		initializeFixupStatepointCallerSavedPass(Registry);
initializeFuncletLayoutPass(Registry);		initializeFuncletLayoutPass(Registry);
initializeGCMachineCodeAnalysisPass(Registry);		initializeGCMachineCodeAnalysisPass(Registry);
▲ Show 20 Lines • Show All 90 Lines • Show Last 20 Lines

llvm/lib/CodeGen/ExpandLargeDivRem.cpp

This file was added.

				//===--- ExpandLargeDivRem.cpp - Expand large div/rem ---------------------===//
				//
				pengfeiUnsubmitted Not Done Reply Inline Actions Ditto. pengfei: Ditto.
				// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
				// See https://llvm.org/LICENSE.txt for license information.
				// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
				//
				//===----------------------------------------------------------------------===//
				//
				// This pass expands div/rem instructions with a bitwidth above a threshold
				// into a call to auto-generated functions.
				// This is useful for targets like x86_64 that cannot lower divisions
				// with more than 128 bits or targets like x86_32 that cannot lower divisions
				// with more than 64 bits.
				craig.topperUnsubmitted Done Reply Inline Actions I think but haven't checked that 32-bit x86, ARM, RISCV32, and other 32-bit targets can't lower division with more than 64 bits. craig.topper: I think but haven't checked that 32-bit x86, ARM, RISCV32, and other 32-bit targets can't lower…
				mgehre-amdAuthorUnsubmitted Done Reply Inline Actions I checked x86_32 and it lowers 128 bit divisions to __divti3: https://godbolt.org/z/Y6hT9n3x5 mgehre-amd: I checked x86_32 and it lowers 128 bit divisions to __divti3: https://godbolt.org/z/Y6hT9n3x5
				craig.topperUnsubmitted Done Reply Inline Actions But it fails to link https://godbolt.org/z/5x38Gxqa9 so I think that's a bug. __divti3 doesn't exist in 32-bit libgcc. craig.topper: But it fails to link https://godbolt.org/z/5x38Gxqa9 so I think that's a bug. __divti3 doesn't…
				mgehre-amdAuthorUnsubmitted Done Reply Inline Actions That's a good observation! I will update the comment. I also need to modify the follow-up PR, https://reviews.llvm.org/D130076, to set the correct limits for those targets. mgehre-amd: That's a good observation! I will update the comment. I also need to modify the follow-up PR…
				craig.topperUnsubmitted Done Reply Inline Actions I thought about disabling that libcall in 32-bit mode, but the it is supported by compiler-rt. So I guess a linker error when using libgcc is better than a compiler error when using compiler-rt or libgcc until we get this pass in. craig.topper: I thought about disabling that libcall in 32-bit mode, but the it is supported by compiler-rt.
				//
				//===----------------------------------------------------------------------===//

				#include "llvm/CodeGen/ExpandLargeDivRem.h"
				#include "llvm/ADT/SmallVector.h"
				#include "llvm/ADT/StringExtras.h"
				#include "llvm/Analysis/GlobalsModRef.h"
				#include "llvm/CodeGen/Passes.h"
				#include "llvm/IR/IRBuilder.h"
				#include "llvm/IR/InstIterator.h"
				#include "llvm/IR/PassManager.h"
				#include "llvm/InitializePasses.h"
				#include "llvm/Pass.h"
				#include "llvm/Support/CommandLine.h"
				#include "llvm/Transforms/Utils/IntegerDivision.h"

				using namespace llvm;

				static cl::opt<unsigned>
				ExpandDivRemBits("expand-div-rem-bits", cl::Hidden, cl::init(128),
				cl::desc("div and rem instructions on integers with "
				craig.topperUnsubmitted Done Reply Inline Actions This says "<N> or more bits", but the code exits for `<= ExpandDivRemBits` craig.topper: This says "<N> or more bits", but the code exits for `<= ExpandDivRemBits`
				mgehre-amdAuthorUnsubmitted Done Reply Inline Actions done mgehre-amd: done
				"more than <N> bits are expanded."));

				static bool runImpl(Function &F) {
				SmallVector<BinaryOperator *, 4> Replace;
				bool Modified = false;

				for (auto &I : instructions(F)) {
				nikicUnsubmitted Done Reply Inline Actions You should either do the replacement directly in this loop, or not use make_early_inc_range. Doesn't make sense to use it if you're not modifying anything. nikic: You should either do the replacement directly in this loop, or not use make_early_inc_range.
				switch (I.getOpcode()) {
				nikicUnsubmitted Done Reply Inline Actions Unnecessary newline nikic: Unnecessary newline
				case Instruction::UDiv:
				case Instruction::SDiv:
				case Instruction::URem:
				case Instruction::SRem: {
				// TODO: This doesn't handle vectors.
				auto *IntTy = dyn_cast<IntegerType>(I.getType());
				arsenmUnsubmitted Done Reply Inline Actions This won't handle vectors arsenm: This won't handle vectors
				mgehre-amdAuthorUnsubmitted Done Reply Inline Actions True, I'm not concerned about vectors right now. I can add a TODO comment here. mgehre-amd: True, I'm not concerned about vectors right now. I can add a TODO comment here.
				if (!IntTy \|\| IntTy->getIntegerBitWidth() <= ExpandDivRemBits)
				continue;

				Replace.push_back(&cast<BinaryOperator>(I));
				Modified = true;
				break;
				}
				default:
				break;
				}
				}

				arsenmUnsubmitted Done Reply Inline Actions Can you set these all at once with AttributeList arsenm: Can you set these all at once with AttributeList
				if (Replace.empty())
				return false;

				while (!Replace.empty()) {
				BinaryOperator *I = Replace.pop_back_val();

				if (I->getOpcode() == Instruction::UDiv \|\|
				I->getOpcode() == Instruction::SDiv) {
				expandDivision(I);
				nikicUnsubmitted Done Reply Inline Actions Could use a vector of BinaryOperator to avoid the casts here. nikic: Could use a vector of BinaryOperator to avoid the casts here.
				} else {
				expandRemainder(I);
				}
				}

				return Modified;
				}

				PreservedAnalyses ExpandLargeDivRemPass::run(Function &F,
				FunctionAnalysisManager &AM) {
				bool Changed = runImpl(F);

				if (Changed)
				return PreservedAnalyses::none();

				return PreservedAnalyses::all();
				nikicUnsubmitted Done Reply Inline Actions At least for the new pass manager, I don't think it's legal to add new functions in a function pass. I'm not sure about the legacy pass manager. nikic: At least for the new pass manager, I don't think it's legal to add new functions in a function…
				mgehre-amdAuthorUnsubmitted Done Reply Inline Actions Yes, this was one of the early comments I got on this PR, so I changed it to not insert new functions but instead insert the loop in-place. mgehre-amd: Yes, this was one of the early comments I got on this PR, so I changed it to not insert new…
				}

				class ExpandLargeDivRemLegacyPass : public FunctionPass {
				public:
				static char ID;

				ExpandLargeDivRemLegacyPass() : FunctionPass(ID) {
				initializeExpandLargeDivRemLegacyPassPass(*PassRegistry::getPassRegistry());
				}

				bool runOnFunction(Function &F) override { return runImpl(F); }

				void getAnalysisUsage(AnalysisUsage &AU) const override {
				AU.addPreserved<AAResultsWrapperPass>();
				AU.addPreserved<GlobalsAAWrapperPass>();
				}
				};

				char ExpandLargeDivRemLegacyPass::ID = 0;
				INITIALIZE_PASS_BEGIN(ExpandLargeDivRemLegacyPass, "expand-large-div-rem",
				"Expand large div/rem", false, false)
				INITIALIZE_PASS_END(ExpandLargeDivRemLegacyPass, "expand-large-div-rem",
				"Expand large div/rem", false, false)

				FunctionPass *llvm::createExpandLargeDivRemPass() {
				return new ExpandLargeDivRemLegacyPass();
				}
				arsenmUnsubmitted Done Reply Inline Actions dyn_cast instead of isa + implicit cast to IntegerType? arsenm: dyn_cast instead of isa + implicit cast to IntegerType?
				arsenmUnsubmitted Done Reply Inline Actions You shouldn't need to pre-accumulate the instructions, just use make_early_inc_range on the instruction iterator? arsenm: You shouldn't need to pre-accumulate the instructions, just use make_early_inc_range on the…
				mgehre-amdAuthorUnsubmitted Done Reply Inline Actions I tried, but it crashed. I think that's because we not only remove the current instruction but also insert new basic blocks, which confuses `make_early_inc_range (instructions(F))`. mgehre-amd: I tried, but it crashed. I think that's because we not only remove the current instruction but…
				arsenmUnsubmitted Done Reply Inline Actions Since this is ultimately a workaround for SelectionDAG, ideally this would be a TargetLowering control arsenm: Since this is ultimately a workaround for SelectionDAG, ideally this would be a TargetLowering…
				mgehre-amdAuthorUnsubmitted Done Reply Inline Actions I'm not sure how I would wire TargetLowering into the pass - I didn't find other passes using it. Would TargetTransformInfo also work for you? mgehre-amd: I'm not sure how I would wire TargetLowering into the pass - I didn't find other passes using…

llvm/lib/Transforms/Utils/IntegerDivision.cpp

Show All 26 Lines
/// remainder, which will have the sign of the dividend. Builder's insert point		/// remainder, which will have the sign of the dividend. Builder's insert point
/// should be pointing where the caller wants code generated, e.g. at the srem		/// should be pointing where the caller wants code generated, e.g. at the srem
/// instruction. This will generate a urem in the process, and Builder's insert		/// instruction. This will generate a urem in the process, and Builder's insert
/// point will be pointing at the uren (if present, i.e. not folded), ready to		/// point will be pointing at the uren (if present, i.e. not folded), ready to
/// be expanded if the user wishes		/// be expanded if the user wishes
static Value generateSignedRemainderCode(Value Dividend, Value *Divisor,		static Value generateSignedRemainderCode(Value Dividend, Value *Divisor,
IRBuilder<> &Builder) {		IRBuilder<> &Builder) {
unsigned BitWidth = Dividend->getType()->getIntegerBitWidth();		unsigned BitWidth = Dividend->getType()->getIntegerBitWidth();
ConstantInt *Shift;		ConstantInt *Shift = Builder.getIntN(BitWidth, BitWidth - 1);
		nikicUnsubmitted Done Reply Inline Actions Precommit as NFC changes? nikic: Precommit as NFC changes?

if (BitWidth == 64) {
Shift = Builder.getInt64(63);
} else {
assert(BitWidth == 32 && "Unexpected bit width");
Shift = Builder.getInt32(31);
}

// Following instructions are generated for both i32 (shift 31) and		// Following instructions are generated for both i32 (shift 31) and
// i64 (shift 63).		// i64 (shift 63).

// ; %dividend_sgn = ashr i32 %dividend, 31		// ; %dividend_sgn = ashr i32 %dividend, 31
// ; %divisor_sgn = ashr i32 %divisor, 31		// ; %divisor_sgn = ashr i32 %divisor, 31
// ; %dvd_xor = xor i32 %dividend, %dividend_sgn		// ; %dvd_xor = xor i32 %dividend, %dividend_sgn
// ; %dvs_xor = xor i32 %divisor, %divisor_sgn		// ; %dvs_xor = xor i32 %divisor, %divisor_sgn
▲ Show 20 Lines • Show All 48 Lines • ▼ Show 20 Lines
/// code generated, e.g. at the sdiv instruction. This will generate a udiv in		/// code generated, e.g. at the sdiv instruction. This will generate a udiv in
/// the process, and Builder's insert point will be pointing at the udiv (if		/// the process, and Builder's insert point will be pointing at the udiv (if
/// present, i.e. not folded), ready to be expanded if the user wishes.		/// present, i.e. not folded), ready to be expanded if the user wishes.
static Value generateSignedDivisionCode(Value Dividend, Value *Divisor,		static Value generateSignedDivisionCode(Value Dividend, Value *Divisor,
IRBuilder<> &Builder) {		IRBuilder<> &Builder) {
// Implementation taken from compiler-rt's __divsi3 and __divdi3		// Implementation taken from compiler-rt's __divsi3 and __divdi3

unsigned BitWidth = Dividend->getType()->getIntegerBitWidth();		unsigned BitWidth = Dividend->getType()->getIntegerBitWidth();
ConstantInt *Shift;		ConstantInt *Shift = Builder.getIntN(BitWidth, BitWidth - 1);

if (BitWidth == 64) {
Shift = Builder.getInt64(63);
} else {
assert(BitWidth == 32 && "Unexpected bit width");
Shift = Builder.getInt32(31);
}

// Following instructions are generated for both i32 (shift 31) and		// Following instructions are generated for both i32 (shift 31) and
// i64 (shift 63).		// i64 (shift 63).

// ; %tmp = ashr i32 %dividend, 31		// ; %tmp = ashr i32 %dividend, 31
// ; %tmp1 = ashr i32 %divisor, 31		// ; %tmp1 = ashr i32 %divisor, 31
// ; %tmp2 = xor i32 %tmp, %dividend		// ; %tmp2 = xor i32 %tmp, %dividend
// ; %u_dvnd = sub nsw i32 %tmp2, %tmp		// ; %u_dvnd = sub nsw i32 %tmp2, %tmp
Show All 28 Lines	static Value generateUnsignedDivisionCode(Value Dividend, Value *Divisor,
// The basic algorithm can be found in the compiler-rt project's		// The basic algorithm can be found in the compiler-rt project's
// implementation of __udivsi3.c. Here, we do a lower-level IR based approach		// implementation of __udivsi3.c. Here, we do a lower-level IR based approach
// that's been hand-tuned to lessen the amount of control flow involved.		// that's been hand-tuned to lessen the amount of control flow involved.

// Some helper values		// Some helper values
IntegerType *DivTy = cast<IntegerType>(Dividend->getType());		IntegerType *DivTy = cast<IntegerType>(Dividend->getType());
unsigned BitWidth = DivTy->getBitWidth();		unsigned BitWidth = DivTy->getBitWidth();

ConstantInt *Zero;		ConstantInt *Zero = ConstantInt::get(DivTy, 0);
ConstantInt *One;		ConstantInt *One = ConstantInt::get(DivTy, 1);
ConstantInt *NegOne;		ConstantInt *NegOne = ConstantInt::getSigned(DivTy, -1);
ConstantInt *MSB;		ConstantInt *MSB = ConstantInt::get(DivTy, BitWidth - 1);
		craig.topperUnsubmitted Done Reply Inline Actions You already have the DivTy, why not go through ConstantInt::get for all of these? craig.topper: You already have the DivTy, why not go through ConstantInt::get for all of these?
		mgehre-amdAuthorUnsubmitted Done Reply Inline Actions done mgehre-amd: done

if (BitWidth == 64) {
Zero = Builder.getInt64(0);
One = Builder.getInt64(1);
NegOne = ConstantInt::getSigned(DivTy, -1);
MSB = Builder.getInt64(63);
} else {
assert(BitWidth == 32 && "Unexpected bit width");
Zero = Builder.getInt32(0);
One = Builder.getInt32(1);
NegOne = ConstantInt::getSigned(DivTy, -1);
MSB = Builder.getInt32(31);
}

ConstantInt *True = Builder.getTrue();		ConstantInt *True = Builder.getTrue();

BasicBlock *IBB = Builder.GetInsertBlock();		BasicBlock *IBB = Builder.GetInsertBlock();
Function *F = IBB->getParent();		Function *F = IBB->getParent();
Function *CTLZ = Intrinsic::getDeclaration(F->getParent(), Intrinsic::ctlz,		Function *CTLZ = Intrinsic::getDeclaration(F->getParent(), Intrinsic::ctlz,
DivTy);		DivTy);

▲ Show 20 Lines • Show All 178 Lines • ▼ Show 20 Lines	static Value generateUnsignedDivisionCode(Value Dividend, Value *Divisor,
Q_5->addIncoming(RetVal, SpecialCases);		Q_5->addIncoming(RetVal, SpecialCases);

return Q_5;		return Q_5;
}		}

/// Generate code to calculate the remainder of two integers, replacing Rem with		/// Generate code to calculate the remainder of two integers, replacing Rem with
/// the generated code. This currently generates code using the udiv expansion,		/// the generated code. This currently generates code using the udiv expansion,
/// but future work includes generating more specialized code, e.g. when more		/// but future work includes generating more specialized code, e.g. when more
/// information about the operands are known. Implements both 32bit and 64bit		/// information about the operands are known.
/// scalar division.
///		///
/// Replace Rem with generated code.		/// Replace Rem with generated code.
bool llvm::expandRemainder(BinaryOperator *Rem) {		bool llvm::expandRemainder(BinaryOperator *Rem) {
assert((Rem->getOpcode() == Instruction::SRem \|\|		assert((Rem->getOpcode() == Instruction::SRem \|\|
Rem->getOpcode() == Instruction::URem) &&		Rem->getOpcode() == Instruction::URem) &&
"Trying to expand remainder from a non-remainder function");		"Trying to expand remainder from a non-remainder function");

IRBuilder<> Builder(Rem);		IRBuilder<> Builder(Rem);

assert(!Rem->getType()->isVectorTy() && "Div over vectors not supported");		assert(!Rem->getType()->isVectorTy() && "Div over vectors not supported");
assert((Rem->getType()->getIntegerBitWidth() == 32 \|\|
Rem->getType()->getIntegerBitWidth() == 64) &&
"Div of bitwidth other than 32 or 64 not supported");

// First prepare the sign if it's a signed remainder		// First prepare the sign if it's a signed remainder
if (Rem->getOpcode() == Instruction::SRem) {		if (Rem->getOpcode() == Instruction::SRem) {
Value *Remainder = generateSignedRemainderCode(Rem->getOperand(0),		Value *Remainder = generateSignedRemainderCode(Rem->getOperand(0),
Rem->getOperand(1), Builder);		Rem->getOperand(1), Builder);

// Check whether this is the insert point while Rem is still valid.		// Check whether this is the insert point while Rem is still valid.
bool IsInsertPoint = Rem->getIterator() == Builder.GetInsertPoint();		bool IsInsertPoint = Rem->getIterator() == Builder.GetInsertPoint();
Show All 23 Lines	bool llvm::expandRemainder(BinaryOperator *Rem) {
if (BinaryOperator *UDiv = dyn_cast<BinaryOperator>(Builder.GetInsertPoint())) {		if (BinaryOperator *UDiv = dyn_cast<BinaryOperator>(Builder.GetInsertPoint())) {
assert(UDiv->getOpcode() == Instruction::UDiv && "Non-udiv in expansion?");		assert(UDiv->getOpcode() == Instruction::UDiv && "Non-udiv in expansion?");
expandDivision(UDiv);		expandDivision(UDiv);
}		}

return true;		return true;
}		}


/// Generate code to divide two integers, replacing Div with the generated		/// Generate code to divide two integers, replacing Div with the generated
/// code. This currently generates code similarly to compiler-rt's		/// code. This currently generates code similarly to compiler-rt's
/// implementations, but future work includes generating more specialized code		/// implementations, but future work includes generating more specialized code
/// when more information about the operands are known. Implements both		/// when more information about the operands are known.
/// 32bit and 64bit scalar division.
///		///
/// Replace Div with generated code.		/// Replace Div with generated code.
bool llvm::expandDivision(BinaryOperator *Div) {		bool llvm::expandDivision(BinaryOperator *Div) {
assert((Div->getOpcode() == Instruction::SDiv \|\|		assert((Div->getOpcode() == Instruction::SDiv \|\|
Div->getOpcode() == Instruction::UDiv) &&		Div->getOpcode() == Instruction::UDiv) &&
"Trying to expand division from a non-division function");		"Trying to expand division from a non-division function");

IRBuilder<> Builder(Div);		IRBuilder<> Builder(Div);

assert(!Div->getType()->isVectorTy() && "Div over vectors not supported");		assert(!Div->getType()->isVectorTy() && "Div over vectors not supported");
assert((Div->getType()->getIntegerBitWidth() == 32 \|\|
Div->getType()->getIntegerBitWidth() == 64) &&
"Div of bitwidth other than 32 or 64 not supported");

// First prepare the sign if it's a signed division		// First prepare the sign if it's a signed division
if (Div->getOpcode() == Instruction::SDiv) {		if (Div->getOpcode() == Instruction::SDiv) {
// Lower the code to unsigned division, and reset Div to point to the udiv.		// Lower the code to unsigned division, and reset Div to point to the udiv.
Value *Quotient = generateSignedDivisionCode(Div->getOperand(0),		Value *Quotient = generateSignedDivisionCode(Div->getOperand(0),
Div->getOperand(1), Builder);		Div->getOperand(1), Builder);

// Check whether this is the insert point while Div is still valid.		// Check whether this is the insert point while Div is still valid.
▲ Show 20 Lines • Show All 84 Lines • ▼ Show 20 Lines	assert((Rem->getOpcode() == Instruction::SRem \|\|
Rem->getOpcode() == Instruction::URem) &&		Rem->getOpcode() == Instruction::URem) &&
"Trying to expand remainder from a non-remainder function");		"Trying to expand remainder from a non-remainder function");

Type *RemTy = Rem->getType();		Type *RemTy = Rem->getType();
assert(!RemTy->isVectorTy() && "Div over vectors not supported");		assert(!RemTy->isVectorTy() && "Div over vectors not supported");

unsigned RemTyBitWidth = RemTy->getIntegerBitWidth();		unsigned RemTyBitWidth = RemTy->getIntegerBitWidth();

assert(RemTyBitWidth <= 64 && "Div of bitwidth greater than 64 not supported");		if (RemTyBitWidth >= 64)

if (RemTyBitWidth == 64)
return expandRemainder(Rem);		return expandRemainder(Rem);

// If bitwidth smaller than 64 extend inputs, extend output and proceed		// If bitwidth smaller than 64 extend inputs, extend output and proceed
// with 64 bit division.		// with 64 bit division.
IRBuilder<> Builder(Rem);		IRBuilder<> Builder(Rem);

Value *ExtDividend;		Value *ExtDividend;
Value *ExtDivisor;		Value *ExtDivisor;
▲ Show 20 Lines • Show All 78 Lines • ▼ Show 20 Lines	assert((Div->getOpcode() == Instruction::SDiv \|\|
Div->getOpcode() == Instruction::UDiv) &&		Div->getOpcode() == Instruction::UDiv) &&
"Trying to expand division from a non-division function");		"Trying to expand division from a non-division function");

Type *DivTy = Div->getType();		Type *DivTy = Div->getType();
assert(!DivTy->isVectorTy() && "Div over vectors not supported");		assert(!DivTy->isVectorTy() && "Div over vectors not supported");

unsigned DivTyBitWidth = DivTy->getIntegerBitWidth();		unsigned DivTyBitWidth = DivTy->getIntegerBitWidth();

assert(DivTyBitWidth <= 64 &&		if (DivTyBitWidth >= 64)
"Div of bitwidth greater than 64 not supported");

if (DivTyBitWidth == 64)
return expandDivision(Div);		return expandDivision(Div);

// If bitwidth smaller than 64 extend inputs, extend output and proceed		// If bitwidth smaller than 64 extend inputs, extend output and proceed
// with 64 bit division.		// with 64 bit division.
IRBuilder<> Builder(Div);		IRBuilder<> Builder(Div);

Value *ExtDividend;		Value *ExtDividend;
Value *ExtDivisor;		Value *ExtDivisor;
Show All 21 Lines

llvm/test/CodeGen/X86/urem-seteq.ll

Show First 20 Lines • Show All 356 Lines • ▼ Show 20 Lines	; X64-NEXT: retq
ret i32 %ret		ret i32 %ret
}		}

; Check illegal types.		; Check illegal types.

; https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34366		; https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=34366
define void @ossfuzz34366() {		define void @ossfuzz34366() {
; X86-LABEL: ossfuzz34366:		; X86-LABEL: ossfuzz34366:
; X86: # %bb.0:
; X86-NEXT: movl (%eax), %eax
; X86-NEXT: movl %eax, %ecx
; X86-NEXT: andl $2147483647, %ecx # imm = 0x7FFFFFFF
; X86-NEXT: orl %eax, %ecx
; X86-NEXT: sete (%eax)
; X86-NEXT: retl
;
; X64-LABEL: ossfuzz34366:		; X64-LABEL: ossfuzz34366:
; X64: # %bb.0:
; X64-NEXT: movq (%rax), %rax
; X64-NEXT: movabsq $9223372036854775807, %rcx # imm = 0x7FFFFFFFFFFFFFFF
; X64-NEXT: andq %rax, %rcx
; X64-NEXT: orq %rax, %rcx
; X64-NEXT: sete (%rax)
; X64-NEXT: retq
nikicUnsubmitted Done Reply Inline Actions What happened here? nikic: What happened here?
mgehre-amdAuthorUnsubmitted Done Reply Inline Actions This IR used to crash LLVM. I don't think that the actual assembly is relevant here, and after my changes it's much longer due to the loops that were inserted. mgehre-amd: This IR used to crash LLVM. I don't think that the actual assembly is relevant here, and after…
%L10 = load i448, ptr undef, align 4		%L10 = load i448, ptr undef, align 4
%B18 = urem i448 %L10, -363419362147803445274661903944002267176820680343659030140745099590319644056698961663095525356881782780381260803133088966767300814307328		%B18 = urem i448 %L10, -363419362147803445274661903944002267176820680343659030140745099590319644056698961663095525356881782780381260803133088966767300814307328
%C13 = icmp ule i448 %B18, 0		%C13 = icmp ule i448 %B18, 0
store i1 %C13, ptr undef, align 1		store i1 %C13, ptr undef, align 1
ret void		ret void
}		}

llvm/test/Transforms/ExpandLargeDivRem/sdiv129.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -S -expand-large-div-rem < %s \| FileCheck %s

				define void @sdiv129(i129* %ptr, i129* %out) nounwind {
				; CHECK-LABEL: @sdiv129(
				; CHECK-NEXT: _udiv-special-cases:
				; CHECK-NEXT: [[A:%.]] = load i129, i129 [[PTR:%.*]], align 4
				; CHECK-NEXT: [[TMP0:%.*]] = ashr i129 [[A]], 128
				; CHECK-NEXT: [[TMP1:%.*]] = xor i129 [[TMP0]], [[A]]
				; CHECK-NEXT: [[TMP2:%.*]] = sub i129 [[TMP1]], [[TMP0]]
				; CHECK-NEXT: [[TMP3:%.*]] = xor i129 0, [[TMP0]]
				; CHECK-NEXT: [[TMP4:%.*]] = icmp eq i129 [[TMP2]], 0
				arsenmUnsubmitted Done Reply Inline Actions The positioning of the checks looks weird. Usually update_test_checks put these comments inside the function? arsenm: The positioning of the checks looks weird. Usually update_test_checks put these comments inside…
				mgehre-amdAuthorUnsubmitted Done Reply Inline Actions I guess the position is still from the time where this was an extra function. I regenerated the check-lines, but update_test_checks.py doesn't move them around. I will re-generate them from scratch to get the typical position. mgehre-amd: I guess the position is still from the time where this was an extra function. I regenerated the…
				; CHECK-NEXT: [[TMP5:%.*]] = or i1 false, [[TMP4]]
				; CHECK-NEXT: [[TMP6:%.*]] = call i129 @llvm.ctlz.i129(i129 3, i1 true)
				; CHECK-NEXT: [[TMP7:%.*]] = call i129 @llvm.ctlz.i129(i129 [[TMP2]], i1 true)
				; CHECK-NEXT: [[TMP8:%.*]] = sub i129 [[TMP6]], [[TMP7]]
				; CHECK-NEXT: [[TMP9:%.*]] = icmp ugt i129 [[TMP8]], 128
				; CHECK-NEXT: [[TMP10:%.*]] = or i1 [[TMP5]], [[TMP9]]
				; CHECK-NEXT: [[TMP11:%.*]] = icmp eq i129 [[TMP8]], 128
				; CHECK-NEXT: [[TMP12:%.*]] = select i1 [[TMP10]], i129 0, i129 [[TMP2]]
				; CHECK-NEXT: [[TMP13:%.*]] = or i1 [[TMP10]], [[TMP11]]
				; CHECK-NEXT: br i1 [[TMP13]], label [[UDIV_END:%.]], label [[UDIV_BB1:%.]]
				; CHECK: udiv-loop-exit:
				; CHECK-NEXT: [[TMP14:%.]] = phi i129 [ 0, [[UDIV_BB1]] ], [ [[TMP29:%.]], [[UDIV_DO_WHILE:%.*]] ]
				; CHECK-NEXT: [[TMP15:%.]] = phi i129 [ [[TMP37:%.]], [[UDIV_BB1]] ], [ [[TMP26:%.*]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP16:%.*]] = shl i129 [[TMP15]], 1
				; CHECK-NEXT: [[TMP17:%.*]] = or i129 [[TMP14]], [[TMP16]]
				; CHECK-NEXT: br label [[UDIV_END]]
				; CHECK: udiv-do-while:
				; CHECK-NEXT: [[TMP18:%.]] = phi i129 [ 0, [[UDIV_PREHEADER:%.]] ], [ [[TMP29]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP19:%.]] = phi i129 [ [[TMP35:%.]], [[UDIV_PREHEADER]] ], [ [[TMP32:%.*]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP20:%.]] = phi i129 [ [[TMP34:%.]], [[UDIV_PREHEADER]] ], [ [[TMP31:%.*]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP21:%.*]] = phi i129 [ [[TMP37]], [[UDIV_PREHEADER]] ], [ [[TMP26]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP22:%.*]] = shl i129 [[TMP20]], 1
				; CHECK-NEXT: [[TMP23:%.*]] = lshr i129 [[TMP21]], 128
				; CHECK-NEXT: [[TMP24:%.*]] = or i129 [[TMP22]], [[TMP23]]
				; CHECK-NEXT: [[TMP25:%.*]] = shl i129 [[TMP21]], 1
				; CHECK-NEXT: [[TMP26]] = or i129 [[TMP18]], [[TMP25]]
				; CHECK-NEXT: [[TMP27:%.*]] = sub i129 2, [[TMP24]]
				; CHECK-NEXT: [[TMP28:%.*]] = ashr i129 [[TMP27]], 128
				; CHECK-NEXT: [[TMP29]] = and i129 [[TMP28]], 1
				; CHECK-NEXT: [[TMP30:%.*]] = and i129 [[TMP28]], 3
				; CHECK-NEXT: [[TMP31]] = sub i129 [[TMP24]], [[TMP30]]
				; CHECK-NEXT: [[TMP32]] = add i129 [[TMP19]], -1
				; CHECK-NEXT: [[TMP33:%.*]] = icmp eq i129 [[TMP32]], 0
				; CHECK-NEXT: br i1 [[TMP33]], label [[UDIV_LOOP_EXIT:%.*]], label [[UDIV_DO_WHILE]]
				; CHECK: udiv-preheader:
				; CHECK-NEXT: [[TMP34]] = lshr i129 [[TMP2]], [[TMP35]]
				; CHECK-NEXT: br label [[UDIV_DO_WHILE]]
				; CHECK: udiv-bb1:
				; CHECK-NEXT: [[TMP35]] = add i129 [[TMP8]], 1
				; CHECK-NEXT: [[TMP36:%.*]] = sub i129 128, [[TMP8]]
				; CHECK-NEXT: [[TMP37]] = shl i129 [[TMP2]], [[TMP36]]
				; CHECK-NEXT: [[TMP38:%.*]] = icmp eq i129 [[TMP35]], 0
				; CHECK-NEXT: br i1 [[TMP38]], label [[UDIV_LOOP_EXIT]], label [[UDIV_PREHEADER]]
				; CHECK: udiv-end:
				; CHECK-NEXT: [[TMP39:%.]] = phi i129 [ [[TMP17]], [[UDIV_LOOP_EXIT]] ], [ [[TMP12]], [[_UDIV_SPECIAL_CASES:%.]] ]
				; CHECK-NEXT: [[TMP40:%.*]] = xor i129 [[TMP39]], [[TMP3]]
				; CHECK-NEXT: [[TMP41:%.*]] = sub i129 [[TMP40]], [[TMP3]]
				; CHECK-NEXT: store i129 [[TMP41]], i129* [[OUT:%.*]], align 4
				; CHECK-NEXT: ret void
				;
				%a = load i129, i129* %ptr
				%res = sdiv i129 %a, 3
				store i129 %res, i129* %out
				ret void
				}

llvm/test/Transforms/ExpandLargeDivRem/srem129.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -S -expand-large-div-rem < %s \| FileCheck %s

				define void @test(i129* %ptr, i129* %out) nounwind {
				; CHECK-LABEL: @test(
				; CHECK-NEXT: _udiv-special-cases:
				; CHECK-NEXT: [[A:%.]] = load i129, i129 [[PTR:%.*]], align 4
				; CHECK-NEXT: [[TMP0:%.*]] = ashr i129 [[A]], 128
				; CHECK-NEXT: [[TMP1:%.*]] = xor i129 [[A]], [[TMP0]]
				; CHECK-NEXT: [[TMP2:%.*]] = sub i129 [[TMP1]], [[TMP0]]
				; CHECK-NEXT: [[TMP3:%.*]] = icmp eq i129 [[TMP2]], 0
				; CHECK-NEXT: [[TMP4:%.*]] = or i1 false, [[TMP3]]
				; CHECK-NEXT: [[TMP5:%.*]] = call i129 @llvm.ctlz.i129(i129 3, i1 true)
				; CHECK-NEXT: [[TMP6:%.*]] = call i129 @llvm.ctlz.i129(i129 [[TMP2]], i1 true)
				; CHECK-NEXT: [[TMP7:%.*]] = sub i129 [[TMP5]], [[TMP6]]
				; CHECK-NEXT: [[TMP8:%.*]] = icmp ugt i129 [[TMP7]], 128
				; CHECK-NEXT: [[TMP9:%.*]] = or i1 [[TMP4]], [[TMP8]]
				; CHECK-NEXT: [[TMP10:%.*]] = icmp eq i129 [[TMP7]], 128
				; CHECK-NEXT: [[TMP11:%.*]] = select i1 [[TMP9]], i129 0, i129 [[TMP2]]
				; CHECK-NEXT: [[TMP12:%.*]] = or i1 [[TMP9]], [[TMP10]]
				; CHECK-NEXT: br i1 [[TMP12]], label [[UDIV_END:%.]], label [[UDIV_BB1:%.]]
				; CHECK: udiv-loop-exit:
				; CHECK-NEXT: [[TMP13:%.]] = phi i129 [ 0, [[UDIV_BB1]] ], [ [[TMP28:%.]], [[UDIV_DO_WHILE:%.*]] ]
				; CHECK-NEXT: [[TMP14:%.]] = phi i129 [ [[TMP36:%.]], [[UDIV_BB1]] ], [ [[TMP25:%.*]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP15:%.*]] = shl i129 [[TMP14]], 1
				; CHECK-NEXT: [[TMP16:%.*]] = or i129 [[TMP13]], [[TMP15]]
				; CHECK-NEXT: br label [[UDIV_END]]
				; CHECK: udiv-do-while:
				; CHECK-NEXT: [[TMP17:%.]] = phi i129 [ 0, [[UDIV_PREHEADER:%.]] ], [ [[TMP28]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP18:%.]] = phi i129 [ [[TMP34:%.]], [[UDIV_PREHEADER]] ], [ [[TMP31:%.*]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP19:%.]] = phi i129 [ [[TMP33:%.]], [[UDIV_PREHEADER]] ], [ [[TMP30:%.*]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP20:%.*]] = phi i129 [ [[TMP36]], [[UDIV_PREHEADER]] ], [ [[TMP25]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP21:%.*]] = shl i129 [[TMP19]], 1
				; CHECK-NEXT: [[TMP22:%.*]] = lshr i129 [[TMP20]], 128
				; CHECK-NEXT: [[TMP23:%.*]] = or i129 [[TMP21]], [[TMP22]]
				; CHECK-NEXT: [[TMP24:%.*]] = shl i129 [[TMP20]], 1
				; CHECK-NEXT: [[TMP25]] = or i129 [[TMP17]], [[TMP24]]
				; CHECK-NEXT: [[TMP26:%.*]] = sub i129 2, [[TMP23]]
				; CHECK-NEXT: [[TMP27:%.*]] = ashr i129 [[TMP26]], 128
				; CHECK-NEXT: [[TMP28]] = and i129 [[TMP27]], 1
				; CHECK-NEXT: [[TMP29:%.*]] = and i129 [[TMP27]], 3
				; CHECK-NEXT: [[TMP30]] = sub i129 [[TMP23]], [[TMP29]]
				; CHECK-NEXT: [[TMP31]] = add i129 [[TMP18]], -1
				; CHECK-NEXT: [[TMP32:%.*]] = icmp eq i129 [[TMP31]], 0
				; CHECK-NEXT: br i1 [[TMP32]], label [[UDIV_LOOP_EXIT:%.*]], label [[UDIV_DO_WHILE]]
				; CHECK: udiv-preheader:
				; CHECK-NEXT: [[TMP33]] = lshr i129 [[TMP2]], [[TMP34]]
				; CHECK-NEXT: br label [[UDIV_DO_WHILE]]
				; CHECK: udiv-bb1:
				; CHECK-NEXT: [[TMP34]] = add i129 [[TMP7]], 1
				; CHECK-NEXT: [[TMP35:%.*]] = sub i129 128, [[TMP7]]
				; CHECK-NEXT: [[TMP36]] = shl i129 [[TMP2]], [[TMP35]]
				; CHECK-NEXT: [[TMP37:%.*]] = icmp eq i129 [[TMP34]], 0
				; CHECK-NEXT: br i1 [[TMP37]], label [[UDIV_LOOP_EXIT]], label [[UDIV_PREHEADER]]
				; CHECK: udiv-end:
				; CHECK-NEXT: [[TMP38:%.]] = phi i129 [ [[TMP16]], [[UDIV_LOOP_EXIT]] ], [ [[TMP11]], [[_UDIV_SPECIAL_CASES:%.]] ]
				; CHECK-NEXT: [[TMP39:%.*]] = mul i129 3, [[TMP38]]
				; CHECK-NEXT: [[TMP40:%.*]] = sub i129 [[TMP2]], [[TMP39]]
				; CHECK-NEXT: [[TMP41:%.*]] = xor i129 [[TMP40]], [[TMP0]]
				; CHECK-NEXT: [[TMP42:%.*]] = sub i129 [[TMP41]], [[TMP0]]
				; CHECK-NEXT: store i129 [[TMP42]], i129* [[OUT:%.*]], align 4
				; CHECK-NEXT: ret void
				;
				%a = load i129, i129* %ptr
				%res = srem i129 %a, 3
				store i129 %res, i129* %out
				ret void
				}

llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -S -expand-large-div-rem < %s \| FileCheck %s

				define void @test(i129* %ptr, i129* %out) nounwind {
				; CHECK-LABEL: @test(
				; CHECK-NEXT: _udiv-special-cases:
				; CHECK-NEXT: [[A:%.]] = load i129, i129 [[PTR:%.*]], align 4
				; CHECK-NEXT: [[TMP0:%.*]] = icmp eq i129 [[A]], 0
				; CHECK-NEXT: [[TMP1:%.*]] = or i1 false, [[TMP0]]
				; CHECK-NEXT: [[TMP2:%.*]] = call i129 @llvm.ctlz.i129(i129 3, i1 true)
				; CHECK-NEXT: [[TMP3:%.*]] = call i129 @llvm.ctlz.i129(i129 [[A]], i1 true)
				; CHECK-NEXT: [[TMP4:%.*]] = sub i129 [[TMP2]], [[TMP3]]
				; CHECK-NEXT: [[TMP5:%.*]] = icmp ugt i129 [[TMP4]], 128
				; CHECK-NEXT: [[TMP6:%.*]] = or i1 [[TMP1]], [[TMP5]]
				; CHECK-NEXT: [[TMP7:%.*]] = icmp eq i129 [[TMP4]], 128
				; CHECK-NEXT: [[TMP8:%.*]] = select i1 [[TMP6]], i129 0, i129 [[A]]
				; CHECK-NEXT: [[TMP9:%.*]] = or i1 [[TMP6]], [[TMP7]]
				; CHECK-NEXT: br i1 [[TMP9]], label [[UDIV_END:%.]], label [[UDIV_BB1:%.]]
				; CHECK: udiv-loop-exit:
				; CHECK-NEXT: [[TMP10:%.]] = phi i129 [ 0, [[UDIV_BB1]] ], [ [[TMP25:%.]], [[UDIV_DO_WHILE:%.*]] ]
				; CHECK-NEXT: [[TMP11:%.]] = phi i129 [ [[TMP33:%.]], [[UDIV_BB1]] ], [ [[TMP22:%.*]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP12:%.*]] = shl i129 [[TMP11]], 1
				; CHECK-NEXT: [[TMP13:%.*]] = or i129 [[TMP10]], [[TMP12]]
				; CHECK-NEXT: br label [[UDIV_END]]
				; CHECK: udiv-do-while:
				; CHECK-NEXT: [[TMP14:%.]] = phi i129 [ 0, [[UDIV_PREHEADER:%.]] ], [ [[TMP25]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP15:%.]] = phi i129 [ [[TMP31:%.]], [[UDIV_PREHEADER]] ], [ [[TMP28:%.*]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP16:%.]] = phi i129 [ [[TMP30:%.]], [[UDIV_PREHEADER]] ], [ [[TMP27:%.*]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP17:%.*]] = phi i129 [ [[TMP33]], [[UDIV_PREHEADER]] ], [ [[TMP22]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP18:%.*]] = shl i129 [[TMP16]], 1
				; CHECK-NEXT: [[TMP19:%.*]] = lshr i129 [[TMP17]], 128
				; CHECK-NEXT: [[TMP20:%.*]] = or i129 [[TMP18]], [[TMP19]]
				; CHECK-NEXT: [[TMP21:%.*]] = shl i129 [[TMP17]], 1
				; CHECK-NEXT: [[TMP22]] = or i129 [[TMP14]], [[TMP21]]
				; CHECK-NEXT: [[TMP23:%.*]] = sub i129 2, [[TMP20]]
				; CHECK-NEXT: [[TMP24:%.*]] = ashr i129 [[TMP23]], 128
				; CHECK-NEXT: [[TMP25]] = and i129 [[TMP24]], 1
				; CHECK-NEXT: [[TMP26:%.*]] = and i129 [[TMP24]], 3
				; CHECK-NEXT: [[TMP27]] = sub i129 [[TMP20]], [[TMP26]]
				; CHECK-NEXT: [[TMP28]] = add i129 [[TMP15]], -1
				; CHECK-NEXT: [[TMP29:%.*]] = icmp eq i129 [[TMP28]], 0
				; CHECK-NEXT: br i1 [[TMP29]], label [[UDIV_LOOP_EXIT:%.*]], label [[UDIV_DO_WHILE]]
				; CHECK: udiv-preheader:
				; CHECK-NEXT: [[TMP30]] = lshr i129 [[A]], [[TMP31]]
				; CHECK-NEXT: br label [[UDIV_DO_WHILE]]
				; CHECK: udiv-bb1:
				; CHECK-NEXT: [[TMP31]] = add i129 [[TMP4]], 1
				; CHECK-NEXT: [[TMP32:%.*]] = sub i129 128, [[TMP4]]
				; CHECK-NEXT: [[TMP33]] = shl i129 [[A]], [[TMP32]]
				; CHECK-NEXT: [[TMP34:%.*]] = icmp eq i129 [[TMP31]], 0
				; CHECK-NEXT: br i1 [[TMP34]], label [[UDIV_LOOP_EXIT]], label [[UDIV_PREHEADER]]
				; CHECK: udiv-end:
				; CHECK-NEXT: [[TMP35:%.]] = phi i129 [ [[TMP13]], [[UDIV_LOOP_EXIT]] ], [ [[TMP8]], [[_UDIV_SPECIAL_CASES:%.]] ]
				; CHECK-NEXT: store i129 [[TMP35]], i129* [[OUT:%.*]], align 4
				; CHECK-NEXT: ret void
				;
				%a = load i129, i129* %ptr
				%res = udiv i129 %a, 3
				store i129 %res, i129* %out
				ret void
				}

llvm/test/Transforms/ExpandLargeDivRem/urem129.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_test_checks.py
				; RUN: opt -S -expand-large-div-rem < %s \| FileCheck %s

				define void @test(i129* %ptr, i129* %out) nounwind {
				; CHECK-LABEL: @test(
				; CHECK-NEXT: _udiv-special-cases:
				; CHECK-NEXT: [[A:%.]] = load i129, i129 [[PTR:%.*]], align 4
				; CHECK-NEXT: [[TMP0:%.*]] = icmp eq i129 [[A]], 0
				; CHECK-NEXT: [[TMP1:%.*]] = or i1 false, [[TMP0]]
				; CHECK-NEXT: [[TMP2:%.*]] = call i129 @llvm.ctlz.i129(i129 3, i1 true)
				; CHECK-NEXT: [[TMP3:%.*]] = call i129 @llvm.ctlz.i129(i129 [[A]], i1 true)
				; CHECK-NEXT: [[TMP4:%.*]] = sub i129 [[TMP2]], [[TMP3]]
				; CHECK-NEXT: [[TMP5:%.*]] = icmp ugt i129 [[TMP4]], 128
				; CHECK-NEXT: [[TMP6:%.*]] = or i1 [[TMP1]], [[TMP5]]
				arsenmUnsubmitted Done Reply Inline Actions This looks out of date? arsenm: This looks out of date?
				mgehre-amdAuthorUnsubmitted Done Reply Inline Actions Yes, stupid me. I'll update. mgehre-amd: Yes, stupid me. I'll update.
				; CHECK-NEXT: [[TMP7:%.*]] = icmp eq i129 [[TMP4]], 128
				craig.topperUnsubmitted Done Reply Inline Actions Is this code handling the possible poison result from the ctlz's incorrectly? If [A] is zero then [TMP3] is poison, [TMP4] is poison, [TMP5] is poison, [TMP6] being an or won't block the poison. Assuming I understand poison correctly. I don't think that was caused by this patch. It's an issue in the existing code. craig.topper: Is this code handling the possible poison result from the ctlz's incorrectly? If [A] is zero…
				craig.topperUnsubmitted Done Reply Inline Actions Patch for the poison issue https://reviews.llvm.org/D130680 craig.topper: Patch for the poison issue https://reviews.llvm.org/D130680
				; CHECK-NEXT: [[TMP8:%.*]] = select i1 [[TMP6]], i129 0, i129 [[A]]
				; CHECK-NEXT: [[TMP9:%.*]] = or i1 [[TMP6]], [[TMP7]]
				; CHECK-NEXT: br i1 [[TMP9]], label [[UDIV_END:%.]], label [[UDIV_BB1:%.]]
				; CHECK: udiv-loop-exit:
				; CHECK-NEXT: [[TMP10:%.]] = phi i129 [ 0, [[UDIV_BB1]] ], [ [[TMP25:%.]], [[UDIV_DO_WHILE:%.*]] ]
				; CHECK-NEXT: [[TMP11:%.]] = phi i129 [ [[TMP33:%.]], [[UDIV_BB1]] ], [ [[TMP22:%.*]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP12:%.*]] = shl i129 [[TMP11]], 1
				; CHECK-NEXT: [[TMP13:%.*]] = or i129 [[TMP10]], [[TMP12]]
				; CHECK-NEXT: br label [[UDIV_END]]
				; CHECK: udiv-do-while:
				; CHECK-NEXT: [[TMP14:%.]] = phi i129 [ 0, [[UDIV_PREHEADER:%.]] ], [ [[TMP25]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP15:%.]] = phi i129 [ [[TMP31:%.]], [[UDIV_PREHEADER]] ], [ [[TMP28:%.*]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP16:%.]] = phi i129 [ [[TMP30:%.]], [[UDIV_PREHEADER]] ], [ [[TMP27:%.*]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP17:%.*]] = phi i129 [ [[TMP33]], [[UDIV_PREHEADER]] ], [ [[TMP22]], [[UDIV_DO_WHILE]] ]
				; CHECK-NEXT: [[TMP18:%.*]] = shl i129 [[TMP16]], 1
				; CHECK-NEXT: [[TMP19:%.*]] = lshr i129 [[TMP17]], 128
				; CHECK-NEXT: [[TMP20:%.*]] = or i129 [[TMP18]], [[TMP19]]
				; CHECK-NEXT: [[TMP21:%.*]] = shl i129 [[TMP17]], 1
				; CHECK-NEXT: [[TMP22]] = or i129 [[TMP14]], [[TMP21]]
				; CHECK-NEXT: [[TMP23:%.*]] = sub i129 2, [[TMP20]]
				; CHECK-NEXT: [[TMP24:%.*]] = ashr i129 [[TMP23]], 128
				; CHECK-NEXT: [[TMP25]] = and i129 [[TMP24]], 1
				; CHECK-NEXT: [[TMP26:%.*]] = and i129 [[TMP24]], 3
				; CHECK-NEXT: [[TMP27]] = sub i129 [[TMP20]], [[TMP26]]
				; CHECK-NEXT: [[TMP28]] = add i129 [[TMP15]], -1
				; CHECK-NEXT: [[TMP29:%.*]] = icmp eq i129 [[TMP28]], 0
				; CHECK-NEXT: br i1 [[TMP29]], label [[UDIV_LOOP_EXIT:%.*]], label [[UDIV_DO_WHILE]]
				; CHECK: udiv-preheader:
				; CHECK-NEXT: [[TMP30]] = lshr i129 [[A]], [[TMP31]]
				; CHECK-NEXT: br label [[UDIV_DO_WHILE]]
				; CHECK: udiv-bb1:
				; CHECK-NEXT: [[TMP31]] = add i129 [[TMP4]], 1
				; CHECK-NEXT: [[TMP32:%.*]] = sub i129 128, [[TMP4]]
				; CHECK-NEXT: [[TMP33]] = shl i129 [[A]], [[TMP32]]
				; CHECK-NEXT: [[TMP34:%.*]] = icmp eq i129 [[TMP31]], 0
				; CHECK-NEXT: br i1 [[TMP34]], label [[UDIV_LOOP_EXIT]], label [[UDIV_PREHEADER]]
				; CHECK: udiv-end:
				; CHECK-NEXT: [[TMP35:%.]] = phi i129 [ [[TMP13]], [[UDIV_LOOP_EXIT]] ], [ [[TMP8]], [[_UDIV_SPECIAL_CASES:%.]] ]
				; CHECK-NEXT: [[TMP36:%.*]] = mul i129 3, [[TMP35]]
				; CHECK-NEXT: [[TMP37:%.*]] = sub i129 [[A]], [[TMP36]]
				; CHECK-NEXT: store i129 [[TMP37]], i129* [[OUT:%.*]], align 4
				; CHECK-NEXT: ret void
				;
				%a = load i129, i129* %ptr
				%res = urem i129 %a, 3
				store i129 %res, i129* %out
				ret void
				}

llvm/tools/opt/opt.cpp

Show First 20 Lines • Show All 449 Lines • ▼ Show 20 Lines	std::vector<StringRef> PassNameExact = {
"expand-reductions", "indirectbr-expand",		"expand-reductions", "indirectbr-expand",
"generic-to-nvvm", "expandmemcmp",		"generic-to-nvvm", "expandmemcmp",
"loop-reduce", "lower-amx-type",		"loop-reduce", "lower-amx-type",
"pre-amx-config", "lower-amx-intrinsics",		"pre-amx-config", "lower-amx-intrinsics",
"polyhedral-info", "print-polyhedral-info",		"polyhedral-info", "print-polyhedral-info",
"replace-with-veclib", "jmc-instrument",		"replace-with-veclib", "jmc-instrument",
"dot-regions", "dot-regions-only",		"dot-regions", "dot-regions-only",
"view-regions", "view-regions-only",		"view-regions", "view-regions-only",
"select-optimize"};		"select-optimize", "expand-large-div-rem"};
for (const auto &P : PassNamePrefix)		for (const auto &P : PassNamePrefix)
if (Pass.startswith(P))		if (Pass.startswith(P))
return true;		return true;
for (const auto &P : PassNameContain)		for (const auto &P : PassNameContain)
if (Pass.contains(P))		if (Pass.contains(P))
return true;		return true;
return llvm::is_contained(PassNameExact, Pass);		return llvm::is_contained(PassNameExact, Pass);
}		}
Show All 32 Lines	int main(int argc, char **argv) {
initializeAnalysis(Registry);		initializeAnalysis(Registry);
initializeTransformUtils(Registry);		initializeTransformUtils(Registry);
initializeInstCombine(Registry);		initializeInstCombine(Registry);
initializeAggressiveInstCombine(Registry);		initializeAggressiveInstCombine(Registry);
initializeInstrumentation(Registry);		initializeInstrumentation(Registry);
initializeTarget(Registry);		initializeTarget(Registry);
// For codegen passes, only passes that do IR to IR transformation are		// For codegen passes, only passes that do IR to IR transformation are
// supported.		// supported.
		initializeExpandLargeDivRemLegacyPassPass(Registry);
initializeExpandMemCmpPassPass(Registry);		initializeExpandMemCmpPassPass(Registry);
initializeScalarizeMaskedMemIntrinLegacyPassPass(Registry);		initializeScalarizeMaskedMemIntrinLegacyPassPass(Registry);
initializeSelectOptimizePass(Registry);		initializeSelectOptimizePass(Registry);
initializeCodeGenPreparePass(Registry);		initializeCodeGenPreparePass(Registry);
initializeAtomicExpandPass(Registry);		initializeAtomicExpandPass(Registry);
initializeRewriteSymbolsLegacyPassPass(Registry);		initializeRewriteSymbolsLegacyPassPass(Registry);
initializeWinEHPreparePass(Registry);		initializeWinEHPreparePass(Registry);
initializeDwarfEHPrepareLegacyPassPass(Registry);		initializeDwarfEHPrepareLegacyPassPass(Registry);
▲ Show 20 Lines • Show All 519 Lines • Show Last 20 Lines

This is an archive of the discontinued LLVM Phabricator instance.

[llvm/CodeGen] Add ExpandLargeDivRem passClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 455866

llvm/include/llvm/CodeGen/ExpandLargeDivRem.h

llvm/include/llvm/CodeGen/MachinePassRegistry.def

llvm/include/llvm/CodeGen/Passes.h

llvm/include/llvm/InitializePasses.h

llvm/include/llvm/LinkAllPasses.h

llvm/lib/CodeGen/CMakeLists.txt

llvm/lib/CodeGen/CodeGen.cpp

llvm/lib/CodeGen/ExpandLargeDivRem.cpp

llvm/lib/Transforms/Utils/IntegerDivision.cpp

llvm/test/CodeGen/X86/urem-seteq.ll

llvm/test/Transforms/ExpandLargeDivRem/sdiv129.ll

llvm/test/Transforms/ExpandLargeDivRem/srem129.ll

llvm/test/Transforms/ExpandLargeDivRem/udiv129.ll

llvm/test/Transforms/ExpandLargeDivRem/urem129.ll

llvm/tools/opt/opt.cpp

[llvm/CodeGen] Add ExpandLargeDivRem pass
ClosedPublic