This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/Analysis/
-
Analysis/
1/1
ConstantFolding.cpp
-
test/Transforms/InstSimplify/ConstProp/AArch64/
-
Transforms/
-
InstSimplify/
-
ConstProp/
-
AArch64/
1/1
aarch64-sve-convert-from-svbool.ll
-
lit.local.cfg

Differential D100463

[AArch64][SVEIntrinsicOpts] Fold sve_convert_from_svbool(zero) to zero
ClosedPublic

Authored by joechrisellis on Apr 14 2021, 4:13 AM.

Download Raw Diff

Details

Reviewers

paulwalker-arm
kmclaughlin
bsmith
DavidTruby
dmgreen

Commits

rGeffacc15999d: [AArch64] Constant fold sve_convert_from_svbool(zero) to zero

Summary

Co-authored-by: Paul Walker <paul.walker@arm.com>

Diff Detail

Repository: rG LLVM Github Monorepo

Unit TestsFailed

	Time	Test
	2,300 ms	x64 debian > libarcher.races::lock-unrelated.c

Event Timeline

joechrisellis created this revision.Apr 14 2021, 4:13 AM

Herald added subscribers: hiraditya, kristof.beyls, tschuett. · View Herald TranscriptApr 14 2021, 4:13 AM

joechrisellis requested review of this revision.Apr 14 2021, 4:13 AM

Herald added a project: Restricted Project. · View Herald TranscriptApr 14 2021, 4:13 AM

Herald added a subscriber: llvm-commits. · View Herald Transcript

Harbormaster completed remote builds in B98661: Diff 337400.Apr 14 2021, 4:56 AM

Why is this kind of thing not done in instcombine? Or even constant folding if that is what this is really doing.

Hi @dmgreen! This is SVE-specific, and SVEIntrinsicOpts.cpp is where such transformations are typically placed (at least for now). I did a quick grep and it seems SVE intrinsics don't currently have much of a presence in generic passes like instcombine/constant folding, perhaps because some SVE optimisations are more complex than others and I guess it makes sense to keep them all in the same place.

X86 also has llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp which houses instcombine-like optimisations for X86. 🙂

In D100463#2690659, @joechrisellis wrote:

Hi @dmgreen! This is SVE-specific, and SVEIntrinsicOpts.cpp is where such transformations are typically placed (at least for now). I did a quick grep and it seems SVE intrinsics don't currently have much of a presence in generic passes like instcombine/constant folding, perhaps because some SVE optimisations are more complex than others and I guess it makes sense to keep them all in the same place.

X86 also has llvm/lib/Target/X86/X86InstCombineIntrinsic.cpp which houses instcombine-like optimisations for X86. 🙂

Hmm. "Because we've been doing it wrong" isn't a great reason to _keep_ doing it wrong :-)

This is very similar to https://reviews.llvm.org/rGbecaa6803ab532d15506829f0551a5fa49c39d7e, but special cased for zeroinit vectors. I do see this code at the start of that function, but it seems like it could still be made to work, and presumably in the long run some of the other constant folding there might be useful (like masked loads with zero masks).

// Do not iterate on scalable vector. The number of elements is unknown at
// compile-time.
if (isa<ScalableVectorType>(VTy))
  return nullptr;

More generally, any backend can use instCombineIntrinsic to add intrinsic folds to instcombine. I can appreciate that not everything that SVECombines is doing will fit well there, but the ones that do should be put there if possible. The steady state combining it does will be worthwhile, as well as it being run multiple times through the pipeline. Plus, you know, not re-inventing the wheel.

dmgreen mentioned this in D100476: [AArch64][SVEIntrinsicOpts] Replace last{a,b} intrinsic calls with extracts....Apr 15 2021, 2:07 AM

@dmgreen -- I hear you, makes sense to me. There might be some functionality already in SVEIntrinsicOpts.cpp that we can pull out, but I guess that can wait for later. In any case I'll move the functionality here to the other passes to reflect your suggestions.

Appreciate your comments! 😄

junparser added a subscriber: junparser.Apr 16 2021, 1:24 AM

Address review comments.

@dmgreen: move transformation to constant folding.

Harbormaster completed remote builds in B99124: Diff 338044.Apr 16 2021, 4:09 AM

Nice one, thanks. This Looks good to me.

llvm/lib/Analysis/ConstantFolding.cpp
3003	Doesn't need a break after the return (unless it's quieting a warning, but I think most compilers treat this sensibly).
llvm/test/Transforms/InstSimplify/ConstProp/AArch64/aarch64-sve-convert-from-svbool.ll
2	If this folder is new it may need (or at least be best) to have a lit.local.cfg to only run it when AArch64 is a registered target.

This revision is now accepted and ready to land.Apr 16 2021, 5:57 AM

Address review comments.

@dmgreen:
- Add lit.local.cfg file to AArch64 directory.
- Fix minor editing error.

Harbormaster completed remote builds in B99179: Diff 338115.Apr 16 2021, 9:04 AM

paulwalker-arm accepted this revision.Apr 19 2021, 2:11 AM

This revision was landed with ongoing or failed builds.Apr 20 2021, 3:03 AM

Closed by commit rGeffacc15999d: [AArch64] Constant fold sve_convert_from_svbool(zero) to zero (authored by joechrisellis). · Explain Why

This revision was automatically updated to reflect the committed changes.

joechrisellis added a commit: rGeffacc15999d: [AArch64] Constant fold sve_convert_from_svbool(zero) to zero.

Revision Contents

Path

Size

llvm/

lib/

Analysis/

ConstantFolding.cpp

50 lines

test/

Transforms/

InstSimplify/

ConstProp/

AArch64/

aarch64-sve-convert-from-svbool.ll

10 lines

lit.local.cfg

2 lines

Diff 338115

llvm/lib/Analysis/ConstantFolding.cpp

Show All 35 Lines
#include "llvm/IR/Function.h"		#include "llvm/IR/Function.h"
#include "llvm/IR/GlobalValue.h"		#include "llvm/IR/GlobalValue.h"
#include "llvm/IR/GlobalVariable.h"		#include "llvm/IR/GlobalVariable.h"
#include "llvm/IR/InstrTypes.h"		#include "llvm/IR/InstrTypes.h"
#include "llvm/IR/Instruction.h"		#include "llvm/IR/Instruction.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/IR/IntrinsicInst.h"		#include "llvm/IR/IntrinsicInst.h"
#include "llvm/IR/Intrinsics.h"		#include "llvm/IR/Intrinsics.h"
		#include "llvm/IR/IntrinsicsAArch64.h"
#include "llvm/IR/IntrinsicsAMDGPU.h"		#include "llvm/IR/IntrinsicsAMDGPU.h"
#include "llvm/IR/IntrinsicsARM.h"		#include "llvm/IR/IntrinsicsARM.h"
#include "llvm/IR/IntrinsicsWebAssembly.h"		#include "llvm/IR/IntrinsicsWebAssembly.h"
#include "llvm/IR/IntrinsicsX86.h"		#include "llvm/IR/IntrinsicsX86.h"
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/IR/Type.h"		#include "llvm/IR/Type.h"
#include "llvm/IR/Value.h"		#include "llvm/IR/Value.h"
#include "llvm/Support/Casting.h"		#include "llvm/Support/Casting.h"
▲ Show 20 Lines • Show All 1,433 Lines • ▼ Show 20 Lines	bool llvm::canConstantFoldCallTo(const CallBase Call, const Function F) {
case Intrinsic::vector_reduce_smax:		case Intrinsic::vector_reduce_smax:
case Intrinsic::vector_reduce_umin:		case Intrinsic::vector_reduce_umin:
case Intrinsic::vector_reduce_umax:		case Intrinsic::vector_reduce_umax:
// Target intrinsics		// Target intrinsics
case Intrinsic::arm_mve_vctp8:		case Intrinsic::arm_mve_vctp8:
case Intrinsic::arm_mve_vctp16:		case Intrinsic::arm_mve_vctp16:
case Intrinsic::arm_mve_vctp32:		case Intrinsic::arm_mve_vctp32:
case Intrinsic::arm_mve_vctp64:		case Intrinsic::arm_mve_vctp64:
		case Intrinsic::aarch64_sve_convert_from_svbool:
// WebAssembly float semantics are always known		// WebAssembly float semantics are always known
case Intrinsic::wasm_trunc_signed:		case Intrinsic::wasm_trunc_signed:
case Intrinsic::wasm_trunc_unsigned:		case Intrinsic::wasm_trunc_unsigned:
case Intrinsic::wasm_trunc_saturate_signed:		case Intrinsic::wasm_trunc_saturate_signed:
case Intrinsic::wasm_trunc_saturate_unsigned:		case Intrinsic::wasm_trunc_saturate_unsigned:
return true;		return true;

// Floating point operations cannot be folded in strictfp functions in		// Floating point operations cannot be folded in strictfp functions in
▲ Show 20 Lines • Show All 1,361 Lines • ▼ Show 20 Lines	if (Operands.size() == 2)
return ConstantFoldScalarCall2(Name, IntrinsicID, Ty, Operands, TLI, Call);		return ConstantFoldScalarCall2(Name, IntrinsicID, Ty, Operands, TLI, Call);

if (Operands.size() == 3)		if (Operands.size() == 3)
return ConstantFoldScalarCall3(Name, IntrinsicID, Ty, Operands, TLI, Call);		return ConstantFoldScalarCall3(Name, IntrinsicID, Ty, Operands, TLI, Call);

return nullptr;		return nullptr;
}		}

static Constant *ConstantFoldVectorCall(StringRef Name,		static Constant *ConstantFoldFixedVectorCall(
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'ConstantFoldFixedVectorCall' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'ConstantFoldFixedVectorCall' [readability…
Intrinsic::ID IntrinsicID,		StringRef Name, Intrinsic::ID IntrinsicID, FixedVectorType *FVTy,
VectorType *VTy,		ArrayRef<Constant *> Operands, const DataLayout &DL,
ArrayRef<Constant *> Operands,		const TargetLibraryInfo TLI, const CallBase Call) {
const DataLayout &DL,
const TargetLibraryInfo *TLI,
const CallBase *Call) {
// Do not iterate on scalable vector. The number of elements is unknown at
// compile-time.
if (isa<ScalableVectorType>(VTy))
return nullptr;

auto *FVTy = cast<FixedVectorType>(VTy);

SmallVector<Constant *, 4> Result(FVTy->getNumElements());		SmallVector<Constant *, 4> Result(FVTy->getNumElements());
SmallVector<Constant *, 4> Lane(Operands.size());		SmallVector<Constant *, 4> Lane(Operands.size());
Type *Ty = FVTy->getElementType();		Type *Ty = FVTy->getElementType();

switch (IntrinsicID) {		switch (IntrinsicID) {
case Intrinsic::masked_load: {		case Intrinsic::masked_load: {
auto *SrcPtr = Operands[0];		auto *SrcPtr = Operands[0];
auto *Mask = Operands[2];		auto *Mask = Operands[2];
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	for (unsigned I = 0, E = FVTy->getNumElements(); I != E; ++I) {
if (!Folded)		if (!Folded)
return nullptr;		return nullptr;
Result[I] = Folded;		Result[I] = Folded;
}		}

return ConstantVector::get(Result);		return ConstantVector::get(Result);
}		}

		static Constant *ConstantFoldScalableVectorCall(
		Lint: Pre-merge checks Inline Actions clang-tidy: warning: invalid case style for function 'ConstantFoldScalableVectorCall' [readability-identifier-naming] not useful Lint: Pre-merge checks: clang-tidy: warning: invalid case style for function 'ConstantFoldScalableVectorCall'…
		StringRef Name, Intrinsic::ID IntrinsicID, ScalableVectorType *SVTy,
		ArrayRef<Constant *> Operands, const DataLayout &DL,
		const TargetLibraryInfo TLI, const CallBase Call) {
		switch (IntrinsicID) {
		case Intrinsic::aarch64_sve_convert_from_svbool: {
		auto *Src = dyn_cast<Constant>(Operands[0]);
		if (!Src \|\| !Src->isNullValue())
		break;

		return ConstantInt::getFalse(SVTy);
		}
		dmgreenUnsubmitted Done Reply Inline Actions Doesn't need a break after the return (unless it's quieting a warning, but I think most compilers treat this sensibly). dmgreen: Doesn't need a break after the return (unless it's quieting a warning, but I think most…
		default:
		break;
		}
		return nullptr;
		}

} // end anonymous namespace		} // end anonymous namespace

Constant llvm::ConstantFoldCall(const CallBase Call, Function *F,		Constant llvm::ConstantFoldCall(const CallBase Call, Function *F,
ArrayRef<Constant *> Operands,		ArrayRef<Constant *> Operands,
const TargetLibraryInfo *TLI) {		const TargetLibraryInfo *TLI) {
if (Call->isNoBuiltin())		if (Call->isNoBuiltin())
return nullptr;		return nullptr;
if (!F->hasName())		if (!F->hasName())
return nullptr;		return nullptr;
StringRef Name = F->getName();		StringRef Name = F->getName();

Type *Ty = F->getReturnType();		Type *Ty = F->getReturnType();

if (auto *VTy = dyn_cast<VectorType>(Ty))		if (auto *FVTy = dyn_cast<FixedVectorType>(Ty))
return ConstantFoldVectorCall(Name, F->getIntrinsicID(), VTy, Operands,		return ConstantFoldFixedVectorCall(
		Name, F->getIntrinsicID(), FVTy, Operands,
		F->getParent()->getDataLayout(), TLI, Call);

		if (auto *SVTy = dyn_cast<ScalableVectorType>(Ty))
		return ConstantFoldScalableVectorCall(
		Name, F->getIntrinsicID(), SVTy, Operands,
F->getParent()->getDataLayout(), TLI, Call);		F->getParent()->getDataLayout(), TLI, Call);

return ConstantFoldScalarCall(Name, F->getIntrinsicID(), Ty, Operands, TLI,		return ConstantFoldScalarCall(Name, F->getIntrinsicID(), Ty, Operands, TLI,
Call);		Call);
}		}

bool llvm::isMathLibCallNoop(const CallBase *Call,		bool llvm::isMathLibCallNoop(const CallBase *Call,
const TargetLibraryInfo *TLI) {		const TargetLibraryInfo *TLI) {
// FIXME: Refactor this code; this duplicates logic in LibCallsShrinkWrap		// FIXME: Refactor this code; this duplicates logic in LibCallsShrinkWrap
▲ Show 20 Lines • Show All 145 Lines • Show Last 20 Lines

llvm/test/Transforms/InstSimplify/ConstProp/AArch64/aarch64-sve-convert-from-svbool.ll

This file was added.

				; RUN: opt -instsimplify -S -o - < %s \| FileCheck %s

				dmgreenUnsubmitted Done Reply Inline Actions If this folder is new it may need (or at least be best) to have a lit.local.cfg to only run it when AArch64 is a registered target. dmgreen: If this folder is new it may need (or at least be best) to have a lit.local.cfg to only run it…
				define <vscale x 2 x i1> @reinterpret_zero() {
				; CHECK-LABEL: @reinterpret_zero(
				; CHECK: ret <vscale x 2 x i1> zeroinitializer
				%pg = tail call <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1> zeroinitializer)
				ret <vscale x 2 x i1> %pg
				}

				declare <vscale x 2 x i1> @llvm.aarch64.sve.convert.from.svbool.nxv2i1(<vscale x 16 x i1>)

llvm/test/Transforms/InstSimplify/ConstProp/AArch64/lit.local.cfg

This file was added.

				if not 'AArch64' in config.root.targets:
				config.unsupported = True