This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Move the truncstore_merge combine to the LoadStoreOpt pass and add support for an extra case.
ClosedPublic

Authored by aemerson on Apr 12 2023, 9:56 AM.

Download Raw Diff

Details

Reviewers

• jpaquette
arsenm
paquette
aemerson

Summary

If we have set of mergeable stores of shifts, but the original source value being shifted
is wider than the merged size, we should still be able to merge if we truncate first. To do this
however we need to search for stores speculatively up the block, without knowing exactly how
many stores we should see before we stop. The old algorithm has to match an exact number of
stores to fit the wide type, or it dies. The new one will try to set the wide type to however
many stores we found in the upwards block traversal and use later checks to verify if they're
a valid mergeable set.

The reason I need to move this to LoadStoreOpt is because the combiner works going top down
inside a block, which means that we end up doing partial merges because we haven't seen all
the possible stores before we mutate the MIR. In LoadStoreOpt we can go bottom up.

As a side effect of this change, we also end up doing better on an existing test case (missing_store)
since we manage to do a partial merge there.

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

aemerson created this revision.Apr 12 2023, 9:56 AM

Herald added a reviewer: paquette. · View Herald TranscriptApr 12 2023, 9:56 AM

Herald added a project: Restricted Project. · View Herald Transcript

Herald added a subscriber: hiraditya. · View Herald Transcript

aemerson requested review of this revision.Apr 12 2023, 9:56 AM

Herald added a subscriber: wdng. · View Herald TranscriptApr 12 2023, 9:56 AM

@arsenm Does AMDGPU need this combine? If so you'll need to add LoadStoreOpt to your pipeline somewhere.

Harbormaster completed remote builds in B225108: Diff 512882.Apr 12 2023, 11:05 AM

Oops, I accidentally committed this in 29c851f4e2ff because I forgot to reset my branch before doing another commit.

If you think this is the wrong approach I’m happy to revert, otherwise I’ll leave it.

This revision is now accepted and ready to land.Apr 12 2023, 5:17 PM

aemerson closed this revision.Apr 12 2023, 5:17 PM

Revision Contents

Path

Size

llvm/

include/

llvm/

CodeGen/

GlobalISel/

CombinerHelper.h

11 lines

LoadStoreOpt.h

5 lines

Target/

GlobalISel/

Combine.td

10 lines

lib/

CodeGen/

GlobalISel/

CombinerHelper.cpp

269 lines

LoadStoreOpt.cpp

297 lines

test/

CodeGen/

AArch64/

GlobalISel/

merge-stores-truncating.ll

47 lines

merge-stores-truncating.mir

22 lines

store-merging-debug.mir

34 lines

store-merging.mir

365 lines

Diff 512882

llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h

Show First 20 Lines • Show All 73 Lines • ▼ Show 20 Lines	struct ShiftOfShiftedLogic {
MachineInstr *Logic;		MachineInstr *Logic;
MachineInstr *Shift2;		MachineInstr *Shift2;
Register LogicNonShiftReg;		Register LogicNonShiftReg;
uint64_t ValSum;		uint64_t ValSum;
};		};

using BuildFnTy = std::function<void(MachineIRBuilder &)>;		using BuildFnTy = std::function<void(MachineIRBuilder &)>;

struct MergeTruncStoresInfo {
SmallVector<GStore *> FoundStores;
GStore *LowestIdxStore = nullptr;
Register WideSrcVal;
bool NeedBSwap = false;
bool NeedRotate = false;
};

using OperandBuildSteps =		using OperandBuildSteps =
SmallVector<std::function<void(MachineInstrBuilder &)>, 4>;		SmallVector<std::function<void(MachineInstrBuilder &)>, 4>;
struct InstructionBuildSteps {		struct InstructionBuildSteps {
unsigned Opcode = 0; /// The opcode for the produced instruction.		unsigned Opcode = 0; /// The opcode for the produced instruction.
OperandBuildSteps OperandFns; /// Operands to be added to the instruction.		OperandBuildSteps OperandFns; /// Operands to be added to the instruction.
InstructionBuildSteps() = default;		InstructionBuildSteps() = default;
InstructionBuildSteps(unsigned Opcode, const OperandBuildSteps &OperandFns)		InstructionBuildSteps(unsigned Opcode, const OperandBuildSteps &OperandFns)
: Opcode(Opcode), OperandFns(OperandFns) {}		: Opcode(Opcode), OperandFns(OperandFns) {}
▲ Show 20 Lines • Show All 474 Lines • ▼ Show 20 Lines	public:
/// sN *a = ...		/// sN *a = ...
/// sM val = a[0] \| (a[1] << N) \| (a[2] << 2N) \| (a[3] << 3N) ...		/// sM val = a[0] \| (a[1] << N) \| (a[2] << 2N) \| (a[3] << 3N) ...
/// \endcode		/// \endcode
///		///
/// And check if the tree can be replaced with a M-bit load + possibly a		/// And check if the tree can be replaced with a M-bit load + possibly a
/// bswap.		/// bswap.
bool matchLoadOrCombine(MachineInstr &MI, BuildFnTy &MatchInfo);		bool matchLoadOrCombine(MachineInstr &MI, BuildFnTy &MatchInfo);

bool matchTruncStoreMerge(MachineInstr &MI, MergeTruncStoresInfo &MatchInfo);
void applyTruncStoreMerge(MachineInstr &MI, MergeTruncStoresInfo &MatchInfo);

bool matchExtendThroughPhis(MachineInstr &MI, MachineInstr *&ExtMI);		bool matchExtendThroughPhis(MachineInstr &MI, MachineInstr *&ExtMI);
void applyExtendThroughPhis(MachineInstr &MI, MachineInstr *&ExtMI);		void applyExtendThroughPhis(MachineInstr &MI, MachineInstr *&ExtMI);

bool matchExtractVecEltBuildVec(MachineInstr &MI, Register &Reg);		bool matchExtractVecEltBuildVec(MachineInstr &MI, Register &Reg);
void applyExtractVecEltBuildVec(MachineInstr &MI, Register &Reg);		void applyExtractVecEltBuildVec(MachineInstr &MI, Register &Reg);

bool matchExtractAllEltsFromBuildVector(		bool matchExtractAllEltsFromBuildVector(
MachineInstr &MI,		MachineInstr &MI,
▲ Show 20 Lines • Show All 296 Lines • Show Last 20 Lines

llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h

Show All 9 Lines
/// Specifically, it focuses on merging stores and loads to consecutive		/// Specifically, it focuses on merging stores and loads to consecutive
/// addresses.		/// addresses.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#ifndef LLVM_CODEGEN_GLOBALISEL_LOADSTOREOPT_H		#ifndef LLVM_CODEGEN_GLOBALISEL_LOADSTOREOPT_H
#define LLVM_CODEGEN_GLOBALISEL_LOADSTOREOPT_H		#define LLVM_CODEGEN_GLOBALISEL_LOADSTOREOPT_H

#include "llvm/ADT/BitVector.h"		#include "llvm/ADT/BitVector.h"
		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/SmallVector.h"		#include "llvm/ADT/SmallVector.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h"		#include "llvm/CodeGen/GlobalISel/MachineIRBuilder.h"
#include "llvm/CodeGen/MachineFunction.h"		#include "llvm/CodeGen/MachineFunction.h"
#include "llvm/CodeGen/MachineFunctionPass.h"		#include "llvm/CodeGen/MachineFunctionPass.h"

namespace llvm {		namespace llvm {
// Forward declarations.		// Forward declarations.
▲ Show 20 Lines • Show All 100 Lines • ▼ Show 20 Lines	private:
/// Erases the old stores from the block when finished.		/// Erases the old stores from the block when finished.
/// \returns true if merging was done. It may fail to perform a merge if		/// \returns true if merging was done. It may fail to perform a merge if
/// there are issues with materializing legal wide values.		/// there are issues with materializing legal wide values.
bool doSingleStoreMerge(SmallVectorImpl<GStore *> &Stores);		bool doSingleStoreMerge(SmallVectorImpl<GStore *> &Stores);
bool processMergeCandidate(StoreMergeCandidate &C);		bool processMergeCandidate(StoreMergeCandidate &C);
bool mergeBlockStores(MachineBasicBlock &MBB);		bool mergeBlockStores(MachineBasicBlock &MBB);
bool mergeFunctionStores(MachineFunction &MF);		bool mergeFunctionStores(MachineFunction &MF);

		bool mergeTruncStore(GStore &StoreMI,
		SmallPtrSetImpl<GStore *> &DeletedStores);
		bool mergeTruncStoresBlock(MachineBasicBlock &MBB);

/// Initialize some target-specific data structures for the store merging		/// Initialize some target-specific data structures for the store merging
/// optimization. \p AddrSpace indicates which address space to use when		/// optimization. \p AddrSpace indicates which address space to use when
/// probing the legalizer info for legal stores.		/// probing the legalizer info for legal stores.
void initializeStoreMergeTargetInfo(unsigned AddrSpace = 0);		void initializeStoreMergeTargetInfo(unsigned AddrSpace = 0);
/// A map between address space numbers and a bitvector of supported stores		/// A map between address space numbers and a bitvector of supported stores
/// sizes. Each bit in the bitvector represents whether a store size of		/// sizes. Each bit in the bitvector represents whether a store size of
/// that bit's value is legal. E.g. if bit 64 is set, then 64 bit scalar		/// that bit's value is legal. E.g. if bit 64 is set, then 64 bit scalar
/// stores are legal.		/// stores are legal.
Show All 24 Lines

llvm/include/llvm/Target/GlobalISel/Combine.td

Show First 20 Lines • Show All 709 Lines • ▼ Show 20 Lines	def combine_insert_vec_elts_build_vector : GICombineRule<
(apply [{ Helper.applyCombineInsertVecElts(*${root}, ${info}); }])>;		(apply [{ Helper.applyCombineInsertVecElts(*${root}, ${info}); }])>;

def load_or_combine : GICombineRule<		def load_or_combine : GICombineRule<
(defs root:$root, build_fn_matchinfo:$info),		(defs root:$root, build_fn_matchinfo:$info),
(match (wip_match_opcode G_OR):$root,		(match (wip_match_opcode G_OR):$root,
[{ return Helper.matchLoadOrCombine(*${root}, ${info}); }]),		[{ return Helper.matchLoadOrCombine(*${root}, ${info}); }]),
(apply [{ Helper.applyBuildFn(*${root}, ${info}); }])>;		(apply [{ Helper.applyBuildFn(*${root}, ${info}); }])>;


def truncstore_merge_matcdata : GIDefMatchData<"MergeTruncStoresInfo">;
def truncstore_merge : GICombineRule<
(defs root:$root, truncstore_merge_matcdata:$info),
(match (wip_match_opcode G_STORE):$root,
[{ return Helper.matchTruncStoreMerge(*${root}, ${info}); }]),
(apply [{ Helper.applyTruncStoreMerge(*${root}, ${info}); }])>;

def extend_through_phis_matchdata: GIDefMatchData<"MachineInstr*">;		def extend_through_phis_matchdata: GIDefMatchData<"MachineInstr*">;
def extend_through_phis : GICombineRule<		def extend_through_phis : GICombineRule<
(defs root:$root, extend_through_phis_matchdata:$matchinfo),		(defs root:$root, extend_through_phis_matchdata:$matchinfo),
(match (wip_match_opcode G_PHI):$root,		(match (wip_match_opcode G_PHI):$root,
[{ return Helper.matchExtendThroughPhis(*${root}, ${matchinfo}); }]),		[{ return Helper.matchExtendThroughPhis(*${root}, ${matchinfo}); }]),
(apply [{ Helper.applyExtendThroughPhis(*${root}, ${matchinfo}); }])>;		(apply [{ Helper.applyExtendThroughPhis(*${root}, ${matchinfo}); }])>;

// Currently only the one combine above.		// Currently only the one combine above.
▲ Show 20 Lines • Show All 366 Lines • ▼ Show 20 Lines	def all_combines : GICombineGroup<[trivial_combines, insert_vec_elt_combines,
shl_ashr_to_sext_inreg, sext_inreg_of_load,		shl_ashr_to_sext_inreg, sext_inreg_of_load,
width_reduction_combines, select_combines,		width_reduction_combines, select_combines,
known_bits_simplifications, ext_ext_fold,		known_bits_simplifications, ext_ext_fold,
not_cmp_fold, opt_brcond_by_inverting_cond,		not_cmp_fold, opt_brcond_by_inverting_cond,
unmerge_merge, unmerge_cst, unmerge_dead_to_trunc,		unmerge_merge, unmerge_cst, unmerge_dead_to_trunc,
unmerge_zext_to_zext, merge_unmerge, trunc_ext_fold, trunc_shift,		unmerge_zext_to_zext, merge_unmerge, trunc_ext_fold, trunc_shift,
const_combines, xor_of_and_with_same_reg, ptr_add_with_zero,		const_combines, xor_of_and_with_same_reg, ptr_add_with_zero,
shift_immed_chain, shift_of_shifted_logic_chain, load_or_combine,		shift_immed_chain, shift_of_shifted_logic_chain, load_or_combine,
truncstore_merge, div_rem_to_divrem, funnel_shift_combines,		div_rem_to_divrem, funnel_shift_combines,
form_bitfield_extract, constant_fold, fabs_fneg_fold,		form_bitfield_extract, constant_fold, fabs_fneg_fold,
intdiv_combines, mulh_combines, redundant_neg_operands,		intdiv_combines, mulh_combines, redundant_neg_operands,
and_or_disjoint_mask, fma_combines, fold_binop_into_select,		and_or_disjoint_mask, fma_combines, fold_binop_into_select,
sub_add_reg, select_to_minmax, redundant_binop_in_equality,		sub_add_reg, select_to_minmax, redundant_binop_in_equality,
fsub_to_fneg, commute_constant_to_rhs]>;		fsub_to_fneg, commute_constant_to_rhs]>;

// A combine group used to for prelegalizer combiners at -O0. The combines in		// A combine group used to for prelegalizer combiners at -O0. The combines in
// this group have been selected based on experiments to balance code size and		// this group have been selected based on experiments to balance code size and
// compile time performance.		// compile time performance.
def optnone_combines : GICombineGroup<[trivial_combines,		def optnone_combines : GICombineGroup<[trivial_combines,
ptr_add_immed_chain, combines_for_extload,		ptr_add_immed_chain, combines_for_extload,
not_cmp_fold, opt_brcond_by_inverting_cond]>;		not_cmp_fold, opt_brcond_by_inverting_cond]>;

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp

Show First 20 Lines • Show All 3,619 Lines • ▼ Show 20 Lines	MatchInfo = [=](MachineIRBuilder &MIB) {
Register LoadDst = NeedsBSwap ? MRI.cloneVirtualRegister(Dst) : Dst;		Register LoadDst = NeedsBSwap ? MRI.cloneVirtualRegister(Dst) : Dst;
MIB.buildLoad(LoadDst, Ptr, *NewMMO);		MIB.buildLoad(LoadDst, Ptr, *NewMMO);
if (NeedsBSwap)		if (NeedsBSwap)
MIB.buildBSwap(Dst, LoadDst);		MIB.buildBSwap(Dst, LoadDst);
};		};
return true;		return true;
}		}

/// Check if the store \p Store is a truncstore that can be merged. That is,
/// it's a store of a shifted value of \p SrcVal. If \p SrcVal is an empty
/// Register then it does not need to match and SrcVal is set to the source
/// value found.
/// On match, returns the start byte offset of the \p SrcVal that is being
/// stored.
static std::optional<int64_t>
getTruncStoreByteOffset(GStore &Store, Register &SrcVal,
MachineRegisterInfo &MRI) {
Register TruncVal;
if (!mi_match(Store.getValueReg(), MRI, m_GTrunc(m_Reg(TruncVal))))
return std::nullopt;

// The shift amount must be a constant multiple of the narrow type.
// It is translated to the offset address in the wide source value "y".
//
// x = G_LSHR y, ShiftAmtC
// s8 z = G_TRUNC x
// store z, ...
Register FoundSrcVal;
int64_t ShiftAmt;
if (!mi_match(TruncVal, MRI,
m_any_of(m_GLShr(m_Reg(FoundSrcVal), m_ICst(ShiftAmt)),
m_GAShr(m_Reg(FoundSrcVal), m_ICst(ShiftAmt))))) {
if (!SrcVal.isValid() \|\| TruncVal == SrcVal) {
if (!SrcVal.isValid())
SrcVal = TruncVal;
return 0; // If it's the lowest index store.
}
return std::nullopt;
}

unsigned NarrowBits = Store.getMMO().getMemoryType().getScalarSizeInBits();
if (ShiftAmt % NarrowBits!= 0)
return std::nullopt;
const unsigned Offset = ShiftAmt / NarrowBits;

if (SrcVal.isValid() && FoundSrcVal != SrcVal)
return std::nullopt;

if (!SrcVal.isValid())
SrcVal = FoundSrcVal;
else if (MRI.getType(SrcVal) != MRI.getType(FoundSrcVal))
return std::nullopt;
return Offset;
}

/// Match a pattern where a wide type scalar value is stored by several narrow
/// stores. Fold it into a single store or a BSWAP and a store if the targets
/// supports it.
///
/// Assuming little endian target:
/// i8 *p = ...
/// i32 val = ...
/// p[0] = (val >> 0) & 0xFF;
/// p[1] = (val >> 8) & 0xFF;
/// p[2] = (val >> 16) & 0xFF;
/// p[3] = (val >> 24) & 0xFF;
/// =>
/// *((i32)p) = val;
///
/// i8 *p = ...
/// i32 val = ...
/// p[0] = (val >> 24) & 0xFF;
/// p[1] = (val >> 16) & 0xFF;
/// p[2] = (val >> 8) & 0xFF;
/// p[3] = (val >> 0) & 0xFF;
/// =>
/// *((i32)p) = BSWAP(val);
bool CombinerHelper::matchTruncStoreMerge(MachineInstr &MI,
MergeTruncStoresInfo &MatchInfo) {
auto &StoreMI = cast<GStore>(MI);
LLT MemTy = StoreMI.getMMO().getMemoryType();

// We only handle merging simple stores of 1-4 bytes.
if (!MemTy.isScalar())
return false;
switch (MemTy.getSizeInBits()) {
case 8:
case 16:
case 32:
break;
default:
return false;
}
if (!StoreMI.isSimple())
return false;

// We do a simple search for mergeable stores prior to this one.
// Any potential alias hazard along the way terminates the search.
SmallVector<GStore *> FoundStores;

// We're looking for:
// 1) a (store(trunc(...)))
// 2) of an LSHR/ASHR of a single wide value, by the appropriate shift to get
// the partial value stored.
// 3) where the offsets form either a little or big-endian sequence.

auto &LastStore = StoreMI;

// The single base pointer that all stores must use.
Register BaseReg;
int64_t LastOffset;
if (!mi_match(LastStore.getPointerReg(), MRI,
m_GPtrAdd(m_Reg(BaseReg), m_ICst(LastOffset)))) {
BaseReg = LastStore.getPointerReg();
LastOffset = 0;
}

GStore *LowestIdxStore = &LastStore;
int64_t LowestIdxOffset = LastOffset;

Register WideSrcVal;
auto LowestShiftAmt = getTruncStoreByteOffset(LastStore, WideSrcVal, MRI);
if (!LowestShiftAmt)
return false; // Didn't match a trunc.
assert(WideSrcVal.isValid());

LLT WideStoreTy = MRI.getType(WideSrcVal);
// The wide type might not be a multiple of the memory type, e.g. s48 and s32.
if (WideStoreTy.getSizeInBits() % MemTy.getSizeInBits() != 0)
return false;
const unsigned NumStoresRequired =
WideStoreTy.getSizeInBits() / MemTy.getSizeInBits();

SmallVector<int64_t, 8> OffsetMap(NumStoresRequired, INT64_MAX);
OffsetMap[*LowestShiftAmt] = LastOffset;
FoundStores.emplace_back(&LastStore);

// Search the block up for more stores.
// We use a search threshold of 10 instructions here because the combiner
// works top-down within a block, and we don't want to search an unbounded
// number of predecessor instructions trying to find matching stores.
// If we moved this optimization into a separate pass then we could probably
// use a more efficient search without having a hard-coded threshold.
const int MaxInstsToCheck = 10;
int NumInstsChecked = 0;
for (auto II = ++LastStore.getReverseIterator();
II != LastStore.getParent()->rend() && NumInstsChecked < MaxInstsToCheck;
++II) {
NumInstsChecked++;
GStore *NewStore;
if ((NewStore = dyn_cast<GStore>(&*II))) {
if (NewStore->getMMO().getMemoryType() != MemTy \|\| !NewStore->isSimple())
break;
} else if (II->isLoadFoldBarrier() \|\| II->mayLoad()) {
break;
} else {
continue; // This is a safe instruction we can look past.
}

Register NewBaseReg;
int64_t MemOffset;
// Check we're storing to the same base + some offset.
if (!mi_match(NewStore->getPointerReg(), MRI,
m_GPtrAdd(m_Reg(NewBaseReg), m_ICst(MemOffset)))) {
NewBaseReg = NewStore->getPointerReg();
MemOffset = 0;
}
if (BaseReg != NewBaseReg)
break;

auto ShiftByteOffset = getTruncStoreByteOffset(*NewStore, WideSrcVal, MRI);
if (!ShiftByteOffset)
break;
if (MemOffset < LowestIdxOffset) {
LowestIdxOffset = MemOffset;
LowestIdxStore = NewStore;
}

// Map the offset in the store and the offset in the combined value, and
// early return if it has been set before.
if (ShiftByteOffset < 0 \|\| ShiftByteOffset >= NumStoresRequired \|\|
OffsetMap[*ShiftByteOffset] != INT64_MAX)
break;
OffsetMap[*ShiftByteOffset] = MemOffset;

FoundStores.emplace_back(NewStore);
// Reset counter since we've found a matching inst.
NumInstsChecked = 0;
if (FoundStores.size() == NumStoresRequired)
break;
}

if (FoundStores.size() != NumStoresRequired) {
return false;
}

const auto &DL = LastStore.getMF()->getDataLayout();
auto &C = LastStore.getMF()->getFunction().getContext();
// Check that a store of the wide type is both allowed and fast on the target
unsigned Fast = 0;
bool Allowed = getTargetLowering().allowsMemoryAccess(
C, DL, WideStoreTy, LowestIdxStore->getMMO(), &Fast);
if (!Allowed \|\| !Fast)
return false;

// Check if the pieces of the value are going to the expected places in memory
// to merge the stores.
unsigned NarrowBits = MemTy.getScalarSizeInBits();
auto checkOffsets = [&](bool MatchLittleEndian) {
if (MatchLittleEndian) {
for (unsigned i = 0; i != NumStoresRequired; ++i)
if (OffsetMap[i] != i * (NarrowBits / 8) + LowestIdxOffset)
return false;
} else { // MatchBigEndian by reversing loop counter.
for (unsigned i = 0, j = NumStoresRequired - 1; i != NumStoresRequired;
++i, --j)
if (OffsetMap[j] != i * (NarrowBits / 8) + LowestIdxOffset)
return false;
}
return true;
};

// Check if the offsets line up for the native data layout of this target.
bool NeedBswap = false;
bool NeedRotate = false;
if (!checkOffsets(DL.isLittleEndian())) {
// Special-case: check if byte offsets line up for the opposite endian.
if (NarrowBits == 8 && checkOffsets(DL.isBigEndian()))
NeedBswap = true;
else if (NumStoresRequired == 2 && checkOffsets(DL.isBigEndian()))
NeedRotate = true;
else
return false;
}

if (NeedBswap &&
!isLegalOrBeforeLegalizer({TargetOpcode::G_BSWAP, {WideStoreTy}}))
return false;
if (NeedRotate &&
!isLegalOrBeforeLegalizer({TargetOpcode::G_ROTR, {WideStoreTy}}))
return false;

MatchInfo.NeedBSwap = NeedBswap;
MatchInfo.NeedRotate = NeedRotate;
MatchInfo.LowestIdxStore = LowestIdxStore;
MatchInfo.WideSrcVal = WideSrcVal;
MatchInfo.FoundStores = std::move(FoundStores);
return true;
}

void CombinerHelper::applyTruncStoreMerge(MachineInstr &MI,
MergeTruncStoresInfo &MatchInfo) {

Builder.setInstrAndDebugLoc(MI);
Register WideSrcVal = MatchInfo.WideSrcVal;
LLT WideStoreTy = MRI.getType(WideSrcVal);

if (MatchInfo.NeedBSwap) {
WideSrcVal = Builder.buildBSwap(WideStoreTy, WideSrcVal).getReg(0);
} else if (MatchInfo.NeedRotate) {
assert(WideStoreTy.getSizeInBits() % 2 == 0 &&
"Unexpected type for rotate");
auto RotAmt =
Builder.buildConstant(WideStoreTy, WideStoreTy.getSizeInBits() / 2);
WideSrcVal =
Builder.buildRotateRight(WideStoreTy, WideSrcVal, RotAmt).getReg(0);
}

Builder.buildStore(WideSrcVal, MatchInfo.LowestIdxStore->getPointerReg(),
MatchInfo.LowestIdxStore->getMMO().getPointerInfo(),
MatchInfo.LowestIdxStore->getMMO().getAlign());

// Erase the old stores.
for (auto *ST : MatchInfo.FoundStores)
ST->eraseFromParent();
}

bool CombinerHelper::matchExtendThroughPhis(MachineInstr &MI,		bool CombinerHelper::matchExtendThroughPhis(MachineInstr &MI,
MachineInstr *&ExtMI) {		MachineInstr *&ExtMI) {
assert(MI.getOpcode() == TargetOpcode::G_PHI);		assert(MI.getOpcode() == TargetOpcode::G_PHI);

Register DstReg = MI.getOperand(0).getReg();		Register DstReg = MI.getOperand(0).getReg();

// TODO: Extending a vector may be expensive, don't do this until heuristics		// TODO: Extending a vector may be expensive, don't do this until heuristics
// are better.		// are better.
▲ Show 20 Lines • Show All 2,315 Lines • Show Last 20 Lines

llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp

//===- LoadStoreOpt.cpp ----------- Generic memory optimizations -- C++ --==//		//===- LoadStoreOpt.cpp ----------- Generic memory optimizations -- C++ --==//
//		//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.		// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.		// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception		// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
/// \file		/// \file
/// This file implements the LoadStoreOpt optimization pass.		/// This file implements the LoadStoreOpt optimization pass.
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "llvm/CodeGen/GlobalISel/LoadStoreOpt.h"		#include "llvm/CodeGen/GlobalISel/LoadStoreOpt.h"
		#include "llvm/ADT/STLExtras.h"
		#include "llvm/ADT/SmallPtrSet.h"
#include "llvm/ADT/Statistic.h"		#include "llvm/ADT/Statistic.h"
#include "llvm/Analysis/AliasAnalysis.h"		#include "llvm/Analysis/AliasAnalysis.h"
#include "llvm/Analysis/MemoryLocation.h"		#include "llvm/Analysis/MemoryLocation.h"
#include "llvm/Analysis/OptimizationRemarkEmitter.h"		#include "llvm/Analysis/OptimizationRemarkEmitter.h"
#include "llvm/CodeGen/GlobalISel/GenericMachineInstrs.h"		#include "llvm/CodeGen/GlobalISel/GenericMachineInstrs.h"
#include "llvm/CodeGen/GlobalISel/LegalizerInfo.h"		#include "llvm/CodeGen/GlobalISel/LegalizerInfo.h"
#include "llvm/CodeGen/GlobalISel/MIPatternMatch.h"		#include "llvm/CodeGen/GlobalISel/MIPatternMatch.h"
#include "llvm/CodeGen/GlobalISel/Utils.h"		#include "llvm/CodeGen/GlobalISel/Utils.h"
▲ Show 20 Lines • Show All 591 Lines • ▼ Show 20 Lines	bool LoadStoreOpt::mergeBlockStores(MachineBasicBlock &MBB) {

// Erase instructions now that we're no longer iterating over the block.		// Erase instructions now that we're no longer iterating over the block.
for (auto *MI : InstsToErase)		for (auto *MI : InstsToErase)
MI->eraseFromParent();		MI->eraseFromParent();
InstsToErase.clear();		InstsToErase.clear();
return Changed;		return Changed;
}		}

		/// Check if the store \p Store is a truncstore that can be merged. That is,
		/// it's a store of a shifted value of \p SrcVal. If \p SrcVal is an empty
		/// Register then it does not need to match and SrcVal is set to the source
		/// value found.
		/// On match, returns the start byte offset of the \p SrcVal that is being
		/// stored.
		static std::optional<int64_t>
		getTruncStoreByteOffset(GStore &Store, Register &SrcVal,
		MachineRegisterInfo &MRI) {
		Register TruncVal;
		if (!mi_match(Store.getValueReg(), MRI, m_GTrunc(m_Reg(TruncVal))))
		return std::nullopt;

		// The shift amount must be a constant multiple of the narrow type.
		// It is translated to the offset address in the wide source value "y".
		//
		// x = G_LSHR y, ShiftAmtC
		// s8 z = G_TRUNC x
		// store z, ...
		Register FoundSrcVal;
		int64_t ShiftAmt;
		if (!mi_match(TruncVal, MRI,
		m_any_of(m_GLShr(m_Reg(FoundSrcVal), m_ICst(ShiftAmt)),
		m_GAShr(m_Reg(FoundSrcVal), m_ICst(ShiftAmt))))) {
		if (!SrcVal.isValid() \|\| TruncVal == SrcVal) {
		if (!SrcVal.isValid())
		SrcVal = TruncVal;
		return 0; // If it's the lowest index store.
		}
		return std::nullopt;
		}

		unsigned NarrowBits = Store.getMMO().getMemoryType().getScalarSizeInBits();
		if (ShiftAmt % NarrowBits != 0)
		return std::nullopt;
		const unsigned Offset = ShiftAmt / NarrowBits;

		if (SrcVal.isValid() && FoundSrcVal != SrcVal)
		return std::nullopt;

		if (!SrcVal.isValid())
		SrcVal = FoundSrcVal;
		else if (MRI.getType(SrcVal) != MRI.getType(FoundSrcVal))
		return std::nullopt;
		return Offset;
		}

		/// Match a pattern where a wide type scalar value is stored by several narrow
		/// stores. Fold it into a single store or a BSWAP and a store if the targets
		/// supports it.
		///
		/// Assuming little endian target:
		/// i8 *p = ...
		/// i32 val = ...
		/// p[0] = (val >> 0) & 0xFF;
		/// p[1] = (val >> 8) & 0xFF;
		/// p[2] = (val >> 16) & 0xFF;
		/// p[3] = (val >> 24) & 0xFF;
		/// =>
		/// *((i32)p) = val;
		///
		/// i8 *p = ...
		/// i32 val = ...
		/// p[0] = (val >> 24) & 0xFF;
		/// p[1] = (val >> 16) & 0xFF;
		/// p[2] = (val >> 8) & 0xFF;
		/// p[3] = (val >> 0) & 0xFF;
		/// =>
		/// *((i32)p) = BSWAP(val);
		bool LoadStoreOpt::mergeTruncStore(GStore &StoreMI,
		SmallPtrSetImpl<GStore *> &DeletedStores) {
		LLT MemTy = StoreMI.getMMO().getMemoryType();

		// We only handle merging simple stores of 1-4 bytes.
		if (!MemTy.isScalar())
		return false;
		switch (MemTy.getSizeInBits()) {
		case 8:
		case 16:
		case 32:
		break;
		default:
		return false;
		}
		if (!StoreMI.isSimple())
		return false;

		// We do a simple search for mergeable stores prior to this one.
		// Any potential alias hazard along the way terminates the search.
		SmallVector<GStore *> FoundStores;

		// We're looking for:
		// 1) a (store(trunc(...)))
		// 2) of an LSHR/ASHR of a single wide value, by the appropriate shift to get
		// the partial value stored.
		// 3) where the offsets form either a little or big-endian sequence.

		auto &LastStore = StoreMI;

		// The single base pointer that all stores must use.
		Register BaseReg;
		int64_t LastOffset;
		if (!mi_match(LastStore.getPointerReg(), *MRI,
		m_GPtrAdd(m_Reg(BaseReg), m_ICst(LastOffset)))) {
		BaseReg = LastStore.getPointerReg();
		LastOffset = 0;
		}

		GStore *LowestIdxStore = &LastStore;
		int64_t LowestIdxOffset = LastOffset;

		Register WideSrcVal;
		auto LowestShiftAmt = getTruncStoreByteOffset(LastStore, WideSrcVal, *MRI);
		if (!LowestShiftAmt)
		return false; // Didn't match a trunc.
		assert(WideSrcVal.isValid());

		LLT WideStoreTy = MRI->getType(WideSrcVal);
		// The wide type might not be a multiple of the memory type, e.g. s48 and s32.
		if (WideStoreTy.getSizeInBits() % MemTy.getSizeInBits() != 0)
		return false;
		const unsigned NumStoresRequired =
		WideStoreTy.getSizeInBits() / MemTy.getSizeInBits();

		SmallVector<int64_t, 8> OffsetMap(NumStoresRequired, INT64_MAX);
		OffsetMap[*LowestShiftAmt] = LastOffset;
		FoundStores.emplace_back(&LastStore);

		const int MaxInstsToCheck = 10;
		int NumInstsChecked = 0;
		for (auto II = ++LastStore.getReverseIterator();
		II != LastStore.getParent()->rend() && NumInstsChecked < MaxInstsToCheck;
		++II) {
		NumInstsChecked++;
		GStore *NewStore;
		if ((NewStore = dyn_cast<GStore>(&*II))) {
		if (NewStore->getMMO().getMemoryType() != MemTy \|\| !NewStore->isSimple())
		break;
		} else if (II->isLoadFoldBarrier() \|\| II->mayLoad()) {
		break;
		} else {
		continue; // This is a safe instruction we can look past.
		}

		Register NewBaseReg;
		int64_t MemOffset;
		// Check we're storing to the same base + some offset.
		if (!mi_match(NewStore->getPointerReg(), *MRI,
		m_GPtrAdd(m_Reg(NewBaseReg), m_ICst(MemOffset)))) {
		NewBaseReg = NewStore->getPointerReg();
		MemOffset = 0;
		}
		if (BaseReg != NewBaseReg)
		break;

		auto ShiftByteOffset = getTruncStoreByteOffset(NewStore, WideSrcVal, MRI);
		if (!ShiftByteOffset)
		break;
		if (MemOffset < LowestIdxOffset) {
		LowestIdxOffset = MemOffset;
		LowestIdxStore = NewStore;
		}

		// Map the offset in the store and the offset in the combined value, and
		// early return if it has been set before.
		if (ShiftByteOffset < 0 \|\| ShiftByteOffset >= NumStoresRequired \|\|
		OffsetMap[*ShiftByteOffset] != INT64_MAX)
		break;
		OffsetMap[*ShiftByteOffset] = MemOffset;

		FoundStores.emplace_back(NewStore);
		// Reset counter since we've found a matching inst.
		NumInstsChecked = 0;
		if (FoundStores.size() == NumStoresRequired)
		break;
		}

		if (FoundStores.size() != NumStoresRequired) {
		if (FoundStores.size() == 1)
		return false;
		// We didn't find enough stores to merge into the size of the original
		// source value, but we may be able to generate a smaller store if we
		// truncate the source value.
		WideStoreTy = LLT::scalar(FoundStores.size() * MemTy.getScalarSizeInBits());
		}

		unsigned NumStoresFound = FoundStores.size();

		const auto &DL = LastStore.getMF()->getDataLayout();
		auto &C = LastStore.getMF()->getFunction().getContext();
		// Check that a store of the wide type is both allowed and fast on the target
		unsigned Fast = 0;
		bool Allowed = TLI->allowsMemoryAccess(
		C, DL, WideStoreTy, LowestIdxStore->getMMO(), &Fast);
		if (!Allowed \|\| !Fast)
		return false;

		// Check if the pieces of the value are going to the expected places in memory
		// to merge the stores.
		unsigned NarrowBits = MemTy.getScalarSizeInBits();
		auto checkOffsets = [&](bool MatchLittleEndian) {
		if (MatchLittleEndian) {
		for (unsigned i = 0; i != NumStoresFound; ++i)
		if (OffsetMap[i] != i * (NarrowBits / 8) + LowestIdxOffset)
		return false;
		} else { // MatchBigEndian by reversing loop counter.
		for (unsigned i = 0, j = NumStoresFound - 1; i != NumStoresFound;
		++i, --j)
		if (OffsetMap[j] != i * (NarrowBits / 8) + LowestIdxOffset)
		return false;
		}
		return true;
		};

		// Check if the offsets line up for the native data layout of this target.
		bool NeedBswap = false;
		bool NeedRotate = false;
		if (!checkOffsets(DL.isLittleEndian())) {
		// Special-case: check if byte offsets line up for the opposite endian.
		if (NarrowBits == 8 && checkOffsets(DL.isBigEndian()))
		NeedBswap = true;
		else if (NumStoresFound == 2 && checkOffsets(DL.isBigEndian()))
		NeedRotate = true;
		else
		return false;
		}

		if (NeedBswap &&
		!isLegalOrBeforeLegalizer({TargetOpcode::G_BSWAP, {WideStoreTy}}, *MF))
		return false;
		if (NeedRotate &&
		!isLegalOrBeforeLegalizer(
		{TargetOpcode::G_ROTR, {WideStoreTy, WideStoreTy}}, *MF))
		return false;

		Builder.setInstrAndDebugLoc(StoreMI);

		if (WideStoreTy != MRI->getType(WideSrcVal))
		WideSrcVal = Builder.buildTrunc(WideStoreTy, WideSrcVal).getReg(0);

		if (NeedBswap) {
		WideSrcVal = Builder.buildBSwap(WideStoreTy, WideSrcVal).getReg(0);
		} else if (NeedRotate) {
		assert(WideStoreTy.getSizeInBits() % 2 == 0 &&
		"Unexpected type for rotate");
		auto RotAmt =
		Builder.buildConstant(WideStoreTy, WideStoreTy.getSizeInBits() / 2);
		WideSrcVal =
		Builder.buildRotateRight(WideStoreTy, WideSrcVal, RotAmt).getReg(0);
		}

		Builder.buildStore(WideSrcVal, LowestIdxStore->getPointerReg(),
		LowestIdxStore->getMMO().getPointerInfo(),
		LowestIdxStore->getMMO().getAlign());

		// Erase the old stores.
		for (auto *ST : FoundStores) {
		ST->eraseFromParent();
		DeletedStores.insert(ST);
		}
		return true;
		}

		bool LoadStoreOpt::mergeTruncStoresBlock(MachineBasicBlock &BB) {
		bool Changed = false;
		SmallVector<GStore *, 16> Stores;
		SmallPtrSet<GStore *, 8> DeletedStores;
		// Walk up the block so we can see the most eligible stores.
		for (MachineInstr &MI : llvm::reverse(BB))
		if (auto *StoreMI = dyn_cast<GStore>(&MI))
		Stores.emplace_back(StoreMI);

		for (auto *StoreMI : Stores) {
		if (DeletedStores.count(StoreMI))
		continue;
		if (mergeTruncStore(*StoreMI, DeletedStores))
		Changed = true;
		}
		return Changed;
		}

bool LoadStoreOpt::mergeFunctionStores(MachineFunction &MF) {		bool LoadStoreOpt::mergeFunctionStores(MachineFunction &MF) {
bool Changed = false;		bool Changed = false;
for (auto &BB : MF) {		for (auto &BB : MF){
Changed \|= mergeBlockStores(BB);		Changed \|= mergeBlockStores(BB);
		Changed \|= mergeTruncStoresBlock(BB);
		}

		// Erase all dead instructions left over by the merging.
		if (Changed) {
		for (auto &BB : MF) {
		for (auto &I : make_early_inc_range(make_range(BB.rbegin(), BB.rend()))) {
		if (isTriviallyDead(I, *MRI))
		I.eraseFromParent();
}		}
		}
		}

return Changed;		return Changed;
}		}

void LoadStoreOpt::initializeStoreMergeTargetInfo(unsigned AddrSpace) {		void LoadStoreOpt::initializeStoreMergeTargetInfo(unsigned AddrSpace) {
// Query the legalizer info to record what store types are legal.		// Query the legalizer info to record what store types are legal.
// We record this because we don't want to bother trying to merge stores into		// We record this because we don't want to bother trying to merge stores into
// illegal ones, which would just result in being split again.		// illegal ones, which would just result in being split again.

▲ Show 20 Lines • Show All 44 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/GlobalISel/merge-stores-truncating.ll

Show First 20 Lines • Show All 281 Lines • ▼ Show 20 Lines	; CHECK-NEXT: ret
%t2 = trunc i16 %sh to i8		%t2 = trunc i16 %sh to i8
store i8 %t1, ptr %p, align 1		store i8 %t1, ptr %p, align 1
%p1 = getelementptr inbounds i8, ptr %p, i64 1		%p1 = getelementptr inbounds i8, ptr %p, i64 1
store i8 %t2, ptr %p1, align 1		store i8 %t2, ptr %p1, align 1
ret void		ret void
}		}

define dso_local void @missing_store(i32 %x, ptr %p) {		define dso_local void @missing_store(i32 %x, ptr %p) {
		; The missing store of shift 16 means we can't merge to 32 bit store,
		; but we can still partially merge to a 16 bit one.
; CHECK-LABEL: missing_store:		; CHECK-LABEL: missing_store:
; CHECK: ; %bb.0:		; CHECK: ; %bb.0:
; CHECK-NEXT: lsr w8, w0, #8		; CHECK-NEXT: lsr w8, w0, #24
; CHECK-NEXT: lsr w9, w0, #24		; CHECK-NEXT: strh w0, [x1]
; CHECK-NEXT: strb w0, [x1]		; CHECK-NEXT: strb w8, [x1, #3]
; CHECK-NEXT: strb w8, [x1, #1]
; CHECK-NEXT: strb w9, [x1, #3]
; CHECK-NEXT: ret		; CHECK-NEXT: ret
%t1 = trunc i32 %x to i8		%t1 = trunc i32 %x to i8
%sh1 = lshr i32 %x, 8		%sh1 = lshr i32 %x, 8
%t2 = trunc i32 %sh1 to i8		%t2 = trunc i32 %sh1 to i8
%sh3 = lshr i32 %x, 24		%sh3 = lshr i32 %x, 24
%t4 = trunc i32 %sh3 to i8		%t4 = trunc i32 %sh3 to i8
store i8 %t1, ptr %p, align 1		store i8 %t1, ptr %p, align 1
%p1 = getelementptr inbounds i8, ptr %p, i64 1		%p1 = getelementptr inbounds i8, ptr %p, i64 1
Show All 29 Lines	; CHECK-NEXT: ret
%t1 = trunc i16 %x to i8		%t1 = trunc i16 %x to i8
%sh = lshr i16 %x, 8		%sh = lshr i16 %x, 8
%t2 = trunc i16 %sh to i8		%t2 = trunc i16 %sh to i8
store volatile i8 %t1, ptr %p, align 1		store volatile i8 %t1, ptr %p, align 1
%p1 = getelementptr inbounds i8, ptr %p, i64 1		%p1 = getelementptr inbounds i8, ptr %p, i64 1
store i8 %t2, ptr %p1, align 1		store i8 %t2, ptr %p1, align 1
ret void		ret void
}		}

		declare void @use_ptr(ptr)

		define dso_local void @trunc_from_larger_src_val(i64 %hold.4.lcssa, ptr %check1792) {
		; Here we can merge these i8 stores into a single i32 store, but first we need
		; to truncate the i64 value to i32.
		; CHECK-LABEL: trunc_from_larger_src_val:
		; CHECK: ; %bb.0:
		; CHECK-NEXT: sub sp, sp, #32
		; CHECK-NEXT: .cfi_def_cfa_offset 32
		; CHECK-NEXT: stp x29, x30, [sp, #16] ; 16-byte Folded Spill
		; CHECK-NEXT: .cfi_offset w30, -8
		; CHECK-NEXT: .cfi_offset w29, -16
		; CHECK-NEXT: str w0, [sp, #12]
		; CHECK-NEXT: add x0, sp, #12
		; CHECK-NEXT: bl _use_ptr
		; CHECK-NEXT: ldp x29, x30, [sp, #16] ; 16-byte Folded Reload
		; CHECK-NEXT: add sp, sp, #32
		; CHECK-NEXT: ret
		%hbuf = alloca [4 x i8], align 1
		%arrayidx177 = getelementptr inbounds [4 x i8], ptr %hbuf, i64 0, i64 1
		%arrayidx234 = getelementptr inbounds [4 x i8], ptr %hbuf, i64 0, i64 2
		%arrayidx237 = getelementptr inbounds [4 x i8], ptr %hbuf, i64 0, i64 3
		%conv227 = trunc i64 %hold.4.lcssa to i8
		store i8 %conv227, ptr %hbuf, align 1
		%shr229 = lshr i64 %hold.4.lcssa, 8
		%conv230 = trunc i64 %shr229 to i8
		store i8 %conv230, ptr %arrayidx177, align 1
		%shr232 = lshr i64 %hold.4.lcssa, 16
		%conv233 = trunc i64 %shr232 to i8
		store i8 %conv233, ptr %arrayidx234, align 1
		%shr235 = lshr i64 %hold.4.lcssa, 24
		%conv236 = trunc i64 %shr235 to i8
		store i8 %conv236, ptr %arrayidx237, align 1
		call void @use_ptr(ptr noundef nonnull %hbuf)
		ret void
		}

llvm/test/CodeGen/AArch64/GlobalISel/merge-stores-truncating.mir

# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py		# NOTE: Assertions have been autogenerated by utils/update_mir_test_checks.py
# RUN: llc -mtriple aarch64 -run-pass=aarch64-prelegalizer-combiner -verify-machineinstrs %s -o - \| FileCheck %s		# RUN: llc -mtriple aarch64 -run-pass=loadstore-opt -verify-machineinstrs %s -o - \| FileCheck %s
---		---
name: trunc_i16_to_i8		name: trunc_i16_to_i8
alignment: 4		alignment: 4
tracksRegLiveness: true		tracksRegLiveness: true
liveins:		liveins:
- { reg: '$w0' }		- { reg: '$w0' }
- { reg: '$x1' }		- { reg: '$x1' }
body: \|		body: \|
▲ Show 20 Lines • Show All 621 Lines • ▼ Show 20 Lines	body: \|
bb.1:		bb.1:
liveins: $w0, $x1		liveins: $w0, $x1

; CHECK-LABEL: name: missing_store		; CHECK-LABEL: name: missing_store
; CHECK: liveins: $w0, $x1		; CHECK: liveins: $w0, $x1
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $w0		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(s32) = COPY $w0
; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1
; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 8		; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 24
; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC [[COPY]](s32)
; CHECK-NEXT: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[COPY]], [[C]](s32)		; CHECK-NEXT: [[LSHR:%[0-9]+]]:_(s32) = G_LSHR [[COPY]], [[C]](s32)
; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s8) = G_TRUNC [[LSHR]](s32)		; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC [[LSHR]](s32)
; CHECK-NEXT: [[LSHR1:%[0-9]+]]:_(s32) = G_LSHR [[COPY]], [[C1]](s32)		; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY]](s32)
; CHECK-NEXT: [[TRUNC2:%[0-9]+]]:_(s8) = G_TRUNC [[LSHR1]](s32)		; CHECK-NEXT: G_STORE [[TRUNC1]](s16), [[COPY1]](p0) :: (store (s16), align 1)
; CHECK-NEXT: G_STORE [[TRUNC]](s8), [[COPY1]](p0) :: (store (s8))		; CHECK-NEXT: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 1		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY1]], [[C1]](s64)
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY1]], [[C2]](s64)		; CHECK-NEXT: G_STORE [[TRUNC]](s8), [[PTR_ADD]](p0) :: (store (s8))
; CHECK-NEXT: G_STORE [[TRUNC1]](s8), [[PTR_ADD]](p0) :: (store (s8))
; CHECK-NEXT: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 3
; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY1]], [[C3]](s64)
; CHECK-NEXT: G_STORE [[TRUNC2]](s8), [[PTR_ADD1]](p0) :: (store (s8))
; CHECK-NEXT: RET_ReallyLR		; CHECK-NEXT: RET_ReallyLR
%0:_(s32) = COPY $w0		%0:_(s32) = COPY $w0
%1:_(p0) = COPY $x1		%1:_(p0) = COPY $x1
%3:_(s32) = G_CONSTANT i32 8		%3:_(s32) = G_CONSTANT i32 8
%6:_(s32) = G_CONSTANT i32 24		%6:_(s32) = G_CONSTANT i32 24
%2:_(s8) = G_TRUNC %0(s32)		%2:_(s8) = G_TRUNC %0(s32)
%4:_(s32) = G_LSHR %0, %3(s32)		%4:_(s32) = G_LSHR %0, %3(s32)
%5:_(s8) = G_TRUNC %4(s32)		%5:_(s8) = G_TRUNC %4(s32)
▲ Show 20 Lines • Show All 127 Lines • Show Last 20 Lines

llvm/test/CodeGen/AArch64/GlobalISel/store-merging-debug.mir

Show First 20 Lines • Show All 97 Lines • ▼ Show 20 Lines	body: \|
bb.0:		bb.0:
liveins: $x0		liveins: $x0

; CHECK-LABEL: name: test_simple_4xs16		; CHECK-LABEL: name: test_simple_4xs16
; CHECK: liveins: $x0		; CHECK: liveins: $x0
; CHECK-NEXT: {{ $}}		; CHECK-NEXT: {{ $}}
; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0, debug-location !11		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0, debug-location !11
; CHECK-NEXT: DBG_VALUE [[COPY]](p0), $noreg, !9, !DIExpression(), debug-location !11		; CHECK-NEXT: DBG_VALUE [[COPY]](p0), $noreg, !9, !DIExpression(), debug-location !11
; CHECK-NEXT: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 4, debug-location !DILocation(line: 2, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE %1:_(s16), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 2, column: 1, scope: !5)
; CHECK-NEXT: DBG_VALUE [[C]](s16), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 2, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE %4:_(s16), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 3, column: 1, scope: !5)
; CHECK-NEXT: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 5, debug-location !DILocation(line: 3, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE %7:_(s16), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 4, column: 1, scope: !5)
; CHECK-NEXT: DBG_VALUE [[C1]](s16), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 3, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE %10:_(s16), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 5, column: 1, scope: !5)
; CHECK-NEXT: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 9, debug-location !DILocation(line: 4, column: 1, scope: !5)
; CHECK-NEXT: DBG_VALUE [[C2]](s16), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 4, column: 1, scope: !5)
; CHECK-NEXT: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 14, debug-location !DILocation(line: 5, column: 1, scope: !5)
; CHECK-NEXT: DBG_VALUE [[C3]](s16), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 5, column: 1, scope: !5)
; CHECK-NEXT: DBG_VALUE 0, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 6, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE 0, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 6, column: 1, scope: !5)
; CHECK-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 2, debug-location !DILocation(line: 7, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE %2:_(s64), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 7, column: 1, scope: !5)
; CHECK-NEXT: DBG_VALUE [[C4]](s64), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 7, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE %3:_(p0), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 8, column: 1, scope: !5)
; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64), debug-location !DILocation(line: 8, column: 1, scope: !5)
; CHECK-NEXT: DBG_VALUE [[PTR_ADD]](p0), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 8, column: 1, scope: !5)
; CHECK-NEXT: DBG_VALUE 1, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 9, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE 1, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 9, column: 1, scope: !5)
; CHECK-NEXT: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4, debug-location !DILocation(line: 10, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE %5:_(s64), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 10, column: 1, scope: !5)
; CHECK-NEXT: DBG_VALUE [[C5]](s64), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 10, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE %6:_(p0), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 11, column: 1, scope: !5)
; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64), debug-location !DILocation(line: 11, column: 1, scope: !5)
; CHECK-NEXT: DBG_VALUE [[PTR_ADD1]](p0), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 11, column: 1, scope: !5)
; CHECK-NEXT: DBG_VALUE 2, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 12, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE 2, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 12, column: 1, scope: !5)
; CHECK-NEXT: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 6, debug-location !DILocation(line: 13, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE %8:_(s64), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 13, column: 1, scope: !5)
; CHECK-NEXT: DBG_VALUE [[C6]](s64), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 13, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE %9:_(p0), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 14, column: 1, scope: !5)
; CHECK-NEXT: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C6]](s64), debug-location !DILocation(line: 14, column: 1, scope: !5)		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 3940688328982532
; CHECK-NEXT: DBG_VALUE [[PTR_ADD2]](p0), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 14, column: 1, scope: !5)		; CHECK-NEXT: G_STORE [[C]](s64), [[COPY]](p0), debug-location !DILocation(line: 9, scope: !5) :: (store (s64), align 2)
; CHECK-NEXT: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 3940688328982532
; CHECK-NEXT: G_STORE [[C7]](s64), [[COPY]](p0), debug-location !DILocation(line: 9, scope: !5) :: (store (s64), align 2)
; CHECK-NEXT: DBG_VALUE 3, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 15, column: 1, scope: !5)		; CHECK-NEXT: DBG_VALUE 3, $noreg, !9, !DIExpression(), debug-location !DILocation(line: 15, column: 1, scope: !5)
; CHECK-NEXT: RET_ReallyLR debug-location !DILocation(line: 16, column: 1, scope: !5)		; CHECK-NEXT: RET_ReallyLR debug-location !DILocation(line: 16, column: 1, scope: !5)
%0:_(p0) = COPY $x0, debug-location !11		%0:_(p0) = COPY $x0, debug-location !11
DBG_VALUE %0(p0), $noreg, !9, !DIExpression(), debug-location !11		DBG_VALUE %0(p0), $noreg, !9, !DIExpression(), debug-location !11
%1:_(s16) = G_CONSTANT i16 4, debug-location !DILocation(line: 2, column: 1, scope: !5)		%1:_(s16) = G_CONSTANT i16 4, debug-location !DILocation(line: 2, column: 1, scope: !5)
DBG_VALUE %1(s16), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 2, column: 1, scope: !5)		DBG_VALUE %1(s16), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 2, column: 1, scope: !5)
%4:_(s16) = G_CONSTANT i16 5, debug-location !DILocation(line: 3, column: 1, scope: !5)		%4:_(s16) = G_CONSTANT i16 5, debug-location !DILocation(line: 3, column: 1, scope: !5)
DBG_VALUE %4(s16), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 3, column: 1, scope: !5)		DBG_VALUE %4(s16), $noreg, !9, !DIExpression(), debug-location !DILocation(line: 3, column: 1, scope: !5)
Show All 27 Lines

llvm/test/CodeGen/AArch64/GlobalISel/store-merging.mir

Show First 20 Lines • Show All 172 Lines • ▼ Show 20 Lines	frameInfo:
maxAlignment: 1		maxAlignment: 1
machineFunctionInfo: {}		machineFunctionInfo: {}
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $x0		liveins: $x0

; CHECK-LABEL: name: test_simple_2xs8		; CHECK-LABEL: name: test_simple_2xs8
; CHECK: liveins: $x0		; CHECK: liveins: $x0
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[C:%[0-9]+]]:_(s8) = G_CONSTANT i8 4		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[C1:%[0-9]+]]:_(s8) = G_CONSTANT i8 5		; CHECK-NEXT: [[C:%[0-9]+]]:_(s8) = G_CONSTANT i8 4
; CHECK: G_STORE [[C]](s8), [[COPY]](p0) :: (store (s8) into %ir.addr11)		; CHECK-NEXT: [[C1:%[0-9]+]]:_(s8) = G_CONSTANT i8 5
; CHECK: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 1		; CHECK-NEXT: G_STORE [[C]](s8), [[COPY]](p0) :: (store (s8) into %ir.addr11)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)		; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
; CHECK: G_STORE [[C1]](s8), [[PTR_ADD]](p0) :: (store (s8) into %ir.addr2)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)
; CHECK: RET_ReallyLR		; CHECK-NEXT: G_STORE [[C1]](s8), [[PTR_ADD]](p0) :: (store (s8) into %ir.addr2)
		; CHECK-NEXT: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%1:_(s8) = G_CONSTANT i8 4		%1:_(s8) = G_CONSTANT i8 4
%4:_(s8) = G_CONSTANT i8 5		%4:_(s8) = G_CONSTANT i8 5
G_STORE %1(s8), %0(p0) :: (store (s8) into %ir.addr11)		G_STORE %1(s8), %0(p0) :: (store (s8) into %ir.addr11)
%2:_(s64) = G_CONSTANT i64 1		%2:_(s64) = G_CONSTANT i64 1
%3:_(p0) = G_PTR_ADD %0, %2(s64)		%3:_(p0) = G_PTR_ADD %0, %2(s64)
G_STORE %4(s8), %3(p0) :: (store (s8) into %ir.addr2)		G_STORE %4(s8), %3(p0) :: (store (s8) into %ir.addr2)
RET_ReallyLR		RET_ReallyLR
Show All 9 Lines	frameInfo:
maxAlignment: 1		maxAlignment: 1
machineFunctionInfo: {}		machineFunctionInfo: {}
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $x0		liveins: $x0

; CHECK-LABEL: name: test_simple_2xs16		; CHECK-LABEL: name: test_simple_2xs16
; CHECK: liveins: $x0		; CHECK: liveins: $x0
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 4		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 5		; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 327684
; CHECK: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 2		; CHECK-NEXT: G_STORE [[C]](s32), [[COPY]](p0) :: (store (s32) into %ir.addr11, align 2)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)		; CHECK-NEXT: RET_ReallyLR
; CHECK: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 327684
; CHECK: G_STORE [[C3]](s32), [[COPY]](p0) :: (store (s32) into %ir.addr11, align 2)
; CHECK: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%1:_(s16) = G_CONSTANT i16 4		%1:_(s16) = G_CONSTANT i16 4
%4:_(s16) = G_CONSTANT i16 5		%4:_(s16) = G_CONSTANT i16 5
G_STORE %1(s16), %0(p0) :: (store (s16) into %ir.addr11)		G_STORE %1(s16), %0(p0) :: (store (s16) into %ir.addr11)
%2:_(s64) = G_CONSTANT i64 2		%2:_(s64) = G_CONSTANT i64 2
%3:_(p0) = G_PTR_ADD %0, %2(s64)		%3:_(p0) = G_PTR_ADD %0, %2(s64)
G_STORE %4(s16), %3(p0) :: (store (s16) into %ir.addr2)		G_STORE %4(s16), %3(p0) :: (store (s16) into %ir.addr2)
RET_ReallyLR		RET_ReallyLR
Show All 9 Lines	frameInfo:
maxAlignment: 1		maxAlignment: 1
machineFunctionInfo: {}		machineFunctionInfo: {}
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $x0		liveins: $x0

; CHECK-LABEL: name: test_simple_4xs16		; CHECK-LABEL: name: test_simple_4xs16
; CHECK: liveins: $x0		; CHECK: liveins: $x0
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 4		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 5		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 3940688328982532
; CHECK: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 9		; CHECK-NEXT: G_STORE [[C]](s64), [[COPY]](p0) :: (store (s64) into %ir.addr11, align 2)
; CHECK: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 14		; CHECK-NEXT: RET_ReallyLR
; CHECK: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64)
; CHECK: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64)
; CHECK: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 6
; CHECK: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C6]](s64)
; CHECK: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 3940688328982532
; CHECK: G_STORE [[C7]](s64), [[COPY]](p0) :: (store (s64) into %ir.addr11, align 2)
; CHECK: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%1:_(s16) = G_CONSTANT i16 4		%1:_(s16) = G_CONSTANT i16 4
%4:_(s16) = G_CONSTANT i16 5		%4:_(s16) = G_CONSTANT i16 5
%7:_(s16) = G_CONSTANT i16 9		%7:_(s16) = G_CONSTANT i16 9
%10:_(s16) = G_CONSTANT i16 14		%10:_(s16) = G_CONSTANT i16 14
G_STORE %1(s16), %0(p0) :: (store (s16) into %ir.addr11)		G_STORE %1(s16), %0(p0) :: (store (s16) into %ir.addr11)
%2:_(s64) = G_CONSTANT i64 2		%2:_(s64) = G_CONSTANT i64 2
%3:_(p0) = G_PTR_ADD %0, %2(s64)		%3:_(p0) = G_PTR_ADD %0, %2(s64)
Show All 17 Lines	frameInfo:
maxAlignment: 1		maxAlignment: 1
machineFunctionInfo: {}		machineFunctionInfo: {}
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $x0		liveins: $x0

; CHECK-LABEL: name: test_simple_2xs32		; CHECK-LABEL: name: test_simple_2xs32
; CHECK: liveins: $x0		; CHECK: liveins: $x0
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 5		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 21474836484
; CHECK: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: G_STORE [[C]](s64), [[COPY]](p0) :: (store (s64) into %ir.addr11, align 4)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)		; CHECK-NEXT: RET_ReallyLR
; CHECK: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 21474836484
; CHECK: G_STORE [[C3]](s64), [[COPY]](p0) :: (store (s64) into %ir.addr11, align 4)
; CHECK: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%1:_(s32) = G_CONSTANT i32 4		%1:_(s32) = G_CONSTANT i32 4
%4:_(s32) = G_CONSTANT i32 5		%4:_(s32) = G_CONSTANT i32 5
G_STORE %1(s32), %0(p0) :: (store (s32) into %ir.addr11)		G_STORE %1(s32), %0(p0) :: (store (s32) into %ir.addr11)
%2:_(s64) = G_CONSTANT i64 4		%2:_(s64) = G_CONSTANT i64 4
%3:_(p0) = G_PTR_ADD %0, %2(s64)		%3:_(p0) = G_PTR_ADD %0, %2(s64)
G_STORE %4(s32), %3(p0) :: (store (s32) into %ir.addr2)		G_STORE %4(s32), %3(p0) :: (store (s32) into %ir.addr2)
RET_ReallyLR		RET_ReallyLR
Show All 9 Lines	frameInfo:
maxAlignment: 1		maxAlignment: 1
machineFunctionInfo: {}		machineFunctionInfo: {}
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $x0		liveins: $x0

; CHECK-LABEL: name: test_simple_2xs64_illegal		; CHECK-LABEL: name: test_simple_2xs64_illegal
; CHECK: liveins: $x0		; CHECK: liveins: $x0
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 5		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: G_STORE [[C]](s64), [[COPY]](p0) :: (store (s64) into %ir.addr11)		; CHECK-NEXT: [[C1:%[0-9]+]]:_(s64) = G_CONSTANT i64 5
; CHECK: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 8		; CHECK-NEXT: G_STORE [[C]](s64), [[COPY]](p0) :: (store (s64) into %ir.addr11)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)		; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 8
; CHECK: G_STORE [[C1]](s64), [[PTR_ADD]](p0) :: (store (s64) into %ir.addr2)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)
; CHECK: RET_ReallyLR		; CHECK-NEXT: G_STORE [[C1]](s64), [[PTR_ADD]](p0) :: (store (s64) into %ir.addr2)
		; CHECK-NEXT: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%1:_(s64) = G_CONSTANT i64 4		%1:_(s64) = G_CONSTANT i64 4
%4:_(s64) = G_CONSTANT i64 5		%4:_(s64) = G_CONSTANT i64 5
G_STORE %1(s64), %0(p0) :: (store (s64) into %ir.addr11)		G_STORE %1(s64), %0(p0) :: (store (s64) into %ir.addr11)
%2:_(s64) = G_CONSTANT i64 8		%2:_(s64) = G_CONSTANT i64 8
%3:_(p0) = G_PTR_ADD %0, %2(s64)		%3:_(p0) = G_PTR_ADD %0, %2(s64)
G_STORE %4(s64), %3(p0) :: (store (s64) into %ir.addr2)		G_STORE %4(s64), %3(p0) :: (store (s64) into %ir.addr2)
RET_ReallyLR		RET_ReallyLR
Show All 9 Lines	frameInfo:
maxAlignment: 1		maxAlignment: 1
machineFunctionInfo: {}		machineFunctionInfo: {}
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $x0		liveins: $x0

; CHECK-LABEL: name: test_simple_vector		; CHECK-LABEL: name: test_simple_vector
; CHECK: liveins: $x0		; CHECK: liveins: $x0
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 4		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 7		; CHECK-NEXT: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 4
; CHECK: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[C]](s16), [[C1]](s16)		; CHECK-NEXT: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 7
; CHECK: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 5		; CHECK-NEXT: [[BUILD_VECTOR:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[C]](s16), [[C1]](s16)
; CHECK: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8		; CHECK-NEXT: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 5
; CHECK: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[C2]](s16), [[C3]](s16)		; CHECK-NEXT: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 8
; CHECK: G_STORE [[BUILD_VECTOR]](<2 x s16>), [[COPY]](p0) :: (store (<2 x s16>) into %ir.addr11)		; CHECK-NEXT: [[BUILD_VECTOR1:%[0-9]+]]:_(<2 x s16>) = G_BUILD_VECTOR [[C2]](s16), [[C3]](s16)
; CHECK: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: G_STORE [[BUILD_VECTOR]](<2 x s16>), [[COPY]](p0) :: (store (<2 x s16>) into %ir.addr11)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64)		; CHECK-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: G_STORE [[BUILD_VECTOR1]](<2 x s16>), [[PTR_ADD]](p0) :: (store (<2 x s16>) into %ir.addr2)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64)
; CHECK: RET_ReallyLR		; CHECK-NEXT: G_STORE [[BUILD_VECTOR1]](<2 x s16>), [[PTR_ADD]](p0) :: (store (<2 x s16>) into %ir.addr2)
		; CHECK-NEXT: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%2:_(s16) = G_CONSTANT i16 4		%2:_(s16) = G_CONSTANT i16 4
%3:_(s16) = G_CONSTANT i16 7		%3:_(s16) = G_CONSTANT i16 7
%1:_(<2 x s16>) = G_BUILD_VECTOR %2(s16), %3(s16)		%1:_(<2 x s16>) = G_BUILD_VECTOR %2(s16), %3(s16)
%7:_(s16) = G_CONSTANT i16 5		%7:_(s16) = G_CONSTANT i16 5
%8:_(s16) = G_CONSTANT i16 8		%8:_(s16) = G_CONSTANT i16 8
%6:_(<2 x s16>) = G_BUILD_VECTOR %7(s16), %8(s16)		%6:_(<2 x s16>) = G_BUILD_VECTOR %7(s16), %8(s16)
G_STORE %1(<2 x s16>), %0(p0) :: (store (<2 x s16>) into %ir.addr11)		G_STORE %1(<2 x s16>), %0(p0) :: (store (<2 x s16>) into %ir.addr11)
Show All 14 Lines	frameInfo:
maxAlignment: 1		maxAlignment: 1
machineFunctionInfo: {}		machineFunctionInfo: {}
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $x0, $x1		liveins: $x0, $x1

; CHECK-LABEL: name: test_unknown_alias		; CHECK-LABEL: name: test_unknown_alias
; CHECK: liveins: $x0, $x1		; CHECK: liveins: $x0, $x1
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1
; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 5		; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; CHECK: G_STORE [[C]](s32), [[COPY]](p0) :: (store (s32) into %ir.addr11)		; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
; CHECK: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY1]](p0) :: (load (s32) from %ir.aliasptr)		; CHECK-NEXT: G_STORE [[C]](s32), [[COPY]](p0) :: (store (s32) into %ir.addr11)
; CHECK: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[COPY1]](p0) :: (load (s32) from %ir.aliasptr)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)		; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: G_STORE [[C1]](s32), [[PTR_ADD]](p0) :: (store (s32) into %ir.addr2)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)
; CHECK: $w0 = COPY [[LOAD]](s32)		; CHECK-NEXT: G_STORE [[C1]](s32), [[PTR_ADD]](p0) :: (store (s32) into %ir.addr2)
; CHECK: RET_ReallyLR implicit $w0		; CHECK-NEXT: $w0 = COPY [[LOAD]](s32)
		; CHECK-NEXT: RET_ReallyLR implicit $w0
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%1:_(p0) = COPY $x1		%1:_(p0) = COPY $x1
%2:_(s32) = G_CONSTANT i32 4		%2:_(s32) = G_CONSTANT i32 4
%6:_(s32) = G_CONSTANT i32 5		%6:_(s32) = G_CONSTANT i32 5
G_STORE %2(s32), %0(p0) :: (store (s32) into %ir.addr11)		G_STORE %2(s32), %0(p0) :: (store (s32) into %ir.addr11)
%3:_(s32) = G_LOAD %1(p0) :: (load (s32) from %ir.aliasptr)		%3:_(s32) = G_LOAD %1(p0) :: (load (s32) from %ir.aliasptr)
%4:_(s64) = G_CONSTANT i64 4		%4:_(s64) = G_CONSTANT i64 4
%5:_(p0) = G_PTR_ADD %0, %4(s64)		%5:_(p0) = G_PTR_ADD %0, %4(s64)
Show All 13 Lines	frameInfo:
maxAlignment: 1		maxAlignment: 1
machineFunctionInfo: {}		machineFunctionInfo: {}
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $x0, $x1		liveins: $x0, $x1

; CHECK-LABEL: name: test_2x_2xs32		; CHECK-LABEL: name: test_2x_2xs32
; CHECK: liveins: $x0, $x1		; CHECK: liveins: $x0, $x1
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1
; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 5		; CHECK-NEXT: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4
; CHECK: [[C2:%[0-9]+]]:_(s32) = G_CONSTANT i32 9		; CHECK-NEXT: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 5
; CHECK: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 17		; CHECK-NEXT: G_STORE [[C]](s32), [[COPY]](p0) :: (store (s32) into %ir.addr11)
; CHECK: G_STORE [[C]](s32), [[COPY]](p0) :: (store (s32) into %ir.addr11)		; CHECK-NEXT: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64)		; CHECK-NEXT: G_STORE [[C1]](s32), [[PTR_ADD]](p0) :: (store (s32) into %ir.addr2)
; CHECK: G_STORE [[C1]](s32), [[PTR_ADD]](p0) :: (store (s32) into %ir.addr2)		; CHECK-NEXT: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 73014444041
; CHECK: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY1]], [[C4]](s64)		; CHECK-NEXT: G_STORE [[C3]](s64), [[COPY1]](p0) :: (store (s64) into %ir.addr32, align 4)
; CHECK: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 73014444041		; CHECK-NEXT: RET_ReallyLR
; CHECK: G_STORE [[C5]](s64), [[COPY1]](p0) :: (store (s64) into %ir.addr32, align 4)
; CHECK: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%1:_(p0) = COPY $x1		%1:_(p0) = COPY $x1
%2:_(s32) = G_CONSTANT i32 4		%2:_(s32) = G_CONSTANT i32 4
%5:_(s32) = G_CONSTANT i32 5		%5:_(s32) = G_CONSTANT i32 5
%6:_(s32) = G_CONSTANT i32 9		%6:_(s32) = G_CONSTANT i32 9
%8:_(s32) = G_CONSTANT i32 17		%8:_(s32) = G_CONSTANT i32 17
G_STORE %2(s32), %0(p0) :: (store (s32) into %ir.addr11)		G_STORE %2(s32), %0(p0) :: (store (s32) into %ir.addr11)
%3:_(s64) = G_CONSTANT i64 4		%3:_(s64) = G_CONSTANT i64 4
Show All 17 Lines	frameInfo:
maxAlignment: 1		maxAlignment: 1
machineFunctionInfo: {}		machineFunctionInfo: {}
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $w1, $w2, $x0		liveins: $w1, $w2, $x0

; CHECK-LABEL: name: test_simple_var_2xs8		; CHECK-LABEL: name: test_simple_var_2xs8
; CHECK: liveins: $w1, $w2, $x0		; CHECK: liveins: $w1, $w2, $x0
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $w1		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC [[COPY1]](s32)		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY $w2		; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s8) = G_TRUNC [[COPY1]](s32)
; CHECK: [[TRUNC1:%[0-9]+]]:_(s8) = G_TRUNC [[COPY2]](s32)		; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $w2
; CHECK: G_STORE [[TRUNC]](s8), [[COPY]](p0) :: (store (s8) into %ir.addr11)		; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s8) = G_TRUNC [[COPY2]](s32)
; CHECK: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1		; CHECK-NEXT: G_STORE [[TRUNC]](s8), [[COPY]](p0) :: (store (s8) into %ir.addr11)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 1
; CHECK: G_STORE [[TRUNC1]](s8), [[PTR_ADD]](p0) :: (store (s8) into %ir.addr2)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
; CHECK: RET_ReallyLR		; CHECK-NEXT: G_STORE [[TRUNC1]](s8), [[PTR_ADD]](p0) :: (store (s8) into %ir.addr2)
		; CHECK-NEXT: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%3:_(s32) = COPY $w1		%3:_(s32) = COPY $w1
%1:_(s8) = G_TRUNC %3(s32)		%1:_(s8) = G_TRUNC %3(s32)
%4:_(s32) = COPY $w2		%4:_(s32) = COPY $w2
%2:_(s8) = G_TRUNC %4(s32)		%2:_(s8) = G_TRUNC %4(s32)
G_STORE %1(s8), %0(p0) :: (store (s8) into %ir.addr11)		G_STORE %1(s8), %0(p0) :: (store (s8) into %ir.addr11)
%5:_(s64) = G_CONSTANT i64 1		%5:_(s64) = G_CONSTANT i64 1
%6:_(p0) = G_PTR_ADD %0, %5(s64)		%6:_(p0) = G_PTR_ADD %0, %5(s64)
Show All 13 Lines	frameInfo:
maxAlignment: 1		maxAlignment: 1
machineFunctionInfo: {}		machineFunctionInfo: {}
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $w1, $w2, $x0		liveins: $w1, $w2, $x0

; CHECK-LABEL: name: test_simple_var_2xs16		; CHECK-LABEL: name: test_simple_var_2xs16
; CHECK: liveins: $w1, $w2, $x0		; CHECK: liveins: $w1, $w2, $x0
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $w1		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32)		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY $w2		; CHECK-NEXT: [[TRUNC:%[0-9]+]]:_(s16) = G_TRUNC [[COPY1]](s32)
; CHECK: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY2]](s32)		; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $w2
; CHECK: G_STORE [[TRUNC]](s16), [[COPY]](p0) :: (store (s16) into %ir.addr11)		; CHECK-NEXT: [[TRUNC1:%[0-9]+]]:_(s16) = G_TRUNC [[COPY2]](s32)
; CHECK: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2		; CHECK-NEXT: G_STORE [[TRUNC]](s16), [[COPY]](p0) :: (store (s16) into %ir.addr11)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; CHECK: G_STORE [[TRUNC1]](s16), [[PTR_ADD]](p0) :: (store (s16) into %ir.addr2)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
; CHECK: RET_ReallyLR		; CHECK-NEXT: G_STORE [[TRUNC1]](s16), [[PTR_ADD]](p0) :: (store (s16) into %ir.addr2)
		; CHECK-NEXT: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%3:_(s32) = COPY $w1		%3:_(s32) = COPY $w1
%1:_(s16) = G_TRUNC %3(s32)		%1:_(s16) = G_TRUNC %3(s32)
%4:_(s32) = COPY $w2		%4:_(s32) = COPY $w2
%2:_(s16) = G_TRUNC %4(s32)		%2:_(s16) = G_TRUNC %4(s32)
G_STORE %1(s16), %0(p0) :: (store (s16) into %ir.addr11)		G_STORE %1(s16), %0(p0) :: (store (s16) into %ir.addr11)
%5:_(s64) = G_CONSTANT i64 2		%5:_(s64) = G_CONSTANT i64 2
%6:_(p0) = G_PTR_ADD %0, %5(s64)		%6:_(p0) = G_PTR_ADD %0, %5(s64)
Show All 13 Lines	frameInfo:
maxAlignment: 1		maxAlignment: 1
machineFunctionInfo: {}		machineFunctionInfo: {}
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $w1, $w2, $x0		liveins: $w1, $w2, $x0

; CHECK-LABEL: name: test_simple_var_2xs32		; CHECK-LABEL: name: test_simple_var_2xs32
; CHECK: liveins: $w1, $w2, $x0		; CHECK: liveins: $w1, $w2, $x0
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[COPY1:%[0-9]+]]:_(s32) = COPY $w1		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[COPY2:%[0-9]+]]:_(s32) = COPY $w2		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(s32) = COPY $w1
; CHECK: G_STORE [[COPY1]](s32), [[COPY]](p0) :: (store (s32) into %ir.addr11)		; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(s32) = COPY $w2
; CHECK: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: G_STORE [[COPY1]](s32), [[COPY]](p0) :: (store (s32) into %ir.addr11)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: G_STORE [[COPY2]](s32), [[PTR_ADD]](p0) :: (store (s32) into %ir.addr2)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C]](s64)
; CHECK: RET_ReallyLR		; CHECK-NEXT: G_STORE [[COPY2]](s32), [[PTR_ADD]](p0) :: (store (s32) into %ir.addr2)
		; CHECK-NEXT: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%1:_(s32) = COPY $w1		%1:_(s32) = COPY $w1
%2:_(s32) = COPY $w2		%2:_(s32) = COPY $w2
G_STORE %1(s32), %0(p0) :: (store (s32) into %ir.addr11)		G_STORE %1(s32), %0(p0) :: (store (s32) into %ir.addr11)
%3:_(s64) = G_CONSTANT i64 4		%3:_(s64) = G_CONSTANT i64 4
%4:_(p0) = G_PTR_ADD %0, %3(s64)		%4:_(p0) = G_PTR_ADD %0, %3(s64)
G_STORE %2(s32), %4(p0) :: (store (s32) into %ir.addr2)		G_STORE %2(s32), %4(p0) :: (store (s32) into %ir.addr2)
RET_ReallyLR		RET_ReallyLR
Show All 13 Lines	body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $x0, $x1		liveins: $x0, $x1

; The store to ptr2 prevents merging into a single store.		; The store to ptr2 prevents merging into a single store.
; We can still merge the stores into addr1 and addr2.		; We can still merge the stores into addr1 and addr2.

; CHECK-LABEL: name: test_alias_4xs16		; CHECK-LABEL: name: test_alias_4xs16
; CHECK: liveins: $x0, $x1		; CHECK: liveins: $x0, $x1
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 4		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1
; CHECK: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 5		; CHECK-NEXT: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 9
; CHECK: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 9		; CHECK-NEXT: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 0
; CHECK: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 0		; CHECK-NEXT: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 14
; CHECK: [[C4:%[0-9]+]]:_(s16) = G_CONSTANT i16 14		; CHECK-NEXT: [[C3:%[0-9]+]]:_(s32) = G_CONSTANT i32 327684
; CHECK: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 2		; CHECK-NEXT: G_STORE [[C3]](s32), [[COPY]](p0) :: (store (s32) into %ir.addr11, align 2)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64)		; CHECK-NEXT: [[C4:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: [[C6:%[0-9]+]]:_(s32) = G_CONSTANT i32 327684		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C4]](s64)
; CHECK: G_STORE [[C6]](s32), [[COPY]](p0) :: (store (s32) into %ir.addr11, align 2)		; CHECK-NEXT: G_STORE [[C]](s16), [[PTR_ADD]](p0) :: (store (s16) into %ir.addr3)
; CHECK: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: G_STORE [[C1]](s16), [[COPY1]](p0) :: (store (s16) into %ir.ptr2)
; CHECK: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C7]](s64)		; CHECK-NEXT: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 6
; CHECK: G_STORE [[C2]](s16), [[PTR_ADD1]](p0) :: (store (s16) into %ir.addr3)		; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64)
; CHECK: G_STORE [[C3]](s16), [[COPY1]](p0) :: (store (s16) into %ir.ptr2)		; CHECK-NEXT: G_STORE [[C2]](s16), [[PTR_ADD1]](p0) :: (store (s16) into %ir.addr4)
; CHECK: [[C8:%[0-9]+]]:_(s64) = G_CONSTANT i64 6		; CHECK-NEXT: RET_ReallyLR
; CHECK: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C8]](s64)
; CHECK: G_STORE [[C4]](s16), [[PTR_ADD2]](p0) :: (store (s16) into %ir.addr4)
; CHECK: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%1:_(p0) = COPY $x1		%1:_(p0) = COPY $x1
%2:_(s16) = G_CONSTANT i16 4		%2:_(s16) = G_CONSTANT i16 4
%5:_(s16) = G_CONSTANT i16 5		%5:_(s16) = G_CONSTANT i16 5
%8:_(s16) = G_CONSTANT i16 9		%8:_(s16) = G_CONSTANT i16 9
%9:_(s16) = G_CONSTANT i16 0		%9:_(s16) = G_CONSTANT i16 0
%12:_(s16) = G_CONSTANT i16 14		%12:_(s16) = G_CONSTANT i16 14
G_STORE %2(s16), %0(p0) :: (store (s16) into %ir.addr11)		G_STORE %2(s16), %0(p0) :: (store (s16) into %ir.addr11)
Show All 22 Lines	frameInfo:
maxAlignment: 1		maxAlignment: 1
machineFunctionInfo: {}		machineFunctionInfo: {}
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $x0, $x1, $x2		liveins: $x0, $x1, $x2
; Here store of 5 and 9 can be merged, others have aliasing barriers.		; Here store of 5 and 9 can be merged, others have aliasing barriers.
; CHECK-LABEL: name: test_alias2_4xs16		; CHECK-LABEL: name: test_alias2_4xs16
; CHECK: liveins: $x0, $x1, $x2		; CHECK: liveins: $x0, $x1, $x2
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[COPY2:%[0-9]+]]:_(p0) = COPY $x2		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1
; CHECK: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 4		; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(p0) = COPY $x2
; CHECK: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 0		; CHECK-NEXT: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 4
; CHECK: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 5		; CHECK-NEXT: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 0
; CHECK: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 9		; CHECK-NEXT: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 14
; CHECK: [[C4:%[0-9]+]]:_(s16) = G_CONSTANT i16 14		; CHECK-NEXT: G_STORE [[C]](s16), [[COPY]](p0) :: (store (s16) into %ir.addr11)
; CHECK: G_STORE [[C]](s16), [[COPY]](p0) :: (store (s16) into %ir.addr11)		; CHECK-NEXT: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; CHECK: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 2		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C3]](s64)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64)		; CHECK-NEXT: G_STORE [[C1]](s16), [[COPY2]](p0) :: (store (s16) into %ir.ptr3)
; CHECK: G_STORE [[C1]](s16), [[COPY2]](p0) :: (store (s16) into %ir.ptr3)		; CHECK-NEXT: [[C4:%[0-9]+]]:_(s32) = G_CONSTANT i32 589829
; CHECK: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: G_STORE [[C4]](s32), [[PTR_ADD]](p0) :: (store (s32) into %ir.addr2, align 2)
; CHECK: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C6]](s64)		; CHECK-NEXT: G_STORE [[C1]](s16), [[COPY1]](p0) :: (store (s16) into %ir.ptr2)
; CHECK: [[C7:%[0-9]+]]:_(s32) = G_CONSTANT i32 589829		; CHECK-NEXT: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 6
; CHECK: G_STORE [[C7]](s32), [[PTR_ADD]](p0) :: (store (s32) into %ir.addr2, align 2)		; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64)
; CHECK: G_STORE [[C1]](s16), [[COPY1]](p0) :: (store (s16) into %ir.ptr2)		; CHECK-NEXT: G_STORE [[C2]](s16), [[PTR_ADD1]](p0) :: (store (s16) into %ir.addr4)
; CHECK: [[C8:%[0-9]+]]:_(s64) = G_CONSTANT i64 6		; CHECK-NEXT: RET_ReallyLR
; CHECK: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C8]](s64)
; CHECK: G_STORE [[C4]](s16), [[PTR_ADD2]](p0) :: (store (s16) into %ir.addr4)
; CHECK: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%1:_(p0) = COPY $x1		%1:_(p0) = COPY $x1
%2:_(p0) = COPY $x2		%2:_(p0) = COPY $x2
%3:_(s16) = G_CONSTANT i16 4		%3:_(s16) = G_CONSTANT i16 4
%6:_(s16) = G_CONSTANT i16 0		%6:_(s16) = G_CONSTANT i16 0
%7:_(s16) = G_CONSTANT i16 5		%7:_(s16) = G_CONSTANT i16 5
%10:_(s16) = G_CONSTANT i16 9		%10:_(s16) = G_CONSTANT i16 9
%13:_(s16) = G_CONSTANT i16 14		%13:_(s16) = G_CONSTANT i16 14
Show All 27 Lines
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $x0, $x1, $x2, $x3		liveins: $x0, $x1, $x2, $x3

; No merging can be done here.		; No merging can be done here.

; CHECK-LABEL: name: test_alias3_4xs16		; CHECK-LABEL: name: test_alias3_4xs16
; CHECK: liveins: $x0, $x1, $x2, $x3		; CHECK: liveins: $x0, $x1, $x2, $x3
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[COPY2:%[0-9]+]]:_(p0) = COPY $x2		; CHECK-NEXT: [[COPY1:%[0-9]+]]:_(p0) = COPY $x1
; CHECK: [[COPY3:%[0-9]+]]:_(p0) = COPY $x3		; CHECK-NEXT: [[COPY2:%[0-9]+]]:_(p0) = COPY $x2
; CHECK: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 4		; CHECK-NEXT: [[COPY3:%[0-9]+]]:_(p0) = COPY $x3
; CHECK: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 0		; CHECK-NEXT: [[C:%[0-9]+]]:_(s16) = G_CONSTANT i16 4
; CHECK: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 5		; CHECK-NEXT: [[C1:%[0-9]+]]:_(s16) = G_CONSTANT i16 0
; CHECK: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 9		; CHECK-NEXT: [[C2:%[0-9]+]]:_(s16) = G_CONSTANT i16 5
; CHECK: [[C4:%[0-9]+]]:_(s16) = G_CONSTANT i16 14		; CHECK-NEXT: [[C3:%[0-9]+]]:_(s16) = G_CONSTANT i16 9
; CHECK: G_STORE [[C]](s16), [[COPY]](p0) :: (store (s16) into %ir.addr11)		; CHECK-NEXT: [[C4:%[0-9]+]]:_(s16) = G_CONSTANT i16 14
; CHECK: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 2		; CHECK-NEXT: G_STORE [[C]](s16), [[COPY]](p0) :: (store (s16) into %ir.addr11)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64)		; CHECK-NEXT: [[C5:%[0-9]+]]:_(s64) = G_CONSTANT i64 2
; CHECK: G_STORE [[C1]](s16), [[COPY2]](p0) :: (store (s16) into %ir.ptr3)		; CHECK-NEXT: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C5]](s64)
; CHECK: G_STORE [[C2]](s16), [[PTR_ADD]](p0) :: (store (s16) into %ir.addr2)		; CHECK-NEXT: G_STORE [[C1]](s16), [[COPY2]](p0) :: (store (s16) into %ir.ptr3)
; CHECK: G_STORE [[C1]](s16), [[COPY3]](p0) :: (store (s16) into %ir.ptr4)		; CHECK-NEXT: G_STORE [[C2]](s16), [[PTR_ADD]](p0) :: (store (s16) into %ir.addr2)
; CHECK: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: G_STORE [[C1]](s16), [[COPY3]](p0) :: (store (s16) into %ir.ptr4)
; CHECK: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C6]](s64)		; CHECK-NEXT: [[C6:%[0-9]+]]:_(s64) = G_CONSTANT i64 4
; CHECK: G_STORE [[C3]](s16), [[PTR_ADD1]](p0) :: (store (s16) into %ir.addr3)		; CHECK-NEXT: [[PTR_ADD1:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C6]](s64)
; CHECK: G_STORE [[C1]](s16), [[COPY1]](p0) :: (store (s16) into %ir.ptr2)		; CHECK-NEXT: G_STORE [[C3]](s16), [[PTR_ADD1]](p0) :: (store (s16) into %ir.addr3)
; CHECK: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 6		; CHECK-NEXT: G_STORE [[C1]](s16), [[COPY1]](p0) :: (store (s16) into %ir.ptr2)
; CHECK: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C7]](s64)		; CHECK-NEXT: [[C7:%[0-9]+]]:_(s64) = G_CONSTANT i64 6
; CHECK: G_STORE [[C4]](s16), [[PTR_ADD2]](p0) :: (store (s16) into %ir.addr4)		; CHECK-NEXT: [[PTR_ADD2:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C7]](s64)
; CHECK: RET_ReallyLR		; CHECK-NEXT: G_STORE [[C4]](s16), [[PTR_ADD2]](p0) :: (store (s16) into %ir.addr4)
		; CHECK-NEXT: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%1:_(p0) = COPY $x1		%1:_(p0) = COPY $x1
%2:_(p0) = COPY $x2		%2:_(p0) = COPY $x2
%3:_(p0) = COPY $x3		%3:_(p0) = COPY $x3
%4:_(s16) = G_CONSTANT i16 4		%4:_(s16) = G_CONSTANT i16 4
%7:_(s16) = G_CONSTANT i16 0		%7:_(s16) = G_CONSTANT i16 0
%8:_(s16) = G_CONSTANT i16 5		%8:_(s16) = G_CONSTANT i16 5
%11:_(s16) = G_CONSTANT i16 9		%11:_(s16) = G_CONSTANT i16 9
Show All 29 Lines
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $x0		liveins: $x0

; Can merge because the load is from a different alloca and can't alias.		; Can merge because the load is from a different alloca and can't alias.

; CHECK-LABEL: name: test_alias_allocas_2xs32		; CHECK-LABEL: name: test_alias_allocas_2xs32
; CHECK: liveins: $x0		; CHECK: liveins: $x0
; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4		; CHECK-NEXT: {{ $}}
; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 5		; CHECK-NEXT: [[FRAME_INDEX:%[0-9]+]]:_(p0) = G_FRAME_INDEX %stack.0.a1
; CHECK: [[FRAME_INDEX:%[0-9]+]]:_(p0) = G_FRAME_INDEX %stack.0.a1		; CHECK-NEXT: [[FRAME_INDEX1:%[0-9]+]]:_(p0) = G_FRAME_INDEX %stack.1.a2
; CHECK: [[FRAME_INDEX1:%[0-9]+]]:_(p0) = G_FRAME_INDEX %stack.1.a2		; CHECK-NEXT: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[FRAME_INDEX1]](p0) :: (dereferenceable load (s32) from %ir.a2)
; CHECK: [[LOAD:%[0-9]+]]:_(s32) = G_LOAD [[FRAME_INDEX1]](p0) :: (dereferenceable load (s32) from %ir.a2)		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 21474836484
; CHECK: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: G_STORE [[C]](s64), [[FRAME_INDEX]](p0) :: (store (s64) into %ir.addr11, align 4)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[FRAME_INDEX]], [[C2]](s64)		; CHECK-NEXT: $w0 = COPY [[LOAD]](s32)
; CHECK: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 21474836484		; CHECK-NEXT: RET_ReallyLR implicit $w0
; CHECK: G_STORE [[C3]](s64), [[FRAME_INDEX]](p0) :: (store (s64) into %ir.addr11, align 4)
; CHECK: $w0 = COPY [[LOAD]](s32)
; CHECK: RET_ReallyLR implicit $w0
%3:_(s32) = G_CONSTANT i32 4		%3:_(s32) = G_CONSTANT i32 4
%7:_(s32) = G_CONSTANT i32 5		%7:_(s32) = G_CONSTANT i32 5
%1:_(p0) = G_FRAME_INDEX %stack.0.a1		%1:_(p0) = G_FRAME_INDEX %stack.0.a1
%2:_(p0) = G_FRAME_INDEX %stack.1.a2		%2:_(p0) = G_FRAME_INDEX %stack.1.a2
G_STORE %3(s32), %1(p0) :: (store (s32) into %ir.addr11)		G_STORE %3(s32), %1(p0) :: (store (s32) into %ir.addr11)
%4:_(s32) = G_LOAD %2(p0) :: (dereferenceable load (s32) from %ir.a2)		%4:_(s32) = G_LOAD %2(p0) :: (dereferenceable load (s32) from %ir.a2)
%5:_(s64) = G_CONSTANT i64 4		%5:_(s64) = G_CONSTANT i64 4
%6:_(p0) = G_PTR_ADD %1, %5(s64)		%6:_(p0) = G_PTR_ADD %1, %5(s64)
Show All 12 Lines	frameInfo:
maxAlignment: 1		maxAlignment: 1
machineFunctionInfo: {}		machineFunctionInfo: {}
body: \|		body: \|
bb.1 (%ir-block.0):		bb.1 (%ir-block.0):
liveins: $x0		liveins: $x0

; CHECK-LABEL: name: test_simple_2xs32_with_align		; CHECK-LABEL: name: test_simple_2xs32_with_align
; CHECK: liveins: $x0		; CHECK: liveins: $x0
; CHECK: [[COPY:%[0-9]+]]:_(p0) = COPY $x0		; CHECK-NEXT: {{ $}}
; CHECK: [[C:%[0-9]+]]:_(s32) = G_CONSTANT i32 4		; CHECK-NEXT: [[COPY:%[0-9]+]]:_(p0) = COPY $x0
; CHECK: [[C1:%[0-9]+]]:_(s32) = G_CONSTANT i32 5		; CHECK-NEXT: [[C:%[0-9]+]]:_(s64) = G_CONSTANT i64 21474836484
; CHECK: [[C2:%[0-9]+]]:_(s64) = G_CONSTANT i64 4		; CHECK-NEXT: G_STORE [[C]](s64), [[COPY]](p0) :: (store (s64) into %ir.addr11, align 2)
; CHECK: [[PTR_ADD:%[0-9]+]]:_(p0) = G_PTR_ADD [[COPY]], [[C2]](s64)		; CHECK-NEXT: RET_ReallyLR
; CHECK: [[C3:%[0-9]+]]:_(s64) = G_CONSTANT i64 21474836484
; CHECK: G_STORE [[C3]](s64), [[COPY]](p0) :: (store (s64) into %ir.addr11, align 2)
; CHECK: RET_ReallyLR
%0:_(p0) = COPY $x0		%0:_(p0) = COPY $x0
%1:_(s32) = G_CONSTANT i32 4		%1:_(s32) = G_CONSTANT i32 4
%4:_(s32) = G_CONSTANT i32 5		%4:_(s32) = G_CONSTANT i32 5
G_STORE %1(s32), %0(p0) :: (store (s32) into %ir.addr11, align 2)		G_STORE %1(s32), %0(p0) :: (store (s32) into %ir.addr11, align 2)
%2:_(s64) = G_CONSTANT i64 4		%2:_(s64) = G_CONSTANT i64 4
%3:_(p0) = G_PTR_ADD %0, %2(s64)		%3:_(p0) = G_PTR_ADD %0, %2(s64)
G_STORE %4(s32), %3(p0) :: (store (s32) into %ir.addr2, align 2)		G_STORE %4(s32), %3(p0) :: (store (s32) into %ir.addr2, align 2)
RET_ReallyLR		RET_ReallyLR

...		...

This is an archive of the discontinued LLVM Phabricator instance.

[GlobalISel] Move the truncstore_merge combine to the LoadStoreOpt pass and add support for an extra case.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 512882

llvm/include/llvm/CodeGen/GlobalISel/CombinerHelper.h

llvm/include/llvm/CodeGen/GlobalISel/LoadStoreOpt.h

llvm/include/llvm/Target/GlobalISel/Combine.td

llvm/lib/CodeGen/GlobalISel/CombinerHelper.cpp

llvm/lib/CodeGen/GlobalISel/LoadStoreOpt.cpp

llvm/test/CodeGen/AArch64/GlobalISel/merge-stores-truncating.ll

llvm/test/CodeGen/AArch64/GlobalISel/merge-stores-truncating.mir

llvm/test/CodeGen/AArch64/GlobalISel/store-merging-debug.mir

llvm/test/CodeGen/AArch64/GlobalISel/store-merging.mir

[GlobalISel] Move the truncstore_merge combine to the LoadStoreOpt pass and add support for an extra case.
ClosedPublic