This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
include/llvm/CodeGen/
-
llvm/
-
CodeGen/
-
MachineOperand.h
-
lib/Target/AArch64/
-
Target/
-
AArch64/
11
AArch64LoadStoreOptimizer.cpp
-
test/CodeGen/AArch64/
-
CodeGen/
-
AArch64/
1
fold_addressing_modes_aarch64.ll

Differential D62301

Fold Address Computations into Load/Store instructions for AArch64
Needs ReviewPublic

Authored by ramred01 on May 23 2019, 3:54 AM.

Download Raw Diff

Details

Reviewers

t.p.northover
evandro

Summary

When compiling with -Oz, the indexed or offset address computations with or without scaling for Load/Store instructions is not folding into the Load/Store instructions for AArch64 even when the AArch64 ISA has load/store instructions with addressing modes that allow for indexed/offset addresses with/without scaling.

This has been fixed by identifying the load/store instructions and then scanning backwards for the ADD/SUB instructions that perform the address computation for the load/store instructions. Thereafter, we replace the ADD/SUB and the Load/Store instruction with an appropriate Load/Store instruction, which has the address computation folded in.

Diff Detail

Event Timeline

ramred01 created this revision.May 23 2019, 3:54 AM

Herald added subscribers: kristof.beyls, javed.absar. · View Herald TranscriptMay 23 2019, 3:54 AM

t.p.northover added inline comments.May 23 2019, 5:32 AM

lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp
1388–1389	I think these names could be improved. Maybe `AddrCalcI` and `MemI`?
1401–1402	I don't see why we're checking ADDWrs here. This is an address computation so it'll always be at 64-bits. Even if it hypothetically wasn't (say with 32-bit pointers) the size of the arithmetic has no bearing on the type being stored, which is what the W in STRWui and LDRWui refer to.
1419–1420	Are you planning to add more variants in the immediate future? If not, it's probably best to convert this to an assertion.
1437	Can we start sentences with a capital letter please.
1449	This is not a useful comment.
1473	I don't believe this is reachable. It's certainly wrong: no ldp or stp instructions allow the sum of two registers, let alone with a shift.
1499	What's the benefit of doing the transformation at all if we don't get to remove the update instruction? I think there should be an earlier check that the load/store is the only user before doing anything. If it remains, the test should probably be whether the register is killed by its use in `I`.
1511–1513	I don't think this is the right place for your code. What you're looking for is a fundamentally but not necessarily different set of mergeability criteria. You could easily imagine someone extending this function to support `ADDXrs` with the intention to apply it in the same way as ADDXri, and that's what this function should be saved for. Instead, I think we need a distinctive pair of names for what you're doing and the existing transformation. Perhaps `mergeUpdateToIndexedMemOp` and `mergeUpdateToSingleUser`, though I'm not entirely happy with those and I'll think some more.
1518–1519	LLVM style braces start on the same line as the case.
1523–1524	I think this needs significantly better validation. The acceptable shifts depend strongly on what load or store this is being folded into. Non-paired loads only allow shifts of 0 or the data width, and paired loads don't allow any shift.
1785–1786	I think this needs refactoring, we shouldn't be hard-coding what `mergeUpdateInsn` supports outside that function. Instead, the merge should be able to bail gracefully when it doesn't understand ADDWrs or whatever.
test/CodeGen/AArch64/fold_addressing_modes_aarch64.ll
32	You can prune all the metadata out of this, and I think we also need more than just one test. There are many different ways this transformation could potentially go wrong.

Revision Contents

Path

Size

include/

llvm/

CodeGen/

MachineOperand.h

9 lines

lib/

Target/

AArch64/

AArch64LoadStoreOptimizer.cpp

194 lines

test/

CodeGen/

AArch64/

fold_addressing_modes_aarch64.ll

47 lines

Diff 200915

include/llvm/CodeGen/MachineOperand.h

Show First 20 Lines • Show All 913 Lines • ▼ Show 20 Lines	private:

/// isOnRegUseList - Return true if this operand is on a register use/def		/// isOnRegUseList - Return true if this operand is on a register use/def
/// list or false if not. This can only be called for register operands		/// list or false if not. This can only be called for register operands
/// that are part of a machine instruction.		/// that are part of a machine instruction.
bool isOnRegUseList() const {		bool isOnRegUseList() const {
assert(isReg() && "Can only add reg operand to use lists");		assert(isReg() && "Can only add reg operand to use lists");
return Contents.Reg.Prev != nullptr;		return Contents.Reg.Prev != nullptr;
}		}

		public:
		bool isOnRegUseListNext() const {
		assert(isReg() && "Can only add reg operand to use lists");
		if(Contents.Reg.Next)
		return Contents.Reg.Next->Contents.Reg.Next != nullptr;
		else
		return false;
		}
};		};

template <> struct DenseMapInfo<MachineOperand> {		template <> struct DenseMapInfo<MachineOperand> {
static MachineOperand getEmptyKey() {		static MachineOperand getEmptyKey() {
return MachineOperand(static_cast<MachineOperand::MachineOperandType>(		return MachineOperand(static_cast<MachineOperand::MachineOperandType>(
MachineOperand::MO_Empty));		MachineOperand::MO_Empty));
}		}
static MachineOperand getTombstoneKey() {		static MachineOperand getTombstoneKey() {
Show All 27 Lines

lib/Target/AArch64/AArch64LoadStoreOptimizer.cpp

Show All 28 Lines
#include "llvm/CodeGen/TargetRegisterInfo.h"		#include "llvm/CodeGen/TargetRegisterInfo.h"
#include "llvm/IR/DebugLoc.h"		#include "llvm/IR/DebugLoc.h"
#include "llvm/MC/MCRegisterInfo.h"		#include "llvm/MC/MCRegisterInfo.h"
#include "llvm/Pass.h"		#include "llvm/Pass.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
		#include "llvm/IR/Value.h"
#include <cassert>		#include <cassert>
#include <cstdint>		#include <cstdint>
#include <iterator>		#include <iterator>
#include <limits>		#include <limits>

using namespace llvm;		using namespace llvm;

#define DEBUG_TYPE "aarch64-ldst-opt"		#define DEBUG_TYPE "aarch64-ldst-opt"

STATISTIC(NumPairCreated, "Number of load/store pair instructions generated");		STATISTIC(NumPairCreated, "Number of load/store pair instructions generated");
STATISTIC(NumPostFolded, "Number of post-index updates folded");		STATISTIC(NumPostFolded, "Number of post-index updates folded");
STATISTIC(NumPreFolded, "Number of pre-index updates folded");		STATISTIC(NumPreFolded, "Number of pre-index updates folded");
STATISTIC(NumUnscaledPairCreated,		STATISTIC(NumUnscaledPairCreated,
"Number of load/store from unscaled generated");		"Number of load/store from unscaled generated");
STATISTIC(NumZeroStoresPromoted, "Number of narrow zero stores promoted");		STATISTIC(NumZeroStoresPromoted, "Number of narrow zero stores promoted");
STATISTIC(NumLoadsFromStoresPromoted, "Number of loads from stores promoted");		STATISTIC(NumLoadsFromStoresPromoted, "Number of loads from stores promoted");
		STATISTIC(NumAddressComputation, "Number of times Address Computation happened");

// The LdStLimit limits how far we search for load/store pairs.		// The LdStLimit limits how far we search for load/store pairs.
static cl::opt<unsigned> LdStLimit("aarch64-load-store-scan-limit",		static cl::opt<unsigned> LdStLimit("aarch64-load-store-scan-limit",
cl::init(20), cl::Hidden);		cl::init(20), cl::Hidden);

// The UpdateLimit limits how far we search for update instructions when we form		// The UpdateLimit limits how far we search for update instructions when we form
// pre-/post-index instructions.		// pre-/post-index instructions.
static cl::opt<unsigned> UpdateLimit("aarch64-update-scan-limit", cl::init(100),		static cl::opt<unsigned> UpdateLimit("aarch64-update-scan-limit", cl::init(100),
▲ Show 20 Lines • Show All 92 Lines • ▼ Show 20 Lines	struct AArch64LoadStoreOpt : public MachineFunctionPass {
bool isMatchingUpdateInsn(MachineInstr &MemMI, MachineInstr &MI,		bool isMatchingUpdateInsn(MachineInstr &MemMI, MachineInstr &MI,
unsigned BaseReg, int Offset);		unsigned BaseReg, int Offset);

// Merge a pre- or post-index base register update into a ld/st instruction.		// Merge a pre- or post-index base register update into a ld/st instruction.
MachineBasicBlock::iterator		MachineBasicBlock::iterator
mergeUpdateInsn(MachineBasicBlock::iterator I,		mergeUpdateInsn(MachineBasicBlock::iterator I,
MachineBasicBlock::iterator Update, bool IsPreIdx);		MachineBasicBlock::iterator Update, bool IsPreIdx);

		// Merge Add instruction with ld/st instruction.
		MachineBasicBlock::iterator
		mergeAddWithLDSTInstruction(MachineBasicBlock::iterator I,
		MachineBasicBlock::iterator Update, bool IsPreIdx);

// Find and merge zero store instructions.		// Find and merge zero store instructions.
bool tryToMergeZeroStInst(MachineBasicBlock::iterator &MBBI);		bool tryToMergeZeroStInst(MachineBasicBlock::iterator &MBBI);

// Find and pair ldr/str instructions.		// Find and pair ldr/str instructions.
bool tryToPairLdStInst(MachineBasicBlock::iterator &MBBI);		bool tryToPairLdStInst(MachineBasicBlock::iterator &MBBI);

// Find and promote load instructions which read directly from store.		// Find and promote load instructions which read directly from store.
bool tryToPromoteLoadFromStore(MachineBasicBlock::iterator &MBBI);		bool tryToPromoteLoadFromStore(MachineBasicBlock::iterator &MBBI);
▲ Show 20 Lines • Show All 1,199 Lines • ▼ Show 20 Lines	AArch64LoadStoreOpt::mergeUpdateInsn(MachineBasicBlock::iterator I,

// Erase the old instructions for the block.		// Erase the old instructions for the block.
I->eraseFromParent();		I->eraseFromParent();
Update->eraseFromParent();		Update->eraseFromParent();

return NextI;		return NextI;
}		}

		// this API will return true if both instructions can fold.
		// e.g: ADDXrs x8, x9, x8, lsl #3
		// STRXui xzr, [x8]
		// can merge to STRXrox xzr, [x9, x8, lsl #3]
		// Both instructions opcodes should 'X' or 'W'
		static bool isAddrFoldableInst(MachineBasicBlock::iterator Update,
		MachineBasicBlock::iterator I) {
		t.p.northoverUnsubmitted Not Done Reply Inline Actions I think these names could be improved. Maybe `AddrCalcI` and `MemI`? t.p.northover: I think these names could be improved. Maybe `AddrCalcI` and `MemI`?
		unsigned IOpc = I->getOpcode();
		// will return true if both instructions are ADDXrs and (STRXui or LDRXui).
		if (Update->getOpcode() == AArch64::ADDXrs) {
		switch(IOpc) {
		default:
		return false;
		case AArch64::STRXui:
		case AArch64::LDRXui:
		return true;
		}
		}
		// will return true if both instructions are ADDWrs and (STRWui and LDRWui).
		else if (Update->getOpcode() == AArch64::ADDWrs) {
		t.p.northoverUnsubmitted Not Done Reply Inline Actions I don't see why we're checking ADDWrs here. This is an address computation so it'll always be at 64-bits. Even if it hypothetically wasn't (say with 32-bit pointers) the size of the arithmetic has no bearing on the type being stored, which is what the W in STRWui and LDRWui refer to. t.p.northover: I don't see why we're checking ADDWrs here. This is an address computation so it'll always be…
		switch(IOpc) {
		default:
		return false;
		case AArch64::STRWui:
		case AArch64::LDRWui:
		return true;
		}
		}
		return false;
		}

		// will return aarch64 target opcode for merged(new) instruction.
		static unsigned getTargetOpcodeForFoldInst(MachineBasicBlock::iterator Update,
		MachineBasicBlock::iterator I) {
		unsigned UpdateOpc = Update->getOpcode();
		unsigned IOpc = I->getOpcode();
		if(UpdateOpc == AArch64::ADDXrs \|\|
		UpdateOpc == AArch64::ADDWrs) {
		t.p.northoverUnsubmitted Not Done Reply Inline Actions Are you planning to add more variants in the immediate future? If not, it's probably best to convert this to an assertion. t.p.northover: Are you planning to add more variants in the immediate future? If not, it's probably best to…
		switch(IOpc) {
		default:
		break;
		case AArch64::STRXui:
		return AArch64::STRXroX;
		case AArch64::STRWui:
		return AArch64::STRWroX;
		case AArch64::LDRXui:
		return AArch64::LDRXroX;
		case AArch64::LDRWui:
		return AArch64::LDRWroX;
		}
		}
		return false;
		}

		// here will merge the both ADD and STR/LDR instructions if able to merge both
		t.p.northoverUnsubmitted Not Done Reply Inline Actions Can we start sentences with a capital letter please. t.p.northover: Can we start sentences with a capital letter please.
		// and will create new instruction.
		MachineBasicBlock::iterator
		AArch64LoadStoreOpt::mergeAddWithLDSTInstruction(MachineBasicBlock::iterator I,
		MachineBasicBlock::iterator Update,
		bool IsPreIdx) {
		MachineBasicBlock::iterator NextI = I;
		// Return the instruction following the merged instruction, which is
		// the instruction following our unmerged store. Unless that's the add/sub
		// instruction we're merging, in which case it's the one after that.
		if (++NextI == Update)
		++NextI;
		// getting the immediate value of operand three
		t.p.northoverUnsubmitted Not Done Reply Inline Actions This is not a useful comment. t.p.northover: This is not a useful comment.
		int Value = Update->getOperand(3).getImm();
		//calculating the scale value
		unsigned int ScaleVal = getMemScale(*I) / Value;
		// getting the matching target opcode.
		unsigned Opc = getTargetOpcodeForFoldInst(Update, I);
		// here changing the second operand kill status for future use.
		// otherwise this operand is deleting.
		Update->getOperand(1).setIsKill(false);
		Update->getOperand(2).setIsKill(false);

		MachineInstrBuilder MIB;
		if (!isPairedLdSt(*I)) {
		// Non-paired instruction.
		MIB = BuildMI(*I->getParent(), I, I->getDebugLoc(), TII->get(Opc))
		.add(I->getOperand(0))
		.add(Update->getOperand(1))
		.add(Update->getOperand(2))
		.addImm(0)
		.addImm(Value)
		.setMemRefs(I->memoperands())
		.setMIFlags(I->mergeFlagsWith(*Update));
		} else {
		// Paired instruction.
		MIB = BuildMI(*I->getParent(), I, I->getDebugLoc(), TII->get(Opc))
		t.p.northoverUnsubmitted Not Done Reply Inline Actions I don't believe this is reachable. It's certainly wrong: no ldp or stp instructions allow the sum of two registers, let alone with a shift. t.p.northover: I don't believe this is reachable. It's certainly wrong: no ldp or stp instructions allow the…
		.add(I->getOperand(0))
		.add(getLdStRegOp(*Update, 0))
		.add(getLdStRegOp(*Update, 1))
		.add(Update->getOperand(2))
		.addImm(0)
		.addImm(ScaleVal)
		.setMemRefs(I->memoperands())
		.setMIFlags(I->mergeFlagsWith(*I));
		}

		++NumAddressComputation;
		LLVM_DEBUG(dbgs() << "Creating Address Computation.");
		LLVM_DEBUG(dbgs() << " Replacing instructions:\n ");
		LLVM_DEBUG(I->print(dbgs()));
		LLVM_DEBUG(dbgs() << " ");
		LLVM_DEBUG(Update->print(dbgs()));
		LLVM_DEBUG(dbgs() << " with instruction:\n ");
		LLVM_DEBUG(((MachineInstr *)MIB)->print(dbgs()));
		LLVM_DEBUG(dbgs() << "\n");

		// Operand one is using in any instructions in below,
		// we have to change the position of ADD(Update) instruction.i
		// because of this adding the new instruction(same as ADD)
		// after new merged instruction.

		if(I->getOperand(1).isOnRegUseListNext()) {
		t.p.northoverUnsubmitted Not Done Reply Inline Actions What's the benefit of doing the transformation at all if we don't get to remove the update instruction? I think there should be an earlier check that the load/store is the only user before doing anything. If it remains, the test should probably be whether the register is killed by its use in `I`. t.p.northover: What's the benefit of doing the transformation at all if we don't get to remove the update…
		MachineBasicBlock *MBB = I->getParent();
		MBB->splice(I, MBB, &*Update);
		}
		else {
		Update->eraseFromParent();
		}
		// Erase the old instructions for the block.
		I->eraseFromParent();
		return NextI;
		}

bool AArch64LoadStoreOpt::isMatchingUpdateInsn(MachineInstr &MemMI,		bool AArch64LoadStoreOpt::isMatchingUpdateInsn(MachineInstr &MemMI,
MachineInstr &MI,		MachineInstr &MI,
unsigned BaseReg, int Offset) {		unsigned BaseReg, int Offset) {
		t.p.northoverUnsubmitted Not Done Reply Inline Actions I don't think this is the right place for your code. What you're looking for is a fundamentally but not necessarily different set of mergeability criteria. You could easily imagine someone extending this function to support `ADDXrs` with the intention to apply it in the same way as ADDXri, and that's what this function should be saved for. Instead, I think we need a distinctive pair of names for what you're doing and the existing transformation. Perhaps `mergeUpdateToIndexedMemOp` and `mergeUpdateToSingleUser`, though I'm not entirely happy with those and I'll think some more. t.p.northover: I don't think this is the right place for your code. What you're looking for is a fundamentally…
switch (MI.getOpcode()) {		switch (MI.getOpcode()) {
default:		default:
break;		break;
		case AArch64::ADDWrs:
		case AArch64::ADDXrs:
		{
		t.p.northoverUnsubmitted Not Done Reply Inline Actions LLVM style braces start on the same line as the case. t.p.northover: LLVM style braces start on the same line as the case.
		if (!MI.getOperand(1).isReg() \|\| !MI.getOperand(2).isReg() \|\| !MI.getOperand(3).isImm())
		break;
		int ShiftValue = MI.getOperand(3).getImm();
		if (ShiftValue > 4)
		break;
		t.p.northoverUnsubmitted Not Done Reply Inline Actions I think this needs significantly better validation. The acceptable shifts depend strongly on what load or store this is being folded into. Non-paired loads only allow shifts of 0 or the data width, and paired loads don't allow any shift. t.p.northover: I think this needs significantly better validation. The acceptable shifts depend strongly on…
		// The update instruction source and destination register must be the
		// same as the load/store base register.
		if (MI.getOperand(0).getReg() != BaseReg \|\| Offset != 0)
		break;
		return true;
		}
case AArch64::SUBXri:		case AArch64::SUBXri:
case AArch64::ADDXri:		case AArch64::ADDXri:
// Make sure it's a vanilla immediate operand, not a relocation or		// Make sure it's a vanilla immediate operand, not a relocation or
// anything else we can't handle.		// anything else we can't handle.
if (!MI.getOperand(2).isImm())		if (!MI.getOperand(2).isImm())
break;		break;
// Watch out for 1 << 12 shifted value.		// Watch out for 1 << 12 shifted value.
if (AArch64_AM::getShiftValue(MI.getOperand(3).getImm()))		if (AArch64_AM::getShiftValue(MI.getOperand(3).getImm()))
▲ Show 20 Lines • Show All 238 Lines • ▼ Show 20 Lines	bool AArch64LoadStoreOpt::tryToMergeLdStUpdate

// Look forward to try to form a post-index instruction. For example,		// Look forward to try to form a post-index instruction. For example,
// ldr x0, [x20]		// ldr x0, [x20]
// add x20, x20, #32		// add x20, x20, #32
// merged into:		// merged into:
// ldr x0, [x20], #32		// ldr x0, [x20], #32
Update = findMatchingUpdateInsnForward(MBBI, 0, UpdateLimit);		Update = findMatchingUpdateInsnForward(MBBI, 0, UpdateLimit);
if (Update != E) {		if (Update != E) {
		if (Update->getOpcode() == AArch64::ADDXri \|\|
		Update->getOpcode() == AArch64::SUBXri) {
		t.p.northoverUnsubmitted Not Done Reply Inline Actions I think this needs refactoring, we shouldn't be hard-coding what `mergeUpdateInsn` supports outside that function. Instead, the merge should be able to bail gracefully when it doesn't understand ADDWrs or whatever. t.p.northover: I think this needs refactoring, we shouldn't be hard-coding what `mergeUpdateInsn` supports…
// Merge the update into the ld/st.		// Merge the update into the ld/st.
MBBI = mergeUpdateInsn(MBBI, Update, /IsPreIdx=/false);		MBBI = mergeUpdateInsn(MBBI, Update, /IsPreIdx=/false);
return true;		return true;
}		}
		}

// Don't know how to handle unscaled pre/post-index versions below, so bail.		// Don't know how to handle unscaled pre/post-index versions below, so bail.
if (TII->isUnscaledLdSt(MI.getOpcode()))		if (TII->isUnscaledLdSt(MI.getOpcode()))
return false;		return false;

// Look back to try to find a pre-index instruction. For example,		// Look back to try to find a pre-index instruction. For example,
		Update = findMatchingUpdateInsnBackward(MBBI, UpdateLimit);
		if (Update != E) {
		// if Update Inst Opcode is ADDXrs,
		// add x8, x9, x8, lsl #3
		// str xzr, [x8]
		// merged into:
		// str xzr, [x9,x8, lsl #3]
		// if Update(ADD inst) opcode is either ADDXrs or ADDWrs
		if (isAddrFoldableInst(Update, MBBI)) {
		MBBI = mergeAddWithLDSTInstruction(MBBI, Update, true);
		return true;
		}
// add x0, x0, #8		// add x0, x0, #8
// ldr x1, [x0]		// ldr x1, [x0]
// merged into:		// merged into:
// ldr x1, [x0, #8]!		// ldr x1, [x0, #8]!
Update = findMatchingUpdateInsnBackward(MBBI, UpdateLimit);		else if (Update->getOpcode() == AArch64::ADDXri \|\|
if (Update != E) {		Update->getOpcode() == AArch64::SUBXri) {
// Merge the update into the ld/st.		// Merge the update into the ld/st.
MBBI = mergeUpdateInsn(MBBI, Update, /IsPreIdx=/true);		MBBI = mergeUpdateInsn(MBBI, Update, /IsPreIdx=/true);
return true;		return true;
}		}
		}

// The immediate in the load/store is scaled by the size of the memory		// The immediate in the load/store is scaled by the size of the memory
// operation. The immediate in the add we're looking for,		// operation. The immediate in the add we're looking for,
// however, is not, so adjust here.		// however, is not, so adjust here.
int UnscaledOffset = getLdStOffsetOp(MI).getImm() * getMemScale(MI);		int UnscaledOffset = getLdStOffsetOp(MI).getImm() * getMemScale(MI);

// Look forward to try to find a post-index instruction. For example,		// Look forward to try to find a post-index instruction. For example,
// ldr x1, [x0, #64]		// ldr x1, [x0, #64]
// add x0, x0, #64		// add x0, x0, #64
// merged into:		// merged into:
// ldr x1, [x0, #64]!		// ldr x1, [x0, #64]!
Update = findMatchingUpdateInsnForward(MBBI, UnscaledOffset, UpdateLimit);		Update = findMatchingUpdateInsnForward(MBBI, UnscaledOffset, UpdateLimit);
if (Update != E) {		if (Update != E) {
		if(Update->getOpcode() == AArch64::ADDXri \|\|
		Update->getOpcode() == AArch64::SUBXri) {
// Merge the update into the ld/st.		// Merge the update into the ld/st.
MBBI = mergeUpdateInsn(MBBI, Update, /IsPreIdx=/true);		MBBI = mergeUpdateInsn(MBBI, Update, /IsPreIdx=/true);
return true;		return true;
}		}
		}

return false;		return false;
}		}

bool AArch64LoadStoreOpt::optimizeBlock(MachineBasicBlock &MBB,		bool AArch64LoadStoreOpt::optimizeBlock(MachineBasicBlock &MBB,
bool EnableNarrowZeroStOpt) {		bool EnableNarrowZeroStOpt) {
bool Modified = false;		bool Modified = false;
// Four tranformations to do here:		// Four tranformations to do here:
▲ Show 20 Lines • Show All 105 Lines • Show Last 20 Lines

test/CodeGen/AArch64/fold_addressing_modes_aarch64.ll

This file was added.

				; RUN: llc -o - %s -mtriple=aarch64-arm-none-eabi -verify-machineinstrs \| FileCheck %s
				; ModuleID = './test_51309.c'
				source_filename = "./test_51309.c"
				target datalayout = "e-m:e-i8:8:32-i16:16:32-i64:64-i128:128-n32:64-S128"
				target triple = "aarch64-arm-none-eabi"

				%struct.As = type { i32, i32 }
				%struct.Bs = type { i16 }

				@A = external dso_local local_unnamed_addr global [4 x %struct.As], align 4
				@B = external dso_local local_unnamed_addr global %struct.Bs*, align 8

				; Function Attrs: minsize norecurse nounwind optsizei

				; CHECK_LABEL: @test
				; CHECK: adrp
				; CHECK: ldr
				; CHECK-NEXT: ldrsh
				; CHECK-NOT: add
				; CHECK-NEXT: str
				define dso_local void @test() local_unnamed_addr #0 {
				%1 = load %struct.Bs, %struct.Bs* @B, align 8, !tbaa !2
				%2 = getelementptr inbounds %struct.Bs, %struct.Bs* %1, i64 0, i32 0
				%3 = load i16, i16* %2, align 2, !tbaa !6
				%4 = sext i16 %3 to i64
				%5 = getelementptr inbounds [4 x %struct.As], [4 x %struct.As]* @A, i64 0, i64 %4, i32 0
				%6 = bitcast i32* %5 to <2 x i32>*
				store <2 x i32> zeroinitializer, <2 x i32>* %6, align 4, !tbaa !9
				ret void
				}

				attributes #0 = { minsize norecurse nounwind optsize "correctly-rounded-divide-sqrt-fp-math"="false" "disable-tail-calls"="false" "less-precise-fpmad"="false" "min-legal-vector-width"="0" "no-frame-pointer-elim"="true" "no-frame-pointer-elim-non-leaf" "no-infs-fp-math"="false" "no-jump-tables"="false" "no-nans-fp-math"="false" "no-signed-zeros-fp-math"="false" "no-trapping-math"="false" "stack-protector-buffer-size"="8" "target-cpu"="cortex-a53" "target-features"="+aes,+crc,+crypto,+fp-armv8,+neon,+sha2" "unsafe-fp-math"="false" "use-soft-float"="false" }
				t.p.northoverUnsubmitted Not Done Reply Inline Actions You can prune all the metadata out of this, and I think we also need more than just one test. There are many different ways this transformation could potentially go wrong. t.p.northover: You can prune all the metadata out of this, and I think we also need more than just one test.

				!llvm.module.flags = !{!0}
				!llvm.ident = !{!1}

				!0 = !{i32 1, !"wchar_size", i32 4}
				!1 = !{!"clang version 9.0.0 (https://git.llvm.org/git/clang.git/ 268b249f1d4cbc212d1853ac9821194f868eef36) (https://git.llvm.org/git/llvm.git/ 772398facdeaf5e5f4f8ca641e06f354441ad9ac)"}
				!2 = !{!3, !3, i64 0}
				!3 = !{!"any pointer", !4, i64 0}
				!4 = !{!"omnipotent char", !5, i64 0}
				!5 = !{!"Simple C/C++ TBAA"}
				!6 = !{!7, !8, i64 0}
				!7 = !{!"Bs", !8, i64 0}
				!8 = !{!"short", !4, i64 0}
				!9 = !{!10, !10, i64 0}
				!10 = !{!"int", !4, i64 0}