This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
lib/Target/PowerPC/
-
Target/
-
PowerPC/
-
PPCISelLowering.h
2/5
PPCISelLowering.cpp
1/2
PPCRegisterInfo.cpp
-
test/CodeGen/PowerPC/
-
CodeGen/
-
PowerPC/
1
spe.ll

Differential D54409

PowerPC/SPE: Fix load/store handling for SPE
ClosedPublic

Authored by jhibbits on Nov 11 2018, 2:58 PM.

Download Raw Diff

Details

Reviewers

nemanjai
hfinkel
joerg

Commits

rG5214956eaaa1: PowerPC/SPE: Fix load/store handling for SPE
rL366318: PowerPC/SPE: Fix load/store handling for SPE

Summary

Pointed out in a comment for D49754, register spilling will currently
spill SPE registers at almost any offset. However, the instructions
evstdd and evldd require a) 8-byte alignment, and b) a limit of 256
(unsigned) bytes from the base register, as the offset must fix into a
5-bit offset, which ranges from 0-31 (indexed in double-words).

This enforces the alignment in offsetMinAlign(), and enforces the spill
range check in PPCRegisterInfo::eliminateFrameIndex().

The update to the register spill test is taken partially from the test
case shown in D49754.

Additionally, pointed out by Kei Thomsen, globals will currently use
evldd/evstdd, though the offset isn't known at compile time, so may
exceed the 8-bit (unsigned) offset permitted. This fixes that as well,
by forcing it to always use evlddx/evstddx when accessing globals.

Part of the patch contributed by Kei Thomsen.

Diff Detail

Repository

rL LLVM

Build Status

Buildable 34039
Build 34038: arc lint + arc unit

Event Timeline

jhibbits created this revision.Nov 11 2018, 2:58 PM

Herald added subscribers: llvm-commits, jsji, kbarton. · View Herald TranscriptNov 11 2018, 2:58 PM

Harbormaster completed remote builds in B24868: Diff 173593.Nov 11 2018, 2:58 PM

jhibbits mentioned this in D49754: Add -m(no-)spe, and e500 CPU definitions and support to clang.Nov 15 2018, 8:40 AM

glaubitz added a subscriber: glaubitz.Dec 4 2018, 7:31 AM

I have applied this patch to the llvm-toolchain-7 package in Debian and did not see any regressions on x86_64 or 32-Bit PowerPC. Additionally, I have included the patches from https://reviews.llvm.org/D49754 and https://reviews.llvm.org/D54583 saw no regressions on x86_64 and 32-bit PowerPC.

All three patches will be part of the next upload of the llvm-toolchain-7 package in Debian unstable which will be version 1:7.0.1~+rc2-9.

Other than the minor nit about the test case, LGTM.

test/CodeGen/PowerPC/spe.ll
535	Can you maybe filter this test case through something like `opt -mem2reg` to get rid of the extraneous `alloca`'s to aid readability?

This revision is now accepted and ready to land.Dec 29 2018, 1:43 PM

In D54409#1342394, @nemanjai wrote:

Other than the minor nit about the test case, LGTM.

Sorry for hi-jacking this, but could you maybe also review this small PowerPC-related change? https://reviews.llvm.org/D55326

kthomsen added a subscriber: kthomsen.Jan 16 2019, 1:47 AM

jhibbits mentioned this in D54583: PowerPC: Optimize SPE double parameter calling setup.Jan 21 2019, 9:48 AM

Incorporate patch from @kthomsen for handling globals.

Herald added a project: Restricted Project. · View Herald TranscriptFeb 5 2019, 8:31 PM

I thought this would've been auto-marked as needs review when I updated, but apparently not.

Herald added a subscriber: jdoerfert. · View Herald TranscriptFeb 19 2019, 12:40 PM

I can't really review this without context. Could you please re-upload with context?

This revision now requires changes to proceed.Feb 20 2019, 3:46 AM

Address feedback. Provide full context for diffs.

I am only requesting changes for the refactoring of the function. As I mentioned in the inline comment, I am not necessarily requesting that you rework the addressing mode computation functions as I mentioned, just that you clean up the code in SelectAddressRegReg().

lib/Target/PowerPC/PPCISelLowering.cpp
2260	I really don't think this belongs here. This should probably be in another function. Perhaps something like `SelectAddressEVXRegReg(...)`. This function would need to decide when r+r mode needs to be used. Furthermore, it is probably buggy that we use `iaddr` for `EVLDD` because it might go into a function like this, the node that computes the address is not an `ISD::ADD` and it reverts to default handling (which is set up to handle 16-bit displacements). Even if you added the code to the `ISD::OR` section below, it is possible that we would get it wrong as the address came from a different node that `SelectAddressRegImm()` ends up knowing about. Ultimately, this is a different addressing mode and it should use different functions to compute when to use the r+r form and when to use r+i form. That being said, I understand that this solves a current problem whereas what I am suggesting solves a future problem that may never actually occur. So I won't stand in the way of this patch - you just might want to think about re-working this. At the very least, this code needs to be pulled out into a separate function and this can look like: if (hasSPE() && usedForDPMemOp(N)) { Base = N.getOperand(0); Index = N.getOperand(1); return true;
2264	This is not necessary. You should be able to do something like: if (MemSDNode Memop = dyn_cast<MemSDNode>(UI)) { if (Memop->getMemoryVT() != MVT::f64) // ... }
lib/Target/PowerPC/PPCRegisterInfo.cpp
1066–1069	A more descriptive name might be something like `OffsetFitsMnemonic`. Notice the capitalization of the variable name as per coding guidelines.

This revision now requires changes to proceed.Mar 31 2019, 5:14 AM

jhibbits marked 2 inline comments as done.Apr 29 2019, 7:26 AM

jhibbits added inline comments.

lib/Target/PowerPC/PPCRegisterInfo.cpp
1066–1069	Oops, I always forget about the capitalizations. Sorry.

Address nemanjai's feedback. Move the EVX check into a separate callable
function. In the future it could possibly be used as a separate addressing
mode, but a naive approach didn't work, and this solves the problem at hand.

Harbormaster completed remote builds in B34039: Diff 206985.Jun 27 2019, 6:56 PM

I integrated the last patch (yes it is working) and saw, that there can be a another optimization. It is now checking for the Offset to fit into the 8-bit offset of the EVLDD / EVSTD, including an alignment of 8. This reduces the effort if variables are stored on the stack and if the variable is in a range short enough, to be addressed directly.

--- PPCISelLowering.orig.h      2019-06-28 09:45:05.061923700 +0200
+++ PPCISelLowering.h   2019-06-28 09:07:19.143760400 +0200
@@ -1200,6 +1200,9 @@ namespace llvm {
   bool isIntS16Immediate(SDNode *N, int16_t &Imm);
   bool isIntS16Immediate(SDValue Op, int16_t &Imm);
 
+  bool isIntS8Immediate(SDNode *N, int8_t &Imm);
+  bool isIntS8Immediate(SDValue Op, int8_t &Imm);
+
 } // end namespace llvm
 
 #endif // LLVM_TARGET_POWERPC_PPC32ISELLOWERING_H
--- PPCISelLowering.orig.cpp    2019-06-28 09:45:47.495524500 +0200
+++ PPCISelLowering.cpp 2019-06-28 10:07:31.750129100 +0200
@@ -2227,6 +2227,23 @@ bool llvm::isIntS16Immediate(SDValue Op,
   return isIntS16Immediate(Op.getNode(), Imm);
 }
 
+/// isIntS8Immediate - This method tests to see if the node is either a 32-bit
+/// or 64-bit immediate, and if the value can be accurately represented as a
+/// sign extension from a 8-bit value.  If so, this returns true and the
+/// immediate.
+bool llvm::isIntS8Immediate(SDNode *N, int8_t &Imm) {
+  if (!isa<ConstantSDNode>(N))
+    return false;
+  Imm = (int8_t)cast<ConstantSDNode>(N)->getZExtValue();
+  if (N->getValueType(0) == MVT::i32)
+    return Imm == (int32_t)cast<ConstantSDNode>(N)->getZExtValue();
+  else
+    return Imm == (int64_t)cast<ConstantSDNode>(N)->getZExtValue();
+}
+bool llvm::isIntS8Immediate(SDValue Op, int8_t &Imm) {
+  return isIntS8Immediate(Op.getNode(), Imm);
+}
+
 /// SelectAddressEVXRegReg - Given the specified address, check to see if it can
 /// be represented as an indexed [r+r] operation.
 bool PPCTargetLowering::SelectAddressEVXRegReg(SDValue N, SDValue &Base,
@@ -2259,8 +2276,15 @@ bool PPCTargetLowering::SelectAddressReg
     if (hasSPE())
       // Is there any SPE load/store (f64), which can't handle 16bit offset?
       // SPE load/store can only handle 8-bit offsets.
-      if (SelectAddressEVXRegReg(N, Base, Index, DAG))
-        return true;
+      if (SelectAddressEVXRegReg(N, Base, Index, DAG)) {
+        int8_t imm8 = 0;
+        // Check if the offset is small enough to fit into the 8-bit
+        // e.g. a value on the stack
+        if (isIntS8Immediate(N.getOperand(1), imm8) && !(imm8 % 8))
+          return false; // offset is okay for the 8-bit offset 
+        else
+          return true; // offset is > 8-bit or unknown yet
+      }
     if (isIntS16Immediate(N.getOperand(1), imm) &&
         (!EncodingAlignment || !(imm % EncodingAlignment)))
       return false; // r+i

What do you think?
Best regards, Kei

The immediate offsets for the evldd/evstdd instructions are UInt8, not SInt8, but otherwise your change looks fine. Do you have additional tests for it? You had mentioned before about problematic relocations, is that still the case with your patch?

I found an even better way to check for the Unsigned (thank you for the hint) 8bit with alignment 8 offset. It has now been integrated into the SelectAddressEVXRegReg() function. Therefore the calling function SelectAddressRegReg() doesn't need to deal with any EVx instructions specials, it is all done in the EVXRegReg function.
Explanation for the SelectAddressEVXRegReg function: If it can find a MVT::f64 (there can only be maximal one MVT::f64) , the offset used here will be checked if it is known now and if the offset is 0..255, with an alignment of 8. Then a false is returned and the calling SelectAddressRegReg() will continue with the isIntS16Immediate(), which will be true as well (if it fits into 8 bit unsigned, it will fit into 16 bit signed as well).

--- PPCISelLowering.orig.cpp    2019-07-01 09:08:11.444438400 +0200
+++ PPCISelLowering.cpp 2019-07-01 08:45:16.911244700 +0200
@@ -2227,9 +2227,27 @@ bool llvm::isIntS16Immediate(SDValue Op,
   return isIntS16Immediate(Op.getNode(), Imm);
 }
 
+/// isIntU8Immediate - This method tests to see if the node is either a 32-bit
+/// or 64-bit immediate, and if the value can be accurately represented as a
+/// zero (unsigned) extension from a 8-bit value.  If so, this returns true 
+/// and the immediate.
+bool llvm::isIntU8Immediate(SDNode *N, uint8_t &Imm) {
+  if (!isa<ConstantSDNode>(N))
+    return false;
+  Imm = (uint8_t)cast<ConstantSDNode>(N)->getZExtValue();
+  if (N->getValueType(0) == MVT::i32)
+    return Imm == (int32_t)cast<ConstantSDNode>(N)->getZExtValue();
+  else
+    return Imm == (int64_t)cast<ConstantSDNode>(N)->getZExtValue();
+}
+bool llvm::isIntU8Immediate(SDValue Op, uint8_t &Imm) {
+  return isIntU8Immediate(Op.getNode(), Imm);
+}
 
-/// SelectAddressEVXRegReg - Given the specified address, check to see if it can
-/// be represented as an indexed [r+r] operation.
+/// SelectAddressEVXRegReg - Given the specified address, check to see if it must
+/// be represented as an indexed [r+r] operation for EVLDD and EVSTD instruction.
+/// If the address offset is know now, it will be checked if it fits into the 
+/// 8-bit offset, with an alignment of 8.
 bool PPCTargetLowering::SelectAddressEVXRegReg(SDValue N, SDValue &Base,
                                                SDValue &Index,
                                                SelectionDAG &DAG) const {
@@ -2237,9 +2255,13 @@ bool PPCTargetLowering::SelectAddressEVX
       UI != E; ++UI) {
     if (MemSDNode *Memop = dyn_cast<MemSDNode>(*UI)) {
       if (Memop->getMemoryVT() == MVT::f64) {
-          Base = N.getOperand(0);
-          Index = N.getOperand(1);
-          return true;
+           uint8_t imm = 0;
+        if (isIntU8Immediate(N.getOperand(1), imm)
+            && !(imm % EVXEncodingAlignment))
+          return false; // offset is okay for the 8-bit offset
+        Base = N.getOperand(0);
+        Index = N.getOperand(1);
+        return true; // offset is unknown or too large. Use reg-reg
       }
     }
   }
--- PPCISelLowering.orig.h      2019-07-01 09:08:11.462574400 +0200
+++ PPCISelLowering.h   2019-07-01 08:39:14.625343900 +0200
@@ -1199,6 +1200,11 @@ namespace llvm {
   bool isIntS16Immediate(SDNode *N, int16_t &Imm);
   bool isIntS16Immediate(SDValue Op, int16_t &Imm);
 
+  bool isIntU8Immediate(SDNode *N, uint8_t &Imm);
+  bool isIntU8Immediate(SDValue Op, uint8_t &Imm);
+
+  // The EVLDD and EVSTD are using 8bit offset with an aligment of 8
+  const unsigned EVXEncodingAlignment = 8;
 } // end namespace llvm
 
 #endif // LLVM_TARGET_POWERPC_PPC32ISELLOWERING_H

I'm not sure about the

const unsigned EVXEncodingAlignment = 8;

in the headerfile. I want to use a "name" for the alignment, instead of a hardcoded "8". What/where is the correct usage for such constants?

I don't have additional tests for it, as I haven't run the compiler tests and don't know how specify the tests. My local tests are to compile all Libraries I have for OS-9, several benchmarks (e.g. whetstone, dhrystone) and other tests to see if it is correct. I know that this test information is not enough here, but I can't handle it in a different way at the moment.

There are no open relocation issues seen from my side.

The only thing I haven't seen is the patch for the compares, I have made in PPCISelDAGToDAG.cpp in D54583 on March 27. Without this patch, the FPU (SPE) compares are not correct. Or has this been addressed in a different way somewhere else (which I might have not seen)?

I found one more patch, where I'm not sure if we already had this one and removed it, or if I simply forgot to post it.

--- PPCISelLowering.orig.cpp    2019-07-01 09:08:11.444438400 +0200
+++ PPCISelLowering.cpp 2019-07-01 08:45:16.911244700 +0200
@@ -3011,7 +3114,7 @@ SDValue PPCTargetLowering::LowerVAARG(SD
                                     VAListPtr, MachinePointerInfo(SV), MVT::i8);
   InChain = GprIndex.getValue(1);
 
-  if (VT == MVT::i64) {
+  if ((VT == MVT::i64) || (hasSPE() && (VT == MVT::f64))) {
     // Check if GprIndex is even
     SDValue GprAnd = DAG.getNode(ISD::AND, dl, MVT::i32, GprIndex,
                                  DAG.getConstant(1, dl, MVT::i32));

This has been done, to make sure the register pair for a f64 is the same (even register) like when using i64.

LGTM other than a few minor nits.

lib/Target/PowerPC/PPCISelLowering.cpp
2239	Am I reading this correctly? If this node is used for any memory operation that is accessing an `f64`, we will never use r+i addressing. Is that the desired behaviour? So if I'm not mistaken, we have the following semantics: If we are accessing an `MVT::f64`, always use r+r If we are accessing any other type, use r+i if the offset is a 16-bit signed value Otherwise use r+r addressing
2263	We can probably combine this condition with the one above into a single `if`, but that can be done on the commit - no need for another review.

This revision is now accepted and ready to land.Jul 16 2019, 2:05 PM

Herald added subscribers: • wuzish, MaskRay. · View Herald TranscriptJul 16 2019, 2:05 PM

jhibbits marked an inline comment as done.Jul 16 2019, 6:45 PM

jhibbits added inline comments.

lib/Target/PowerPC/PPCISelLowering.cpp
2239	Yes, that's it for right now. @kthomsen has a fix (commented above), but I wanted to get this in first, and then optimize it after, so @kthomsen's patch will be a separate change.

Herald added a subscriber: shchenz. · View Herald TranscriptJul 16 2019, 6:45 PM

Closed by commit rL366318: PowerPC/SPE: Fix load/store handling for SPE (authored by jhibbits). · Explain WhyJul 17 2019, 5:30 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

lib/

Target/

PowerPC/

PPCISelLowering.h

5 lines

PPCISelLowering.cpp

24 lines

PPCRegisterInfo.cpp

8 lines

test/

CodeGen/

PowerPC/

spe.ll

25 lines

Diff 206985

lib/Target/PowerPC/PPCISelLowering.h

Show First 20 Lines • Show All 661 Lines • ▼ Show 20 Lines	public:
/// getPreIndexedAddressParts - returns true by value, base pointer and		/// getPreIndexedAddressParts - returns true by value, base pointer and
/// offset pointer and addressing mode by reference if the node's address		/// offset pointer and addressing mode by reference if the node's address
/// can be legally represented as pre-indexed load / store address.		/// can be legally represented as pre-indexed load / store address.
bool getPreIndexedAddressParts(SDNode *N, SDValue &Base,		bool getPreIndexedAddressParts(SDNode *N, SDValue &Base,
SDValue &Offset,		SDValue &Offset,
ISD::MemIndexedMode &AM,		ISD::MemIndexedMode &AM,
SelectionDAG &DAG) const override;		SelectionDAG &DAG) const override;

		/// SelectAddressEVXRegReg - Given the specified addressed, check to see if
		/// it can be more efficiently represented as [r+imm].
		bool SelectAddressEVXRegReg(SDValue N, SDValue &Base, SDValue &Index,
		SelectionDAG &DAG) const;

/// SelectAddressRegReg - Given the specified addressed, check to see if it		/// SelectAddressRegReg - Given the specified addressed, check to see if it
/// can be more efficiently represented as [r+imm]. If \p EncodingAlignment		/// can be more efficiently represented as [r+imm]. If \p EncodingAlignment
/// is non-zero, only accept displacement which is not suitable for [r+imm].		/// is non-zero, only accept displacement which is not suitable for [r+imm].
/// Returns false if it can be represented by [r+imm], which are preferred.		/// Returns false if it can be represented by [r+imm], which are preferred.
bool SelectAddressRegReg(SDValue N, SDValue &Base, SDValue &Index,		bool SelectAddressRegReg(SDValue N, SDValue &Base, SDValue &Index,
SelectionDAG &DAG,		SelectionDAG &DAG,
unsigned EncodingAlignment = 0) const;		unsigned EncodingAlignment = 0) const;

▲ Show 20 Lines • Show All 522 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCISelLowering.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,221 Lines • ▼ Show 20 Lines	if (N->getValueType(0) == MVT::i32)
return Imm == (int32_t)cast<ConstantSDNode>(N)->getZExtValue();		return Imm == (int32_t)cast<ConstantSDNode>(N)->getZExtValue();
else		else
return Imm == (int64_t)cast<ConstantSDNode>(N)->getZExtValue();		return Imm == (int64_t)cast<ConstantSDNode>(N)->getZExtValue();
}		}
bool llvm::isIntS16Immediate(SDValue Op, int16_t &Imm) {		bool llvm::isIntS16Immediate(SDValue Op, int16_t &Imm) {
return isIntS16Immediate(Op.getNode(), Imm);		return isIntS16Immediate(Op.getNode(), Imm);
}		}


		/// SelectAddressEVXRegReg - Given the specified address, check to see if it can
		/// be represented as an indexed [r+r] operation.
		bool PPCTargetLowering::SelectAddressEVXRegReg(SDValue N, SDValue &Base,
		SDValue &Index,
		SelectionDAG &DAG) const {
		for (SDNode::use_iterator UI = N->use_begin(), E = N->use_end();
		UI != E; ++UI) {
		if (MemSDNode Memop = dyn_cast<MemSDNode>(UI)) {
		if (Memop->getMemoryVT() == MVT::f64) {
		nemanjaiUnsubmitted Not Done Reply Inline Actions Am I reading this correctly? If this node is used for any memory operation that is accessing an `f64`, we will never use r+i addressing. Is that the desired behaviour? So if I'm not mistaken, we have the following semantics: If we are accessing an `MVT::f64`, always use r+r If we are accessing any other type, use r+i if the offset is a 16-bit signed value Otherwise use r+r addressing nemanjai: Am I reading this correctly? If this node is used for any memory operation that is accessing an…
		jhibbitsAuthorUnsubmitted Done Reply Inline Actions Yes, that's it for right now. @kthomsen has a fix (commented above), but I wanted to get this in first, and then optimize it after, so @kthomsen's patch will be a separate change. jhibbits: Yes, that's it for right now. @kthomsen has a fix (commented above), but I wanted to get this…
		Base = N.getOperand(0);
		Index = N.getOperand(1);
		return true;
		}
		}
		}
		return false;
		}

/// SelectAddressRegReg - Given the specified addressed, check to see if it		/// SelectAddressRegReg - Given the specified addressed, check to see if it
/// can be represented as an indexed [r+r] operation. Returns false if it		/// can be represented as an indexed [r+r] operation. Returns false if it
/// can be more efficiently represented as [r+imm]. If \p EncodingAlignment is		/// can be more efficiently represented as [r+imm]. If \p EncodingAlignment is
/// non-zero and N can be represented by a base register plus a signed 16-bit		/// non-zero and N can be represented by a base register plus a signed 16-bit
/// displacement, make a more precise judgement by checking (displacement % \p		/// displacement, make a more precise judgement by checking (displacement % \p
/// EncodingAlignment).		/// EncodingAlignment).
bool PPCTargetLowering::SelectAddressRegReg(SDValue N, SDValue &Base,		bool PPCTargetLowering::SelectAddressRegReg(SDValue N, SDValue &Base,
SDValue &Index, SelectionDAG &DAG,		SDValue &Index, SelectionDAG &DAG,
unsigned EncodingAlignment) const {		unsigned EncodingAlignment) const {
int16_t imm = 0;		int16_t imm = 0;
if (N.getOpcode() == ISD::ADD) {		if (N.getOpcode() == ISD::ADD) {
		if (hasSPE())
		nemanjaiUnsubmitted Not Done Reply Inline Actions I really don't think this belongs here. This should probably be in another function. Perhaps something like `SelectAddressEVXRegReg(...)`. This function would need to decide when r+r mode needs to be used. Furthermore, it is probably buggy that we use `iaddr` for `EVLDD` because it might go into a function like this, the node that computes the address is not an `ISD::ADD` and it reverts to default handling (which is set up to handle 16-bit displacements). Even if you added the code to the `ISD::OR` section below, it is possible that we would get it wrong as the address came from a different node that `SelectAddressRegImm()` ends up knowing about. Ultimately, this is a different addressing mode and it should use different functions to compute when to use the r+r form and when to use r+i form. That being said, I understand that this solves a current problem whereas what I am suggesting solves a future problem that may never actually occur. So I won't stand in the way of this patch - you just might want to think about re-working this. At the very least, this code needs to be pulled out into a separate function and this can look like: if (hasSPE() && usedForDPMemOp(N)) { Base = N.getOperand(0); Index = N.getOperand(1); return true; nemanjai: I really don't think this belongs here. This should probably be in another function. Perhaps…
		// Is there any SPE load/store (f64), which can't handle 16bit offset?
		// SPE load/store can only handle 8-bit offsets.
		if (SelectAddressEVXRegReg(N, Base, Index, DAG))
		nemanjaiUnsubmitted Not Done Reply Inline Actions We can probably combine this condition with the one above into a single `if`, but that can be done on the commit - no need for another review. nemanjai: We can probably combine this condition with the one above into a single `if`, but that can be…
		return true;
		nemanjaiUnsubmitted Done Reply Inline Actions This is not necessary. You should be able to do something like: if (MemSDNode Memop = dyn_cast<MemSDNode>(UI)) { if (Memop->getMemoryVT() != MVT::f64) // ... } nemanjai: This is not necessary. You should be able to do something like: ``` if (MemSDNode *Memop =…
if (isIntS16Immediate(N.getOperand(1), imm) &&		if (isIntS16Immediate(N.getOperand(1), imm) &&
(!EncodingAlignment \|\| !(imm % EncodingAlignment)))		(!EncodingAlignment \|\| !(imm % EncodingAlignment)))
return false; // r+i		return false; // r+i
if (N.getOperand(1).getOpcode() == PPCISD::Lo)		if (N.getOperand(1).getOpcode() == PPCISD::Lo)
return false; // r+i		return false; // r+i

Base = N.getOperand(0);		Base = N.getOperand(0);
Index = N.getOperand(1);		Index = N.getOperand(1);
▲ Show 20 Lines • Show All 13,033 Lines • Show Last 20 Lines

lib/Target/PowerPC/PPCRegisterInfo.cpp

Show First 20 Lines • Show All 932 Lines • ▼ Show 20 Lines	static unsigned offsetMinAlignForOpcode(unsigned OpC) {
case PPC::DFLOADf64:		case PPC::DFLOADf64:
case PPC::DFSTOREf32:		case PPC::DFSTOREf32:
case PPC::DFSTOREf64:		case PPC::DFSTOREf64:
case PPC::LXSD:		case PPC::LXSD:
case PPC::LXSSP:		case PPC::LXSSP:
case PPC::STXSD:		case PPC::STXSD:
case PPC::STXSSP:		case PPC::STXSSP:
return 4;		return 4;
		case PPC::EVLDD:
		case PPC::EVSTDD:
		return 8;
case PPC::LXV:		case PPC::LXV:
case PPC::STXV:		case PPC::STXV:
return 16;		return 16;
}		}
}		}

// If the offset must be a multiple of some value, return what that value is.		// If the offset must be a multiple of some value, return what that value is.
static unsigned offsetMinAlign(const MachineInstr &MI) {		static unsigned offsetMinAlign(const MachineInstr &MI) {
▲ Show 20 Lines • Show All 106 Lines • ▼ Show 20 Lines	PPCRegisterInfo::eliminateFrameIndex(MachineBasicBlock::iterator II,
// If we can, encode the offset directly into the instruction. If this is a		// If we can, encode the offset directly into the instruction. If this is a
// normal PPC "ri" instruction, any 16-bit value can be safely encoded. If		// normal PPC "ri" instruction, any 16-bit value can be safely encoded. If
// this is a PPC64 "ix" instruction, only a 16-bit value with the low two bits		// this is a PPC64 "ix" instruction, only a 16-bit value with the low two bits
// clear can be encoded. This is extremely uncommon, because normally you		// clear can be encoded. This is extremely uncommon, because normally you
// only "std" to a stack slot that is at least 4-byte aligned, but it can		// only "std" to a stack slot that is at least 4-byte aligned, but it can
// happen in invalid code.		// happen in invalid code.
assert(OpC != PPC::DBG_VALUE &&		assert(OpC != PPC::DBG_VALUE &&
"This should be handled in a target-independent way");		"This should be handled in a target-independent way");
if (!noImmForm && ((isInt<16>(Offset) &&		bool OffsetFitsMnemonic = (OpC == PPC::EVSTDD \|\| OpC == PPC::EVLDD) ?
		isUInt<8>(Offset) :
		isInt<16>(Offset);
		if (!noImmForm && ((OffsetFitsMnemonic &&
		nemanjaiUnsubmitted Not Done Reply Inline Actions A more descriptive name might be something like `OffsetFitsMnemonic`. Notice the capitalization of the variable name as per coding guidelines. nemanjai: A more descriptive name might be something like `OffsetFitsMnemonic`. Notice the capitalization…
		jhibbitsAuthorUnsubmitted Done Reply Inline Actions Oops, I always forget about the capitalizations. Sorry. jhibbits: Oops, I always forget about the capitalizations. Sorry.
((Offset % offsetMinAlign(MI)) == 0)) \|\|		((Offset % offsetMinAlign(MI)) == 0)) \|\|
OpC == TargetOpcode::STACKMAP \|\|		OpC == TargetOpcode::STACKMAP \|\|
OpC == TargetOpcode::PATCHPOINT)) {		OpC == TargetOpcode::PATCHPOINT)) {
MI.getOperand(OffsetOperandNo).ChangeToImmediate(Offset);		MI.getOperand(OffsetOperandNo).ChangeToImmediate(Offset);
return;		return;
}		}

// The offset doesn't fit into a single register, scavenge one to build the		// The offset doesn't fit into a single register, scavenge one to build the
▲ Show 20 Lines • Show All 193 Lines • Show Last 20 Lines

test/CodeGen/PowerPC/spe.ll

Show First 20 Lines • Show All 517 Lines • ▼ Show 20 Lines	entry:
ret i32 %1		ret i32 %1
; CHECK-LABEL: test_dasmconst		; CHECK-LABEL: test_dasmconst
; CHECK: evmergelo		; CHECK: evmergelo
; CHECK: #APP		; CHECK: #APP
; CHECK: efdctsi		; CHECK: efdctsi
; CHECK: #NO_APP		; CHECK: #NO_APP
}		}

define double @test_spill(double %a) nounwind {		declare double @test_spill_spe_regs(double, double);
		define dso_local void @test_func2() #0 {
entry:		entry:
		ret void
		}

		declare void @test_memset(i8* nocapture writeonly, i8, i32, i1)
		@global_var1 = global i32 0, align 4
		define double @test_spill(double %a, i32 %a1, i64 %a2, i8 * %a3, i32 %a4, i32 %a5) nounwind {
		entry:
		nemanjaiUnsubmitted Not Done Reply Inline Actions Can you maybe filter this test case through something like `opt -mem2reg` to get rid of the extraneous `alloca`'s to aid readability? nemanjai: Can you maybe filter this test case through something like `opt -mem2reg` to get rid of the…
		%v1 = alloca [13 x i32], align 4
		%v2 = alloca [11 x i32], align 4
%0 = fadd double %a, %a		%0 = fadd double %a, %a
call void asm sideeffect "","~{r0},~{r3},~{s4},~{r5},~{r6},~{r7},~{r8},~{r9},~{r10},~{r11},~{r12},~{r13},~{r14},~{r15},~{r16},~{r17},~{r18},~{r19},~{r20},~{r21},~{r22},~{r23},~{r24},~{r25},~{r26},~{r27},~{r28},~{r29},~{r30},~{r31}"() nounwind		call void asm sideeffect "","~{s0},~{s3},~{s4},~{s5},~{s6},~{s7},~{s8},~{s9},~{s10},~{s11},~{s12},~{s13},~{s14},~{s15},~{s16},~{s17},~{s18},~{s19},~{s20},~{s21},~{s22},~{s23},~{s24},~{s25},~{s26},~{s27},~{s28},~{s29},~{s30},~{s31}"() nounwind
%1 = fadd double %0, 3.14159		%1 = fadd double %0, 3.14159
		%2 = bitcast [13 x i32]* %v1 to i8*
		call void @test_memset(i8* align 4 %2, i8 0, i32 24, i1 true)
		store i32 0, i32* %a5, align 4
		call void @test_func2()
		%3 = bitcast [11 x i32]* %v2 to i8*
		call void @test_memset(i8* align 4 %3, i8 0, i32 20, i1 true)
br label %return		br label %return

return:		return:
ret double %1		ret double %1

; CHECK-LABEL: test_spill		; CHECK-LABEL: test_spill
; CHECK: efdadd		; CHECK: li [[VREG:[0-9]+]], 256
		; CHECK: evstddx {{[0-9]+}}, {{[0-9]+}}, [[VREG]]
		; CHECK-NOT: evstdd {{[0-9]+}}, 256({{[0-9]+}}
; CHECK: evstdd		; CHECK: evstdd
		; CHECK: efdadd
; CHECK: evldd		; CHECK: evldd
}		}