This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
lib/
-
CodeGen/SelectionDAG/
-
SelectionDAG/
-
SelectionDAG.cpp
-
IR/
-
ConstantFold.cpp
-
test/CodeGen/PowerPC/
-
CodeGen/
-
PowerPC/
-
ppc128-constant-folding.ll

Differential D24011

[ConstantFold] Add a flag for ppc_fp128 constant folding, since APFloat doesn't support double-double semantic
AbandonedPublic

Authored by timshen on Aug 29 2016, 4:11 PM.

Download Raw Diff

Details

Reviewers

echristo
kbarton
iteratee
hfinkel

Summary

There is a FIXME in APFloat that is saying ppc_fp128 is supported through IEEE 128 bits semantic, not IBM double-double semantic. Add a flag to turn of the constand folding to leave the computation to runtime.

Diff Detail

Event Timeline

timshen updated this revision to Diff 69633.Aug 29 2016, 4:11 PM

timshen retitled this revision from to [ConstantFold] Add a flag for ppc_fp128 constant folding, since APFloat doesn't support double-double semantic.

timshen updated this object.

timshen added reviewers: hfinkel, kbarton, echristo, iteratee.

timshen added a subscriber: llvm-commits.

Herald added subscribers: nemanjai, mehdi_amini. · View Herald TranscriptAug 29 2016, 4:11 PM

This looks super suspicious. Why can't we fix this for real in APFloat?

I'd be more comfortable with a patch that didn't attempt to fold in the face of PPCDoubleDouble without a flag to control it.

In the past, we've been comfortable leaving this as an open known issue, especially given that the semantics of PPC long double are not IEEE anyway - the real fix involves creating an APFloat-based double-double implementation which can be used to evaluate constants. How important is it that we make this kind of change now?

In D24011#528387, @majnemer wrote:

I'd be more comfortable with a patch that didn't attempt to fold in the face of PPCDoubleDouble without a flag to control it.

We (me and iteratee) don't mind removing the flags. @hfinkel and @kbarton?

In D24011#528401, @hfinkel wrote:

In the past, we've been comfortable leaving this as an open known issue, especially given that the semantics of PPC long double are not IEEE anyway - the real fix involves creating an APFloat-based double-double implementation which can be used to evaluate constants.

The problem this patch fixes is not the IEEE conformance, but the consistency. Currently libstdc++ std::numeric_limits<long double>::epsilon() returns an epsilon value that may be suitable for double-double semantic, but not suitable for the compile-time IEEE 128 semantic.

How important is it that we make this kind of change now?

By "this kind of change" you mean being able to turn off constant folding for ppc_fp128? We observe internal test failures that due to the mix of using std::numeric_limits<long double>::epsilon() and LLVM constant folding, and we'd like to make the tests green, especially when they seem use the standard library correctly. :)

In D24011#528429, @timshen wrote:

In D24011#528387, @majnemer wrote:

I'd be more comfortable with a patch that didn't attempt to fold in the face of PPCDoubleDouble without a flag to control it.

We (me and iteratee) don't mind removing the flags. @hfinkel and @kbarton?

Agreed. We should either do this or not.

In D24011#528401, @hfinkel wrote:

In the past, we've been comfortable leaving this as an open known issue, especially given that the semantics of PPC long double are not IEEE anyway - the real fix involves creating an APFloat-based double-double implementation which can be used to evaluate constants.

The problem this patch fixes is not the IEEE conformance, but the consistency. Currently libstdc++ std::numeric_limits<long double>::epsilon() returns an epsilon value that may be suitable for double-double semantic, but not suitable for the compile-time IEEE 128 semantic.

Yes, I completely understand that.

How important is it that we make this kind of change now?

By "this kind of change" you mean being able to turn off constant folding for ppc_fp128? We observe internal test failures that due to the mix of using std::numeric_limits<long double>::epsilon() and LLVM constant folding, and we'd like to make the tests green, especially when they seem use the standard library correctly. :)

Yes. I'm certainly sympathetic to wanting correctness tests to pass. However, this has been a known issue in LLVM for nearly a decade, and we could have made this change at any time. We did not because we judged the benefit to be not worth the performance cost to applications that benefit from the constant folding. I'm not against doing this - I suspect few applications are depending on long-double calculations on PPC for high performance computation, but I think we need to understand the rationale in light of the history here.

In D24011#528490, @hfinkel wrote:

Yes. I'm certainly sympathetic to wanting correctness tests to pass. However, this has been a known issue in LLVM for nearly a decade, and we could have made this change at any time. We did not because we judged the benefit to be not worth the performance cost to applications that benefit from the constant folding. I'm not against doing this - I suspect few applications are depending on long-double calculations on PPC for high performance computation, but I think we need to understand the rationale in light of the history here.

Currently Google is transitioning from GCC to LLVM, and cares about PowerPC, and we care about correctness over performance gain due to ppc long double constant folding. This makes a difference from the past decade, from our perspective.

If there seems to be few applications are depending on this optimization, I wonder if we can turn off the constant folding for now, and fix APFloat later if someone really cares about it?

In D24011#529508, @timshen wrote:

If there seems to be few applications are depending on this optimization, I wonder if we can turn off the constant folding for now, and fix APFloat later if someone really cares about it?

Just wanted to note that the front-end would still encounter the issue with APFloat, and I would be concerned if the (front-end) constant expression evaluation for ppc_fp128 is disabled entirely.

In D24011#529529, @hubert.reinterpretcast wrote:

In D24011#529508, @timshen wrote:

If there seems to be few applications are depending on this optimization, I wonder if we can turn off the constant folding for now, and fix APFloat later if someone really cares about it?

Just wanted to note that the front-end would still encounter the issue with APFloat, and I would be concerned if the (front-end) constant expression evaluation for ppc_fp128 is disabled entirely.

Good note. No, we don't touch the front-end currently. :)

In D24011#529603, @timshen wrote:

In D24011#529529, @hubert.reinterpretcast wrote:

In D24011#529508, @timshen wrote:

If there seems to be few applications are depending on this optimization, I wonder if we can turn off the constant folding for now, and fix APFloat later if someone really cares about it?

Just wanted to note that the front-end would still encounter the issue with APFloat, and I would be concerned if the (front-end) constant expression evaluation for ppc_fp128 is disabled entirely.

Good note. No, we don't touch the front-end currently. :)

Yes, but Hubert is right. To fully make this change, you'll need to disable constexpr evaluation of long doubles in Clang also (which I suspect might make it non-conforming).

In the end, I don't think that going for the easy fix here is all that useful. We'll need to invest the time to making an APFloat-based double-double implementation and use that for constant folding.

If the implementations in compiler-rt/lib/builtins/ppc/gcc_q*.c are good, then we could base the implementation, algorithmically, on those.

Yes, but Hubert is right. To fully make this change, you'll need to disable constexpr evaluation of long doubles in Clang also (which I suspect might make it non-conforming).

An incorrect frontend behavior isn't a good justification for keeping an incorrect backend behavior.

In the end, I don't think that going for the easy fix here is all that useful. We'll need to invest the time to making an APFloat-based double-double implementation and use that for constant folding.

Correctness is its own reward.
As a plus, this patch doesn't interfere with someone else adding double double support to APFloat and using that in both the frontend and the backend.

If the implementations in compiler-rt/lib/builtins/ppc/gcc_q*.c are good, then we could base the implementation, algorithmically, on those.

Yes, that would be the way forward.

I think the important question here is if this patch is an improvement on the way to the best solution you've outlined here.
Tim and I both think that is.

In D24011#529901, @iteratee wrote:

Yes, but Hubert is right. To fully make this change, you'll need to disable constexpr evaluation of long doubles in Clang also (which I suspect might make it non-conforming).

An incorrect frontend behavior isn't a good justification for keeping an incorrect backend behavior.

I didn't say that it was. My point is that I think you'll end up needing the full solution regardless, because the behavior will surface in places, such as constexpr evaluation in Clang, that you can't (as easily) turn off.

In the end, I don't think that going for the easy fix here is all that useful. We'll need to invest the time to making an APFloat-based double-double implementation and use that for constant folding.

Correctness is its own reward.
As a plus, this patch doesn't interfere with someone else adding double double support to APFloat and using that in both the frontend and the backend.

It is the "someone else" I have a problem with here. I'm not okay with this patch as the desired end state of the work on this. Sure, you can disable constant folding, but that doesn't change the fact that LLVM has an APFloat with a PPCDoubleDouble mode which does not match the runtime implementation, and other uses of the LLVM libraries (i.e. Clang) will still use that implementation.

If the implementations in compiler-rt/lib/builtins/ppc/gcc_q*.c are good, then we could base the implementation, algorithmically, on those.

Yes, that would be the way forward.

Yes, *if* they exactly match the behavior of the GCC implementations? Or do they just need to be consistent with the properties implied by numeric_limits<long double>? What are your requirements here?

I think the important question here is if this patch is an improvement on the way to the best solution you've outlined here.
Tim and I both think that is.

It leaves the overall infrastructure only more-subtly broken. We might do it to make our lives easier, but it is not an improvement. As a community, sometimes we decide to say, "this has been a known problem for a long time, and so we don't want the band-aid, we'll wait for a real solution." This has certainly been a known problem for a long time, and an important question here is: Do we need to leave this as it is to motivate work on a real solution?

Here's another possible short-term approach: Actually make the APFloat representation used for PPC long double large enough to represent ((1.0 + epsilon) - 1.0), where epsilon == 4.94066e-324. I'm under the impression that this is property that is failing for you, and it is indicative of the fact that the current 128-bit representation used cannot represent all of the numbers that the runtime long double type can represent.

As a quick experiment, I'm assuming that we don't want constant folding to change the output of this program:

$ cat ~/eps.cpp 
#include <iostream>
#include <limits>
using namespace std;

int main() {
  long double x = 1.0;
  cout << (x + numeric_limits<long double>::epsilon()) - x << "\n";
}

which can be accomplished by this patch:

diff --git a/lib/Support/APFloat.cpp b/lib/Support/APFloat.cpp
index f9370b8..81cef0e 100644
--- a/lib/Support/APFloat.cpp
+++ b/lib/Support/APFloat.cpp
@@ -76,7 +76,7 @@ namespace llvm {
      to represent all possible values held by a PPC double-double number,
      for example: (long double) 1.0 + (long double) 0x1p-106
      Should this be replaced by a full emulation of PPC double-double?  */
-  const fltSemantics APFloat::PPCDoubleDouble = { 1023, -1022 + 53, 53 + 53, 128 };
+  const fltSemantics APFloat::PPCDoubleDouble = { 1023, -1022 + 53, 20*(53 + 53), 128 };
 
   /* A tight upper bound on number of parts required to hold the value
      pow(5, power) is
@@ -2926,7 +2926,7 @@ APInt
 APFloat::convertPPCDoubleDoubleAPFloatToAPInt() const
 {
   assert(semantics == (const llvm::fltSemantics*)&PPCDoubleDouble);
-  assert(partCount()==2);
+  // assert(partCount()==2);
 
   uint64_t words[2];
   opStatus fs;
@@ -2947,7 +2947,7 @@ APFloat::convertPPCDoubleDoubleAPFloatToAPInt() const
 
   APFloat u(extended);
   fs = u.convert(IEEEdouble, rmNearestTiesToEven, &losesInfo);
-  assert(fs == opOK || fs == opInexact);
+  // assert(fs == opOK || fs == opInexact);
   (void)fs;
   words[0] = *u.convertDoubleAPFloatToAPInt().getRawData();
 
@@ -2963,7 +2963,7 @@ APFloat::convertPPCDoubleDoubleAPFloatToAPInt() const
     APFloat v(extended);
     v.subtract(u, rmNearestTiesToEven);
     fs = v.convert(IEEEdouble, rmNearestTiesToEven, &losesInfo);
-    assert(fs == opOK && !losesInfo);
+    // assert(fs == opOK && !losesInfo);
     (void)fs;
     words[1] = *v.convertDoubleAPFloatToAPInt().getRawData();
   } else {

and with this patch applied, even when compiled with optimizations enabled, the test program still prints "4.94066e-324" like it should. This is obviously also a work-around, but:

It should only cause APFloat answers to differ from runtime answers by being more accurate, not less.
It isolates the work-around to the APFloat implementation, instead of trying to spread the knowledge of the APFloat deficiencies around the rest of the infrastructure.

P.S. With this patch applied, all Clang regression tests pass, and all LLVM regression tests pass except for the ADT/ADTTests/APFloatTest.PPCDoubleDouble unit test (for a few checks that are checking the representation itself).

timshen planned changes to this revision.Aug 31 2016, 12:00 PM

Somewhat sadly I think the best move forward is going to be getting ppc_fp128 support in APFloat. Moving forward with IEEE754r support in Power9 it might make sense to switch the platform to support that by default, but we're not there yet.

-eric

timshen abandoned this revision.Oct 25 2016, 4:45 PM

Revision Contents

Path

Size

llvm/

lib/

CodeGen/

SelectionDAG/

SelectionDAG.cpp

72 lines

IR/

ConstantFold.cpp

45 lines

test/

CodeGen/

PowerPC/

ppc128-constant-folding.ll

21 lines

Diff 69633

llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 47 Lines • ▼ Show 20 Lines
#include "llvm/Target/TargetRegisterInfo.h"		#include "llvm/Target/TargetRegisterInfo.h"
#include "llvm/Target/TargetSubtargetInfo.h"		#include "llvm/Target/TargetSubtargetInfo.h"
#include <algorithm>		#include <algorithm>
#include <cmath>		#include <cmath>
#include <utility>		#include <utility>

using namespace llvm;		using namespace llvm;

		extern cl::opt<bool> EnablePPCFp128ConstantFold;

/// makeVTList - Return an instance of the SDVTList struct initialized with the		/// makeVTList - Return an instance of the SDVTList struct initialized with the
/// specified members.		/// specified members.
static SDVTList makeVTList(const EVT *VTs, unsigned NumVTs) {		static SDVTList makeVTList(const EVT *VTs, unsigned NumVTs) {
SDVTList Res = {VTs, NumVTs};		SDVTList Res = {VTs, NumVTs};
return Res;		return Res;
}		}

// Default null implementations of the callbacks.		// Default null implementations of the callbacks.
▲ Show 20 Lines • Show All 3,696 Lines • ▼ Show 20 Lines	if (SDValue SV =
return SV;		return SV;

// Constant fold FP operations.		// Constant fold FP operations.
bool HasFPExceptions = TLI->hasFloatingPointExceptions();		bool HasFPExceptions = TLI->hasFloatingPointExceptions();
if (N1CFP) {		if (N1CFP) {
if (N2CFP) {		if (N2CFP) {
APFloat V1 = N1CFP->getValueAPF(), V2 = N2CFP->getValueAPF();		APFloat V1 = N1CFP->getValueAPF(), V2 = N2CFP->getValueAPF();
APFloat::opStatus s;		APFloat::opStatus s;

		if ((&V1.getSemantics() != &APFloat::PPCDoubleDouble &&
		&V2.getSemantics() != &APFloat::PPCDoubleDouble) \|\|
		EnablePPCFp128ConstantFold) {
switch (Opcode) {		switch (Opcode) {
case ISD::FADD:		case ISD::FADD:
s = V1.add(V2, APFloat::rmNearestTiesToEven);		s = V1.add(V2, APFloat::rmNearestTiesToEven);
if (!HasFPExceptions \|\| s != APFloat::opInvalidOp)		if (!HasFPExceptions \|\| s != APFloat::opInvalidOp)
return getConstantFP(V1, DL, VT);		return getConstantFP(V1, DL, VT);
break;		break;
case ISD::FSUB:		case ISD::FSUB:
s = V1.subtract(V2, APFloat::rmNearestTiesToEven);		s = V1.subtract(V2, APFloat::rmNearestTiesToEven);
if (!HasFPExceptions \|\| s!=APFloat::opInvalidOp)		if (!HasFPExceptions \|\| s != APFloat::opInvalidOp)
return getConstantFP(V1, DL, VT);		return getConstantFP(V1, DL, VT);
break;		break;
case ISD::FMUL:		case ISD::FMUL:
s = V1.multiply(V2, APFloat::rmNearestTiesToEven);		s = V1.multiply(V2, APFloat::rmNearestTiesToEven);
if (!HasFPExceptions \|\| s!=APFloat::opInvalidOp)		if (!HasFPExceptions \|\| s != APFloat::opInvalidOp)
return getConstantFP(V1, DL, VT);		return getConstantFP(V1, DL, VT);
break;		break;
case ISD::FDIV:		case ISD::FDIV:
s = V1.divide(V2, APFloat::rmNearestTiesToEven);		s = V1.divide(V2, APFloat::rmNearestTiesToEven);
if (!HasFPExceptions \|\| (s!=APFloat::opInvalidOp &&		if (!HasFPExceptions \|\|
s!=APFloat::opDivByZero)) {		(s != APFloat::opInvalidOp && s != APFloat::opDivByZero)) {
return getConstantFP(V1, DL, VT);		return getConstantFP(V1, DL, VT);
}		}
break;		break;
case ISD::FREM :		case ISD::FREM:
s = V1.mod(V2);		s = V1.mod(V2);
if (!HasFPExceptions \|\| (s!=APFloat::opInvalidOp &&		if (!HasFPExceptions \|\|
s!=APFloat::opDivByZero)) {		(s != APFloat::opInvalidOp && s != APFloat::opDivByZero)) {
return getConstantFP(V1, DL, VT);		return getConstantFP(V1, DL, VT);
}		}
break;		break;
case ISD::FCOPYSIGN:		case ISD::FCOPYSIGN:
V1.copySign(V2);		V1.copySign(V2);
return getConstantFP(V1, DL, VT);		return getConstantFP(V1, DL, VT);
default: break;		default:
		break;
		}
}		}
}		}

if (Opcode == ISD::FP_ROUND) {		if (Opcode == ISD::FP_ROUND) {
APFloat V = N1CFP->getValueAPF(); // make copy		APFloat V = N1CFP->getValueAPF(); // make copy
bool ignored;		bool ignored;
// This can return overflow, underflow, or inexact; we don't care.		// This can return overflow, underflow, or inexact; we don't care.
// FIXME need to be more flexible about rounding mode.		// FIXME need to be more flexible about rounding mode.
▲ Show 20 Lines • Show All 3,481 Lines • Show Last 20 Lines

llvm/lib/IR/ConstantFold.cpp

Show All 28 Lines
#include "llvm/IR/Operator.h"		#include "llvm/IR/Operator.h"
#include "llvm/IR/PatternMatch.h"		#include "llvm/IR/PatternMatch.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/ManagedStatic.h"		#include "llvm/Support/ManagedStatic.h"
#include "llvm/Support/MathExtras.h"		#include "llvm/Support/MathExtras.h"
using namespace llvm;		using namespace llvm;
using namespace llvm::PatternMatch;		using namespace llvm::PatternMatch;

		cl::opt<bool> EnablePPCFp128ConstantFold(
		"enable-ppcfp128-constant-fold",
		cl::desc("Enable inaccurate ppcfp128 constant folding"), cl::init(true));

//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
// ConstantFold*Instruction Implementations		// ConstantFold*Instruction Implementations
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

/// Convert the specified vector Constant node to the specified vector type.		/// Convert the specified vector Constant node to the specified vector type.
/// At this point, we know that the elements of the input vector constant are		/// At this point, we know that the elements of the input vector constant are
/// all simple integer or FP values.		/// all simple integer or FP values.
static Constant BitCastConstantVector(Constant CV, VectorType *DstTy) {		static Constant BitCastConstantVector(Constant CV, VectorType *DstTy) {
▲ Show 20 Lines • Show All 1,134 Lines • ▼ Show 20 Lines	if (ConstantInt *CI1 = dyn_cast<ConstantInt>(C1)) {
default:		default:
break;		break;
}		}
} else if (ConstantFP *CFP1 = dyn_cast<ConstantFP>(C1)) {		} else if (ConstantFP *CFP1 = dyn_cast<ConstantFP>(C1)) {
if (ConstantFP *CFP2 = dyn_cast<ConstantFP>(C2)) {		if (ConstantFP *CFP2 = dyn_cast<ConstantFP>(C2)) {
const APFloat &C1V = CFP1->getValueAPF();		const APFloat &C1V = CFP1->getValueAPF();
const APFloat &C2V = CFP2->getValueAPF();		const APFloat &C2V = CFP2->getValueAPF();
APFloat C3V = C1V; // copy for modification		APFloat C3V = C1V; // copy for modification

		if ((&C3V.getSemantics() != &APFloat::PPCDoubleDouble &&
		&C2V.getSemantics() != &APFloat::PPCDoubleDouble) \|\|
		EnablePPCFp128ConstantFold) {
switch (Opcode) {		switch (Opcode) {
default:		default:
break;		break;
case Instruction::FAdd:		case Instruction::FAdd:
(void)C3V.add(C2V, APFloat::rmNearestTiesToEven);		(void)C3V.add(C2V, APFloat::rmNearestTiesToEven);
return ConstantFP::get(C1->getContext(), C3V);		return ConstantFP::get(C1->getContext(), C3V);
case Instruction::FSub:		case Instruction::FSub:
(void)C3V.subtract(C2V, APFloat::rmNearestTiesToEven);		(void)C3V.subtract(C2V, APFloat::rmNearestTiesToEven);
return ConstantFP::get(C1->getContext(), C3V);		return ConstantFP::get(C1->getContext(), C3V);
case Instruction::FMul:		case Instruction::FMul:
(void)C3V.multiply(C2V, APFloat::rmNearestTiesToEven);		(void)C3V.multiply(C2V, APFloat::rmNearestTiesToEven);
return ConstantFP::get(C1->getContext(), C3V);		return ConstantFP::get(C1->getContext(), C3V);
case Instruction::FDiv:		case Instruction::FDiv:
(void)C3V.divide(C2V, APFloat::rmNearestTiesToEven);		(void)C3V.divide(C2V, APFloat::rmNearestTiesToEven);
return ConstantFP::get(C1->getContext(), C3V);		return ConstantFP::get(C1->getContext(), C3V);
case Instruction::FRem:		case Instruction::FRem:
(void)C3V.mod(C2V);		(void)C3V.mod(C2V);
return ConstantFP::get(C1->getContext(), C3V);		return ConstantFP::get(C1->getContext(), C3V);
}		}
}		}
		}
} else if (VectorType *VTy = dyn_cast<VectorType>(C1->getType())) {		} else if (VectorType *VTy = dyn_cast<VectorType>(C1->getType())) {
// Perform elementwise folding.		// Perform elementwise folding.
SmallVector<Constant*, 16> Result;		SmallVector<Constant*, 16> Result;
Type *Ty = IntegerType::get(VTy->getContext(), 32);		Type *Ty = IntegerType::get(VTy->getContext(), 32);
for (unsigned i = 0, e = VTy->getNumElements(); i != e; ++i) {		for (unsigned i = 0, e = VTy->getNumElements(); i != e; ++i) {
Constant *LHS =		Constant *LHS =
ConstantExpr::getExtractElement(C1, ConstantInt::get(Ty, i));		ConstantExpr::getExtractElement(C1, ConstantInt::get(Ty, i));
Constant *RHS =		Constant *RHS =
▲ Show 20 Lines • Show All 1,072 Lines • Show Last 20 Lines

llvm/test/CodeGen/PowerPC/ppc128-constant-folding.ll

This file was added.

				; RUN: llc -O0 -enable-ppcfp128-constant-fold=true < %s \| FileCheck %s -check-prefix=ENABLE
				; RUN: llc -O0 -enable-ppcfp128-constant-fold=false < %s \| FileCheck %s -check-prefix=DISABLE
				target triple = "powerpc64le-linux-gnu"

				; The code returns the second double in integer, which is able to be constant-folded.
				; int64_t Foo() {
				; long double sum = 1.0L + std::numeric_limits<long double>::epsilon();
				; return ((int64_t)(&sum) + 1);
				; }

				; ENABLE-NOT: __gcc_qadd
				; DISABLE: __gcc_qadd
				define i64 @Foo() {
				%1 = alloca ppc_fp128, align 16
				%2 = fadd ppc_fp128 0xM3FF00000000000000000000000000000, 0xM00000000000000010000000000000000
				store ppc_fp128 %2, ppc_fp128* %1, align 16
				%3 = bitcast ppc_fp128* %1 to i64*
				%4 = getelementptr inbounds i64, i64* %3, i64 1
				%5 = load i64, i64* %4, align 8
				ret i64 %5
				}