Download Raw Diff

Details

Reviewers

jholewinski
jingyue

Summary

Make NVPTXISelDAGToDAG able to emit cached loads (LDG) for pointer induction variables.

Also fix latent bug where LDG was not restricted to kernel functions. I believe that this could not be triggered so far since we do not currently infer that a pointer is global outside a kernel function, and only loads of global pointers are considered for cached loads.

This brings a 30% performance gain on some eigen3-based Google-internal CUDA benchmarks, where LLVM introduces a pointer induction variable and then previously couldn't use LDG.

Diff Detail

Event Timeline

broune updated this revision to Diff 31382.Aug 5 2015, 11:33 AM

broune retitled this revision from to [NVPTX] Use LDG for pointer induction variables.

broune updated this object.

broune added reviewers: jholewinski, jingyue.

broune added subscribers: llvm-commits, meheff, eliben.

Herald added a subscriber: jholewinski. · View Herald TranscriptAug 5 2015, 11:33 AM

broune added a subscriber: wengxt.Aug 5 2015, 11:33 AM

eliben added inline comments.Aug 5 2015, 11:40 AM

lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
551	Can't you get the data layout from F now?
571	maybe Objs for coding style (elsewhere too)?
572	How come const_cast is needed now but not before?
575	Seems like a good candidate for auto, since the type is repeated in the cast?

jingyue added inline comments.Aug 5 2015, 11:48 AM

test/CodeGen/NVPTX/load-with-non-coherent-cache.ll
216	This comment doesn't parse to me. Do you mean we can only know that the parameter is global but don't know it's never written to from the caller?

wengxt added inline comments.Aug 5 2015, 11:59 AM

test/CodeGen/NVPTX/load-with-non-coherent-cache.ll
222	from need to be a global address space pointer to make this check meaningful.

Address eliben's comments.

lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
551	Good point. Done.
571	Done. Actually lots of variables in this file are leading-lower-case independent of my change. I'll fix that later, but it will change lots of lines, so I'll keep it separate from this change.
572	GetUnderlyingObject has an overload that takes a const Value * and just wraps the non-const version using const_cast. There is no such overload for GetUnderlyingObjects, which makes sense, since one of the parameters is a container of Value * that would have to become a container of const Value *. Converting between such containers is problematic. So I think it's reasonable not to have the const overload, even though it does push const_cast into the callers. I checked previously what other callers of GetUnderlyingObjects do in this case and they also use const_cast.

Add @notkernel2 test.

test/CodeGen/NVPTX/load-with-non-coherent-cache.ll
216	We also don't know that the parameter is global. "only" applies to both sides of the "and", so the statement is that we don't know either one of those, since @notkernel is not a kernel function. A more detailed way of stating what I'm getting at: In order to know that a parameter is a global pointer from within a non-kernel function, we would have to do inter-procedural analysis to see where the values for that parameter come from. This present function is not a kernel function, and we don't do inter-procedural analysis (and it doesn't have any callers anyway), so we cannot infer that it is global in this case. Thus we cannot use LDG. The same statement applies if you replace "global pointer" with "never written to". "never written to" is intended to include everything that happens for the duration of the kernel call (not just this function call), though I suppose "never" actually covers more than that, so I clarified it in the comment.
222	The meaning of the test is to capture what actually happens for a device function, where there are currently two independent reasons that LDG cannot be used. Though it's a good suggestion to have a test where the parameter is explicitly marked as global, so I added that. Thanks for the suggestion.

LGTM

lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
552	I'd copy some comments at load-with-non-coherent-cache.ll:216 here too, so that people reading source code will quickly understand why we only do this for kernel functions.
test/CodeGen/NVPTX/load-with-non-coherent-cache.ll
216	Thanks. Makes sense now.

This revision is now accepted and ready to land.Aug 5 2015, 3:36 PM

jingyue added inline comments.Aug 5 2015, 3:39 PM

test/CodeGen/NVPTX/load-with-non-coherent-cache.ll
243	FYI, adding the keyword ptx_kernel to the function definition also makes `isKernelFunction` to return true. In that way, you don't need a long list of metadata. I noticed that only recently.

Tiny comment improvement. Thanks to Jingyue for pointing out that the comment was ambiguous.

broune added inline comments.Aug 5 2015, 3:43 PM

test/CodeGen/NVPTX/load-with-non-coherent-cache.ll
243	Thanks, that's good to know. I'll fix that in the whole file my follow-up clean-up cl.

jholewinski added inline comments.Aug 5 2015, 3:44 PM

test/CodeGen/NVPTX/load-with-non-coherent-cache.ll
243	That works, but I would like to deprecate it. The PTX calling conventions are legacy cruft from the old back-end.

Another tiny improvement to the comments.

lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp
552	I added a more thorough comment explaining the conditions for using cached loads.

broune added inline comments.Aug 5 2015, 4:03 PM

test/CodeGen/NVPTX/load-with-non-coherent-cache.ll
243	OK, then I'll leave it out of my follow-up clean-up patch :)

Hi Justin,

Why are we deprecating ptx_kernel? What's bad about it?

Jingyue

Closed by commit at http://reviews.llvm.org/rL244166 .

Diff 31411

lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp

//===-- NVPTXISelDAGToDAG.cpp - A dag to dag inst selector for NVPTX ------===//		//===-- NVPTXISelDAGToDAG.cpp - A dag to dag inst selector for NVPTX ------===//
//		//
// The LLVM Compiler Infrastructure		// The LLVM Compiler Infrastructure
//		//
// This file is distributed under the University of Illinois Open Source		// This file is distributed under the University of Illinois Open Source
// License. See LICENSE.TXT for details.		// License. See LICENSE.TXT for details.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//
//		//
// This file defines an instruction selector for the NVPTX target.		// This file defines an instruction selector for the NVPTX target.
//		//
//===----------------------------------------------------------------------===//		//===----------------------------------------------------------------------===//

#include "NVPTXISelDAGToDAG.h"		#include "NVPTXISelDAGToDAG.h"
		#include "NVPTXUtilities.h"
#include "llvm/Analysis/ValueTracking.h"		#include "llvm/Analysis/ValueTracking.h"
#include "llvm/IR/GlobalValue.h"		#include "llvm/IR/GlobalValue.h"
#include "llvm/IR/Instructions.h"		#include "llvm/IR/Instructions.h"
#include "llvm/Support/CommandLine.h"		#include "llvm/Support/CommandLine.h"
#include "llvm/Support/Debug.h"		#include "llvm/Support/Debug.h"
#include "llvm/Support/ErrorHandling.h"		#include "llvm/Support/ErrorHandling.h"
#include "llvm/Support/raw_ostream.h"		#include "llvm/Support/raw_ostream.h"
#include "llvm/Target/TargetIntrinsicInfo.h"		#include "llvm/Target/TargetIntrinsicInfo.h"
▲ Show 20 Lines • Show All 518 Lines • ▼ Show 20 Lines	if (auto *PT = dyn_cast<PointerType>(Src->getType())) {
case llvm::ADDRESS_SPACE_CONST: return NVPTX::PTXLdStInstCode::CONSTANT;		case llvm::ADDRESS_SPACE_CONST: return NVPTX::PTXLdStInstCode::CONSTANT;
default: break;		default: break;
}		}
}		}
return NVPTX::PTXLdStInstCode::GENERIC;		return NVPTX::PTXLdStInstCode::GENERIC;
}		}

static bool canLowerToLDG(MemSDNode *N, const NVPTXSubtarget &Subtarget,		static bool canLowerToLDG(MemSDNode *N, const NVPTXSubtarget &Subtarget,
unsigned codeAddrSpace, const DataLayout &DL) {		unsigned CodeAddrSpace, MachineFunction *F) {
if (!Subtarget.hasLDG() \|\| codeAddrSpace != NVPTX::PTXLdStInstCode::GLOBAL) {		// To use non-coherent caching, the load has to be from global
		elibenUnsubmitted Done Reply Inline Actions Can't you get the data layout from F now? eliben: Can't you get the data layout from F now?
		brouneAuthorUnsubmitted Not Done Reply Inline Actions Good point. Done. broune: Good point. Done.
		// memory and we have to prove that the memory area is not written
		jingyueUnsubmitted Not Done Reply Inline Actions I'd copy some comments at load-with-non-coherent-cache.ll:216 here too, so that people reading source code will quickly understand why we only do this for kernel functions. jingyue: I'd copy some comments at load-with-non-coherent-cache.ll:216 here too, so that people reading…
		brouneAuthorUnsubmitted Not Done Reply Inline Actions I added a more thorough comment explaining the conditions for using cached loads. broune: I added a more thorough comment explaining the conditions for using cached loads.
		// to anywhere for the duration of the kernel call, not even after
		// the load.
		//
		// To ensure that there are no writes to the memory, we require the
		// underlying pointer to be a noalias (__restrict) kernel parameter
		// that is never used for a write. We can only do this for kernel
		// functions since from within a device function, we cannot know if
		// there were or will be writes to the memory from the caller - or we
		// could, but then we would have to do inter-procedural analysis.
		if (!Subtarget.hasLDG() \|\| CodeAddrSpace != NVPTX::PTXLdStInstCode::GLOBAL \|\|
		!isKernelFunction(*F->getFunction())) {
return false;		return false;
}		}

// Check whether load operates on a readonly argument.		// We use GetUnderlyingObjects() here instead of
bool canUseLDG = false;		// GetUnderlyingObject() mainly because the former looks through phi
if (const Argument *A = dyn_cast<const Argument>(		// nodes while the latter does not. We need to look through phi
GetUnderlyingObject(N->getMemOperand()->getValue(), DL)))		// nodes to handle pointer induction variables.
canUseLDG = A->onlyReadsMemory() && A->hasNoAliasAttr();		SmallVector<Value *, 8> Objs;
		elibenUnsubmitted Done Reply Inline Actions maybe Objs for coding style (elsewhere too)? eliben: maybe Objs for coding style (elsewhere too)?
		brouneAuthorUnsubmitted Not Done Reply Inline Actions Done. Actually lots of variables in this file are leading-lower-case independent of my change. I'll fix that later, but it will change lots of lines, so I'll keep it separate from this change. broune: Done. Actually lots of variables in this file are leading-lower-case independent of my change.
		GetUnderlyingObjects(const_cast<Value *>(N->getMemOperand()->getValue()),
		elibenUnsubmitted Done Reply Inline Actions How come const_cast is needed now but not before? eliben: How come const_cast is needed now but not before?
		brouneAuthorUnsubmitted Not Done Reply Inline Actions GetUnderlyingObject has an overload that takes a const Value * and just wraps the non-const version using const_cast. There is no such overload for GetUnderlyingObjects, which makes sense, since one of the parameters is a container of Value * that would have to become a container of const Value . Converting between such containers is problematic. So I think it's reasonable not to have the const overload, even though it does push const_cast into the callers. I checked previously what other callers of GetUnderlyingObjects do in this case and they also use const_cast. broune:* GetUnderlyingObject has an overload that takes a const Value * and just wraps the non-const…
		Objs, F->getDataLayout());
		for (Value *Obj : Objs) {
		auto *A = dyn_cast<const Argument>(Obj);
		elibenUnsubmitted Done Reply Inline Actions Seems like a good candidate for auto, since the type is repeated in the cast? eliben: Seems like a good candidate for auto, since the type is repeated in the cast?
		if (!A \|\| !A->onlyReadsMemory() \|\| !A->hasNoAliasAttr()) return false;
		}

return canUseLDG;		return true;
}		}

SDNode NVPTXDAGToDAGISel::SelectIntrinsicNoChain(SDNode N) {		SDNode NVPTXDAGToDAGISel::SelectIntrinsicNoChain(SDNode N) {
unsigned IID = cast<ConstantSDNode>(N->getOperand(0))->getZExtValue();		unsigned IID = cast<ConstantSDNode>(N->getOperand(0))->getZExtValue();
switch (IID) {		switch (IID) {
default:		default:
return nullptr;		return nullptr;
case Intrinsic::nvvm_texsurf_handle_internal:		case Intrinsic::nvvm_texsurf_handle_internal:
▲ Show 20 Lines • Show All 80 Lines • ▼ Show 20 Lines	if (LD->isIndexed())
return nullptr;		return nullptr;

if (!LoadedVT.isSimple())		if (!LoadedVT.isSimple())
return nullptr;		return nullptr;

// Address Space Setting		// Address Space Setting
unsigned int codeAddrSpace = getCodeAddrSpace(LD);		unsigned int codeAddrSpace = getCodeAddrSpace(LD);

if (canLowerToLDG(LD, *Subtarget, codeAddrSpace, CurDAG->getDataLayout())) {		if (canLowerToLDG(LD, *Subtarget, codeAddrSpace, MF)) {
return SelectLDGLDU(N);		return SelectLDGLDU(N);
}		}

// Volatile Setting		// Volatile Setting
// - .volatile is only availalble for .global and .shared		// - .volatile is only availalble for .global and .shared
bool isVolatile = LD->isVolatile();		bool isVolatile = LD->isVolatile();
if (codeAddrSpace != NVPTX::PTXLdStInstCode::GLOBAL &&		if (codeAddrSpace != NVPTX::PTXLdStInstCode::GLOBAL &&
codeAddrSpace != NVPTX::PTXLdStInstCode::SHARED &&		codeAddrSpace != NVPTX::PTXLdStInstCode::SHARED &&
▲ Show 20 Lines • Show All 221 Lines • ▼ Show 20 Lines	SDNode NVPTXDAGToDAGISel::SelectLoadVector(SDNode N) {
EVT LoadedVT = MemSD->getMemoryVT();		EVT LoadedVT = MemSD->getMemoryVT();

if (!LoadedVT.isSimple())		if (!LoadedVT.isSimple())
return nullptr;		return nullptr;

// Address Space Setting		// Address Space Setting
unsigned int CodeAddrSpace = getCodeAddrSpace(MemSD);		unsigned int CodeAddrSpace = getCodeAddrSpace(MemSD);

if (canLowerToLDG(MemSD, *Subtarget, CodeAddrSpace, CurDAG->getDataLayout())) {		if (canLowerToLDG(MemSD, *Subtarget, CodeAddrSpace, MF)) {
return SelectLDGLDU(N);		return SelectLDGLDU(N);
}		}

// Volatile Setting		// Volatile Setting
// - .volatile is only availalble for .global and .shared		// - .volatile is only availalble for .global and .shared
bool IsVolatile = MemSD->isVolatile();		bool IsVolatile = MemSD->isVolatile();
if (CodeAddrSpace != NVPTX::PTXLdStInstCode::GLOBAL &&		if (CodeAddrSpace != NVPTX::PTXLdStInstCode::GLOBAL &&
CodeAddrSpace != NVPTX::PTXLdStInstCode::SHARED &&		CodeAddrSpace != NVPTX::PTXLdStInstCode::SHARED &&
▲ Show 20 Lines • Show All 4,202 Lines • Show Last 20 Lines

test/CodeGen/NVPTX/load-with-non-coherent-cache.ll

	Show First 20 Lines • Show All 183 Lines • ▼ Show 20 Lines
	; SM35-LABEL: .visible .entry foo18(			; SM35-LABEL: .visible .entry foo18(
	; SM35: ld.global.nc.u64			; SM35: ld.global.nc.u64
	define void @foo18(float noalias readonly %from, float %to) {			define void @foo18(float noalias readonly %from, float %to) {
	%1 = load float , float * %from			%1 = load float , float * %from
	store float * %1, float ** %to			store float * %1, float ** %to
	ret void			ret void
	}			}

	!nvvm.annotations = !{!1 ,!2 ,!3 ,!4 ,!5 ,!6, !7 ,!8 ,!9 ,!10 ,!11 ,!12, !13, !14, !15, !16, !17, !18}			; Test that we can infer a cached load for a pointer induction variable.
				; SM20-LABEL: .visible .entry foo19(
				; SM20: ld.global.f32
				; SM35-LABEL: .visible .entry foo19(
				; SM35: ld.global.nc.f32
				define void @foo19(float * noalias readonly %from, float * %to, i32 %n) {
				entry:
				br label %loop

				loop:
				%i = phi i32 [ 0, %entry ], [ %nexti, %loop ]
				%sum = phi float [ 0.0, %entry ], [ %nextsum, %loop ]
				%ptr = getelementptr inbounds float, float * %from, i32 %i
				%value = load float, float * %ptr, align 4
				%nextsum = fadd float %value, %sum
				%nexti = add nsw i32 %i, 1
				%exitcond = icmp eq i32 %nexti, %n
				br i1 %exitcond, label %exit, label %loop

				exit:
				store float %nextsum, float * %to
				ret void
				}

				; This test captures the case of a non-kernel function. In a
				jingyueUnsubmitted Not Done Reply Inline Actions This comment doesn't parse to me. Do you mean we can only know that the parameter is global but don't know it's never written to from the caller? jingyue: This comment doesn't parse to me. Do you mean we can only know that the parameter is global…
				brouneAuthorUnsubmitted Not Done Reply Inline Actions We also don't know that the parameter is global. "only" applies to both sides of the "and", so the statement is that we don't know either one of those, since @notkernel is not a kernel function. A more detailed way of stating what I'm getting at: In order to know that a parameter is a global pointer from within a non-kernel function, we would have to do inter-procedural analysis to see where the values for that parameter come from. This present function is not a kernel function, and we don't do inter-procedural analysis (and it doesn't have any callers anyway), so we cannot infer that it is global in this case. Thus we cannot use LDG. The same statement applies if you replace "global pointer" with "never written to". "never written to" is intended to include everything that happens for the duration of the kernel call (not just this function call), though I suppose "never" actually covers more than that, so I clarified it in the comment. broune: We also don't know that the parameter is global. "only" applies to both sides of the "and", so…
				jingyueUnsubmitted Not Done Reply Inline Actions Thanks. Makes sense now. jingyue: Thanks. Makes sense now.
				; non-kernel function, without interprocedural analysis, we do not
				; know that the parameter is global. We also do not know that the
				; pointed-to memory is never written to (for the duration of the
				; kernel). For both reasons, we cannot use a cached load here.
				; SM20-LABEL: notkernel(
				; SM20: ld.f32
				wengxtUnsubmitted Not Done Reply Inline Actions from need to be a global address space pointer to make this check meaningful. wengxt: from need to be a global address space pointer to make this check meaningful.
				brouneAuthorUnsubmitted Not Done Reply Inline Actions The meaning of the test is to capture what actually happens for a device function, where there are currently two independent reasons that LDG cannot be used. Though it's a good suggestion to have a test where the parameter is explicitly marked as global, so I added that. Thanks for the suggestion. broune: The meaning of the test is to capture what actually happens for a device function, where there…
				; SM35-LABEL: notkernel(
				; SM35: ld.f32
				define void @notkernel(float * noalias readonly %from, float * %to) {
				%1 = load float, float * %from
				store float %1, float * %to
				ret void
				}

				; As @notkernel, but with the parameter explicitly marked as global. We still
				; do not know that the parameter is never written to (for the duration of the
				; kernel). This case does not currently come up normally since we do not infer
				; that pointers are global interprocedurally as of 2015-08-05.
				; SM20-LABEL: notkernel2(
				; SM20: ld.global.f32
				; SM35-LABEL: notkernel2(
				; SM35: ld.global.f32
				define void @notkernel2(float addrspace(1) * noalias readonly %from, float * %to) {
				%1 = load float, float addrspace(1) * %from
				store float %1, float * %to
				ret void
				}
				jingyueUnsubmitted Not Done Reply Inline Actions FYI, adding the keyword ptx_kernel to the function definition also makes `isKernelFunction` to return true. In that way, you don't need a long list of metadata. I noticed that only recently. jingyue: FYI, adding the keyword ptx_kernel to the function definition also makes `isKernelFunction` to…
				brouneAuthorUnsubmitted Not Done Reply Inline Actions Thanks, that's good to know. I'll fix that in the whole file my follow-up clean-up cl. broune: Thanks, that's good to know. I'll fix that in the whole file my follow-up clean-up cl.
				jholewinskiUnsubmitted Not Done Reply Inline Actions That works, but I would like to deprecate it. The PTX calling conventions are legacy cruft from the old back-end. jholewinski: That works, but I would like to deprecate it. The PTX calling conventions are legacy cruft…
				brouneAuthorUnsubmitted Not Done Reply Inline Actions OK, then I'll leave it out of my follow-up clean-up patch :) broune: OK, then I'll leave it out of my follow-up clean-up patch :)

				!nvvm.annotations = !{!1 ,!2 ,!3 ,!4 ,!5 ,!6, !7 ,!8 ,!9 ,!10 ,!11 ,!12, !13, !14, !15, !16, !17, !18, !19}
	!1 = !{void (float , float )* @foo1, !"kernel", i32 1}			!1 = !{void (float , float )* @foo1, !"kernel", i32 1}
	!2 = !{void (double , double )* @foo2, !"kernel", i32 1}			!2 = !{void (double , double )* @foo2, !"kernel", i32 1}
	!3 = !{void (i16 , i16 )* @foo3, !"kernel", i32 1}			!3 = !{void (i16 , i16 )* @foo3, !"kernel", i32 1}
	!4 = !{void (i32 , i32 )* @foo4, !"kernel", i32 1}			!4 = !{void (i32 , i32 )* @foo4, !"kernel", i32 1}
	!5 = !{void (i64 , i64 )* @foo5, !"kernel", i32 1}			!5 = !{void (i64 , i64 )* @foo5, !"kernel", i32 1}
	!6 = !{void (i128 , i128 )* @foo6, !"kernel", i32 1}			!6 = !{void (i128 , i128 )* @foo6, !"kernel", i32 1}
	!7 = !{void (<2 x i8> , <2 x i8> )* @foo7, !"kernel", i32 1}			!7 = !{void (<2 x i8> , <2 x i8> )* @foo7, !"kernel", i32 1}
	!8 = !{void (<2 x i16> , <2 x i16> )* @foo8, !"kernel", i32 1}			!8 = !{void (<2 x i16> , <2 x i16> )* @foo8, !"kernel", i32 1}
	!9 = !{void (<2 x i32> , <2 x i32> )* @foo9, !"kernel", i32 1}			!9 = !{void (<2 x i32> , <2 x i32> )* @foo9, !"kernel", i32 1}
	!10 = !{void (<2 x i64> , <2 x i64> )* @foo10, !"kernel", i32 1}			!10 = !{void (<2 x i64> , <2 x i64> )* @foo10, !"kernel", i32 1}
	!11 = !{void (<2 x float> , <2 x float> )* @foo11, !"kernel", i32 1}			!11 = !{void (<2 x float> , <2 x float> )* @foo11, !"kernel", i32 1}
	!12 = !{void (<2 x double> , <2 x double> )* @foo12, !"kernel", i32 1}			!12 = !{void (<2 x double> , <2 x double> )* @foo12, !"kernel", i32 1}
	!13 = !{void (<4 x i8> , <4 x i8> )* @foo13, !"kernel", i32 1}			!13 = !{void (<4 x i8> , <4 x i8> )* @foo13, !"kernel", i32 1}
	!14 = !{void (<4 x i16> , <4 x i16> )* @foo14, !"kernel", i32 1}			!14 = !{void (<4 x i16> , <4 x i16> )* @foo14, !"kernel", i32 1}
	!15 = !{void (<4 x i32> , <4 x i32> )* @foo15, !"kernel", i32 1}			!15 = !{void (<4 x i32> , <4 x i32> )* @foo15, !"kernel", i32 1}
	!16 = !{void (<4 x float> , <4 x float> )* @foo16, !"kernel", i32 1}			!16 = !{void (<4 x float> , <4 x float> )* @foo16, !"kernel", i32 1}
	!17 = !{void (<4 x double> , <4 x double> )* @foo17, !"kernel", i32 1}			!17 = !{void (<4 x double> , <4 x double> )* @foo17, !"kernel", i32 1}
	!18 = !{void (float , float )* @foo18, !"kernel", i32 1}			!18 = !{void (float , float )* @foo18, !"kernel", i32 1}
				!19 = !{void (float , float , i32)* @foo19, !"kernel", i32 1}

This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] Use LDG for pointer induction variables
ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 31411

lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp

test/CodeGen/NVPTX/load-with-non-coherent-cache.ll

This is an archive of the discontinued LLVM Phabricator instance.

[NVPTX] Use LDG for pointer induction variablesClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 31411

lib/Target/NVPTX/NVPTXISelDAGToDAG.cpp

test/CodeGen/NVPTX/load-with-non-coherent-cache.ll

[NVPTX] Use LDG for pointer induction variables
ClosedPublic