This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
clang/
-
include/clang/Basic/
-
clang/
-
Basic/
-
BuiltinsMips.def
-
lib/
-
Headers/
-
msa.h
-
Sema/
-
SemaChecking.cpp
-
llvm/
-
include/llvm/IR/
-
llvm/
-
IR/
-
IntrinsicsMips.td
-
lib/Target/Mips/
-
Target/
-
Mips/
-
MipsISelLowering.h
-
MipsISelLowering.cpp
-
MipsMSAInstrInfo.td
-
MipsSEISelDAGToDAG.cpp
-
test/CodeGen/Mips/msa/
-
CodeGen/
-
Mips/
-
msa/
-
ldr_str.ll

Differential D73644

[Mips] Add intrinsics for 4-byte and 8-byte MSA loads/stores.
ClosedPublic

Authored by mbrkusanin on Jan 29 2020, 10:04 AM.

Download Raw Diff

Details

Reviewers

atanasyan
petarj
sdardis
mstojanovic

Commits

rG5ba931a84a34: [Mips] Add intrinsics for 4-byte and 8-byte MSA loads/stores.

Summary

New intrinisics are implemented for when we need to port SIMD code from other
arhitectures and only load or store portions of MSA registers.

Following intriniscs are added which only load/store element 0 of a vector:
v4i32 builtin_msa_ldr_w (const void *, imm_n2048_2044);
v2i64 builtin_msa_ldr_d (const void *, imm_n4096_4088);
void builtin_msa_str_w (v4i32, void *, imm_n2048_2044);
void builtin_msa_str_d (v2i64, void *, imm_n4096_4088);

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

mbrkusanin created this revision.Jan 29 2020, 10:04 AM

Herald added subscribers: cfe-commits, jrtc27, hiraditya, arichardson. · View Herald TranscriptJan 29 2020, 10:04 AM

A few notes/questions:

Generated code was tested with Qemu:
- For mips32r5 Qemu provides p5600
- For mips64r6 Qemu provides i6400
- For mips64r5 there is no cpu on Qemu with MSA and it appears that there won't be any hardware with Mips64r5 and MSA.
- For mips32r6 Qemu only provides a cpu called mips32r6-generic which does not support MSA. I tested the code for this on mips64r6.

Names of the new intrinsics can be explained in the following way:

__builtin_msa_ldr_d (load right half)
__builtin_msa_ldrq_w (load right quarter)
__builtin_msa_str_d (store right half)
__builtin_msa_strq_w (store right quarter)
Other proposed names are: ld1_d/ld1_w/st1_d/st1_w and ldc1/lwc1/sdc1/swc1. I have no strong preference and would not mind changing them if someone thinks they would fit better.

I did not make any tests for Clang (c/c++ test) since there are no tests for other intrinsics. Also should these new intrinsics be documented somewhere? Most other are corresponding to some instruction that already exists but these are replaced with pseudos.

emitLDRQ_W() and emitLDR_D() could be combined into one function but it decreases readability. Same with emitting stores: emitSTRQ_W() and emitSTR_D(). I already tried this and have the code ready if this would be more preferable.

draganm added a subscriber: draganm.Jan 29 2020, 4:32 PM

Is it possible to emulate these new intrinsics using existing ones and some additional code? Is code generated in this case much larger/slower then the code generated by the new intrinsics?

We could do that for loads. For example on Mips32r5 (where we need most instructions) for intrinsic ldr_d instead of:

	lwr	$1, 16($5)
	lwl	$1, 19($5)
	lwr	$2, 20($5)
	lwl	$2, 23($5)
	fill.w	$w0, $1
	insert.w	$w0[1], $2

We could use already available ld.d and then fix up $w0[2] and $w0[3] manually (when working with MSA128WRegClass / v4i32). ld.d has no alignment restrictions.

	ld.d	$w0, 16($5)
	copy_s.w	$1, $w0[0]
	insert.w	$w0[2], $1
	insert.w	$w0[3], $1

Optionally if we don't care what values are loaded in elements other then first we could just use ld.d and ld.w for ldr_d and ldrq_w respectively.

For stores however we cannot use st.d or st.w because we would write to memory we are not supposed to (we write to void* not necessarily v2i64 or v4i32).

I see, thanks. Is there the same or similar functionality in GCC?

Rebase.

Not yet, a proposal was made to both GCC and LLVM and as far as I can tell no work was done on GCC yet. If we accept these names I'll let them know so we end up with matching names.

As for 4/8 byte loads, in case of having them implemented as ld plus some extra instructions, I don't really see the point about making sure those other vector elements have same value as first. So if we ignore those we remain with only ld. In that case we can just not implement these loads and just have the user use __builtin_msa_ld_w and __builtin_msa_ld_d instead. But if we do decide to implement them it would make more sense to have them only read 4/8 bytes instead of all 16. That way you can use both since ld is already available.

Looking good to me as-is.

Current naming is okay. But what do you think about reducing name of quarter intrinsics: __builtin_msa_ldr_w instead of __builtin_msa_ldrq_w? Will it clash with any future intrinsics' names?
There is almost no documentation on target specific intrinsics. Some articles like Using ARM NEON instructions in big endian mode cover specific use cases. It's up to you to write an article for these new intrinsics.

This revision is now accepted and ready to land.Feb 10 2020, 4:22 AM

Rebase
Rename ldrq_w to ldr_w; Rename strq_w to str_w.

Closed by commit rG5ba931a84a34: [Mips] Add intrinsics for 4-byte and 8-byte MSA loads/stores. (authored by mbrkusanin). · Explain WhyFeb 11 2020, 2:55 AM

This revision was automatically updated to reflect the committed changes.

Revision Contents

Path

Size

clang/

include/

clang/

Basic/

BuiltinsMips.def

6 lines

lib/

Headers/

msa.h

4 lines

Sema/

SemaChecking.cpp

4 lines

llvm/

include/

llvm/

IR/

IntrinsicsMips.td

14 lines

lib/

Target/

Mips/

MipsISelLowering.h

4 lines

MipsISelLowering.cpp

279 lines

MipsMSAInstrInfo.td

20 lines

MipsSEISelDAGToDAG.cpp

75 lines

test/

CodeGen/

Mips/

msa/

ldr_str.ll

224 lines

Diff 243783

clang/include/clang/Basic/BuiltinsMips.def

	Show First 20 Lines • Show All 629 Lines • ▼ Show 20 Lines
	BUILTIN(__builtin_msa_insve_w, "V4SiV4SiIUiV4Si", "nc")			BUILTIN(__builtin_msa_insve_w, "V4SiV4SiIUiV4Si", "nc")
	BUILTIN(__builtin_msa_insve_d, "V2SLLiV2SLLiIUiV2SLLi", "nc")			BUILTIN(__builtin_msa_insve_d, "V2SLLiV2SLLiIUiV2SLLi", "nc")

	BUILTIN(__builtin_msa_ld_b, "V16Scv*Ii", "nc")			BUILTIN(__builtin_msa_ld_b, "V16Scv*Ii", "nc")
	BUILTIN(__builtin_msa_ld_h, "V8Ssv*Ii", "nc")			BUILTIN(__builtin_msa_ld_h, "V8Ssv*Ii", "nc")
	BUILTIN(__builtin_msa_ld_w, "V4Siv*Ii", "nc")			BUILTIN(__builtin_msa_ld_w, "V4Siv*Ii", "nc")
	BUILTIN(__builtin_msa_ld_d, "V2SLLiv*Ii", "nc")			BUILTIN(__builtin_msa_ld_d, "V2SLLiv*Ii", "nc")

				BUILTIN(__builtin_msa_ldr_d, "V2SLLiv*Ii", "nc")
				BUILTIN(__builtin_msa_ldr_w, "V4Siv*Ii", "nc")

	BUILTIN(__builtin_msa_ldi_b, "V16cIi", "nc")			BUILTIN(__builtin_msa_ldi_b, "V16cIi", "nc")
	BUILTIN(__builtin_msa_ldi_h, "V8sIi", "nc")			BUILTIN(__builtin_msa_ldi_h, "V8sIi", "nc")
	BUILTIN(__builtin_msa_ldi_w, "V4iIi", "nc")			BUILTIN(__builtin_msa_ldi_w, "V4iIi", "nc")
	BUILTIN(__builtin_msa_ldi_d, "V2LLiIi", "nc")			BUILTIN(__builtin_msa_ldi_d, "V2LLiIi", "nc")

	BUILTIN(__builtin_msa_madd_q_h, "V8SsV8SsV8SsV8Ss", "nc")			BUILTIN(__builtin_msa_madd_q_h, "V8SsV8SsV8SsV8Ss", "nc")
	BUILTIN(__builtin_msa_madd_q_w, "V4SiV4SiV4SiV4Si", "nc")			BUILTIN(__builtin_msa_madd_q_w, "V4SiV4SiV4SiV4Si", "nc")

	▲ Show 20 Lines • Show All 206 Lines • ▼ Show 20 Lines
	BUILTIN(__builtin_msa_srlri_w, "V4iV4iIUi", "nc")			BUILTIN(__builtin_msa_srlri_w, "V4iV4iIUi", "nc")
	BUILTIN(__builtin_msa_srlri_d, "V2LLiV2LLiIUi", "nc")			BUILTIN(__builtin_msa_srlri_d, "V2LLiV2LLiIUi", "nc")

	BUILTIN(__builtin_msa_st_b, "vV16Scv*Ii", "nc")			BUILTIN(__builtin_msa_st_b, "vV16Scv*Ii", "nc")
	BUILTIN(__builtin_msa_st_h, "vV8Ssv*Ii", "nc")			BUILTIN(__builtin_msa_st_h, "vV8Ssv*Ii", "nc")
	BUILTIN(__builtin_msa_st_w, "vV4Siv*Ii", "nc")			BUILTIN(__builtin_msa_st_w, "vV4Siv*Ii", "nc")
	BUILTIN(__builtin_msa_st_d, "vV2SLLiv*Ii", "nc")			BUILTIN(__builtin_msa_st_d, "vV2SLLiv*Ii", "nc")

				BUILTIN(__builtin_msa_str_d, "vV2SLLiv*Ii", "nc")
				BUILTIN(__builtin_msa_str_w, "vV4Siv*Ii", "nc")

	BUILTIN(__builtin_msa_subs_s_b, "V16ScV16ScV16Sc", "nc")			BUILTIN(__builtin_msa_subs_s_b, "V16ScV16ScV16Sc", "nc")
	BUILTIN(__builtin_msa_subs_s_h, "V8SsV8SsV8Ss", "nc")			BUILTIN(__builtin_msa_subs_s_h, "V8SsV8SsV8Ss", "nc")
	BUILTIN(__builtin_msa_subs_s_w, "V4SiV4SiV4Si", "nc")			BUILTIN(__builtin_msa_subs_s_w, "V4SiV4SiV4Si", "nc")
	BUILTIN(__builtin_msa_subs_s_d, "V2SLLiV2SLLiV2SLLi", "nc")			BUILTIN(__builtin_msa_subs_s_d, "V2SLLiV2SLLiV2SLLi", "nc")

	BUILTIN(__builtin_msa_subs_u_b, "V16UcV16UcV16Uc", "nc")			BUILTIN(__builtin_msa_subs_u_b, "V16UcV16UcV16Uc", "nc")
	BUILTIN(__builtin_msa_subs_u_h, "V8UsV8UsV8Us", "nc")			BUILTIN(__builtin_msa_subs_u_h, "V8UsV8UsV8Us", "nc")
	BUILTIN(__builtin_msa_subs_u_w, "V4UiV4UiV4Ui", "nc")			BUILTIN(__builtin_msa_subs_u_w, "V4UiV4UiV4Ui", "nc")
	Show All 32 Lines

clang/lib/Headers/msa.h

	Show First 20 Lines • Show All 206 Lines • ▼ Show 20 Lines
	#define __msa_clei_u_b __builtin_msa_clei_u_b			#define __msa_clei_u_b __builtin_msa_clei_u_b
	#define __msa_clei_u_h __builtin_msa_clei_u_h			#define __msa_clei_u_h __builtin_msa_clei_u_h
	#define __msa_clei_u_w __builtin_msa_clei_u_w			#define __msa_clei_u_w __builtin_msa_clei_u_w
	#define __msa_clei_u_d __builtin_msa_clei_u_d			#define __msa_clei_u_d __builtin_msa_clei_u_d
	#define __msa_ld_b __builtin_msa_ld_b			#define __msa_ld_b __builtin_msa_ld_b
	#define __msa_ld_h __builtin_msa_ld_h			#define __msa_ld_h __builtin_msa_ld_h
	#define __msa_ld_w __builtin_msa_ld_w			#define __msa_ld_w __builtin_msa_ld_w
	#define __msa_ld_d __builtin_msa_ld_d			#define __msa_ld_d __builtin_msa_ld_d
				#define __msa_ldr_d __builtin_msa_ldr_d
				#define __msa_ldr_w __builtin_msa_ldrq_w
	#define __msa_st_b __builtin_msa_st_b			#define __msa_st_b __builtin_msa_st_b
	#define __msa_st_h __builtin_msa_st_h			#define __msa_st_h __builtin_msa_st_h
	#define __msa_st_w __builtin_msa_st_w			#define __msa_st_w __builtin_msa_st_w
	#define __msa_st_d __builtin_msa_st_d			#define __msa_st_d __builtin_msa_st_d
				#define __msa_str_d __builtin_msa_str_d
				#define __msa_str_w __builtin_msa_strq_w
	#define __msa_sat_s_b __builtin_msa_sat_s_b			#define __msa_sat_s_b __builtin_msa_sat_s_b
	#define __msa_sat_s_h __builtin_msa_sat_s_h			#define __msa_sat_s_h __builtin_msa_sat_s_h
	#define __msa_sat_s_w __builtin_msa_sat_s_w			#define __msa_sat_s_w __builtin_msa_sat_s_w
	#define __msa_sat_s_d __builtin_msa_sat_s_d			#define __msa_sat_s_d __builtin_msa_sat_s_d
	#define __msa_sat_u_b __builtin_msa_sat_u_b			#define __msa_sat_u_b __builtin_msa_sat_u_b
	#define __msa_sat_u_h __builtin_msa_sat_u_h			#define __msa_sat_u_h __builtin_msa_sat_u_h
	#define __msa_sat_u_w __builtin_msa_sat_u_w			#define __msa_sat_u_w __builtin_msa_sat_u_w
	#define __msa_sat_u_d __builtin_msa_sat_u_d			#define __msa_sat_u_d __builtin_msa_sat_u_d
	▲ Show 20 Lines • Show All 343 Lines • Show Last 20 Lines

clang/lib/Sema/SemaChecking.cpp

This file is larger than 256 KB, so syntax highlighting is disabled by default.

Show First 20 Lines • Show All 2,749 Lines • ▼ Show 20 Lines	bool Sema::CheckMipsBuiltinArgument(unsigned BuiltinID, CallExpr *TheCall) {
case Mips::BI__builtin_msa_ldi_b: i = 0; l = -128; u = 255; break;		case Mips::BI__builtin_msa_ldi_b: i = 0; l = -128; u = 255; break;
case Mips::BI__builtin_msa_ldi_h:		case Mips::BI__builtin_msa_ldi_h:
case Mips::BI__builtin_msa_ldi_w:		case Mips::BI__builtin_msa_ldi_w:
case Mips::BI__builtin_msa_ldi_d: i = 0; l = -512; u = 511; break;		case Mips::BI__builtin_msa_ldi_d: i = 0; l = -512; u = 511; break;
case Mips::BI__builtin_msa_ld_b: i = 1; l = -512; u = 511; m = 1; break;		case Mips::BI__builtin_msa_ld_b: i = 1; l = -512; u = 511; m = 1; break;
case Mips::BI__builtin_msa_ld_h: i = 1; l = -1024; u = 1022; m = 2; break;		case Mips::BI__builtin_msa_ld_h: i = 1; l = -1024; u = 1022; m = 2; break;
case Mips::BI__builtin_msa_ld_w: i = 1; l = -2048; u = 2044; m = 4; break;		case Mips::BI__builtin_msa_ld_w: i = 1; l = -2048; u = 2044; m = 4; break;
case Mips::BI__builtin_msa_ld_d: i = 1; l = -4096; u = 4088; m = 8; break;		case Mips::BI__builtin_msa_ld_d: i = 1; l = -4096; u = 4088; m = 8; break;
		case Mips::BI__builtin_msa_ldr_d: i = 1; l = -4096; u = 4088; m = 8; break;
		case Mips::BI__builtin_msa_ldr_w: i = 1; l = -2048; u = 2044; m = 4; break;
case Mips::BI__builtin_msa_st_b: i = 2; l = -512; u = 511; m = 1; break;		case Mips::BI__builtin_msa_st_b: i = 2; l = -512; u = 511; m = 1; break;
case Mips::BI__builtin_msa_st_h: i = 2; l = -1024; u = 1022; m = 2; break;		case Mips::BI__builtin_msa_st_h: i = 2; l = -1024; u = 1022; m = 2; break;
case Mips::BI__builtin_msa_st_w: i = 2; l = -2048; u = 2044; m = 4; break;		case Mips::BI__builtin_msa_st_w: i = 2; l = -2048; u = 2044; m = 4; break;
case Mips::BI__builtin_msa_st_d: i = 2; l = -4096; u = 4088; m = 8; break;		case Mips::BI__builtin_msa_st_d: i = 2; l = -4096; u = 4088; m = 8; break;
		case Mips::BI__builtin_msa_str_d: i = 2; l = -4096; u = 4088; m = 8; break;
		case Mips::BI__builtin_msa_str_w: i = 2; l = -2048; u = 2044; m = 4; break;
}		}

if (!m)		if (!m)
return SemaBuiltinConstantArgRange(TheCall, i, l, u);		return SemaBuiltinConstantArgRange(TheCall, i, l, u);

return SemaBuiltinConstantArgRange(TheCall, i, l, u) \|\|		return SemaBuiltinConstantArgRange(TheCall, i, l, u) \|\|
SemaBuiltinConstantArgMultiple(TheCall, i, m);		SemaBuiltinConstantArgMultiple(TheCall, i, m);
}		}
▲ Show 20 Lines • Show All 11,720 Lines • Show Last 20 Lines

llvm/include/llvm/IR/IntrinsicsMips.td

Show First 20 Lines • Show All 1,265 Lines • ▼ Show 20 Lines	def int_mips_ld_h : GCCBuiltin<"__builtin_msa_ld_h">,
[IntrReadMem, IntrArgMemOnly]>;		[IntrReadMem, IntrArgMemOnly]>;
def int_mips_ld_w : GCCBuiltin<"__builtin_msa_ld_w">,		def int_mips_ld_w : GCCBuiltin<"__builtin_msa_ld_w">,
Intrinsic<[llvm_v4i32_ty], [llvm_ptr_ty, llvm_i32_ty],		Intrinsic<[llvm_v4i32_ty], [llvm_ptr_ty, llvm_i32_ty],
[IntrReadMem, IntrArgMemOnly]>;		[IntrReadMem, IntrArgMemOnly]>;
def int_mips_ld_d : GCCBuiltin<"__builtin_msa_ld_d">,		def int_mips_ld_d : GCCBuiltin<"__builtin_msa_ld_d">,
Intrinsic<[llvm_v2i64_ty], [llvm_ptr_ty, llvm_i32_ty],		Intrinsic<[llvm_v2i64_ty], [llvm_ptr_ty, llvm_i32_ty],
[IntrReadMem, IntrArgMemOnly]>;		[IntrReadMem, IntrArgMemOnly]>;

		def int_mips_ldr_d : GCCBuiltin<"__builtin_msa_ldr_d">,
		Intrinsic<[llvm_v2i64_ty], [llvm_ptr_ty, llvm_i32_ty],
		[IntrReadMem, IntrArgMemOnly]>;
		def int_mips_ldr_w : GCCBuiltin<"__builtin_msa_ldr_w">,
		Intrinsic<[llvm_v4i32_ty], [llvm_ptr_ty, llvm_i32_ty],
		[IntrReadMem, IntrArgMemOnly]>;

def int_mips_ldi_b : GCCBuiltin<"__builtin_msa_ldi_b">,		def int_mips_ldi_b : GCCBuiltin<"__builtin_msa_ldi_b">,
Intrinsic<[llvm_v16i8_ty], [llvm_i32_ty], [IntrNoMem, ImmArg<0>]>;		Intrinsic<[llvm_v16i8_ty], [llvm_i32_ty], [IntrNoMem, ImmArg<0>]>;
def int_mips_ldi_h : GCCBuiltin<"__builtin_msa_ldi_h">,		def int_mips_ldi_h : GCCBuiltin<"__builtin_msa_ldi_h">,
Intrinsic<[llvm_v8i16_ty], [llvm_i32_ty], [IntrNoMem, ImmArg<0>]>;		Intrinsic<[llvm_v8i16_ty], [llvm_i32_ty], [IntrNoMem, ImmArg<0>]>;
def int_mips_ldi_w : GCCBuiltin<"__builtin_msa_ldi_w">,		def int_mips_ldi_w : GCCBuiltin<"__builtin_msa_ldi_w">,
Intrinsic<[llvm_v4i32_ty], [llvm_i32_ty], [IntrNoMem, ImmArg<0>]>;		Intrinsic<[llvm_v4i32_ty], [llvm_i32_ty], [IntrNoMem, ImmArg<0>]>;
def int_mips_ldi_d : GCCBuiltin<"__builtin_msa_ldi_d">,		def int_mips_ldi_d : GCCBuiltin<"__builtin_msa_ldi_d">,
Intrinsic<[llvm_v2i64_ty], [llvm_i32_ty], [IntrNoMem, ImmArg<0>]>;		Intrinsic<[llvm_v2i64_ty], [llvm_i32_ty], [IntrNoMem, ImmArg<0>]>;
▲ Show 20 Lines • Show All 408 Lines • ▼ Show 20 Lines	def int_mips_st_h : GCCBuiltin<"__builtin_msa_st_h">,
[IntrArgMemOnly]>;		[IntrArgMemOnly]>;
def int_mips_st_w : GCCBuiltin<"__builtin_msa_st_w">,		def int_mips_st_w : GCCBuiltin<"__builtin_msa_st_w">,
Intrinsic<[], [llvm_v4i32_ty, llvm_ptr_ty, llvm_i32_ty],		Intrinsic<[], [llvm_v4i32_ty, llvm_ptr_ty, llvm_i32_ty],
[IntrArgMemOnly]>;		[IntrArgMemOnly]>;
def int_mips_st_d : GCCBuiltin<"__builtin_msa_st_d">,		def int_mips_st_d : GCCBuiltin<"__builtin_msa_st_d">,
Intrinsic<[], [llvm_v2i64_ty, llvm_ptr_ty, llvm_i32_ty],		Intrinsic<[], [llvm_v2i64_ty, llvm_ptr_ty, llvm_i32_ty],
[IntrArgMemOnly]>;		[IntrArgMemOnly]>;

		def int_mips_str_d : GCCBuiltin<"__builtin_msa_str_d">,
		Intrinsic<[], [llvm_v2i64_ty, llvm_ptr_ty, llvm_i32_ty],
		[IntrArgMemOnly]>;
		def int_mips_str_w : GCCBuiltin<"__builtin_msa_str_w">,
		Intrinsic<[], [llvm_v4i32_ty, llvm_ptr_ty, llvm_i32_ty],
		[IntrArgMemOnly]>;

def int_mips_subs_s_b : GCCBuiltin<"__builtin_msa_subs_s_b">,		def int_mips_subs_s_b : GCCBuiltin<"__builtin_msa_subs_s_b">,
Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, llvm_v16i8_ty], [IntrNoMem]>;		Intrinsic<[llvm_v16i8_ty], [llvm_v16i8_ty, llvm_v16i8_ty], [IntrNoMem]>;
def int_mips_subs_s_h : GCCBuiltin<"__builtin_msa_subs_s_h">,		def int_mips_subs_s_h : GCCBuiltin<"__builtin_msa_subs_s_h">,
Intrinsic<[llvm_v8i16_ty], [llvm_v8i16_ty, llvm_v8i16_ty], [IntrNoMem]>;		Intrinsic<[llvm_v8i16_ty], [llvm_v8i16_ty, llvm_v8i16_ty], [IntrNoMem]>;
def int_mips_subs_s_w : GCCBuiltin<"__builtin_msa_subs_s_w">,		def int_mips_subs_s_w : GCCBuiltin<"__builtin_msa_subs_s_w">,
Intrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, llvm_v4i32_ty], [IntrNoMem]>;		Intrinsic<[llvm_v4i32_ty], [llvm_v4i32_ty, llvm_v4i32_ty], [IntrNoMem]>;
def int_mips_subs_s_d : GCCBuiltin<"__builtin_msa_subs_s_d">,		def int_mips_subs_s_d : GCCBuiltin<"__builtin_msa_subs_s_d">,
Intrinsic<[llvm_v2i64_ty], [llvm_v2i64_ty, llvm_v2i64_ty], [IntrNoMem]>;		Intrinsic<[llvm_v2i64_ty], [llvm_v2i64_ty, llvm_v2i64_ty], [IntrNoMem]>;
▲ Show 20 Lines • Show All 65 Lines • Show Last 20 Lines

llvm/lib/Target/Mips/MipsISelLowering.h

Show First 20 Lines • Show All 700 Lines • ▼ Show 20 Lines	private:
MachineBasicBlock *emitAtomicCmpSwapPartword(MachineInstr &MI,		MachineBasicBlock *emitAtomicCmpSwapPartword(MachineInstr &MI,
MachineBasicBlock *BB,		MachineBasicBlock *BB,
unsigned Size) const;		unsigned Size) const;
MachineBasicBlock emitSEL_D(MachineInstr &MI, MachineBasicBlock BB) const;		MachineBasicBlock emitSEL_D(MachineInstr &MI, MachineBasicBlock BB) const;
MachineBasicBlock emitPseudoSELECT(MachineInstr &MI, MachineBasicBlock BB,		MachineBasicBlock emitPseudoSELECT(MachineInstr &MI, MachineBasicBlock BB,
bool isFPCmp, unsigned Opc) const;		bool isFPCmp, unsigned Opc) const;
MachineBasicBlock *emitPseudoD_SELECT(MachineInstr &MI,		MachineBasicBlock *emitPseudoD_SELECT(MachineInstr &MI,
MachineBasicBlock *BB) const;		MachineBasicBlock *BB) const;
		MachineBasicBlock emitLDR_W(MachineInstr &MI, MachineBasicBlock BB) const;
		MachineBasicBlock emitLDR_D(MachineInstr &MI, MachineBasicBlock BB) const;
		MachineBasicBlock emitSTR_W(MachineInstr &MI, MachineBasicBlock BB) const;
		MachineBasicBlock emitSTR_D(MachineInstr &MI, MachineBasicBlock BB) const;
};		};

/// Create MipsTargetLowering objects.		/// Create MipsTargetLowering objects.
const MipsTargetLowering *		const MipsTargetLowering *
createMips16TargetLowering(const MipsTargetMachine &TM,		createMips16TargetLowering(const MipsTargetMachine &TM,
const MipsSubtarget &STI);		const MipsSubtarget &STI);
const MipsTargetLowering *		const MipsTargetLowering *
createMipsSETargetLowering(const MipsTargetMachine &TM,		createMipsSETargetLowering(const MipsTargetMachine &TM,
Show All 12 Lines

llvm/lib/Target/Mips/MipsISelLowering.cpp

Show First 20 Lines • Show All 1,445 Lines • ▼ Show 20 Lines	MipsTargetLowering::EmitInstrWithCustomInserter(MachineInstr &MI,
case Mips::PseudoSELECTFP_T_I64:		case Mips::PseudoSELECTFP_T_I64:
case Mips::PseudoSELECTFP_T_S:		case Mips::PseudoSELECTFP_T_S:
case Mips::PseudoSELECTFP_T_D32:		case Mips::PseudoSELECTFP_T_D32:
case Mips::PseudoSELECTFP_T_D64:		case Mips::PseudoSELECTFP_T_D64:
return emitPseudoSELECT(MI, BB, true, Mips::BC1T);		return emitPseudoSELECT(MI, BB, true, Mips::BC1T);
case Mips::PseudoD_SELECT_I:		case Mips::PseudoD_SELECT_I:
case Mips::PseudoD_SELECT_I64:		case Mips::PseudoD_SELECT_I64:
return emitPseudoD_SELECT(MI, BB);		return emitPseudoD_SELECT(MI, BB);
		case Mips::LDR_W:
		return emitLDR_W(MI, BB);
		case Mips::LDR_D:
		return emitLDR_D(MI, BB);
		case Mips::STR_W:
		return emitSTR_W(MI, BB);
		case Mips::STR_D:
		return emitSTR_D(MI, BB);
}		}
}		}

// This function also handles Mips::ATOMIC_SWAP_I32 (when BinOpcode == 0), and		// This function also handles Mips::ATOMIC_SWAP_I32 (when BinOpcode == 0), and
// Mips::ATOMIC_LOAD_NAND_I32 (when Nand == true)		// Mips::ATOMIC_LOAD_NAND_I32 (when Nand == true)
MachineBasicBlock *		MachineBasicBlock *
MipsTargetLowering::emitAtomicBinary(MachineInstr &MI,		MipsTargetLowering::emitAtomicBinary(MachineInstr &MI,
MachineBasicBlock *BB) const {		MachineBasicBlock *BB) const {
▲ Show 20 Lines • Show All 3,248 Lines • ▼ Show 20 Lines	if (Subtarget.isGP64bit()) {
Register Reg = StringSwitch<Register>(RegName)		Register Reg = StringSwitch<Register>(RegName)
.Case("$28", Mips::GP)		.Case("$28", Mips::GP)
.Default(Register());		.Default(Register());
if (Reg)		if (Reg)
return Reg;		return Reg;
}		}
report_fatal_error("Invalid register name global variable");		report_fatal_error("Invalid register name global variable");
}		}

		MachineBasicBlock *MipsTargetLowering::emitLDR_W(MachineInstr &MI,
		MachineBasicBlock *BB) const {
		MachineFunction *MF = BB->getParent();
		MachineRegisterInfo &MRI = MF->getRegInfo();
		const TargetInstrInfo *TII = Subtarget.getInstrInfo();
		const bool IsLittle = Subtarget.isLittle();
		DebugLoc DL = MI.getDebugLoc();

		Register Dest = MI.getOperand(0).getReg();
		Register Address = MI.getOperand(1).getReg();
		unsigned Imm = MI.getOperand(2).getImm();

		MachineBasicBlock::iterator I(MI);

		if (Subtarget.hasMips32r6() \|\| Subtarget.hasMips64r6()) {
		// Mips release 6 can load from adress that is not naturally-aligned.
		Register Temp = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		BuildMI(*BB, I, DL, TII->get(Mips::LW))
		.addDef(Temp)
		.addUse(Address)
		.addImm(Imm);
		BuildMI(*BB, I, DL, TII->get(Mips::FILL_W)).addDef(Dest).addUse(Temp);
		} else {
		// Mips release 5 needs to use instructions that can load from an unaligned
		// memory address.
		Register LoadHalf = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		Register LoadFull = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		Register Undef = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		BuildMI(*BB, I, DL, TII->get(Mips::IMPLICIT_DEF)).addDef(Undef);
		BuildMI(*BB, I, DL, TII->get(Mips::LWR))
		.addDef(LoadHalf)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 0 : 3))
		.addUse(Undef);
		BuildMI(*BB, I, DL, TII->get(Mips::LWL))
		.addDef(LoadFull)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 3 : 0))
		.addUse(LoadHalf);
		BuildMI(*BB, I, DL, TII->get(Mips::FILL_W)).addDef(Dest).addUse(LoadFull);
		}

		MI.eraseFromParent();
		return BB;
		}

		MachineBasicBlock *MipsTargetLowering::emitLDR_D(MachineInstr &MI,
		MachineBasicBlock *BB) const {
		MachineFunction *MF = BB->getParent();
		MachineRegisterInfo &MRI = MF->getRegInfo();
		const TargetInstrInfo *TII = Subtarget.getInstrInfo();
		const bool IsLittle = Subtarget.isLittle();
		DebugLoc DL = MI.getDebugLoc();

		Register Dest = MI.getOperand(0).getReg();
		Register Address = MI.getOperand(1).getReg();
		unsigned Imm = MI.getOperand(2).getImm();

		MachineBasicBlock::iterator I(MI);

		if (Subtarget.hasMips32r6() \|\| Subtarget.hasMips64r6()) {
		// Mips release 6 can load from adress that is not naturally-aligned.
		if (Subtarget.isGP64bit()) {
		Register Temp = MRI.createVirtualRegister(&Mips::GPR64RegClass);
		BuildMI(*BB, I, DL, TII->get(Mips::LD))
		.addDef(Temp)
		.addUse(Address)
		.addImm(Imm);
		BuildMI(*BB, I, DL, TII->get(Mips::FILL_D)).addDef(Dest).addUse(Temp);
		} else {
		Register Wtemp = MRI.createVirtualRegister(&Mips::MSA128WRegClass);
		Register Lo = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		Register Hi = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		BuildMI(*BB, I, DL, TII->get(Mips::LW))
		.addDef(Lo)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 0 : 4));
		BuildMI(*BB, I, DL, TII->get(Mips::LW))
		.addDef(Hi)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 4 : 0));
		BuildMI(*BB, I, DL, TII->get(Mips::FILL_W)).addDef(Wtemp).addUse(Lo);
		BuildMI(*BB, I, DL, TII->get(Mips::INSERT_W), Dest)
		.addUse(Wtemp)
		.addUse(Hi)
		.addImm(1);
		}
		} else {
		// Mips release 5 needs to use instructions that can load from an unaligned
		// memory address.
		Register LoHalf = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		Register LoFull = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		Register LoUndef = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		Register HiHalf = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		Register HiFull = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		Register HiUndef = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		Register Wtemp = MRI.createVirtualRegister(&Mips::MSA128WRegClass);
		BuildMI(*BB, I, DL, TII->get(Mips::IMPLICIT_DEF)).addDef(LoUndef);
		BuildMI(*BB, I, DL, TII->get(Mips::LWR))
		.addDef(LoHalf)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 0 : 7))
		.addUse(LoUndef);
		BuildMI(*BB, I, DL, TII->get(Mips::LWL))
		.addDef(LoFull)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 3 : 4))
		.addUse(LoHalf);
		BuildMI(*BB, I, DL, TII->get(Mips::IMPLICIT_DEF)).addDef(HiUndef);
		BuildMI(*BB, I, DL, TII->get(Mips::LWR))
		.addDef(HiHalf)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 4 : 3))
		.addUse(HiUndef);
		BuildMI(*BB, I, DL, TII->get(Mips::LWL))
		.addDef(HiFull)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 7 : 0))
		.addUse(HiHalf);
		BuildMI(*BB, I, DL, TII->get(Mips::FILL_W)).addDef(Wtemp).addUse(LoFull);
		BuildMI(*BB, I, DL, TII->get(Mips::INSERT_W), Dest)
		.addUse(Wtemp)
		.addUse(HiFull)
		.addImm(1);
		}

		MI.eraseFromParent();
		return BB;
		}

		MachineBasicBlock *MipsTargetLowering::emitSTR_W(MachineInstr &MI,
		MachineBasicBlock *BB) const {
		MachineFunction *MF = BB->getParent();
		MachineRegisterInfo &MRI = MF->getRegInfo();
		const TargetInstrInfo *TII = Subtarget.getInstrInfo();
		const bool IsLittle = Subtarget.isLittle();
		DebugLoc DL = MI.getDebugLoc();

		Register StoreVal = MI.getOperand(0).getReg();
		Register Address = MI.getOperand(1).getReg();
		unsigned Imm = MI.getOperand(2).getImm();

		MachineBasicBlock::iterator I(MI);

		if (Subtarget.hasMips32r6() \|\| Subtarget.hasMips64r6()) {
		// Mips release 6 can store to adress that is not naturally-aligned.
		Register BitcastW = MRI.createVirtualRegister(&Mips::MSA128WRegClass);
		Register Tmp = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		BuildMI(*BB, I, DL, TII->get(Mips::COPY)).addDef(BitcastW).addUse(StoreVal);
		BuildMI(*BB, I, DL, TII->get(Mips::COPY_S_W))
		.addDef(Tmp)
		.addUse(BitcastW)
		.addImm(0);
		BuildMI(*BB, I, DL, TII->get(Mips::SW))
		.addUse(Tmp)
		.addUse(Address)
		.addImm(Imm);
		} else {
		// Mips release 5 needs to use instructions that can store to an unaligned
		// memory address.
		Register Tmp = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		BuildMI(*BB, I, DL, TII->get(Mips::COPY_S_W))
		.addDef(Tmp)
		.addUse(StoreVal)
		.addImm(0);
		BuildMI(*BB, I, DL, TII->get(Mips::SWR))
		.addUse(Tmp)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 0 : 3));
		BuildMI(*BB, I, DL, TII->get(Mips::SWL))
		.addUse(Tmp)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 3 : 0));
		}

		MI.eraseFromParent();

		return BB;
		}

		MachineBasicBlock *MipsTargetLowering::emitSTR_D(MachineInstr &MI,
		MachineBasicBlock *BB) const {
		MachineFunction *MF = BB->getParent();
		MachineRegisterInfo &MRI = MF->getRegInfo();
		const TargetInstrInfo *TII = Subtarget.getInstrInfo();
		const bool IsLittle = Subtarget.isLittle();
		DebugLoc DL = MI.getDebugLoc();

		Register StoreVal = MI.getOperand(0).getReg();
		Register Address = MI.getOperand(1).getReg();
		unsigned Imm = MI.getOperand(2).getImm();

		MachineBasicBlock::iterator I(MI);

		if (Subtarget.hasMips32r6() \|\| Subtarget.hasMips64r6()) {
		// Mips release 6 can store to adress that is not naturally-aligned.
		if (Subtarget.isGP64bit()) {
		Register BitcastD = MRI.createVirtualRegister(&Mips::MSA128DRegClass);
		Register Lo = MRI.createVirtualRegister(&Mips::GPR64RegClass);
		BuildMI(*BB, I, DL, TII->get(Mips::COPY))
		.addDef(BitcastD)
		.addUse(StoreVal);
		BuildMI(*BB, I, DL, TII->get(Mips::COPY_S_D))
		.addDef(Lo)
		.addUse(BitcastD)
		.addImm(0);
		BuildMI(*BB, I, DL, TII->get(Mips::SD))
		.addUse(Lo)
		.addUse(Address)
		.addImm(Imm);
		} else {
		Register BitcastW = MRI.createVirtualRegister(&Mips::MSA128WRegClass);
		Register Lo = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		Register Hi = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		BuildMI(*BB, I, DL, TII->get(Mips::COPY))
		.addDef(BitcastW)
		.addUse(StoreVal);
		BuildMI(*BB, I, DL, TII->get(Mips::COPY_S_W))
		.addDef(Lo)
		.addUse(BitcastW)
		.addImm(0);
		BuildMI(*BB, I, DL, TII->get(Mips::COPY_S_W))
		.addDef(Hi)
		.addUse(BitcastW)
		.addImm(1);
		BuildMI(*BB, I, DL, TII->get(Mips::SW))
		.addUse(Lo)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 0 : 4));
		BuildMI(*BB, I, DL, TII->get(Mips::SW))
		.addUse(Hi)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 4 : 0));
		}
		} else {
		// Mips release 5 needs to use instructions that can store to an unaligned
		// memory address.
		Register Bitcast = MRI.createVirtualRegister(&Mips::MSA128WRegClass);
		Register Lo = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		Register Hi = MRI.createVirtualRegister(&Mips::GPR32RegClass);
		BuildMI(*BB, I, DL, TII->get(Mips::COPY)).addDef(Bitcast).addUse(StoreVal);
		BuildMI(*BB, I, DL, TII->get(Mips::COPY_S_W))
		.addDef(Lo)
		.addUse(Bitcast)
		.addImm(0);
		BuildMI(*BB, I, DL, TII->get(Mips::COPY_S_W))
		.addDef(Hi)
		.addUse(Bitcast)
		.addImm(1);
		BuildMI(*BB, I, DL, TII->get(Mips::SWR))
		.addUse(Lo)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 0 : 3));
		BuildMI(*BB, I, DL, TII->get(Mips::SWL))
		.addUse(Lo)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 3 : 0));
		BuildMI(*BB, I, DL, TII->get(Mips::SWR))
		.addUse(Hi)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 4 : 7));
		BuildMI(*BB, I, DL, TII->get(Mips::SWL))
		.addUse(Hi)
		.addUse(Address)
		.addImm(Imm + (IsLittle ? 7 : 4));
		}

		MI.eraseFromParent();
		return BB;
		}

llvm/lib/Target/Mips/MipsMSAInstrInfo.td

	Show First 20 Lines • Show All 2,333 Lines • ▼ Show 20 Lines
	class LD_D_DESC : LD_DESC_BASE<"ld.d", load, v2i64, MSA128DOpnd,			class LD_D_DESC : LD_DESC_BASE<"ld.d", load, v2i64, MSA128DOpnd,
	mem_simm10_lsl3, addrimm10lsl3>;			mem_simm10_lsl3, addrimm10lsl3>;

	class LDI_B_DESC : MSA_I10_LDI_DESC_BASE<"ldi.b", MSA128BOpnd>;			class LDI_B_DESC : MSA_I10_LDI_DESC_BASE<"ldi.b", MSA128BOpnd>;
	class LDI_H_DESC : MSA_I10_LDI_DESC_BASE<"ldi.h", MSA128HOpnd>;			class LDI_H_DESC : MSA_I10_LDI_DESC_BASE<"ldi.h", MSA128HOpnd>;
	class LDI_W_DESC : MSA_I10_LDI_DESC_BASE<"ldi.w", MSA128WOpnd>;			class LDI_W_DESC : MSA_I10_LDI_DESC_BASE<"ldi.w", MSA128WOpnd>;
	class LDI_D_DESC : MSA_I10_LDI_DESC_BASE<"ldi.d", MSA128DOpnd>;			class LDI_D_DESC : MSA_I10_LDI_DESC_BASE<"ldi.d", MSA128DOpnd>;

				class MSA_LOAD_PSEUDO_BASE<SDPatternOperator intrinsic, RegisterOperand RO> :
				PseudoSE<(outs RO:$dst), (ins PtrRC:$ptr, GPR32:$imm),
				[(set RO:$dst, (intrinsic iPTR:$ptr, GPR32:$imm))]> {
				let hasNoSchedulingInfo = 1;
				let usesCustomInserter = 1;
				}

				def LDR_D : MSA_LOAD_PSEUDO_BASE<int_mips_ldr_d, MSA128DOpnd>;
				def LDR_W : MSA_LOAD_PSEUDO_BASE<int_mips_ldr_w, MSA128WOpnd>;

	class LSA_DESC_BASE<string instr_asm, RegisterOperand RORD,			class LSA_DESC_BASE<string instr_asm, RegisterOperand RORD,
	InstrItinClass itin = NoItinerary> {			InstrItinClass itin = NoItinerary> {
	dag OutOperandList = (outs RORD:$rd);			dag OutOperandList = (outs RORD:$rd);
	dag InOperandList = (ins RORD:$rs, RORD:$rt, uimm2_plus1:$sa);			dag InOperandList = (ins RORD:$rs, RORD:$rt, uimm2_plus1:$sa);
	string AsmString = !strconcat(instr_asm, "\t$rd, $rs, $rt, $sa");			string AsmString = !strconcat(instr_asm, "\t$rd, $rs, $rt, $sa");
	list<dag> Pattern = [(set RORD:$rd, (add RORD:$rt,			list<dag> Pattern = [(set RORD:$rd, (add RORD:$rt,
	(shl RORD:$rs,			(shl RORD:$rs,
	immZExt2Lsa:$sa)))];			immZExt2Lsa:$sa)))];
	▲ Show 20 Lines • Show All 316 Lines • ▼ Show 20 Lines
	class ST_B_DESC : ST_DESC_BASE<"st.b", store, v16i8, MSA128BOpnd, mem_simm10>;			class ST_B_DESC : ST_DESC_BASE<"st.b", store, v16i8, MSA128BOpnd, mem_simm10>;
	class ST_H_DESC : ST_DESC_BASE<"st.h", store, v8i16, MSA128HOpnd,			class ST_H_DESC : ST_DESC_BASE<"st.h", store, v8i16, MSA128HOpnd,
	mem_simm10_lsl1, addrimm10lsl1>;			mem_simm10_lsl1, addrimm10lsl1>;
	class ST_W_DESC : ST_DESC_BASE<"st.w", store, v4i32, MSA128WOpnd,			class ST_W_DESC : ST_DESC_BASE<"st.w", store, v4i32, MSA128WOpnd,
	mem_simm10_lsl2, addrimm10lsl2>;			mem_simm10_lsl2, addrimm10lsl2>;
	class ST_D_DESC : ST_DESC_BASE<"st.d", store, v2i64, MSA128DOpnd,			class ST_D_DESC : ST_DESC_BASE<"st.d", store, v2i64, MSA128DOpnd,
	mem_simm10_lsl3, addrimm10lsl3>;			mem_simm10_lsl3, addrimm10lsl3>;

				class MSA_STORE_PSEUDO_BASE<SDPatternOperator intrinsic, RegisterOperand RO> :
				PseudoSE<(outs), (ins RO:$dst, PtrRC:$ptr, GPR32:$imm),
				[(intrinsic RO:$dst, iPTR:$ptr, GPR32:$imm)]> {
				let hasNoSchedulingInfo = 1;
				let usesCustomInserter = 1;
				}

				def STR_D : MSA_STORE_PSEUDO_BASE<int_mips_str_d, MSA128DOpnd>;
				def STR_W : MSA_STORE_PSEUDO_BASE<int_mips_str_w, MSA128WOpnd>;

	class SUBS_S_B_DESC : MSA_3R_DESC_BASE<"subs_s.b", int_mips_subs_s_b,			class SUBS_S_B_DESC : MSA_3R_DESC_BASE<"subs_s.b", int_mips_subs_s_b,
	MSA128BOpnd>;			MSA128BOpnd>;
	class SUBS_S_H_DESC : MSA_3R_DESC_BASE<"subs_s.h", int_mips_subs_s_h,			class SUBS_S_H_DESC : MSA_3R_DESC_BASE<"subs_s.h", int_mips_subs_s_h,
	MSA128HOpnd>;			MSA128HOpnd>;
	class SUBS_S_W_DESC : MSA_3R_DESC_BASE<"subs_s.w", int_mips_subs_s_w,			class SUBS_S_W_DESC : MSA_3R_DESC_BASE<"subs_s.w", int_mips_subs_s_w,
	MSA128WOpnd>;			MSA128WOpnd>;
	class SUBS_S_D_DESC : MSA_3R_DESC_BASE<"subs_s.d", int_mips_subs_s_d,			class SUBS_S_D_DESC : MSA_3R_DESC_BASE<"subs_s.d", int_mips_subs_s_d,
	MSA128DOpnd>;			MSA128DOpnd>;
	▲ Show 20 Lines • Show All 1,382 Lines • Show Last 20 Lines

llvm/lib/Target/Mips/MipsSEISelDAGToDAG.cpp

Show First 20 Lines • Show All 827 Lines • ▼ Show 20 Lines	for (++Inst; Inst != Seq.end(); ++Inst) {
SDValue(RegOpnd, 0), ImmOpnd);		SDValue(RegOpnd, 0), ImmOpnd);
}		}

ReplaceNode(Node, RegOpnd);		ReplaceNode(Node, RegOpnd);
return true;		return true;
}		}

case ISD::INTRINSIC_W_CHAIN: {		case ISD::INTRINSIC_W_CHAIN: {
switch (cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue()) {		const unsigned IntrinsicOpcode =
		cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();
		switch (IntrinsicOpcode) {
default:		default:
break;		break;

case Intrinsic::mips_cfcmsa: {		case Intrinsic::mips_cfcmsa: {
SDValue ChainIn = Node->getOperand(0);		SDValue ChainIn = Node->getOperand(0);
SDValue RegIdx = Node->getOperand(2);		SDValue RegIdx = Node->getOperand(2);
SDValue Reg = CurDAG->getCopyFromReg(ChainIn, DL,		SDValue Reg = CurDAG->getCopyFromReg(ChainIn, DL,
getMSACtrlReg(RegIdx), MVT::i32);		getMSACtrlReg(RegIdx), MVT::i32);
ReplaceNode(Node, Reg.getNode());		ReplaceNode(Node, Reg.getNode());
return true;		return true;
}		}
		case Intrinsic::mips_ldr_d:
		case Intrinsic::mips_ldr_w: {
		unsigned Op = (IntrinsicOpcode == Intrinsic::mips_ldr_d) ? Mips::LDR_D
		: Mips::LDR_W;

		SDLoc DL(Node);
		assert(Node->getNumOperands() == 4 && "Unexpected number of operands.");
		const SDValue &Chain = Node->getOperand(0);
		const SDValue &Intrinsic = Node->getOperand(1);
		const SDValue &Pointer = Node->getOperand(2);
		const SDValue &Constant = Node->getOperand(3);

		assert(Chain.getValueType() == MVT::Other);
		assert(Intrinsic.getOpcode() == ISD::TargetConstant &&
		Constant.getOpcode() == ISD::Constant &&
		"Invalid instruction operand.");

		// Convert Constant to TargetConstant.
		const ConstantInt *Val =
		cast<ConstantSDNode>(Constant)->getConstantIntValue();
		SDValue Imm =
		CurDAG->getTargetConstant(*Val, DL, Constant.getValueType());

		SmallVector<SDValue, 3> Ops{Pointer, Imm, Chain};

		assert(Node->getNumValues() == 2);
		assert(Node->getValueType(0).is128BitVector());
		assert(Node->getValueType(1) == MVT::Other);
		SmallVector<EVT, 2> ResTys{Node->getValueType(0), Node->getValueType(1)};

		ReplaceNode(Node, CurDAG->getMachineNode(Op, DL, ResTys, Ops));

		return true;
		}
}		}
break;		break;
}		}

case ISD::INTRINSIC_WO_CHAIN: {		case ISD::INTRINSIC_WO_CHAIN: {
switch (cast<ConstantSDNode>(Node->getOperand(0))->getZExtValue()) {		switch (cast<ConstantSDNode>(Node->getOperand(0))->getZExtValue()) {
default:		default:
break;		break;

case Intrinsic::mips_move_v:		case Intrinsic::mips_move_v:
// Like an assignment but will always produce a move.v even if		// Like an assignment but will always produce a move.v even if
// unnecessary.		// unnecessary.
ReplaceNode(Node, CurDAG->getMachineNode(Mips::MOVE_V, DL,		ReplaceNode(Node, CurDAG->getMachineNode(Mips::MOVE_V, DL,
Node->getValueType(0),		Node->getValueType(0),
Node->getOperand(1)));		Node->getOperand(1)));
return true;		return true;
}		}
break;		break;
}		}

case ISD::INTRINSIC_VOID: {		case ISD::INTRINSIC_VOID: {
switch (cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue()) {		const unsigned IntrinsicOpcode =
		cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();
		switch (IntrinsicOpcode) {
default:		default:
break;		break;

case Intrinsic::mips_ctcmsa: {		case Intrinsic::mips_ctcmsa: {
SDValue ChainIn = Node->getOperand(0);		SDValue ChainIn = Node->getOperand(0);
SDValue RegIdx = Node->getOperand(2);		SDValue RegIdx = Node->getOperand(2);
SDValue Value = Node->getOperand(3);		SDValue Value = Node->getOperand(3);
SDValue ChainOut = CurDAG->getCopyToReg(ChainIn, DL,		SDValue ChainOut = CurDAG->getCopyToReg(ChainIn, DL,
getMSACtrlReg(RegIdx), Value);		getMSACtrlReg(RegIdx), Value);
ReplaceNode(Node, ChainOut.getNode());		ReplaceNode(Node, ChainOut.getNode());
return true;		return true;
}		}
		case Intrinsic::mips_str_d:
		case Intrinsic::mips_str_w: {
		unsigned Op = (IntrinsicOpcode == Intrinsic::mips_str_d) ? Mips::STR_D
		: Mips::STR_W;

		SDLoc DL(Node);
		assert(Node->getNumOperands() == 5 && "Unexpected number of operands.");
		const SDValue &Chain = Node->getOperand(0);
		const SDValue &Intrinsic = Node->getOperand(1);
		const SDValue &Vec = Node->getOperand(2);
		const SDValue &Pointer = Node->getOperand(3);
		const SDValue &Constant = Node->getOperand(4);

		assert(Chain.getValueType() == MVT::Other);
		assert(Intrinsic.getOpcode() == ISD::TargetConstant &&
		Constant.getOpcode() == ISD::Constant &&
		"Invalid instruction operand.");

		// Convert Constant to TargetConstant.
		const ConstantInt *Val =
		cast<ConstantSDNode>(Constant)->getConstantIntValue();
		SDValue Imm =
		CurDAG->getTargetConstant(*Val, DL, Constant.getValueType());

		SmallVector<SDValue, 4> Ops{Vec, Pointer, Imm, Chain};

		assert(Node->getNumValues() == 1);
		assert(Node->getValueType(0) == MVT::Other);
		SmallVector<EVT, 1> ResTys{Node->getValueType(0)};

		ReplaceNode(Node, CurDAG->getMachineNode(Op, DL, ResTys, Ops));
		return true;
		}
}		}
break;		break;
}		}

// Manually match MipsISD::Ins nodes to get the correct instruction. It has		// Manually match MipsISD::Ins nodes to get the correct instruction. It has
// to be done in this fashion so that we respect the differences between		// to be done in this fashion so that we respect the differences between
// dins and dinsm, as the difference is that the size operand has the range		// dins and dinsm, as the difference is that the size operand has the range
// 0 < size <= 32 for dins while dinsm has the range 2 <= size <= 64 which		// 0 < size <= 32 for dins while dinsm has the range 2 <= size <= 64 which
▲ Show 20 Lines • Show All 454 Lines • Show Last 20 Lines

llvm/test/CodeGen/Mips/msa/ldr_str.ll

This file was added.

				; NOTE: Assertions have been autogenerated by utils/update_llc_test_checks.py
				; RUN: llc -march=mips -mcpu=mips32r5 -mattr=+msa,+fp64 -O0 < %s \| FileCheck %s --check-prefix=MIPS32R5-EB
				; RUN: llc -march=mipsel -mcpu=mips32r5 -mattr=+msa,+fp64 -O0 < %s \| FileCheck %s --check-prefix=MIPS32R5-EL
				; RUN: llc -march=mips -mcpu=mips32r6 -mattr=+msa,+fp64 -O0 < %s \| FileCheck %s --check-prefix=MIPS32R6-EB
				; RUN: llc -march=mipsel -mcpu=mips32r6 -mattr=+msa,+fp64 -O0 < %s \| FileCheck %s --check-prefix=MIPS32R6-EL
				; RUN: llc -march=mips64 -mcpu=mips64r6 -mattr=+msa,+fp64 -O0 < %s \| FileCheck %s --check-prefix=MIPS64R6
				; RUN: llc -march=mips64el -mcpu=mips64r6 -mattr=+msa,+fp64 -O0 < %s \| FileCheck %s --check-prefix=MIPS64R6

				; Test intrinsics for 4-byte and 8-byte MSA load and stores.

				define void @llvm_mips_ldr_d_test(<2 x i64>* %val, i8* %ptr) nounwind {
				; MIPS32R5-EB-LABEL: llvm_mips_ldr_d_test:
				; MIPS32R5-EB: # %bb.0: # %entry
				; MIPS32R5-EB-NEXT: # implicit-def: $at
				; MIPS32R5-EB-NEXT: lwr $1, 23($5)
				; MIPS32R5-EB-NEXT: lwl $1, 20($5)
				; MIPS32R5-EB-NEXT: # implicit-def: $v0
				; MIPS32R5-EB-NEXT: lwr $2, 19($5)
				; MIPS32R5-EB-NEXT: lwl $2, 16($5)
				; MIPS32R5-EB-NEXT: fill.w $w0, $1
				; MIPS32R5-EB-NEXT: insert.w $w0[1], $2
				; MIPS32R5-EB-NEXT: st.d $w0, 0($4)
				; MIPS32R5-EB-NEXT: jr $ra
				; MIPS32R5-EB-NEXT: nop
				;
				; MIPS32R5-EL-LABEL: llvm_mips_ldr_d_test:
				; MIPS32R5-EL: # %bb.0: # %entry
				; MIPS32R5-EL-NEXT: # implicit-def: $at
				; MIPS32R5-EL-NEXT: lwr $1, 16($5)
				; MIPS32R5-EL-NEXT: lwl $1, 19($5)
				; MIPS32R5-EL-NEXT: # implicit-def: $v0
				; MIPS32R5-EL-NEXT: lwr $2, 20($5)
				; MIPS32R5-EL-NEXT: lwl $2, 23($5)
				; MIPS32R5-EL-NEXT: fill.w $w0, $1
				; MIPS32R5-EL-NEXT: insert.w $w0[1], $2
				; MIPS32R5-EL-NEXT: st.d $w0, 0($4)
				; MIPS32R5-EL-NEXT: jr $ra
				; MIPS32R5-EL-NEXT: nop
				;
				; MIPS32R6-EB-LABEL: llvm_mips_ldr_d_test:
				; MIPS32R6-EB: # %bb.0: # %entry
				; MIPS32R6-EB-NEXT: lw $1, 20($5)
				; MIPS32R6-EB-NEXT: lw $2, 16($5)
				; MIPS32R6-EB-NEXT: fill.w $w0, $1
				; MIPS32R6-EB-NEXT: insert.w $w0[1], $2
				; MIPS32R6-EB-NEXT: st.d $w0, 0($4)
				; MIPS32R6-EB-NEXT: jrc $ra
				;
				; MIPS32R6-EL-LABEL: llvm_mips_ldr_d_test:
				; MIPS32R6-EL: # %bb.0: # %entry
				; MIPS32R6-EL-NEXT: lw $1, 16($5)
				; MIPS32R6-EL-NEXT: lw $2, 20($5)
				; MIPS32R6-EL-NEXT: fill.w $w0, $1
				; MIPS32R6-EL-NEXT: insert.w $w0[1], $2
				; MIPS32R6-EL-NEXT: st.d $w0, 0($4)
				; MIPS32R6-EL-NEXT: jrc $ra
				;
				; MIPS64R6-LABEL: llvm_mips_ldr_d_test:
				; MIPS64R6: # %bb.0: # %entry
				; MIPS64R6-NEXT: ld $1, 16($5)
				; MIPS64R6-NEXT: fill.d $w0, $1
				; MIPS64R6-NEXT: st.d $w0, 0($4)
				; MIPS64R6-NEXT: jrc $ra
				entry:
				%0 = tail call <2 x i64> @llvm.mips.ldr.d(i8* %ptr, i32 16)
				store <2 x i64> %0, <2 x i64>* %val
				ret void
				}

				declare <2 x i64> @llvm.mips.ldr.d(i8*, i32) nounwind

				define void @llvm_mips_ldr_w_test(<4 x i32>* %val, i8* %ptr) nounwind {
				; MIPS32R5-EB-LABEL: llvm_mips_ldr_w_test:
				; MIPS32R5-EB: # %bb.0: # %entry
				; MIPS32R5-EB-NEXT: # implicit-def: $at
				; MIPS32R5-EB-NEXT: lwr $1, 19($5)
				; MIPS32R5-EB-NEXT: lwl $1, 16($5)
				; MIPS32R5-EB-NEXT: fill.w $w0, $1
				; MIPS32R5-EB-NEXT: st.w $w0, 0($4)
				; MIPS32R5-EB-NEXT: jr $ra
				; MIPS32R5-EB-NEXT: nop
				;
				; MIPS32R5-EL-LABEL: llvm_mips_ldr_w_test:
				; MIPS32R5-EL: # %bb.0: # %entry
				; MIPS32R5-EL-NEXT: # implicit-def: $at
				; MIPS32R5-EL-NEXT: lwr $1, 16($5)
				; MIPS32R5-EL-NEXT: lwl $1, 19($5)
				; MIPS32R5-EL-NEXT: fill.w $w0, $1
				; MIPS32R5-EL-NEXT: st.w $w0, 0($4)
				; MIPS32R5-EL-NEXT: jr $ra
				; MIPS32R5-EL-NEXT: nop
				;
				; MIPS32R6-EB-LABEL: llvm_mips_ldr_w_test:
				; MIPS32R6-EB: # %bb.0: # %entry
				; MIPS32R6-EB-NEXT: lw $1, 16($5)
				; MIPS32R6-EB-NEXT: fill.w $w0, $1
				; MIPS32R6-EB-NEXT: st.w $w0, 0($4)
				; MIPS32R6-EB-NEXT: jrc $ra
				;
				; MIPS32R6-EL-LABEL: llvm_mips_ldr_w_test:
				; MIPS32R6-EL: # %bb.0: # %entry
				; MIPS32R6-EL-NEXT: lw $1, 16($5)
				; MIPS32R6-EL-NEXT: fill.w $w0, $1
				; MIPS32R6-EL-NEXT: st.w $w0, 0($4)
				; MIPS32R6-EL-NEXT: jrc $ra
				;
				; MIPS64R6-LABEL: llvm_mips_ldr_w_test:
				; MIPS64R6: # %bb.0: # %entry
				; MIPS64R6-NEXT: lw $1, 16($5)
				; MIPS64R6-NEXT: fill.w $w0, $1
				; MIPS64R6-NEXT: st.w $w0, 0($4)
				; MIPS64R6-NEXT: jrc $ra
				entry:
				%0 = tail call <4 x i32> @llvm.mips.ldr.w(i8* %ptr, i32 16)
				store <4 x i32> %0, <4 x i32>* %val
				ret void
				}

				declare <4 x i32> @llvm.mips.ldr.w(i8*, i32) nounwind

				define void @llvm_mips_str_d_test(<2 x i64>* %val, i8* %ptr) nounwind {
				; MIPS32R5-EB-LABEL: llvm_mips_str_d_test:
				; MIPS32R5-EB: # %bb.0: # %entry
				; MIPS32R5-EB-NEXT: ld.d $w0, 0($4)
				; MIPS32R5-EB-NEXT: copy_s.w $1, $w0[0]
				; MIPS32R5-EB-NEXT: copy_s.w $2, $w0[1]
				; MIPS32R5-EB-NEXT: swr $1, 19($5)
				; MIPS32R5-EB-NEXT: swl $1, 16($5)
				; MIPS32R5-EB-NEXT: swr $2, 23($5)
				; MIPS32R5-EB-NEXT: swl $2, 20($5)
				; MIPS32R5-EB-NEXT: jr $ra
				; MIPS32R5-EB-NEXT: nop
				;
				; MIPS32R5-EL-LABEL: llvm_mips_str_d_test:
				; MIPS32R5-EL: # %bb.0: # %entry
				; MIPS32R5-EL-NEXT: ld.d $w0, 0($4)
				; MIPS32R5-EL-NEXT: copy_s.w $1, $w0[0]
				; MIPS32R5-EL-NEXT: copy_s.w $2, $w0[1]
				; MIPS32R5-EL-NEXT: swr $1, 16($5)
				; MIPS32R5-EL-NEXT: swl $1, 19($5)
				; MIPS32R5-EL-NEXT: swr $2, 20($5)
				; MIPS32R5-EL-NEXT: swl $2, 23($5)
				; MIPS32R5-EL-NEXT: jr $ra
				; MIPS32R5-EL-NEXT: nop
				;
				; MIPS32R6-EB-LABEL: llvm_mips_str_d_test:
				; MIPS32R6-EB: # %bb.0: # %entry
				; MIPS32R6-EB-NEXT: ld.d $w0, 0($4)
				; MIPS32R6-EB-NEXT: copy_s.w $1, $w0[0]
				; MIPS32R6-EB-NEXT: copy_s.w $2, $w0[1]
				; MIPS32R6-EB-NEXT: sw $1, 20($5)
				; MIPS32R6-EB-NEXT: sw $2, 16($5)
				; MIPS32R6-EB-NEXT: jrc $ra
				;
				; MIPS32R6-EL-LABEL: llvm_mips_str_d_test:
				; MIPS32R6-EL: # %bb.0: # %entry
				; MIPS32R6-EL-NEXT: ld.d $w0, 0($4)
				; MIPS32R6-EL-NEXT: copy_s.w $1, $w0[0]
				; MIPS32R6-EL-NEXT: copy_s.w $2, $w0[1]
				; MIPS32R6-EL-NEXT: sw $1, 16($5)
				; MIPS32R6-EL-NEXT: sw $2, 20($5)
				; MIPS32R6-EL-NEXT: jrc $ra
				;
				; MIPS64R6-LABEL: llvm_mips_str_d_test:
				; MIPS64R6: # %bb.0: # %entry
				; MIPS64R6-NEXT: ld.d $w0, 0($4)
				; MIPS64R6-NEXT: copy_s.d $1, $w0[0]
				; MIPS64R6-NEXT: sd $1, 16($5)
				; MIPS64R6-NEXT: jrc $ra
				entry:
				%0 = load <2 x i64>, <2 x i64>* %val
				tail call void @llvm.mips.str.d(<2 x i64> %0, i8* %ptr, i32 16)
				ret void
				}

				declare void @llvm.mips.str.d(<2 x i64>, i8*, i32) nounwind

				define void @llvm_mips_str_w_test(<4 x i32>* %val, i8* %ptr) nounwind {
				; MIPS32R5-EB-LABEL: llvm_mips_str_w_test:
				; MIPS32R5-EB: # %bb.0: # %entry
				; MIPS32R5-EB-NEXT: ld.w $w0, 0($4)
				; MIPS32R5-EB-NEXT: copy_s.w $1, $w0[0]
				; MIPS32R5-EB-NEXT: swr $1, 19($5)
				; MIPS32R5-EB-NEXT: swl $1, 16($5)
				; MIPS32R5-EB-NEXT: jr $ra
				; MIPS32R5-EB-NEXT: nop
				;
				; MIPS32R5-EL-LABEL: llvm_mips_str_w_test:
				; MIPS32R5-EL: # %bb.0: # %entry
				; MIPS32R5-EL-NEXT: ld.w $w0, 0($4)
				; MIPS32R5-EL-NEXT: copy_s.w $1, $w0[0]
				; MIPS32R5-EL-NEXT: swr $1, 16($5)
				; MIPS32R5-EL-NEXT: swl $1, 19($5)
				; MIPS32R5-EL-NEXT: jr $ra
				; MIPS32R5-EL-NEXT: nop
				;
				; MIPS32R6-EB-LABEL: llvm_mips_str_w_test:
				; MIPS32R6-EB: # %bb.0: # %entry
				; MIPS32R6-EB-NEXT: ld.w $w0, 0($4)
				; MIPS32R6-EB-NEXT: copy_s.w $1, $w0[0]
				; MIPS32R6-EB-NEXT: sw $1, 16($5)
				; MIPS32R6-EB-NEXT: jrc $ra
				;
				; MIPS32R6-EL-LABEL: llvm_mips_str_w_test:
				; MIPS32R6-EL: # %bb.0: # %entry
				; MIPS32R6-EL-NEXT: ld.w $w0, 0($4)
				; MIPS32R6-EL-NEXT: copy_s.w $1, $w0[0]
				; MIPS32R6-EL-NEXT: sw $1, 16($5)
				; MIPS32R6-EL-NEXT: jrc $ra
				;
				; MIPS64R6-LABEL: llvm_mips_str_w_test:
				; MIPS64R6: # %bb.0: # %entry
				; MIPS64R6-NEXT: ld.w $w0, 0($4)
				; MIPS64R6-NEXT: copy_s.w $1, $w0[0]
				; MIPS64R6-NEXT: sw $1, 16($5)
				; MIPS64R6-NEXT: jrc $ra
				entry:
				%0 = load <4 x i32>, <4 x i32>* %val
				tail call void @llvm.mips.str.w(<4 x i32> %0, i8* %ptr, i32 16)
				ret void
				}

				declare void @llvm.mips.str.w(<4 x i32>, i8*, i32) nounwind

This is an archive of the discontinued LLVM Phabricator instance.

[Mips] Add intrinsics for 4-byte and 8-byte MSA loads/stores.ClosedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 243783

clang/include/clang/Basic/BuiltinsMips.def

clang/lib/Headers/msa.h

clang/lib/Sema/SemaChecking.cpp

llvm/include/llvm/IR/IntrinsicsMips.td

llvm/lib/Target/Mips/MipsISelLowering.h

llvm/lib/Target/Mips/MipsISelLowering.cpp

llvm/lib/Target/Mips/MipsMSAInstrInfo.td

llvm/lib/Target/Mips/MipsSEISelDAGToDAG.cpp

llvm/test/CodeGen/Mips/msa/ldr_str.ll

[Mips] Add intrinsics for 4-byte and 8-byte MSA loads/stores.
ClosedPublic