New intrinisics are implemented for when we need to port SIMD code from other
arhitectures and only load or store portions of MSA registers.
Following intriniscs are added which only load/store element 0 of a vector:
v4i32 builtin_msa_ldr_w (const void *, imm_n2048_2044);
v2i64 builtin_msa_ldr_d (const void *, imm_n4096_4088);
void builtin_msa_str_w (v4i32, void *, imm_n2048_2044);
void builtin_msa_str_d (v2i64, void *, imm_n4096_4088);