Add testing files and enable streaming mode flag for:
bit-counting.ll
bitselect.ll
insert-vector-elt.ll
subvector.ll
vector-shuffle.ll
int-immediates.ll
int-minmax.ll
int-reduce.ll
trunc.ll
int-compare.ll
int-vselect.ll
mask-opt.ll
masked-scatter.ll
masked-gather.ll
fp-compares.ll
fp-extend-trunc.ll
addressing-modes.ll
fp-arith.ll
int-select.ll
log-reduce.ll
ld2-alloca.ll
Add needed changes to force generateing code compatible to streaming mode:
1- enable custom lowering for CTLZ and CTPOP, (needed for bit-counting.ll test).
2- enable custom lowering for insert_vector_elt, (needed for insert-vector-elt.ll test).
3- enable custom lowering for vector SETCC, (needed for subvector.ll and int-compare.ll tests).
4- enable custom lowering for SMIN, SMAX, UMIN, UMAX, (needed for int-minmax.ll and int-immediates.ll tests).
5- enable custom lowering for vecreduce_smin/smax/umin/umax/add, (needed for int-reduce).
6- enable custom lowering for truncate, (needed for trunc.ll)
7- enable custome lowering for truncStore, (needed for fp-extend-trunc.ll).
8- enable expanding setueq to avoid custom-lowering setcc to setcc_merge_zero which cause a crash while instruction selection because there is no pattern match for it, (that is needed for fp-compares.ll)
9- disable combining OR into BSL, (needed for bit-select.ll test).
10- disable lowering interleaved load to avoid generating invalid neon intrinsic, (needed for ld2-alloca.ll).
11- use SVE OR instruction instead of NEON OR, during copying phyReg -AArch64InstrInfo::copyPhysReg-, (needed for vector-shuffle).
12- force scalarisation for masked gather/scatter, because they are not supported in streaming mode.
I think we can remove this test because the input vector > 256 bits.