Expand large or unknown size memory intrinsics into loops in the
default lowering pipeline if the target doesn't have the corresponding
libfunc. Previously AMDGPU had a custom pass which existed to call the
expansion utilities.
With a default no-libcall option, we can remove the libfunc checks in
LoopIdiomRecognize for these, which never made any sense. This also
provides a path to lifting the immarg restriction on
llvm.memcpy.inline.
There seems to be a bug where TLI reports functions as available if
you use -march and not -mtriple.
what happens when someone compiles a libc?
will all the calls to memcpy in libc files convert into for loops?