We should query the subtarget of the calling function, not of the intrinsic.
This probably makes no functional difference (as libcalls are unlikely to vary across subtargets), but fixes minor compile-time regressions from unnecessary subtarget instantiations.
Followup to D157567.