AArch64 and X86 do the exact same thing for this.
Rather than requiring the target-cpu exactly matches,
if target info is available check subtarget features.
Fixes not inlining OpenCL library functions on AMDGPU,
which don't have an explicitly set target-cpu.
I think this implementation is not conservative enough for some targets. For example, the ARM backend has a more conservative implementation https://github.com/llvm-mirror/llvm/blob/master/lib/Target/ARM/ARMTargetTransformInfo.cpp#L18
Some target-features in the ARM backend have impact on the generated code (e.g. thumb-mode). So if we allow inlining for all subsets, we could change the "thumbness" of the inline function. Other backends might have similar tricky target-features to deal with.