Page MenuHomePhabricator

TTI: Use a better default for areInlineCompatibl

Authored by arsenm on Aug 7 2017, 8:06 AM.



AArch64 and X86 do the exact same thing for this.
Rather than requiring the target-cpu exactly matches,
if target info is available check subtarget features.

Fixes not inlining OpenCL library functions on AMDGPU,
which don't have an explicitly set target-cpu.

Diff Detail

Event Timeline

arsenm created this revision.Aug 7 2017, 8:06 AM
fhahn added a subscriber: fhahn.Aug 7 2017, 8:19 AM
fhahn added inline comments.

I think this implementation is not conservative enough for some targets. For example, the ARM backend has a more conservative implementation

Some target-features in the ARM backend have impact on the generated code (e.g. thumb-mode). So if we allow inlining for all subsets, we could change the "thumbness" of the inline function. Other backends might have similar tricky target-features to deal with.

MatzeB requested changes to this revision.Aug 15 2017, 11:14 AM

I also think this patch is too optimistic. Not comparing the CPU name/feature attributes seems daring but doable to me. But I assuming all features are additive so we just need to check whether the callees flags are a subset of the callers flags seems too optimistic to me.

This revision now requires changes to proceed.Aug 15 2017, 11:14 AM
arsenm abandoned this revision.Aug 28 2017, 4:06 PM