Previously the time complexity is O(|number of paths from the root to an
implied feature| * CPU_FWATURE_MAX) where CPU_FEATURE_MAX is 92.
The number of paths can be large (theoretically exponential).
For an inline asm statement, there is a code path
clang::Parser::ParseAsmStatement -> clang::Sema::ActOnGCCAsmStmt -> ASTContext::getFunctionFeatureMap
leading to potentially many calls of getImpliedEnabledFeatures (41 for my -march=native case).
We should improve the performance a bit in case the number of inline asm
statements is large (Linux kernel builds).
are these more concise to return llvm::any_of here and above?