This is an archive of the discontinued LLVM Phabricator instance.

Expression eval lookup - prune methods without parsing
AbandonedPublic

Authored by jarin on Nov 5 2019, 6:43 AM.

Details

Summary

In large codebases, we sometimes see Module::FindFunctions (when called from
ClangExpressionDeclMap::FindExternalVisibleDecls) returning huge amounts of
functions. For example, a simple unreal engine example contains

10000 of functions with the name 'operator*'. Evaluating an expression "*x",

where x is an instance of a class, will cause all those function's decl
contexts to be parsed, taking multiple seconds. However, most of those
parsed contexts will be immediately thrown away because most of the
functions are methods.

With this patch, I am trying to avoid the parsing by checking method-ness
directly from debug info rather than from parsed contexts. This helps a lot:
our unreal engine expression evaluation pause goes roughly from 4.5s to 0.3s.

However, my patch feels wrong - ideally, we would ignore methods already
during lookup (in ManualDWARFIndex::GetFunctions).

A resonable solution would be to change the meaning of
eFunctionNameTypeFull to only mean mangled names, and if the caller
wanted methods and basenames, they would have to call with
eFunctionNameTypeFull | eFunctionNameTypeMethod |
eFunctionNameTypeBase. This would require some amount churn at
(Get|Find)Functions call sites. Also, I am not sure about implications
for Apple's lookup index.

Thoughts?

Diff Detail

Event Timeline

jarin created this revision.Nov 5 2019, 6:43 AM

A resonable solution would be to change the meaning of
eFunctionNameTypeFull to only mean mangled names, and if the caller
wanted methods and basenames, they would have to call with
eFunctionNameTypeFull | eFunctionNameTypeMethod |
eFunctionNameTypeBase. This would require some amount churn at
(Get|Find)Functions call sites. Also, I am not sure about implications
for Apple's lookup index.

This sounds like a good path forward to me. I remember being bothered by the fact that we were returning ns::foo, when someone asked for a function with a "full" name foo when I worked on this code last time. However, I didn't do anything about that back when I was working on this code, as it wasn't that relevant for me then...

Unfortunately, just simply removing the

m_set.function_basenames.Find(name, offsets);
m_set.function_methods.Find(name, offsets);

lines from the manual dwarf index causes a number of tests to fail, so it looks like there are thing which do depend on us returning the extra things here. These need to be tracked down and fixed (possibly by adding | eFunctionNameTypeMethod | eFunctionNameTypeBase to the FindFunctions call, possibly in another way...). As for the Apple (and debug_names) indexes, they can not avoid looking at the dwarf DIE to eliminate methods (because the accelerator tables don't store this information), but they can definitely check whether something is a method *after* looking at the DWARF but *before* returning from the GetFunctions method (which should still be a major improvement over status quo).

I thought about something similar in the context of https://reviews.llvm.org/D68678 (that's for FIndTypes, but it we the same kind of problems there). Like Pavel said, it would be valuable to be able to filter after the accelerator table lookup, but before building, e.g., clang AST nodes.

jarin added a comment.Nov 6 2019, 4:50 AM

Thanks for the feedback! We will experiment with filtering in GetFunctions sometime next week.

Regarding the FindTypes patch, it would be really nice to have that for Linux, as well. I see the type pruning (TypeMap::RemoveMismatchedTypes) taking several seconds for some expression evaluations; it is likely to be the same problem you are solving.

jarin abandoned this revision.Jan 23 2020, 2:06 AM