The current getFoldedSizeOf() implementation uses naive recursion, which could be really slow when the input structure type is too complex.
This issue was first brought up in http://llvm.org/bugs/show_bug.cgi?id=8281; this change fixes it by adding memoization.
This looks up Ty in cache twice, you can do it once, e.g., remember the find result.