The existing logic was to discard any symbols representing function template instantiations, as the definitions were assumed to be inline. But there are three explicit specializations of clang::Type::getAs that are only defined in Clang's lib/AST/Type.cpp, and at least the plugin used by the LibreOffice build (https://wiki.documentfoundation.org/Development/Clang_plugins) uses those functions.
Details
Diff Detail
Event Timeline
utils/extract_symbols.py | ||
---|---|---|
136 | Instead of listing each symbol specifically it would be better to use a regex, e.g. re.match('??$getAs.+Type@clang@@') (note: I haven't tested this regex to check that it works). | |
203 | Same here, though here you can just check that the symbol starts with _ZNK5clang4Type5getAs. |
utils/extract_symbols.py | ||
---|---|---|
203 | We still want to filter out the majority of clang::Type::getAs intstantiations (as they're implicit instantiations of the inline function template), so I thought listing the three symbols explicitly in the two should_keep_*_symbol functions is the best way. An alternative would be to use a regex that matches just those three cases---I can do that change if you prefer it that way. |
utils/extract_symbols.py | ||
---|---|---|
203 | My concern here is that if things change in the future (e.g. more specializations of clang::Type::getAs are added in the future) then the code here would have to change, but a regex would hopefully be more robust. Also from a quick check (nm clang | grep 4Type5getAs | wc -l) there are 37 instances of this function, which isn't many. |
utils/extract_symbols.py | ||
---|---|---|
203 | "My concern here is that if things change in the future (e.g. more specializations of clang::Type::getAs are added in the future) then the code here would have to change": as it would have to when other non-inline explicit specializations are added; those are apparently always special cases that need some sort of manual maintenance here (unless all function template instantiations should be "filtered in") "but a regex would hopefully be more robust": more robust in what sense? "there are 37 instances of this function, which isn't many": I'm not sure I get what you want to imply with that (that it would be OK to "filter in" all 37 of them, implicit instantiations as well as explicit specializations?) |
utils/extract_symbols.py | ||
---|---|---|
203 |
If we check for symbols that start with _ZNK5clang4Type5getAs (or the equivalent regex on the microsoft mangling side of things) then that will match all instantiations of the clang::type::getAs function template (both of the generic version in the header file and any explicit specializations). Currently there are three explicit specializations but if more were added in the future then exporting all clang::type::getAs symbols would mean these new specializations would also be exported, whereas exporting only the three listed symbols would mean any new specializations would not be exported. (I have no idea if this is even remotely likely, so this may be an entirely theoretical concern.)
The reason we're doing all this stuff with deciding whether or not to export certain symbols is that we have a limit of 65535 symbols that can be exported from a dll/exe on Windows (when using MSVC), and 37 isn't many compared to 65535. Exporting symbols that we don't need to (e.g. the instances of clang::type::getAs that get instantiated from the generic definition in the header file) is perfectly fine, just so long as we don't go over that limit of 65535. This does mean that should_keep_itanium_symbol is, to some degree, pointless, as we don't have that limit when building using gcc, but it's there for the sake of consistency which is valuable in that it reduces platform-specific sources of failure (e.g. if something isn't working due to a missing symbol on Windows then you can then debug that on Linux). |
utils/extract_symbols.py | ||
---|---|---|
203 | Thanks, got you now. I had somehow started to operate under the wrong assumption that we'd need to be careful to filter out any weak (and whatever's the matching term for MSVC) inline functions. |
Instead of listing each symbol specifically it would be better to use a regex, e.g. re.match('??$getAs.+Type@clang@@') (note: I haven't tested this regex to check that it works).