Add new APIs needed to use the demangler for C++ demangling in IR passes.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
I don't have a problem with adding an API like this in theory, but what exactly are you planning on using these for in an IR pass? If its to infer some semantic property of the symbol then there probably are better ways of expressing that, like getting the frontend to add metadata or a function attribute or something.
Yes, I infer semantics properties. I use demangler to implement lowering of "C++ intrinsics" - Itanium-mangled C++ name (e.g. templated function instantiation) - for my SPIRV-based target. Lowering depends on the name, template parameter values and argument types (if overloaded). https://github.com/intel/llvm/pull/1881
If its to infer some semantic property of the symbol then there probably are better ways of expressing that, like getting the frontend to add metadata or a function attribute or something.
It is not obvious to me that metadata + attributes is better than demangling (as long as it implemented correctly) - with demangling implementation is more self-encapsulated and simpler. But that's pretty much orthogonal to this code change.
Had to add few more fixes to get my code including ItanumDemangler.h compiled. Plus returning NodeArray by const ref is more efficient.
Hmm, I would tend to agree with Erik here, that directly declaring the semantics via an attribute (or metadata) is better than encoding it in the name. I think it will be easier to reason about if you declare the semantics in the IR.
@hfinkel do you have thoughts?
I agree. We want to encode semantics using the facilities that are intended to convey semantics (e.g., attributes). I don't think that we want to go down the road of imbuing the function name itself with IR-level semantics. That would mean that passes that change function names (outliners, specializers, etc.) need to be aware of special rules not to disturb those encoded semantics. I realize that, in some sense, this may be a theoretical concern (because these semantics are likely encoded in the middle of the string or at the beginning, and the names are normally changed by appending to the end of the string). Regardless, I'd prefer we leave the names free of special meaning in the IR.
The patch by itself is just ItaniumDemangler.h interface enhancements (+ compilation error fix - objcProto->getProtocol()) :-)
If the change is OK by itself (not taking into account potential future uses), I would appreciate approval.
Still I would like to find the best solution for the intrinsic translation (also because we'll come to llvm.org some day with upstream request :-) ), so let me try to explain what I'm doing exactly and understand objections in more details. Or maybe my explanation will address them.
This IR pass
- is not a part of LLVM optimizer, instead it is invoked as clang code generation "epilog", so LLVM optimizer passes don't see/interpret the original C++ names, they operate on usual IR with "@llvm.xxx" intrinsics
- is doing basically the same thing done by clang when it lowers built-ins: given the built-in name, it translates it to LLVM IR
The main difference from clang is that the built-in name is a C++ mangled name. For example, in our SIMD library we declare the following C++ intrinsic:
template < sycl::intel::gpu::EsimdAtomicOpType Op, typename Ty, int N, sycl::intel::gpu::CacheHint L1H = sycl::intel::gpu::CacheHint::Default, sycl::intel::gpu::CacheHint L3H = sycl::intel::gpu::CacheHint::Default> SYCL_EXTERNAL sycl::intel::gpu::vector_type_t<Ty, N> __esimd_flat_atomic0(sycl::intel::gpu::vector_type_t<uint64_t, N> addrs, sycl::intel::gpu::vector_type_t<uint16_t, N> pred);
and then use the __esimd_flat_atomic0 in the C++ library code to implement higher-level operations. Particular instantiation of __esimd_flat_atomic0 appears as C++ mangled name in the LLVM IR output of the clang after CodeGen.
Before FE finishes and hand's IR off to the optimizer, we run a pass which "lowers" those calls to "@llvm.xxx" intrinsic calls. There is no actually any real "semantics interpretation", just intrinsic translation, and this is entirely encapsulated into the Intel GPU-specific IR pass.
So I think those things @hfinkel expressed concerns about won't happen:
That would mean that passes that change function names (outliners, specializers, etc.) need to be aware of special rules not to disturb those encoded semantics
This IR pass is part of the Front End.
If I understand the suggestion to use attributes correctly, that would mean the following:
Suppose there are two instantiations of __esimd_flat_atomic0 and clang CodeGen produced a function declaration @AAA__esimd_flat_atomic0_XXX for the first one and @BBB__esimd_flat_atomic0_YYY - for the second. Then clang FE CodeGen should also emit attributes encoding all the template parameters of each instantiation. Then there would be and IR lowering pass which would be the same as my IR lowering pass except that it would interpret <name, attributes> information to generate proper "@llvm.xxx" intrinsic instead of using demangler for that. So it seems with attribute-based approach there would be more code and changes to two components (FE + LLVM pass) instead of one (LLVM Pass).
Please upload the patch with full context.
Still I would like to find the best solution for the intrinsic translation (also because we'll come to llvm.org some day with upstream request :-) ), so let me try to explain what I'm doing exactly and understand objections in more details. Or maybe my explanation will address them.
This IR pass
- is not a part of LLVM optimizer, instead it is invoked as clang code generation "epilog", so LLVM optimizer passes don't see/interpret the original C++ names, they operate on usual IR with "@llvm.xxx" intrinsics
- is doing basically the same thing done by clang when it lowers built-ins: given the built-in name, it translates it to LLVM IR
The main difference from clang is that the built-in name is a C++ mangled name. For example, in our SIMD library we declare the following C++ intrinsic:
template < sycl::intel::gpu::EsimdAtomicOpType Op, typename Ty, int N, sycl::intel::gpu::CacheHint L1H = sycl::intel::gpu::CacheHint::Default, sycl::intel::gpu::CacheHint L3H = sycl::intel::gpu::CacheHint::Default> SYCL_EXTERNAL sycl::intel::gpu::vector_type_t<Ty, N> __esimd_flat_atomic0(sycl::intel::gpu::vector_type_t<uint64_t, N> addrs, sycl::intel::gpu::vector_type_t<uint16_t, N> pred);and then use the __esimd_flat_atomic0 in the C++ library code to implement higher-level operations. Particular instantiation of __esimd_flat_atomic0 appears as C++ mangled name in the LLVM IR output of the clang after CodeGen.
Before FE finishes and hand's IR off to the optimizer, we run a pass which "lowers" those calls to "@llvm.xxx" intrinsic calls. There is no actually any real "semantics interpretation", just intrinsic translation, and this is entirely encapsulated into the Intel GPU-specific IR pass.
So I think those things @hfinkel expressed concerns about won't happen:That would mean that passes that change function names (outliners, specializers, etc.) need to be aware of special rules not to disturb those encoded semantics
This IR pass is part of the Front End.
If I understand the suggestion to use attributes correctly, that would mean the following:
Suppose there are two instantiations of __esimd_flat_atomic0 and clang CodeGen produced a function declaration @AAA__esimd_flat_atomic0_XXX for the first one and @BBB__esimd_flat_atomic0_YYY - for the second. Then clang FE CodeGen should also emit attributes encoding all the template parameters of each instantiation. Then there would be and IR lowering pass which would be the same as my IR lowering pass except that it would interpret <name, attributes> information to generate proper "@llvm.xxx" intrinsic instead of using demangler for that. So it seems with attribute-based approach there would be more code and changes to two components (FE + LLVM pass) instead of one (LLVM Pass).
I'm inclined to say that we should just enhance Clang to support templated builtins, and then in CGBulitin, just generate the correct LLVM intrinsic directly. No IR pre-processing pass required.
Please upload the patch with full context.
Done. Sorry for missing that procedure detail.
I'm inclined to say that we should just enhance Clang to support templated builtins, and then in CGBulitin, just generate the correct LLVM intrinsic directly. No IR pre-processing pass required.
I agree that that would be useful feature - C++ templated built-ins. But this seems like pretty big infrastructural effort. I don't feel myself familiar with Clang builtin ifra enough to do that. I'll be happy to re-implement intrinsic lowering once the Clang support is available.
Until then, does current approach with using IR pass seem reasonable work-around?
For the record: using existing non-templated built-ins mechanism would require separate built-in per each template parameters combination, which does not seem practical.
Thanks. Regarding this patch itself, did you explain how this API is to be used?
I'm inclined to say that we should just enhance Clang to support templated builtins, and then in CGBulitin, just generate the correct LLVM intrinsic directly. No IR pre-processing pass required.
I agree that that would be useful feature - C++ templated built-ins. But this seems like pretty big infrastructural effort. I don't feel myself familiar with Clang builtin ifra enough to do that. I'll be happy to re-implement intrinsic lowering once the Clang support is available.
Until then, does current approach with using IR pass seem reasonable work-around?
No. We should actually implement the features that we need. That having been said, did you try declaring the builtin and then declaring it as a function template in the header file? I just spent a few minutes looking through all of the places where BuiltinID appears in clang/lib/Sema, etc., and I didn't see any obvious place where this breaks (especially if you use custom type checking for the builtin -- if you don't, there's one place that checks the declaration return type against the builtin return type that would need to be changed). In short, I don't actually think that it's a large infrastructure change.
For the record: using existing non-templated built-ins mechanism would require separate built-in per each template parameters combination, which does not seem practical.
Agreed.
Thanks. Regarding this patch itself, did you explain how this API is to be used?
Not specifically. I just mentioned it will be used in LLVM passes which use the demangler.
Example for IntegerLiteral::getType() and getValue():
switch (Args[N]->getKind()) { case id::Node::KIntegerLiteral: { auto *ValL = castNode(Args[N], IntegerLiteral); const id::StringView &TyStr = ValL->getType(); Ty = TyStr.size() == 0 ? IntegerType::getInt32Ty(Ctx) : parsePrimitiveTypeString( StringRef(TyStr.begin(), TyStr.size()), Ctx); Val = ValL->getValue(); break; } ... return APInt(Ty->getPrimitiveSizeInBits(), StringRef(Val.begin(), Val.size()), 10);
That having been said, did you try declaring the builtin and then declaring it as a function template in the header file?
I haven't. It seems that you mean I should declare in Builins.def something like
LIBBUILTIN(__esimd_flat_atomic0, <signature>, <effects>, "esimd_builtins.h", CXX_LANG)
and insert the C++ templated declaration of __esimd_flat_atomic0 into esimd_builtins.h, correct? I have a question then - the signature may include const references to vectors, how to represent those?
More generally, if we want "C++ intrinsics" there should probably be a way to represent C++ types in the signature - ?
A vector is vector_type::type from
template <typename Ty, int N> struct vector_type { static constexpr int length = N; using type = Ty __attribute__((ext_vector_type(N))); };
and insert the C++ templated declaration of __esimd_flat_atomic0 into esimd_builtins.h, correct? I have a question then - the signature may include const references to vectors, how to represent those?
More generally, if we want "C++ intrinsics" there should probably be a way to represent C++ types in the signature - ?A vector is vector_type::type from
template <typename Ty, int N> struct vector_type { static constexpr int length = N; using type = Ty __attribute__((ext_vector_type(N))); };
According to {{clang/include/clang/Basic/Builtins.def}}, you can represent pointers and references via * and &; for vectors - there are V, q, and E. I don't know how to deal with template parameter here, but reference to 4-component vector should be something like &E4i
According to {{clang/include/clang/Basic/Builtins.def}}, you can represent pointers and references via * and &; for vectors - there are V, q, and E. I don't know how to deal with template parameter here, but reference to 4-component vector should be something like &E4i
Thanks. Somehow missed the '&' in the doc. I assume it should appear on the other side - "E4i&", though.
I still would like to understand what to do with C++ classes - do we forbid them in the "C++ intrinsics" or how do we represent them otherwise ("v&"?)
If you use custom type checking, I don't think it matters what you put in for types in the builtins file.
clang-format: please reformat the code