This document describes how uniquing of internal names is done. This
name uniquing is done to support the constraints and invariants of the FIR
dialect of MLIR.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
Could you comment on whether this mangling will have any effect on interfacing with C/C++? Will this have any effects on LTO. What happens if a bind name is specified?
http://web.mit.edu/tibbetts/Public/inside-c/www/mangling.html
Hi Kiran,
Good questions and thanks for asking.
The hope is that this mangling will not conflict with C and C++, of course. None of the languages (C, C++, or Fortran) have a standard mangling. C reserves the underscore, double underscore, and underscore capital letter prefixes [1],[2]. A description of a common C++ name mangling scheme is [3],[your link]. It seems like the only common thing about Fortran name mangling implementations is that different vendors have their own, as can be experimented with [4].
The uniquing scheme described in this document has some similarities and differences to other mangling schemes, but it was designed to minimize collisions with those spaces.
As far as using bind C names, the plan is to just use the bind C name directly as it should never have the prefix marker "_Q", so it will be recognized as a symbol name that was not uniqued. That may or may not be sufficient depending other unknowns. (We are similarly targeting llvm intrinsic functions in our present work.) The fallback plan would be to unique the name and then relabel when lowering to LLVM.
The bidirectional ability and flexibility are key objectives. It means that it may be the case that these names are never exposed to the LLVM layer, in the object files, to the linker, etc. Since this scheme can recover the symbols from the front-end, the symbols can themselves be lowered as a conversion and in a target-dependent manner.
[1] https://stackoverflow.com/questions/39625352/why-do-some-functions-in-c-have-an-underscore-prefix
[2] http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf
[3] https://itanium-cxx-abi.github.io/cxx-abi/abi.html#mangling
[4] https://fortran.godbolt.org/
Reminder, flang should follow the well-established naming conventions when creating external names for "f77" entities. There's nothing in this proposal that blocks that; this is just an fyi.
flang/documentation/BijectiveInternalNameUniquing.md | ||
---|---|---|
8 | "goal" could be "feature" or "requirement" -- not important. |
Thanks @schweitz for the detailed reply.
I have a couple more questions.
-> Isn't this necessary for the Block construct?
-> Will uniquing all names lead to lower readability of the IR? (Assuming that is what being proposed here)
-> Yeah, renaming local variables back to original names (during lowering to LLVM IR) when there is no clash seems better for readability.
The uniquing is required in the context of the MLIR Module symbol space. (Artifacts with a process lifetime such as functions, globals, etc.) Locals need not be uniqued as they have a unique identity as ssa-values. (Their names are tracked with name attributes attached to the Op.)
The uniquing is required in the context of the MLIR Module symbol space. (Artifacts with a process lifetime such as functions, globals, etc.) Locals need not be uniqued as they have a unique identity as ssa-> values. (Their names are tracked with name attributes attached to the Op.)
Thanks @schweitz. Adding the above information to the doc might be helpful.
flang/documentation/BijectiveInternalNameUniquing.md | ||
---|---|---|
35 | Submodules of an ancestor module have to have distinct names, so you don't need to include s1mod in the unique name for s2mod (though it doesn't hurt). So the unique name for smod2 could be just _QMmodSs2mod. | |
50 | What about the blank common block? Is its name just _QB? | |
59 | On line 30 it says F is the prefix. P makes more sense to me. |
"goal" could be "feature" or "requirement" -- not important.