diff --git a/flang/docs/RuntimeTypeInfo.md b/flang/docs/RuntimeTypeInfo.md new file mode 100644 --- /dev/null +++ b/flang/docs/RuntimeTypeInfo.md @@ -0,0 +1,271 @@ + + +# The derived type runtime information table + +```eval_rst +.. contents:: + :local: +``` + +## Overview + +Many operations on derived types must be implemented, or can be +implemented, with calls to the runtime support library rather than +directly with generated code. +Some operations might be initially implemented in the runtime library +and then reimplemented later in generated code for compelling +performance gains in optimized compilations. + +The runtime library uses *derived type description* tables to represent +the relevant characteristics of derived types. +This note summarizes the requirements for these descriptions. + +The semantics phase of the F18 frontend constructs derived type +descriptions from its scoped symbol table after name resolution +and semantic constraint checking have succeeded. +The lowering phase then transfers the tables to the static +read-only data section of the generated program by translating them into +initialized objects. +During execution, references to the tables occur by passing their addresses +as arguments to relevant runtime library APIs and as pointers in +the addenda of descriptors. + +## Requirements + +The following Fortran language features require, or may require, the use of +derived type descriptions in the runtime library. + +### Components + +The components of a derived type need to be described in component +order (7.4.7), but when there is a parent component, its components +can be described by reference to the description of the type of the +parent component. + +The ordered component descriptions are needed to implement +* default initialization +* `ALLOCATE`, with and without `SOURCE=` +* intrinsic assignment of derived types with `ALLOCATABLE` and + automatic components +* intrinsic I/O of derived type instances +* `NAMELIST` I/O of derived type instances +* "same type" tests + +The characteristics of data components include their names, types, +offsets, bounds, cobounds, derived type descriptions when appropriate, +default component initializers, and flags for `ALLOCATABLE`, `POINTER`, +`PRIVATE`, and automatic components (implicit allocatables). +Procedure pointer components require only their offsets and address(es). + +### Calls to type-bound procedures + +Only extensible derived types -- those without `SEQUENCE` or `BIND(C)` +-- are allowed to have type-bound procedures. +Calls to these bindings will be resolved at compilation time when +the binding is `NON_OVERRIDABLE` or when an object is not polymorphic. +Calls to overridable bindings of polymorphic objects requires the +use of a runtime table of procedure addresses. + +Each derived type (or instantiation of a parameterized derived type) +will have a complete type-bound procedure table in which all of the +bindings of its ancestor types appear first. +(Specifically, the table offsets of any inherited bindings must be +the same as they are in the table of the ancestral type's table.) +These ancestral bindings reflect their overrides, if any. + +The non-inherited bindings of a type then follow the inherited +bindings, and they do so in alphabetical order of binding name. +(This is an arbitrary choice -- we could also define them to +appear in binding declaration order, I suppose -- but a consistent +ordering should be used so that relocatables generated by distinct +versions of the F18 compiler will have a better chance to interoperate.) + +### Type parameter values and "same type" testing + +The values of the `KIND` and `LEN` parameters of a particular derived type +instance can be obtained to implement type parameter inquiries without +requiring derived type information tables. +In the case of a `KIND` type parameter, it's a constant value known at +compilation time, and in the case of a `LEN` type parameter, it's a +member of the addendum to the object's descriptor. + +The runtime library will have an API (TBD) to be called as +part of the implementation of `TYPE IS` and `CLASS IS` guards +of the `SELECT TYPE` construct. +This language support predicate returns a true result when +an object's type matches a particular type specification and +`KIND` (but not `LEN`) type parameter values. + +Note that this "is same type as" predicate is *not* the same as +the one to be called to implement the `SAME_TYPE_AS()` intrinsic function, +which is specified so as to *ignore* the values of `KIND` type +parameters. + +Subclause 7.5.2 defines what being the "same" derived type means +in Fortran. +In short, each definition of a derived type defines a distinct type, +so type equality testing can usually compare addresses of derived +type descriptions at runtime. +The exceptions are `SEQUENCE` types and interoperable (`BIND(C)`) +types. +Independent definitions of each of these are considered to be the "same type" +when these definitions match in terms of names, types, and attributes, +both being either `SEQUENCE` or `BIND(C)`, and containing +no `PRIVATE` components. +These "sequence" derived types cannot have type parameters, type-bound +procedures, an absence of components, or components that are not themselves +of a sequence type, so we can use a static hash code to implement +their "same type" tests. + +### FINAL subroutines + +When an instance of a derived type is deallocated or goes out of scope, +one of its `FINAL` subroutines may be called. +Subclause 7.5.6.3 defines when finalization occurs -- it doesn't happen +in all situations. + +The subroutines named in a derived type's `FINAL` statements are not +bindings, so their arguments are not passed object dummy arguments and +do not have to satisfy the constraints of a passed object. +Specifically, they can be arrays, and cannot be polymorphic. +If a `FINAL` subroutine's dummy argument is an array, it may be +assumed-shape or assumed-rank, but it could also be an explicit-shape +or assumed-size argument. +This means that it may or may not be passed by means of a descriptor. + +Note that a `FINAL` subroutine with a scalar argument does not define +a finalizer for array objects unless the subroutine is elemental +(and probably `IMPURE`). +This seems to be a language pitfall and F18 will emit a +warning when an array of a finalizable derived type is declared +with a rank lacking a `FINAL` subroutine when other ranks do have one. + +So the necessary information in the derived type table for a `FINAL` +subroutine comprises: +* address(es) of the subroutine +* rank of the argument, or whether it is assumed-rank +* for rank 0, whether the subroutine is elemental +* for rank > 0, whether the argument requires a descriptor + +This descriptor flag is needed to handle a difficult case with +`FINAL` subroutines that most other implementations of Fortran +fail to get right: a `FINAL` subroutine +whose argument is a an explicit shape or assumed size array may +have to be called upon the parent component of an array of +an extended derived type. + +``` + module m + type :: parent + integer :: n + contains + final :: subr + end type + type, extends(parent) :: extended + integer :: m + end type + contains + subroutine subr(a) + type(parent) :: a(1) + end subroutine + end module + subroutine demo + use m + type(extended) :: arr(1) + end subroutine +``` + +If the `FINAL` subroutine doesn't use a descriptor -- and it +will not if there are no `LEN` type parameters -- the runtime +will have to allocate and populate a temporary array of copies +elements of the parent component of the array so that it can +be passed by reference to the `FINAL` subroutine. + +### Defined assignment + +A defined assignment subroutine for a derived type can be declared +by means of a generic `INTERFACE ASSIGNMENT(=)` and by means of +a generic type-bound procedure. +Defined assignments with non-type-bound generic interfaces are +resolved to specific subroutines at compilation time. +Most cases of type-bound defined assignment are resolved to their +bindings at compilation time as well (with possible runtime +resolution of overridable bindings). + +Intrinsic assignment of derived types with components that have +derived types with type-bound generic assignments is specified +by subclause 10.2.1.3 paragraph 13 as invoking defined assignment +subroutines, however. + +This seems to be the only case of defined assignment that may be of +interest to the runtime library. +If this is correct, then the requirements are somewhat constrained; +we know that the rank of the target of the assignment must match +the rank of the source, and that one of the dummy arguments of the +bound subroutine is a passed object dummy argument and satisfies +all of the constraints of one -- in particular, it's scalar and +polymorphic. + +So the derived type information for a defined assignment needs to +comprise: +* address(es) of the subroutine +* whether the first, second, or both arguments are descriptors +* whether the subroutine is elemental + +### User defined derived type I/O + +Fortran programs can specify subroutines that implement formatted and +unformatted `READ` and `WRITE` operations for derived types. +These defined I/O subroutines may be specified with an explicit `INTERFACE` +or with a type-bound generic. +When specified with an `INTERFACE`, the first argument must not be +polymorphic, but when specified with a type-bound generic, the first +argument is a passed-object dummy argument and required to be so. +In any case, the argument is scalar. + +Nearly all invocations of user defined derived type I/O subroutines +are resolved at compilation time to specific procedures or to +overridable bindings. +(The I/O library APIs for acquiring their arguments remain to be +designed, however.) +The case that is of interest to the runtime library is that of +NAMELIST I/O, which is specified to invoke user defined derived +type I/O subroutines if they have been defined. + +The derived type information for a user defined derived type I/O +subroutine comprises: +* address(es) of the subroutine +* whether it is for a read or a write +* whether it is formatted or unformatted +* whether the first argument is a descriptor (true if it is a + binding of the derived type, or has a `LEN` type parameter) + +## Exporting derived type descriptions from module relocatables + +Subclause 7.5.2 requires that two objects be considered as having the +same derived type if they are declared "with reference to the same +derived type definition". +For derived types that are defined in modules and accessed by means +of use association, we need to be able to describe the type in the +read-only static data section of the module and access the description +as a link-time external. + +This is not always possible to achieve in the case of instantiations +of parameterized derived types, however. +Two identical instantiations in distinct compilation units of the same +use associated parameterized derived type seem impractical to implement +using the same address. +(Perhaps some linkers would support unification of global objects +with "mangled" names and identical contents, but this seems unportable.) + +Derived type descriptions therefore will contain pointers to +their "uninstantiated" original derived types. +For derived types with no `KIND` type parameters, these pointers +will be null; for uninstantiated derived types, these pointers +will point at themselves.