This is an archive of the discontinued LLVM Phabricator instance.

[flang] Fail at link time if derived type descriptors were not generated
ClosedPublic

Authored by jeanPerier on Feb 11 2022, 9:18 AM.

Details

Summary

Currently, code generation was creating weak symbols for derived type
descriptor global it could not find in the current compilation unit.
The rational is that:

  • the derived type descriptors of external module derived types are generated in the compilation unit that compiled the module so that the type descriptor address is uniquely associated with the type.
  • some types do not have derived type descriptors: the builtin derived types used to create derived type descriptors. The runtime knows about them and does not need them to accomplish the feat of describing themselves. Hence, all unresolved derived type descriptors in codegen cannot be assumed to be resolved at link time.

However, this caused immense debugging pain when, for some reasons, derived
type descriptor that should be generated were not. This caused random
runtime failures instead of a much cleaner link time failure.

Improve this situation by allowing codegen to detect the builtin derived
types that have no derived type descriptors and requiring the other
unresolved derived type descriptor to be resolved at link time.

Also make derived type descriptor constant data since this was a TODO
and makes the situation even cleaner. This requiring telling lowering
which compiler created symbols can be placed in read only memory. I
considered using PARAMETER, but I have mixed feeling using it since that
would cause the initializer expressions of derived type descriptor to
be invalid from a Fortran point of view since pointer targets cannot be
parameters. I do not want to start misusing Fortran attributes, even if
I think it is quite unlikely semantics would currently complain. I also
do not want to rely on the fact that all object symbols with the
CompilerCreated flags are currently constant data. This could easily
change in the future and cause runtime bugs if lowering rely on this
while the assumption is not loud and clear in semantics.
Instead, add a ReadOnly symbol flag to tell lowering that a compiler
generated symbol can be placed in read only memory.

Diff Detail

Event Timeline

jeanPerier created this revision.Feb 11 2022, 9:18 AM
jeanPerier requested review of this revision.Feb 11 2022, 9:18 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 11 2022, 9:18 AM
klausler accepted this revision.Feb 11 2022, 9:27 AM
klausler added inline comments.
flang/include/flang/Common/builtin-modules.h
1 ↗(On Diff #407908)

I'm not sure that /Common is the right place for this -- the name could be defined in semantics and referenced from lowering via the SemanticsContext.

flang/include/flang/Semantics/symbol.h
509

Could go on the same line

511

"the PARAMETER attribute."

flang/lib/Semantics/runtime-type-info.cpp
250

Fortran

This revision is now accepted and ready to land.Feb 11 2022, 9:27 AM
jeanPerier marked an inline comment as done.

Remove flang/Common/builtin-modules.h and fix typos.

jeanPerier marked 2 inline comments as done.Feb 14 2022, 2:11 AM
jeanPerier added inline comments.
flang/include/flang/Common/builtin-modules.h
1 ↗(On Diff #407908)

OK, a new header is actually not that useful, I moved this definition directly in Semantics/runtime-type-info.h. However, it is not possible to use the SemanticsContext in the part where I need to know if a types comes from the type info module.

The current mapping between derived types and their type descriptors happens late: In the pass from MLIR to LLVM IR. At that stage, we have a FIR operation fir.embox ... : fir.box<fir.type<a_derived_type>> that means create a descriptor for an object of type a_derived_type and the translation pass to LLVM IR will find the type descriptor global object from the type name (plus some scope information mangled in the type name). The semantic context is not available anymore at that stage (at the MLIR level, the compilation could stop and dump the IR, and the LLVM translation pass can start back from that).