This is an archive of the discontinued LLVM Phabricator instance.

[flang] Detect circularly defined interfaces of procedures
ClosedPublic

Authored by PeteSteinfeld on Mar 2 2021, 8:01 AM.

Details

Summary

It's possible to define a procedure whose interface depends on a procedure
which has an interface that depends on the original procedure. Such a circular
definition was causing the compiler to fall into an infinite loop when
resolving the name of the second procedure. It's also possible to create
circular dependency chains of more than two procedures.

I fixed this by adding the function HasCycle() to the class DeclarationVisitor
and calling it from DeclareProcEntity() to detect procedures with such
circularly defined interfaces. I marked the associated symbols of such
procedures by calling SetError() on them. When processing subsequent
procedures, I called HasError() before attempting to analyze their interfaces.
Unfortunately, this did not work.

With help from Tim, we determined that the SymbolSet used to track the
erroneous symbols was instantiated using a "<" operator which was defined using
the location of the name of the procedure. But the location of the procedure
name was being changed by a call to ReplaceName() between the times that the
calls to SetError() and HasError() were made. This caused HasError() to
incorrectly report that a symbol was not in the set of erroneous symbols.

I fixed this by changing SymbolSet to be an unordered set that uses the
contents of the name of the symbol as the basis for its hash function. This
works because the contents of the name of the symbol is preserved by
ReplaceName() even though its location changes.

I also fixed the error message used when reporting recursively defined
dummy procedure arguments by removing extra apostrophes and sorting the
list of symbols.

I also added tests that will crash the compiler without this change.

Note that the "<" operator is used in other contexts, for example, in the map
of characterized procedures, maps of items in equivalence sets, maps of
structure constructor values, ... All of these situations happen after name
resolution has been completed and all calls to ReplaceName() have already
happened and thus are not subject to the problem I ran into when ReplaceName()
was called when processing procedure entities.

Note also that the implementation of the "<" operator uses the relative
location in the cooked character stream as the basis of its implementation.
This is potentially problematic when symbols from diffent compilation units
(for example symbols originating in .mod files) are put into the same map since
their names will appear in two different source streams which may not be
allocated in the same relative positions in memory. But I was unable to create
a test that caused a problem. Using a direct comparison of the content of the
name of the symbol in the "<" operator has problems. Symbols in enclosing or
parallel scopes can have the same name. Also using the location of the symbol
in the cooked character stream has the advantage that it preserves the the
order of the symbols in a structure constructor constant, which makes matching
the values with the symbols relatively easy.

This patch supersedes D97749.

Diff Detail

Event Timeline

PeteSteinfeld created this revision.Mar 2 2021, 8:01 AM
PeteSteinfeld requested review of this revision.Mar 2 2021, 8:01 AM
Herald added a project: Restricted Project. · View Herald TranscriptMar 2 2021, 8:01 AM
PeteSteinfeld added a project: Restricted Project.Mar 2 2021, 8:01 AM
awarzynski accepted this revision.Mar 2 2021, 8:47 AM

Thank you for fixing this so quickly @PeteSteinfeld !

The patch was originally reviewed and accepted in https://reviews.llvm.org/D97201. This an updated version to make sure that this change works fine regardless of the underlying platform.

This change was previously reverted because resolve102.f90 was failing in our public Flang builders (the AAtch64 and PowerPC ones). I tested this updated patch on an AArch64 machine and can confirm that resolve102.f90 is no longer failing. The non-determinism in the output that was causing the failure has been removed by updating GetSeenProcs (see https://reviews.llvm.org/D97749 for context).

LGTM!

This revision is now accepted and ready to land.Mar 2 2021, 8:47 AM
This revision was landed with ongoing or failed builds.Mar 2 2021, 10:54 AM
This revision was automatically updated to reflect the committed changes.

I see a failure in the buildbot.
https://lab.llvm.org/buildbot/#/builders/33/builds/2845

/home/flang/flang-aarch64-ubuntu-clang/flang-aarch64-ubuntu-clang/llvm-project/flang/lib/Semantics/resolve-names.cpp:3655:25: error: loop variable 'procInCycle' of type 'const Fortran::common::Reference<const Fortran::semantics::Symbol>' creates a copy from type 'const Fortran::common::Reference<const Fortran::semantics::Symbol>' [-Werror,-Wrange-loop-construct]

for (const auto procInCycle : procsInCycle) {
                ^

/home/flang/flang-aarch64-ubuntu-clang/flang-aarch64-ubuntu-clang/llvm-project/flang/lib/Semantics/resolve-names.cpp:3655:14: note: use reference type 'const Fortran::common::Reference<const Fortran::semantics::Symbol> &' to prevent copying

for (const auto procInCycle : procsInCycle) {
     ^~~~~~~~~~~~~~~~~~~~~~~~
                &

I see a failure in the buildbot.
https://lab.llvm.org/buildbot/#/builders/33/builds/2845

/home/flang/flang-aarch64-ubuntu-clang/flang-aarch64-ubuntu-clang/llvm-project/flang/lib/Semantics/resolve-names.cpp:3655:25: error: loop variable 'procInCycle' of type 'const Fortran::common::Reference<const Fortran::semantics::Symbol>' creates a copy from type 'const Fortran::common::Reference<const Fortran::semantics::Symbol>' [-Werror,-Wrange-loop-construct]

for (const auto procInCycle : procsInCycle) {
                ^

/home/flang/flang-aarch64-ubuntu-clang/flang-aarch64-ubuntu-clang/llvm-project/flang/lib/Semantics/resolve-names.cpp:3655:14: note: use reference type 'const Fortran::common::Reference<const Fortran::semantics::Symbol> &' to prevent copying

for (const auto procInCycle : procsInCycle) {
     ^~~~~~~~~~~~~~~~~~~~~~~~
                &

Thanks, @kiranchandramohan. I'll fix this.

I see a failure in the buildbot.
https://lab.llvm.org/buildbot/#/builders/33/builds/2845

/home/flang/flang-aarch64-ubuntu-clang/flang-aarch64-ubuntu-clang/llvm-project/flang/lib/Semantics/resolve-names.cpp:3655:25: error: loop variable 'procInCycle' of type 'const Fortran::common::Reference<const Fortran::semantics::Symbol>' creates a copy from type 'const Fortran::common::Reference<const Fortran::semantics::Symbol>' [-Werror,-Wrange-loop-construct]

for (const auto procInCycle : procsInCycle) {
                ^

/home/flang/flang-aarch64-ubuntu-clang/flang-aarch64-ubuntu-clang/llvm-project/flang/lib/Semantics/resolve-names.cpp:3655:14: note: use reference type 'const Fortran::common::Reference<const Fortran::semantics::Symbol> &' to prevent copying

for (const auto procInCycle : procsInCycle) {
     ^~~~~~~~~~~~~~~~~~~~~~~~
                &

Thanks, @kiranchandramohan. I'll fix this.

I just pushed a change that should fix this.

tskeith added inline comments.Mar 3 2021, 3:24 PM
flang/include/flang/Semantics/symbol.h
600

I don't like keeping an operator< that is only valid in some cases. It took a long time to find the bug with SymbolSet and errorSymbols_ and the presence of this function makes it easy to introduce another similar bug.

I think it would be better not to provide a default ordering of symbols, or to provide one that is always valid. Ordering by symbol address is valid, though may result in an unexpected order. Or we could add a serial number to symbols and sort them using that, i.e. in order of creation.

klausler added inline comments.Mar 3 2021, 3:34 PM
flang/include/flang/Semantics/symbol.h
600

I concur, and will soon propose replacing SymbolSet with multiple types whose collating order (if any) is more clear.

Most SymbolSet use cases really just need to be an unordered collection of Symbol pointers. The other cases might be better served as maps and multimaps from name positions or name values to Symbols.

Note that we could use the initial address in the cooked character stream as the basis for the "<" operator. I used that in an earlier implementation, and it worked fine and was immune to changes caused by calls to ReplaceName(). But it has the potential problem that there may be cases where we're comparing symbols from .mod files. Peter's solution of serial numbers would fix this.

Note also that I believe that currently, all sets of symbols in the compiler are declared as SymbolSet, which is an unordered set.