This is an archive of the discontinued LLVM Phabricator instance.

[flang] Allow large and erroneous ac-implied-do's
ClosedPublic

Authored by PeteSteinfeld on May 10 2021, 7:48 PM.

Details

Summary

We sometimes unroll an ac-implied-do of an array constructor into a flat list
of values. We then re-analyze the array constructor that contains the
resulting list of expressions. Such a list may or may not contain errors.

But when processing an array constructor with an unrolled ac-implied-do, the
compiler was building an expression to represent the extent of the resulting
array constructor containing the list of values. The number of operands
in this extent expression was based on the number of elements in the
unrolled list of values. For very large lists, this created an
expression so large that it could not be evaluated by the compiler
without overflowing the stack.

I fixed this by continuously folding the extent expression as each operand is
added to it. I added the test .../flang/test/Semantics/array-constr-big.f90
that will cause the compiler to seg fault without this change.

Also, when the unrolled ac-implied-do expression contains errors, we were
repeating the same error message referencing the same source line for every
instance of the erroneous expression in the unrolled list. This potentially
resulted in a very long list of messages for a single error in the source code.

I fixed this by comparing the message being emitted to the previously emitted
message. If they are the same, I do not emit the message. This change is also
tested by the new test array-constr-big.f90.

Several of the existing tests had duplicate error messages for the same source
line, and this change caused differences in their output. So I adjusted the
tests to match the new message emitting behavior.

Diff Detail

Event Timeline

PeteSteinfeld created this revision.May 10 2021, 7:48 PM
PeteSteinfeld requested review of this revision.May 10 2021, 7:48 PM
Herald added a project: Restricted Project. · View Herald TranscriptMay 10 2021, 7:48 PM
PeteSteinfeld added a project: Restricted Project.
jeanPerier accepted this revision.May 11 2021, 4:36 AM
jeanPerier added inline comments.
flang/include/flang/Evaluate/shape.h
187

Are they not cases where the additions cannot be folded and the array constructor could still grow big ?
If so, then we will still eventually hit the overflow, and every Fold might actually get expensive.

Maybe GetArrayConstructorExtent should give up if the non constant part grows too much (this could be done by keeping constants n and non constants n in a different accumulators for which the number of operands would be tracked, and GetArrayConstructorExtent would return nullopt after a certain threshold is reached).

I am not sure if there are actually cases where non constant n could still lead to an overflow (I suppose ac-implied-do containing such dynamic extents expression would not be unrolled). So what you have may be enough until we hit a scenario where more is needed here.

flang/lib/Parser/message.cpp
330

I am not opposed to having this here. I am wondering if the places that are raising the same messages over and over should not also be be prevented from doing so. Maybe some checkers in ArrayConstructors should stop after x errors to avoid wasting time and filling messages_.
Again, if you considered it and it turned out unpractical, what you have here already looks like an improvement to me, and it had the benefit to clean some other small error duplication cases.

This revision is now accepted and ready to land.May 11 2021, 4:36 AM

Thanks for the feedback, Jean!

flang/include/flang/Evaluate/shape.h
187

I don't believe that there are cases where the additions cannot be folded. One of the conditions for folding is that the initial, final, and stride expressions of the ac-implied-do are all constants. The large extent expression that gets produced is based on these constant values. So I don't believe that, with this change, it's possible to create an extent expression that will overflow the stack no matter how large the unrolled list of values is. Note that the values in the unrolled list are not necessarily constant and are not used in the folding. It's only the extent expression.

I like your suggestion of giving up on expression processing if an expression grows to large. I think that this should be done generally for all expression processing. I'm working on a more general way to do this.

flang/lib/Parser/message.cpp
330

Here's an example of code that will produce many messages:

integer(foo),parameter :: jval4(*) = (/ (0_foo,ii=1,10) /)

Without this change, the compiler will produce 10 identical messages for the one erroneous instance of "0_foo":

./j1.f90:2:46: error: Must be a constant value
    integer(foo),parameter :: jval4(*) = (/ (0_foo,ii=1,10) /)
                                               ^^^
./j1.f90:2:46: error: Must be a constant value
    integer(foo),parameter :: jval4(*) = (/ (0_foo,ii=1,10) /)
                                               ^^^
    ...

Basing the elimination of messages on the number of messages seems incorrect to me. Rather, we should only produce one message for each group of unrolled messages. It might be possible to do that. But the current code solves the same problem in a more general way. Let me know if you disagree.

klausler accepted this revision.May 11 2021, 9:08 AM
klausler added inline comments.
flang/lib/Parser/message.cpp
214

Messages*

This revision was automatically updated to reflect the committed changes.
PeteSteinfeld added inline comments.May 11 2021, 10:22 AM
flang/lib/Parser/message.cpp
214

Thanks. I was (unsuccessfully) trying to distinguish between the name of the type and the general English language term. I'll clean this up.