Page MenuHomePhabricator

[Flang][OpenMP] Avoid abort when collapse clause value is negative
ClosedPublic

Authored by clementval on Apr 9 2020, 12:29 PM.

Details

Summary

If the value in the collapse close is negative f18 abort without the correct error message. This PR change the size_t in name resolution to a int64_t and check appropriately for negative or zero before the privatization of induction variable.
The correct error is then catch by the OpenMP structure check.

This diff is migrated from the GitHub pull request https://github.com/flang-compiler/f18/pull/1098

Diff Detail

Event Timeline

clementval created this revision.Apr 9 2020, 12:29 PM
Herald added a project: Restricted Project. · View Herald Transcript
clementval edited the summary of this revision. (Show Details)Apr 9 2020, 12:30 PM
jdoerfert added a comment.EditedApr 9 2020, 1:00 PM

Why is this not caught by the parser? I mean, this seems a lot of code to accommodate negative values and some conditions to not trip over them even though they are never allowed/useful.

Why is this not caught by the parser? I mean, this seems a lot of code to accommodate negative values and some conditions to not trip over them even though they are never allowed/useful.

The parser doesn’t enforce any semantic checking or name resolution so any int expression is valid at parsing time.

Why is this not caught by the parser? I mean, this seems a lot of code to accommodate negative values and some conditions to not trip over them even though they are never allowed/useful.

Why is this not caught by the parser? I mean, this seems a lot of code to accommodate negative values and some conditions to not trip over them even though they are never allowed/useful.

The argument is only an Expr to the parser. To check whether it's an integer or constant, we need to do that in Semantics.

ichoyjx accepted this revision.Apr 9 2020, 2:53 PM
This revision is now accepted and ready to land.Apr 9 2020, 2:53 PM
clementval updated this revision to Diff 256451.Apr 9 2020, 5:23 PM

Update diff

Why is this not caught by the parser? I mean, this seems a lot of code to accommodate negative values and some conditions to not trip over them even though they are never allowed/useful.

The argument is only an Expr to the parser. To check whether it's an integer or constant, we need to do that in Semantics.

Thanks for the review. I just updated the patch with correct clang-format. If you have write rights, can you land the patch. I do not have rights now.

Why is this not caught by the parser? I mean, this seems a lot of code to accommodate negative values and some conditions to not trip over them even though they are never allowed/useful.

The argument is only an Expr to the parser. To check whether it's an integer or constant, we need to do that in Semantics.

As far as I can tell, the collapse argument is a scalarIntConstantExpr to the parser. So conceptually one could make it a scalarPositiveIntConstantExpr and issue an error early.
One should consider this to avoid implicit dependences expressed as TODOs that require checks in places that may be visited before the collapse is verified.

​// TODO: This assumes that the do-loops association for collapse/ordered
//       clause has been performed (the number of nested do-loops >= n).
void OmpAttributeVisitor::PrivatizeAssociatedLoopIndex(

As far as I can tell, the collapse argument is a scalarIntConstantExpr to the parser. So conceptually one could make it a scalarPositiveIntConstantExpr and issue an error early.

Fortran has scalar-int-constant-expr but not scalar-positive-int-constant-expr as nonterminals.

The f18 parser doesn't do any semantic checking in any case. The various constrained nonterminals of Fortran are represented in the parse tree and checked later in expression semantics.

As far as I can tell, the collapse argument is a scalarIntConstantExpr to the parser. So conceptually one could make it a scalarPositiveIntConstantExpr and issue an error early.

Fortran has scalar-int-constant-expr but not scalar-positive-int-constant-expr as nonterminals.

The f18 parser doesn't do any semantic checking in any case.

This is arguably not true. The OpenMP grammar does not restrict the the collapse value to be a constant, the description does *in the same sentence* it is restricted to be positive.
If you would not perform any semantic check you would allow an arbitrary expression here and check it later in sema. My argument now is that this does make as much sense as checking that it is a constant but not if it is positive.

The various constrained nonterminals of Fortran are represented in the parse tree and checked later in expression semantics.

But apparently not before you access/use input not validated yet. Take the TODO that I quoted (which is above the method that now duplicates the semantic check for the collapse clause value).
It already noted that there is an implicit dependence here. Resolving that by duplicating checks seems conceptually the wrong approach. (You also now have signed values floating around even though you want conceptually unsigned values.)

As far as I can tell, the collapse argument is a scalarIntConstantExpr to the parser. So conceptually one could make it a scalarPositiveIntConstantExpr and issue an error early.

Fortran has scalar-int-constant-expr but not scalar-positive-int-constant-expr as nonterminals.

The f18 parser doesn't do any semantic checking in any case.

This is arguably not true. The OpenMP grammar does not restrict the the collapse value to be a constant, the description does *in the same sentence* it is restricted to be positive.
If you would not perform any semantic check you would allow an arbitrary expression here and check it later in sema. My argument now is that this does make as much sense as checking that it is a constant but not if it is positive.

The various constrained nonterminals of Fortran are represented in the parse tree and checked later in expression semantics.

But apparently not before you access/use input not validated yet. Take the TODO that I quoted (which is above the method that now duplicates the semantic check for the collapse clause value).
It already noted that there is an implicit dependence here. Resolving that by duplicating checks seems conceptually the wrong approach. (You also now have signed values floating around even though you want conceptually unsigned values.)

I can't figure out what you're trying to express, or what part of my comment was "arguably not true". The Fortran 2018 grammar has some conventions that allow the names of nonterminals to be qualified with semantic restriction prefixes (4.1.3 for "scalar"). So a "scalar-int-constant-expr" is an "int-constant-expr" (R1031) that is semantically constrained to be scalar and constant (C1031) in the Fortran sense of a "constant expression" (10.1.12) and integer (C1008). The f18 parse tree represents constraints like this with template classes (Scalar<>, Integer<>, &c.) that wrap a subtree with a representation of a check that is applied after parsing during the semantic analysis and validation of expressions and variable references. The expressions and variable references in the parse tree are analyzed and validated after parsing has recognized a syntactically valid program, name resolution has created the symbols and scopes and types, and any necessary corrections to ambiguous parses have been applied to the parse tree. The semantic analysis of expressions and variables generates (in the absence of errors) a strongly-typed internal representation of those objects, and attaches those representations to the parse tree so that the later semantic passes have access to them. All compiler messages from all phases get buffered, then sorted in source position order and emitted later after all messages have been created.

As far as I can tell, the collapse argument is a scalarIntConstantExpr to the parser. So conceptually one could make it a scalarPositiveIntConstantExpr and issue an error early.

Fortran has scalar-int-constant-expr but not scalar-positive-int-constant-expr as nonterminals.

The f18 parser doesn't do any semantic checking in any case.

This is arguably not true. The OpenMP grammar does not restrict the the collapse value to be a constant, the description does *in the same sentence* it is restricted to be positive.
If you would not perform any semantic check you would allow an arbitrary expression here and check it later in sema. My argument now is that this does make as much sense as checking that it is a constant but not if it is positive.

The various constrained nonterminals of Fortran are represented in the parse tree and checked later in expression semantics.

But apparently not before you access/use input not validated yet. Take the TODO that I quoted (which is above the method that now duplicates the semantic check for the collapse clause value).
It already noted that there is an implicit dependence here. Resolving that by duplicating checks seems conceptually the wrong approach. (You also now have signed values floating around even though you want conceptually unsigned values.)

I can't figure out what you're trying to express,

Hm, unsure what was confusing in my response. Maybe you say which of these is unclear or if it something else entirely:

  • My concerns with this kind of "fix", e.g., how it duplicates the verification logic and ties into the TODO above he function.
  • My suggested alternative "fix", that is parsing positive-constants.
  • My argument that the line drawn in the current parser is arbitrary, e.g., it cannot be justified with the OpenMP standard, and therefor not a good reason to keep it this way.

or what part of my comment was "arguably not true".

The part I quoted. To make it clear, this part: *The f18 parser doesn't do any semantic checking in any case.*
I then explained why I think that with respect to the OpenMP standard. From your answer below I sense you might
draw a line between parsing as building (an annotated) parse tree and parsing as all the effects of the code in the
flang/parser folder. FWIW, I was talking about the latter.

The Fortran 2018 grammar has some conventions that allow the names of nonterminals to be qualified with semantic restriction prefixes (4.1.3 for "scalar"). So a "scalar-int-constant-expr" is an "int-constant-expr" (R1031) that is semantically constrained to be scalar and constant (C1031) in the Fortran sense of a "constant expression" (10.1.12) and integer (C1008). The f18 parse tree represents constraints like this with template classes (Scalar<>, Integer<>, &c.) that wrap a subtree with a representation of a check that is applied after parsing during the semantic analysis and validation of expressions and variable references. The expressions and variable references in the parse tree are analyzed and validated after parsing has recognized a syntactically valid program, name resolution has created the symbols and scopes and types, and any necessary corrections to ambiguous parses have been applied to the parse tree. The semantic analysis of expressions and variables generates (in the absence of errors) a strongly-typed internal representation of those objects, and attaches those representations to the parse tree so that the later semantic passes have access to them. All compiler messages from all phases get buffered, then sorted in source position order and emitted later after all messages have been created.

Thank you for the detailed explanation. Unfortunately, now I am the one unsure what you want to tell me.

I never argued about the Fortran grammar but the OpenMP one, this is about OpenMP parsing/sema after all and the Fortran grammar does not mention the collapse clause (I guess at least). I also did not say you should do anything different *except* what is basically exposed with this patch. That is, PrivatizeAssociatedLoopIndex is run on an AST nodes in which the associated loops has an invalid value. This patch "fixes" PrivatizeAssociatedLoopIndex so it can cope with such invalid input. My point is that this duplicates the burden of validating the associated loops count.


To recap my argument:

As far as I can tell, when PrivatizeAssociatedLoopIndex is entered the collapse clause argument has been evaluated already (, which is why the test failed). If that argument was defined as scalarPositiveIntConstantExpr instead of a scalarIntConstantExpr the pre-condition mentioned in the TODO above PrivatizeAssociatedLoopIndex would be satisfied by construction and we would not need to worry about invalid inputs here or elsewhere for that matter. The OpenMP standard defines the argument of a collapse clause as positive constant. Expecting a positive constant already during parsing (=the code in flang/parser) seems more reasonable to me as parsing a constant (which is also only taken from the OpenMP standard) and then verifying the value in a second step.

To recap my argument:

As far as I can tell, when PrivatizeAssociatedLoopIndex is entered the collapse clause argument has been evaluated already (, which is why the test failed). If that argument was defined as scalarPositiveIntConstantExpr instead of a scalarIntConstantExpr the pre-condition mentioned in the TODO above PrivatizeAssociatedLoopIndex would be satisfied by construction and we would not need to worry about invalid inputs here or elsewhere for that matter. The OpenMP standard defines the argument of a collapse clause as positive constant. Expecting a positive constant already during parsing (=the code in flang/parser) seems more reasonable to me as parsing a constant (which is also only taken from the OpenMP standard) and then verifying the value in a second step.

The code that was moved into the flang/lib/Parser directory comprises preprocessing, prescanning into a normalized source form, parsing proper with construction of a parse tree, message formatting and source provenance, and utilities for formatting the parse tree in various ways. None of the code in flang/lib/Parser performs semantic analysis. It is not possible to validate any semantic constraints in the parser, including the existing constraints (scalar, integer, constant, default character, logical, &c.) and the new one you suggest, because the parser has no symbol table and cannot evaluate expressions. All of that semantic checking is performed by code that is not located in the flang/lib/Parser and that runs after a successful parse of the entire source file is complete.

ChinouneMehdi added a project: Restricted Project.Apr 15 2020, 5:44 AM
sscalpone accepted this revision.Apr 15 2020, 12:03 PM
sscalpone added a subscriber: sscalpone.

In all cases, the compiler must check that the expression used in collapse is positive. Seems like overkill to create a new expr parser to catch a subset of this check.

The change looks good. Thanks.

To recap my argument:

As far as I can tell, when PrivatizeAssociatedLoopIndex is entered the collapse clause argument has been evaluated already (, which is why the test failed). If that argument was defined as scalarPositiveIntConstantExpr instead of a scalarIntConstantExpr the pre-condition mentioned in the TODO above PrivatizeAssociatedLoopIndex would be satisfied by construction and we would not need to worry about invalid inputs here or elsewhere for that matter. The OpenMP standard defines the argument of a collapse clause as positive constant. Expecting a positive constant already during parsing (=the code in flang/parser) seems more reasonable to me as parsing a constant (which is also only taken from the OpenMP standard) and then verifying the value in a second step.

The code that was moved into the flang/lib/Parser directory comprises preprocessing, prescanning into a normalized source form, parsing proper with construction of a parse tree, message formatting and source provenance, and utilities for formatting the parse tree in various ways. None of the code in flang/lib/Parser performs semantic analysis. It is not possible to validate any semantic constraints in the parser, including the existing constraints (scalar, integer, constant, default character, logical, &c.) and the new one you suggest, because the parser has no symbol table and cannot evaluate expressions. All of that semantic checking is performed by code that is not located in the flang/lib/Parser and that runs after a successful parse of the entire source file is complete.

You seem to not address any of my concerns but eager to lecture me about something different. This is not helping at all.

In all cases, the compiler must check that the expression used in collapse is positive. Seems like overkill to create a new expr parser to catch a subset of this check.

The change looks good. Thanks.

I mentioned this before, the change addresses this TODO which should be removed or modified accordingly:

// TODO: This assumes that the do-loops association for collapse/ordered
//       clause has been performed (the number of nested do-loops >= n).

@jdoerfert I have spoken to the original develop who will prepare an update for the TODO comment. if @clementval wants to fix it up, see my comment.

sscalpone added inline comments.Apr 15 2020, 4:40 PM
flang/lib/Semantics/resolve-names.cpp
6572

Perhaps replace the TODO in the comment with:
// TODO: revisit after semantics checks are completed for do-loop association of collapse and ordered

Update todo

clang-fromat

In all cases, the compiler must check that the expression used in collapse is positive. Seems like overkill to create a new expr parser to catch a subset of this check.

The change looks good. Thanks.

I mentioned this before, the change addresses this TODO which should be removed or modified accordingly:

// TODO: This assumes that the do-loops association for collapse/ordered
//       clause has been performed (the number of nested do-loops >= n).

I have updated the comment. I'll have a look to address the semantic check regarding the loop association. Nevertheless, the structure check for OpenMP is currently done after the name resolution so the error is only caught after this is executed.

@jdoerfert I have spoken to the original develop who will prepare an update for the TODO comment. if @clementval wants to fix it up, see my comment.

@sscalpone I'll have a look at that.

DavidTruby requested changes to this revision.Apr 16 2020, 8:01 AM
This revision now requires changes to proceed.Apr 16 2020, 8:01 AM

@DavidTruby You requested change without any comment. Which changes do you want?

My mistake, I think I forgot to click "Save Draft" on the comment! I've added it now, sorry about that.

flang/lib/Semantics/resolve-names.cpp
1242

Should this be std::int64_t? I don't think int64_t is guaranteed to be provided outside the std:: namespace in C++.

Use std::int64_t

clementval marked 2 inline comments as done.Apr 16 2020, 6:39 PM

@sscalpone Just to be sure what we are talking about. Semantic check for ordered and collapse clause are done in check-omp-structure.cpp. Those checks are executed after the name resolution. So the check here is still necessary unless semantic check are done before name resolution.

DavidTruby accepted this revision.Apr 17 2020, 7:44 AM

Looks good to me now, thanks! I would like to see @jdoerfert approve this before it lands since he expressed concerns in previous comments.

This revision is now accepted and ready to land.Apr 17 2020, 7:44 AM

Looks good to me now, thanks! I would like to see @jdoerfert approve this before it lands since he expressed concerns in previous comments.

Thanks for the review! Sure let's see if @jdoerfert has still some concerns. There was also a question to @sscalpone that is still unanswered.

Yes, @clementval, I understood your comment, and it got me thinking about the structure of the compiler's evaluation of OpenMP semantics. Looking at the code, I believe the assert at line 6517 will trigger if the collapse value is greater than the loop nest. Unfortunately the check for collapse(n) cannot be moved earlier to the structure checker because symbols have not yet been resolved so 'n' cannot be evaluated. I've talked with @ichoyjx about keeping the symbol resolution here but moving the association of attributes to semantics, at which time all of the required information can be available.

In the meantime, this change is a reasonable workaround to prevent the assertion until the code is restructured.

Yes, @clementval, I understood your comment, and it got me thinking about the structure of the compiler's evaluation of OpenMP semantics. Looking at the code, I believe the assert at line 6517 will trigger if the collapse value is greater than the loop nest. Unfortunately the check for collapse(n) cannot be moved earlier to the structure checker because symbols have not yet been resolved so 'n' cannot be evaluated. I've talked with @ichoyjx about keeping the symbol resolution here but moving the association of attributes to semantics, at which time all of the required information can be available.

In the meantime, this change is a reasonable workaround to prevent the assertion until the code is restructured.

Yeah it makes sense

@jdoerfert Ping. Do you agree with the current changes? If not, we can just let it as is until the order of semantic checks is redone to avoid a double check.

sscalpone accepted this revision.May 4 2020, 11:26 PM

Looks good to me now, thanks! I would like to see @jdoerfert approve this before it lands since he expressed concerns in previous comments.

@jdoerfert anything to add here?

@DavidTruby @sscalpone Since we have no news for a while on this, should a land this patch?

I'm gonna land this patch if there is no objection in the next 24h.

This revision was automatically updated to reflect the committed changes.

Apologies for not responding, I did not see the emails and accepted patches hide in my dashboard pretty well. TBH, from the responses I assumed you went ahead with this a while ago.

Apologies for not responding, I did not see the emails and accepted patches hide in my dashboard pretty well. TBH, from the responses I assumed you went ahead with this a while ago.

No worries, this was not blocking on my side so I thought I would give enough time to give a feedback if needed.