This is an archive of the discontinued LLVM Phabricator instance.

[flang][hlfir] Lower forall to HLFIR
ClosedPublic

Authored by jeanPerier on May 4 2023, 11:21 AM.

Details

Summary

Lower Forall to the previously added hlfir.forall, hlfir.forall_mask.
hlfir.forall_index, and hlfir.region_assign operations.

The HLFIR assignment code lowering is moved into genDataAssignment for
more readability and so that user defined assignment (still a TODO),
will be able to share most of the logic.

Depends on D149852, D149836

Diff Detail

Event Timeline

jeanPerier created this revision.May 4 2023, 11:21 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 4 2023, 11:21 AM
jeanPerier requested review of this revision.May 4 2023, 11:21 AM
vzakhari accepted this revision.May 4 2023, 7:17 PM

Looks great!

flang/lib/Lower/Bridge.cpp
2048

Why cannot we do the same for the masked case?

3130

typo: LHS -> RHS

3159

typo: right -> left

This revision is now accepted and ready to land.May 4 2023, 7:17 PM
jeanPerier marked an inline comment as done.May 5 2023, 1:29 AM
jeanPerier added inline comments.
flang/lib/Lower/Bridge.cpp
2048

Because it may, and most likely will, depend on the forall indices being defined in the concurrent header.
Typically, this will be forall(i=lb:ub, some_predicate(i)) x(i) = y(i). These symbols depends on the hlfir.forall block arguments, so the expression can only be evaluated inside it.

We could check if the mask expression depends on the forall symbols via CollectSymbols, but I think we would be wasting compilation time because I do not think it is usual to have a scalar mask that does not depend on the indices (it would be equivalent to having "if (cdt) { forall {} })".

Note that the bounds/step expressions of forall indices in the same forall may not depend on any of the indices of the same concurrent header (Fortran 2018 C1123 constraints), so it is safe to outline them.

I added a comment.

jeanPerier updated this revision to Diff 519765.May 5 2023, 1:30 AM

Thanks a lot for the review!

  • Update genDataAssignment to incorporate genImplicitLogicalConvert.
  • Add comment to explain why mask expression evalution is not outlined
  • Fix right/left mix-ups in comments
tblah added inline comments.May 5 2023, 3:23 AM
flang/test/Lower/HLFIR/forall.f90
115–116

As I understand it, the limits to these iterations can be any integer expression (R1127). So jfoo() and jbar() might have side effects and so can't be hoisted outside of the outer forall.

jeanPerier added inline comments.May 5 2023, 5:35 AM
flang/test/Lower/HLFIR/forall.f90
115–116

Thanks for pointing this out! I actually thought I was doing an optimization by hoisting these out, but after reading the standard again, it turns out I think we are required to hoist them precisely because they may have side effects...

First, you are correct that they may have side effects, however, I think the standard tells in 11.1.7.4.2 that the concurrent-limits/steps can be evaluated in any order, and that we must not evaluate jfoo()/jbar() for each i (that we evaluate them exactly and only once).

Rational:

10.2.4.2.2 Determination of the values for index variables.

  1. The values of the index variables are determined as they are for the DO CONCURRENT statement (11.1.7.4.2).

11.1.7.4.2: DO CONCURRENT loop control

  1. The concurrent-limit and concurrent-step expressions in the concurrent-control-list are evaluated. > These expressions may be evaluated in any order.

Plus the standard explicitly adds a constraint that the concurrent control for j cannot depend on i, so that they can be evaluated only once.

C1123 (R1126) A concurrent-limit or concurrent-step in a concurrent-control shall not contain a reference to any index-name in the concurrent-control-list in which it appears.

Regarding the side effects they can have:

10.1.4 Evaluation of operations
Execution of a function reference in the logical expression in [...] the concurrent-limits and concurrent-steps in a FORALL statement (10.2.4) is permitted to define variables in the subsidiary [...] forall-assignment-stmt.

While this allows concurrent-limits/step (of the outer forall only(*) ) expression to have side effect on the variables used in the nested forall-assignment statements, this does not allow the concurrent-limits/step to have side effect that would affect other concurrent-limit/setp. The standard is quite clear here we could even evaluate jfoo()/jbar() before ifoo() and ibar().

(*) All this discussions only impact the outer Forall, since C1037 requires every procedure reference inside a Forall body to be pure, which includes the concurrent-control of the nested Forall. So we can evaluate the nested control again an again or never without any correctness implications.

Do you agree with the reading of the standard?

tblah accepted this revision.May 5 2023, 5:41 AM

Looks great to me!

flang/test/Lower/HLFIR/forall.f90
115–116

Ahh yes I think you are right. Thank you for checking this

vzakhari added inline comments.May 5 2023, 8:41 AM
flang/lib/Lower/Bridge.cpp
2048

Thanks!

This revision was automatically updated to reflect the committed changes.