This is an archive of the discontinued LLVM Phabricator instance.

[flang] Support C1553 about BIND(C) function result
ClosedPublic

Authored by peixin on Nov 2 2022, 7:31 AM.

Details

Summary

As Fortran 2018 C1553, if with BIND(C), the function result shall be an
interoperable scalar variable. As Fortran 2018 18.3.4(1), the
interoperable scalar variable is not a coarray, has neither the
ALLOCATABLE nor the POINTER attribute, and if it is of type character its
length is not assumed or declared by an expression that is not a constant
expression.

As Fortran 2018 18.3.1(1), if the type is character, the length type
parameter is interoperable if and only if its value is one.

Diff Detail

Event Timeline

peixin created this revision.Nov 2 2022, 7:31 AM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: jdoerfert. · View Herald Transcript
peixin requested review of this revision.Nov 2 2022, 7:31 AM
PeteSteinfeld accepted this revision.Nov 2 2022, 1:31 PM

All builds and tests correctly and looks good.

This revision is now accepted and ready to land.Nov 2 2022, 1:31 PM
jeanPerier added inline comments.Nov 3 2022, 1:47 AM
flang/lib/Semantics/check-declarations.cpp
433

Thanks for adding the check. I think that when it comes to the function result, a character can only be interoperable if its length is 1. I think character arguments are more relaxed because a character scalar can be sequence associated (15.5.2.11) to a character array that is interoperable with a char[]. 15.5.2.11 does not apply to result however as far as I know.

I find the standard not cristal clear here. But I am not quite sure how the interoperability of a character(10) result would work in practice for instance. What do you think ?

peixin added inline comments.Nov 3 2022, 2:25 AM
flang/lib/Semantics/check-declarations.cpp
433

I find the standard not cristal clear here.

It's true that character arguments can be sequence associated. And it should be OK to pass it to one C string. One problem is that C string should have one character '\0' at the end position, but Fortran does not have. So it is not safe to support it.

Please check 18.3.1 (1): If the type is character, the length type parameter is interoperable if and only if its value is one. gfortran and ifort do not allow character length more than 1 for both argument and function result.

So should we enforce the length 1 for both argument and function result?

peixin added inline comments.Nov 8 2022, 2:52 AM
flang/lib/Semantics/check-declarations.cpp
433

@jeanPerier Can we also enforce the length 1 of character for BIND(C) procedure?

jeanPerier added inline comments.Nov 16 2022, 12:27 AM
flang/lib/Semantics/check-declarations.cpp
433

Sorry for the late reply, yes, I think we should.

peixin updated this revision to Diff 475807.Nov 16 2022, 5:53 AM
peixin retitled this revision from [flang] Support C1553 about BIND(C) function result to [flang] Support C1553 about BIND(C) function result and procedure argument.
peixin edited the summary of this revision. (Show Details)

Enforce character length 1 for BIND(C) procedure.

jeanPerier requested changes to this revision.Dec 20 2022, 4:47 AM

Sorry for the delay, I am bad at watching already approved patches. My main comment is I think this would now reject legal program because of the dummy check.

flang/lib/Semantics/check-declarations.cpp
418

The message seems a bit complex to me. What about BIND(C) character function result must have length one ? @PeteSteinfeld for opinion.

421

I do not think all BIND(C) character must have length one. I think this only holds true if the argument has the VALUE attribute.

Otherwise, 18.3.6 tells:

(5) any dummy argument without the VALUE attribute corresponds to a formal parameter of the prototype
that is of a pointer type, and [...] the dummy argument is a nonallocatable nonpointer variable of type CHARACTER with assumed character length and the formal parameter is a pointer to CFI_cdesc_t,
[...]
(6) each allocatable or pointer dummy argument of type CHARACTER has deferred character length
`

It bothers me a little because this means that 18.3.4 and 18.3.5 do not apply to dummy objects, which is not very clear IMHO, but I do not think any error should be raised for:

subroutine foo(c1, c2, c3, p1, p2) bind(c)
  character(*) :: c1, c2(:), c3(100)
  character(:), pointer :: p1, p2(:)
end subroutine
This revision now requires changes to proceed.Dec 20 2022, 4:47 AM
peixin added inline comments.Dec 20 2022, 6:08 PM
flang/lib/Semantics/check-declarations.cpp
421

Thanks for this check. I did some more investigations and have the following results:

  1. gfortran 11.3.0 and early version do not support assumed-length or deferred-length character scalar (https://godbolt.org/z/GY3869j11). But from gfortran 12, these two are supported. ifort 2021.7.1 and ifx 2022.2.1 support these two, too.

For assumed-length or deferred-length character array, gfortran 12 supports it. ifort reports ICE, and ifx reports Feature found on this line is not yet supported in ifx . (https://godbolt.org/z/9795MYMv1)

BTW, flang-new supports the above two test cases in execution.

  1. I also think the standard is not very clear here. As 18.3.1(1), If the type is character, the length type parameter is interoperable if and only if its value is one. To my understanding, the standard only wants users to use character with length one to be interoperable even if it is assumed-length or deferred-length. As I mentioned before, C string has zero terminator, but Fortran character does not have. It should be unsafe to be interoperable for character with length more than 1. I tried gfortran 12 with -fcheck=bounds, and it did not report runtime errors.
  1. All of gfortran 12.2, ifort 2021.7.1 and ifx 2022.2.1 reports semantic errors when the length is constant and more than 1 (violate 18.3.1(1)).

All in all, for BIND(C) procedure character arugment, we may only report semantic error when the length is constant and more than 1. For assumed-length or deferred-length, we can allow it for now. @jeanPerier What do you think?

jeanPerier added inline comments.Dec 21 2022, 1:23 AM
flang/lib/Semantics/check-declarations.cpp
421

To my understanding, the standard only wants users to use character with length one to be interoperable even if it is assumed-length or deferred-length. As I mentioned before, C string has zero terminator, but Fortran character does not have. It should be unsafe to be interoperable for character with length more than 1.

I am not sure about that because the standard mandates to use a CFI descriptor for assumed-length/deferred length, so the (byte) length of the character is safely accessible from C elem_len descriptor field and user may very well manipulate it using all the n versions of the C library (e.g. strncmp).

By the way, I notice that even though we compile and run "correctly" BIND(C) for scalar assumed length with a Fortran caller and callee, we are not using a CFI descriptor (fir.box) as we should.

All in all, for BIND(C) procedure character arugment, we may only report semantic error when the length is constant and more than 1. For assumed-length or deferred-length, we can allow it for now. @jeanPerier What do you think?

Sounds fine to me. In a very pedantic way, I think we could also allow "assumed-shape, assumed-rank without the CONTIGUOUS" attribute with explicit length since it is covered by the third bullet point in (5) and (6) only applies to pointer/allocatable. The latest gfortran allow this (e.g character(10) :: c2(:)) (but not scalar or explicit shape with explicit length). Adding @klausler in the discussion to validate this.

peixin added inline comments.Dec 21 2022, 6:17 PM
flang/lib/Semantics/check-declarations.cpp
421

I am not sure about that because the standard mandates to use a CFI descriptor for assumed-length/deferred length, so the (byte) length of the character is safely accessible from C elem_len descriptor field and user may very well manipulate it using all the n versions of the C library (e.g. strncmp).

By the way, I notice that even though we compile and run "correctly" BIND(C) for scalar assumed length with a Fortran caller and callee, we are not using a CFI descriptor (fir.box) as we should.

Right. I missed this part.

The latest gfortran allow this (e.g character(10) :: c2(:)) (but not scalar or explicit shape with explicit length).

Check this example (https://godbolt.org/z/Krdnf3c6d), it seems gfortran uses CFI descriptor for this case. If we can support the explicit length in lowering using CFI decriptor, maybe one portability warning is better instead of following the support in gfortran (gfortran 13 may support more)?

BTW, it seems the standard does not disallow the explicit length (more than 1) character dummy argument without the VALUE attribute explicitly.

peixin updated this revision to Diff 487139.Jan 7 2023, 10:14 PM
peixin retitled this revision from [flang] Support C1553 about BIND(C) function result and procedure argument to [flang] Support C1553 about BIND(C) function result.
peixin edited the summary of this revision. (Show Details)
peixin set the repository for this revision to rG LLVM Github Monorepo.

Restrict this patch to C1553. Leave BIND(C) argument work (lowering and semantic check) in https://github.com/llvm/llvm-project/issues/59881.

peixin updated this revision to Diff 487146.Jan 7 2023, 11:29 PM

Fix clang-format issue. It seems that ../build/bin/clang-format is not consistent with buildbot clang-format check.

Looks good, thanks for all the rework @peixin

jeanPerier accepted this revision.Jan 9 2023, 5:06 AM
This revision is now accepted and ready to land.Jan 9 2023, 5:06 AM
This revision was automatically updated to reflect the committed changes.