This is an archive of the discontinued LLVM Phabricator instance.

[flang][RFC] Add lowering design for procedure pointers
ClosedPublic

Authored by peixin on Oct 27 2022, 5:42 AM.

Details

Diff Detail

Event Timeline

peixin created this revision.Oct 27 2022, 5:42 AM
Herald added a project: Restricted Project. · View Herald Transcript
Herald added a subscriber: jdoerfert. · View Herald Transcript
peixin requested review of this revision.Oct 27 2022, 5:42 AM

Thanks for adding me.
NFC comment: you have a typo in the title - procdure -> procedure

peixin retitled this revision from [flang][RFC] Add lowering design for procdure pointers to [flang][RFC] Add lowering design for procedure pointers.Oct 27 2022, 5:46 AM
peixin edited the summary of this revision. (Show Details)

Thanks for writing this up !

I think there are few additional things you may need to cover here about Procedure pointers:

  1. What will happen for procedure pointers pointing to internal procedures.

Currently, the flang-procedure-pointer pass (see lib/Optimizer/CodeGen/BoxedProcedure.cpp) rewrites fir.boxproc<T> and deals with emboxproc, the idea behind is that function address in Fortran can point to internal procedures, and there are several ways this could be dealt with. For instance, procedure pointers could carry a pointer to the host stack if it points to an internal procedure, and calls made to it would need to retrieve and hoist this as an extra arguments. For procedure dummy, the chosen design was to create thunks (which has the drawback or requiring the stack to be executable). Some solutions with runtime managed trampolines defined outside of the stack could also be imagined.

  1. Procedure pointers to functions returning characters

For dummy procedures, the length of the function may be assumed in the procedure making the call. Given the length is required to make the call, it must be provided as an extra argument (see https://github.com/llvm/llvm-project/blob/ffc7f9d542370eb72ad1f4bf79f763ca685bab8b/flang/include/flang/Optimizer/Builder/Character.h#L198). Is it also possible for a dummy procedure pointer to have a function type with an assumed length ? If so, something might need to be done to carry the length information (as an extra argument ?).

  1. Is it possible to pass procedure pointers in BIND(C) procedure ? (If yes, what is the related C type).
flang/docs/ProcedurePointer.md
171

I get that you are following the local object POINTER lowering here where we both have a fir.box<> and a fir.ptr<> being allocated on the stack and do sync operations. I am thinking this may away with HLFIR, at least when lowering, and that a pass would instead simplify local fir.box into a set of address allocations.

I would if possible stay away from it for procedure pointers and just start with a single fir.alloca !fir.boxproc<(!fir.ref<i32>) -> !fir.ref<f32>>.

193

Lack Fortran input illustration ?

229

Note that you may need to relax T to be () -> () in this case to account for the fact that function types are being rewritten by certain pass (e.g: https://github.com/llvm/llvm-project/blob/main/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp and https://github.com/llvm/llvm-project/blob/main/flang/lib/Optimizer/Transforms/AbstractResult.cpp).

The issue I see there with having a type !fir.boxproc<T> inside a fir.type is that if T function type needs to be rewritten, so must the fir.type, and currently, I do not think there is a way to change the types of the fir.type members after it was created without creating a new type with a different name. See this TODO: https://github.com/llvm/llvm-project/blob/67d0d7ac0acb0665d6a09f61278fbcf51f0114c2/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp#L128.

The same issue will probably happens with fir.global of procedure pointers, where the fir.global might type needs to be changed if the interface is transformed.

  1. What will happen for procedure pointers pointing to internal procedures.

Currently, the flang-procedure-pointer pass (see lib/Optimizer/CodeGen/BoxedProcedure.cpp) rewrites fir.boxproc<T> and deals with emboxproc, the idea behind is that function address in Fortran can point to internal procedures, and there are several ways this could be dealt with. For instance, procedure pointers could carry a pointer to the host stack if it points to an internal procedure, and calls made to it would need to retrieve and hoist this as an extra arguments. For procedure dummy, the chosen design was to create thunks (which has the drawback or requiring the stack to be executable). Some solutions with runtime managed trampolines defined outside of the stack could also be imagined.

An extra argument doesn't work for internal procedures passed as actual arguments to F'77 procedures unless you want to break ABI compatibility.

A stack thunk is a security hole. Please don't require an executable stack segment for Fortran.

I don't think that there's any alternative to having heap-resident thunks for internal procedures when they are used as procedure pointer targets and/or actual arguments. These impose a clean-up burden on lowering but are otherwise straightforward.

peixin updated this revision to Diff 471732.Oct 29 2022, 2:44 AM
  1. Add more related sections and constraints.
  2. Address the comments.

Thanks @jeanPerier and @klausler a lot for the detailed comments and explanations.

Note that you may need to relax T to be () -> () in this case to account for the fact that function types are being rewritten by certain pass (e.g: https://github.com/llvm/llvm-project/blob/main/flang/lib/Optimizer/CodeGen/TargetRewrite.cpp and https://github.com/llvm/llvm-project/blob/main/flang/lib/Optimizer/Transforms/AbstractResult.cpp).

The issue I see there with having a type !fir.boxproc<T> inside a fir.type is that if T function type needs to be rewritten, so must the fir.type, and currently, I do not think there is a way to change the types of the fir.type members after it was created without creating a new type with a different name. See this TODO: https://github.com/llvm/llvm-project/blob/67d0d7ac0acb0665d6a09f61278fbcf51f0114c2/flang/lib/Optimizer/CodeGen/BoxedProcedure.cpp#L128.

Fixed.

The same issue will probably happens with fir.global of procedure pointers, where the fir.global might type needs to be changed if the interface is transformed.

Right, I hit this previously. Add it in TODO lists.

  1. What will happen for procedure pointers pointing to internal procedures.

Add explanations in doc.

Is it also possible for a dummy procedure pointer to have a function type with an assumed length ? If so, something might need to be done to carry the length information (as an extra argument ?).

Yes. Add explanations in doc.

  1. Is it possible to pass procedure pointers in BIND(C) procedure ? (If yes, what is the related C type).

Yes. Add Case 4 and Case 5. Please check that.

An extra argument doesn't work for internal procedures passed as actual arguments to F'77 procedures unless you want to break ABI compatibility.
A stack thunk is a security hole. Please don't require an executable stack segment for Fortran.
I don't think that there's any alternative to having heap-resident thunks for internal procedures when they are used as procedure pointer targets and/or actual arguments. These impose a clean-up burden on lowering but are otherwise straightforward.

Thanks for these explanations. Added in doc.

flang/docs/ProcedurePointer.md
171

Got it. Thanks. Fixed.

jeanPerier added inline comments.Oct 31 2022, 2:52 AM
flang/docs/ProcedurePointer.md
267
274

Do you mean tuple<fir.ref<!fir.boxproc<T>>, i64> or fir.ref<!fir.boxproc<T>>, i64> ? (The latter probably only makes sense it it is possible to change the length of the procedure pointer dummy, which I am not sure is possible).

peixin updated this revision to Diff 471974.Oct 31 2022, 5:24 AM
peixin added inline comments.
flang/docs/ProcedurePointer.md
267

Fixed.

274

Sorry, my bad. I meant to use tuple<!fir.ref<!fir.boxproc<T>>, i64>.

PeteSteinfeld requested changes to this revision.Oct 31 2022, 1:29 PM

Thanks for doing this!

I'd like to see information on how you're going to test this. There should be executable tests, which, unfortunately, we don't have much support for. It would be good to list all of the areas that need to be tested -- functions, subroutines, internal procedures, component procedure pointers, interfacing with C, varying types for arguments and function returns, ...

flang/docs/ProcedurePointer.md
81

"intrinsic NULL" should read "the NULL() intrinsic"

208–211

I didn't understand this section. I assume that the sentence that starts with "With the explicit interface ..." should read something like "If the C function has an explicit interface and a dummy argument that is a procedure pointer, the code passes a C function pointer as the actual argument". But I'm not sure what happens for the case where the interface is not explicit. Can you please clarify?

270

Should read "to a function returning a character type"

272–274

Is this section specific to functions that return a character type with an assumed length? Is so the section title should mention the fact that the character type is assumed length. Also the section does not actually mention characters. Best to be as specific as possible.

278–283

I don't understand the reference to Fortran 77. Internal procedures first appeared in Fortran 2008. But that doesn't seem relevant, either. I suggest just getting rid of references to the standard here.

I assume that you want this paragraph to read something like: "Initially the current plan is to implement pointers to internal procedures using the LLVM Trampoline intrinsics. This has the drawback of requiring the stack to be executable, which is a security hole. To avoid this, we will need improve the implementation to use heap-resident thunks.

291

Should read "that of an external ..."

376

Should read "components"

378–379

Should read "Having procedure pointers in derived types permits methods to be dynamically bound to objects. Such procedure pointer components will have the type !fir.boxproc<T>.

438

Should read "... possible implementations." That is, don't have the final "possible".

This revision now requires changes to proceed.Oct 31 2022, 1:29 PM
peixin updated this revision to Diff 472227.Nov 1 2022, 1:15 AM

Address the comments.

peixin marked 3 inline comments as done.Nov 1 2022, 1:26 AM

I'd like to see information on how you're going to test this. There should be executable tests, which, unfortunately, we don't have much support for. It would be good to list all of the areas that need to be tested -- functions, subroutines, internal procedures, component procedure pointers, interfacing with C, varying types for arguments and function returns, ...

I agree that execution test is important for this feature. Usually, I test end-to-end with some fortran examples locally. For this feature, I think LLVM IR check is the first round. Regarding to execution tests, is it OK to attach the issue created in llvm-project to the future lowering/codegen patch, or can I upload the Fortran/C test examples somewhere?

flang/docs/ProcedurePointer.md
208–211

I changed the description. The added testing part can clarify it more clearly.

A C function pointer is semantically equivalent to a Fortran procedure in LLVM IR level, and a pointer to a C function pointer is semantically equivalent to a Fortran procedure pointer in LLVM IR level. That is, a Fortran procedure will be converted to a opaque pointer in LLVM IR level, which is the same for a C function pointer; a Fortran procedure pointer will be converted to a opaque pointer pointing to a opaque pointer, which is the same for a pointer to a C function pointer.

272–274

Remove the case of assumed length character.

Is it also possible for a dummy procedure pointer to have a function type with an assumed length ?

@jeanPerier I was wrong for your question. It cannot be that case. C721 does not allow it. Check the following:

  interface
    character(*) function func()
    end
  end interface
  procedure(), pointer :: p=>func
  call bar(p)
contains
  subroutine bar(proc)
    procedure(func), pointer :: proc
    character(10) :: a! = "aaaa"
    a = proc()
    print *, a
  end
end
character(*) function func()
  func = "abc"
end

gfortran can compile it, but the running result is wrong. LLVM Flang reports the semantic error, which is expected.

278–283

Thanks a lot. English is not my mother language, so I really appreciate you make it more clear.

438

Sorry, it was one typo. Fixed.

PeteSteinfeld accepted this revision.Nov 1 2022, 7:50 AM

Thanks for all of the updates, @peixin. I only had one small nit, which was my fault.

I like your idea of creating an issue in llvm-project to cover the execution testing of the implementation. Please do so.

I'm approving this patch since it has all of the changes I've requested. But I recommend waiting for @jeanPerier to approve since he has had so many comments.

flang/docs/ProcedurePointer.md
280–281

"need improve" should read "need to improve". My mistake!

This revision is now accepted and ready to land.Nov 1 2022, 7:50 AM
jeanPerier added inline comments.Nov 2 2022, 3:00 AM
flang/docs/ProcedurePointer.md
272–274

C721 makes your example illegal, but I am not sure it makes it illegal to have dummy procedure pointers with assumed length.

Here is an example that shows an usage of assumed length character function with pointers that is, I think, legal. I wrote it in a way that neither the procedure pointer call site, nor the called target know about the result length at compile time. The only way to get the result length is for it to be propagated alongside the procedure pointer argument when calling "foo":

gfortran and ifort compile and print the expected:

ab
abcd
module test
contains
subroutine foo(proc_pointer)
  character(*), external, pointer :: proc_pointer
  print *, proc_pointer()
end subroutine

subroutine as_length2()
 character(2), external :: assumed_length
 character(2), external, pointer :: p
 p => assumed_length
 call foo(p)
end subroutine

subroutine as_length4()
 character(4), external :: assumed_length
 character(4), external, pointer :: p
 p => assumed_length
 call foo(p)
end subroutine
end module

character(*) function assumed_length()
 assumed_length = "abcdefghi"
end function

 use test
 call as_length2()
 call as_length4()
end

Given nvfortran, xlf, and nag seem to have various issues with this program, I would not prioritize this feature, but adding a clean todo seems the minimum to me.

jeanPerier added inline comments.Nov 2 2022, 3:04 AM
flang/docs/ProcedurePointer.md
272–274

Mmh, never mind. Reading the fine prints of C723 I read "A function name declared with an asterisk type-param-value shall not be [...], a pointer, [...]", so it might just be illegal to have procedure pointers with assumed length. But flang still lacks a semantic check to enforce this.

peixin updated this revision to Diff 472553.Nov 2 2022, 3:16 AM

Address the comment from @PeteSteinfeld . need -> need to

peixin updated this revision to Diff 472554.Nov 2 2022, 3:17 AM

Add C723 in the description. C721 -> C721 and C723.

flang/docs/ProcedurePointer.md
272–274

Thanks for finding this. I filed one ticket for it. https://github.com/llvm/llvm-project/issues/58757

kiranchandramohan added a comment.EditedNov 2 2022, 5:37 AM

The conversion to LLVM of the procedure-pointer-related operations (emboxproc, hostboxproc,unboxproc) are marked as TODO. The examples only show the usage of emboxproc. Does that mean hostboxproc and unboxproc are not going to be used? Could you discuss their conversions to LLVM as well?
https://github.com/llvm/llvm-project/blob/d839f654586a4f3a84b334fcc2c986343a1d7f98/flang/lib/Optimizer/CodeGen/CodeGen.cpp#L631
https://github.com/llvm/llvm-project/blob/d839f654586a4f3a84b334fcc2c986343a1d7f98/flang/lib/Optimizer/CodeGen/CodeGen.cpp#L2085
https://github.com/llvm/llvm-project/blob/d839f654586a4f3a84b334fcc2c986343a1d7f98/flang/lib/Optimizer/CodeGen/CodeGen.cpp#L3171

flang/docs/ProcedurePointer.md
276

Could you clarify that this will not be part of the initial implementation?

peixin added a comment.Nov 2 2022, 7:23 AM

The conversion to LLVM of the procedure-pointer-related operations (emboxproc, hostboxproc,unboxproc) are marked as TODO. The examples only show the usage of emboxproc. Does that mean hostboxproc and unboxproc are not going to be used? Could you discuss their conversions to LLVM as well?

As I mentioned in the NOTE after Current list of TODOs in code generation, there are any number of possible implementations of the conversion to LLVM for procedure pointer. Actually, using BoxedProcedure pass has already supported most of usage scenarios of procedure pointers, but there are several implementations not finished as I listed there (BoxProcType type conversion, record type with a boxproc type, and fir.global).

My current plan is to use BoxedProcedure pass to complete procedure pointer. But it is free to anyone who want to try to implement it using the conversion of EmboxProc, BoxProc_Host, Len_Param_Index, UnboxProc operations without the need of BoxedProcedure pass.

flang/docs/ProcedurePointer.md
276

I don't get it. This will be implemented, but following the method of processing internal procedure. I may not describe it clearly during the meeting.

The final implementation for both procedure pointer to internal procedure and internal procedure should use heap-resident thunks. That's what the following description says.

kiranchandramohan accepted this revision.Nov 2 2022, 8:24 AM

The conversion to LLVM of the procedure-pointer-related operations (emboxproc, hostboxproc,unboxproc) are marked as TODO. The examples only show the usage of emboxproc. Does that mean hostboxproc and unboxproc are not going to be used? Could you discuss their conversions to LLVM as well?

As I mentioned in the NOTE after Current list of TODOs in code generation, there are any number of possible implementations of the conversion to LLVM for procedure pointer. Actually, using BoxedProcedure pass has already supported most of usage scenarios of procedure pointers, but there are several implementations not finished as I listed there (BoxProcType type conversion, record type with a boxproc type, and fir.global).

My current plan is to use BoxedProcedure pass to complete procedure pointer. But it is free to anyone who want to try to implement it using the conversion of EmboxProc, BoxProc_Host, Len_Param_Index, UnboxProc operations without the need of BoxedProcedure pass.

Thanks @peixin for the explanation. Could you explicitly state in the beginning of the document that you will use/extend the BoxedProcedure pass and thus will not be lowering the procedure-pointer-related options to LLVM?

peixin updated this revision to Diff 472817.Nov 2 2022, 6:06 PM

Thanks @kiranchandramohan. Good point. It's more clear for readers. Add the descriptions as suggested.

This revision was landed with ongoing or failed builds.Nov 2 2022, 6:07 PM
This revision was automatically updated to reflect the committed changes.