This is an archive of the discontinued LLVM Phabricator instance.

[flang][runtime] New APIs for copyin/copyout of non-contiguous objects.
ClosedPublic

Authored by vzakhari on Oct 20 2022, 1:59 PM.

Details

Summary

The intention is to use these APIs for copyin/copyout of subprogram
arguments at the call sites. Currently, Flang generates loop nests
to do this, and in some corner cases this results in very long
compilation times due to LLVM loop optimizations.

For example, Flang produces 25245 loops for 521.wrf/module_dm.f90.
If we extract the copyin/copyout loops into runtime, Flang will only
produce 207 loops, and the compilation time may reduce by 47x.

Given that the copyin/copyout loop nests can not be fused with other
loop nests, extracting them into runtime functions should not reduce
performance if the runtime optimizes the leading contiguous dimension
copies.

The implementation will come in separate patches.

Diff Detail

Event Timeline

vzakhari created this revision.Oct 20 2022, 1:59 PM
Herald added a project: Restricted Project. · View Herald TranscriptOct 20 2022, 1:59 PM
Herald added a subscriber: jdoerfert. · View Herald Transcript
vzakhari requested review of this revision.Oct 20 2022, 1:59 PM

Other than the windows build issue, LGTM

flang/include/flang/Runtime/support.h
39

I think std::size_t usage here is breaking the windows build. As far I understand, std::size_t is not defined in <cstdint> (that defines uintmax_t instead), but in <cstddef> and some other headers.

jeanPerier accepted this revision.Oct 21 2022, 7:04 AM
This revision is now accepted and ready to land.Oct 21 2022, 7:04 AM
vzakhari updated this revision to Diff 469710.Oct 21 2022, 11:55 AM

Thank you, Jean! It should be fixed now.

jeanPerier accepted this revision.Oct 24 2022, 12:32 AM
tschuett added inline comments.
flang/include/flang/Runtime/support.h
74

Is uint8_t a bool? Do you need shouldFree to be input and output?

vzakhari added inline comments.Oct 24 2022, 3:55 PM
flang/include/flang/Runtime/support.h
74

It is a boolean output. I did not want to use bool, because its size is implementation defined, so for the purpose of allocating/reading value of shouldFree in the Flang generated code I want to rely on a fixed-size integer.