This is an archive of the discontinued LLVM Phabricator instance.

[flang] Don't try to run the newly built flang-new when cross compiling
ClosedPublic

Authored by mstorsjo on Jul 22 2022, 4:49 AM.

Details

Summary

If CMAKE_CROSSCOMPILING, then the newly built flang-new executable was cross compiled and thus can't be executed on the build system, and thus can't be used for generating module files.

Diff Detail

Event Timeline

mstorsjo created this revision.Jul 22 2022, 4:49 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald Transcript
Herald added a subscriber: mgorny. · View Herald Transcript
mstorsjo requested review of this revision.Jul 22 2022, 4:49 AM

Hi @mstorsjo , thanks for looking into this! AFAIK, we haven't had anyone to either x-compile flang-new or to use it to x-compile (we do test --target for X86, AArch64 and PPC, but not extensively). Comments like this indicate that some work is required. I'm OK with this change, but would like to understand a bit better the scenarios for which this is needed. Also, note that without these module files, flang-new becomes rather limited.

mstorsjo added a comment.EditedJul 22 2022, 9:43 AM

Comments like this indicate that some work is required. I'm OK with this change, but would like to understand a bit better the scenarios for which this is needed.

My toolchain distribution, https://github.com/mstorsjo/llvm-mingw, is quite focused on cross compilation. I start out on Linux, and build clang (and maybe in the future, flang too). Clang can generate code for any target enabled, so this Linux-based clang build can generate code for windows for i686, x86_64, armv7, aarch64.

To enable flang in that use case, this patch isn’t needed, but only D130351 and D130352. That should, in general, allow cross compiling Fortran code from Linux, targeting windows.

Then secondly, I use this cross compiling toolchain to cross compile llvm itself, producing clang binaries that will run on Windows. (This allows building windows based tool chains from scratch, in a clean environment, entirely independent of any previous windows toolchain.) To make flang build successfully in that setup, I need this patch and D130350.

Thanks for that pointer - so the cross compilation in flang might not be entirely correct yet. Still, I guess that’s a goal that you’re willing to work towards, at some point, and this would be an initial step.

Also, note that without these module files, flang-new becomes rather limited.

Yep. From the first step, where I build a Linux based flang cross compiler (where these module files are generated), I’d copy those files over to the second stage where I cross compile the toolchain (which I can’t run at that point). So in the end, I’d still have a complete toolchain. I already do the same for all c/c++ runtimes that ship with the toolchain too.

Thanks for this very comprehensive reply and for working on this and on llvm-mingw!

Then secondly, I use this cross compiling toolchain to cross compile llvm itself, producing clang binaries that will run on Windows. (This allows building windows based tool chains from scratch, in a clean environment, entirely independent of any previous windows toolchain.) To make flang build successfully in that setup, I need this patch and D130350.

Right, so alternatively you could replace:

COMMAND flang-new -fc1 -fsyntax-only -module-dir ${FLANG_INTRINSIC_MODULES_DIR}
        ${FLANG_SOURCE_DIR}/module/${filename}.f90

with (added -triple):

COMMAND flang-new -fc1 -triple <windows-triple> -fsyntax-only -module-dir ${FLANG_INTRINSIC_MODULES_DIR}
        ${FLANG_SOURCE_DIR}/module/${filename}.f90

Could this work? Or would it fail as at this stage flang-new would be cross-compiled for Windows and this CMake COMMAND would be run on the host (e.g. Linux). Asking hypothetically.

Thanks for that pointer - so the cross compilation in flang might not be entirely correct yet. Still, I guess that’s a goal that you’re willing to work towards, at some point, and this would be an initial step.

I'm not aware of any active development in this area - you are paving the way for Flang, thanks!

Also, note that without these module files, flang-new becomes rather limited.

Yep. From the first step, where I build a Linux based flang cross compiler (where these module files are generated), I’d copy those files over to the second stage where I cross compile the toolchain (which I can’t run at that point). So in the end, I’d still have a complete toolchain. I already do the same for all c/c++ runtimes that ship with the toolchain too.

So instead of re-generating the module files (and c/c++ runtimes) in the 2nd stage, you simply copy them from the Stage 1, right? But this way your toolchain will contain module files generated for Linux, i.e. incorrect files, right? What am I missing here? Unless these modules files are cross-compiled for Windows in Stage 1?

Also, in Stage 2 CMAKE_CROSSCOMPILING is set, right?

Thanks for this very comprehensive reply and for working on this and on llvm-mingw!

Then secondly, I use this cross compiling toolchain to cross compile llvm itself, producing clang binaries that will run on Windows. (This allows building windows based tool chains from scratch, in a clean environment, entirely independent of any previous windows toolchain.) To make flang build successfully in that setup, I need this patch and D130350.

Right, so alternatively you could replace:

COMMAND flang-new -fc1 -fsyntax-only -module-dir ${FLANG_INTRINSIC_MODULES_DIR}
        ${FLANG_SOURCE_DIR}/module/${filename}.f90

with (added -triple):

COMMAND flang-new -fc1 -triple <windows-triple> -fsyntax-only -module-dir ${FLANG_INTRINSIC_MODULES_DIR}
        ${FLANG_SOURCE_DIR}/module/${filename}.f90

Could this work? Or would it fail as at this stage flang-new would be cross-compiled for Windows and this CMake COMMAND would be run on the host (e.g. Linux). Asking hypothetically.

This would fail indeed. E.g. we'd be running on x86_64 linux and building an aarch64 windows flang-new.exe, so we can't execute it.

Similarly when we've cross compiled a new clang.exe, we can't use that to build libc++, but we need a similarly cross compiled libc++ that we manually reassemble into the toolchain.

Also, note that without these module files, flang-new becomes rather limited.

Yep. From the first step, where I build a Linux based flang cross compiler (where these module files are generated), I’d copy those files over to the second stage where I cross compile the toolchain (which I can’t run at that point). So in the end, I’d still have a complete toolchain. I already do the same for all c/c++ runtimes that ship with the toolchain too.

So instead of re-generating the module files (and c/c++ runtimes) in the 2nd stage, you simply copy them from the Stage 1, right? But this way your toolchain will contain module files generated for Linux, i.e. incorrect files, right? What am I missing here? Unless these modules files are cross-compiled for Windows in Stage 1?

Oh, indeed - if the those module files are target dependent, then that wouldn't work (or would seem to work long enough to give strange results for unexpecting users). Then we'd need some way to cross-build those in stage 1, for use with the generic cross compilation too. Could this be done with something like D130352, i.e. where I can set up cmake with my compiler tools of choice and point it at some subdirectory which would run the necessary steps?

It's not as simple as the diff you suggested above, because when I build the cross compiler, I set it up as a cross compiler for 4 architectures at once (although only 2 of them are relevant for flang afaik, as it doesn't support 32 bit), so we'd need to generate this for N different targets - kinda like is done with LLVM_RUNTIME_TARGETS.

Also, in Stage 2 CMAKE_CROSSCOMPILING is set, right?

That's correct, yes.

But this way your toolchain will contain module files generated for Linux, i.e. incorrect files, right? What am I missing here? Unless these modules files are cross-compiled for Windows in Stage 1?

Oh, indeed - if the those module files are target dependent, then that wouldn't work (or would seem to work long enough to give strange results for unexpecting users). Then we'd need some way to cross-build those in stage 1, for use with the generic cross compilation too.

This actually is more problematic for my setup than I had initially expected. This means you have target-specific code under include. In my toolchain, where there's 4 cross compile targets bundled in the same toolchain, the include directory is shared between all the 4 targets, while they only have unique lib directories. Or is there any difference between architectures in the module files, or does only OS make a difference?

Thanks for the explanation, makes sense! Could you add some comments inline in CMakeLists.txt and in the summary? That would really help! The fact that:

  • CMAKE_CROSSCOMPILING is set ==> "flang-new was cross-compiled and hence cannot be used to generate module files"

is key IMO. This is not obvious unless one is familiar with this CMake var (I wasn't).

But this way your toolchain will contain module files generated for Linux, i.e. incorrect files, right? What am I missing here? Unless these modules files are cross-compiled for Windows in Stage 1?

Oh, indeed - if the those module files are target dependent, then that wouldn't work (or would seem to work long enough to give strange results for unexpecting users). Then we'd need some way to cross-build those in stage 1, for use with the generic cross compilation too.

This actually is more problematic for my setup than I had initially expected. This means you have target-specific code under include. In my toolchain, where there's 4 cross compile targets bundled in the same toolchain, the include directory is shared between all the 4 targets, while they only have unique lib directories. Or is there any difference between architectures in the module files, or does only OS make a difference?

I don't know the full answer to this, but the format of module files in Fortran is different for different compilers. In LLVM Flang, these are effectively textual files (see this example). AFAIK, that's not the case for GFortran and may also change in LLVM Flang in the future. As you can see, in LLVM Flang these files are generated during the semantic analysis (-fsyntax-only means "stop after semantic checks"). I've not worked in that area of LLVM Flang to know whether anything target-specific happens there. This would be a good question for either Discourse, GItHub or Slack.

Could this be done with something like D130352, i.e. where I can set up cmake with my compiler tools of choice and point it at some subdirectory which would run the necessary steps?

Yes, I think that that would be safer long-term. And it's yet another good reasons to make Fortran runtime and module files as a separate sub-project 🤔 . But this will require consulting the community.

As Flang is WIP, it might be tricky to get all the necessary answers to design a proper future-proof solution. You might prefer to open a GitHub issue asking about all this (e.g. "Are module files in LLVM Flang platform/target agnostic?") and add a TODO in the code referring to that. Mostly to unblock this.

mstorsjo updated this revision to Diff 448160.Jul 27 2022, 2:17 PM

Added a comment in CMakeLists.txt as requested, and filed a github issue to discuss the compatibility of modules in https://github.com/llvm/llvm-project/issues/56764.

mstorsjo edited the summary of this revision. (Show Details)Jul 27 2022, 2:18 PM
awarzynski accepted this revision.Jul 28 2022, 2:16 AM

LGTM, thank you for working on this and for your patience answering all my questions :)

This revision is now accepted and ready to land.Jul 28 2022, 2:16 AM

the format of module files in Fortran is different for different compilers. In LLVM Flang, these are effectively textual files (see this example). AFAIK, that's not the case for GFortran and may also change in LLVM Flang in the future.

I'm sorry that you don't recognize the format of f18's module files. They are free source form Fortran, and are processed when referenced by USE statements by the same parser and semantic analyzer that are normally applied to regular source files.