This is an archive of the discontinued LLVM Phabricator instance.

[flang/mlir] Fix "file too big" build error on CYGWIN.
ClosedPublic

Authored by carlo-bramini on Sep 2 2023, 6:38 AM.

Details

Summary

Attached patch fixes issues #63582 and #57718 when building my port to CYGWIN of llvm-project.
https://github.com/llvm/llvm-project/issues/63582
https://github.com/llvm/llvm-project/issues/57718

I have not seen this error also on LLVM as described in issue #51143, but perhaps this patch could be used for fixing it, since it is the same problem.
https://github.com/llvm/llvm-project/issues/51143

Diff Detail

Event Timeline

carlo-bramini created this revision.Sep 2 2023, 6:38 AM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptSep 2 2023, 6:38 AM
carlo-bramini requested review of this revision.Sep 2 2023, 6:38 AM
mstorsjo added inline comments.Sep 2 2023, 11:49 AM
flang/CMakeLists.txt
339 ↗(On Diff #555590)

As far as I know, you don't usually need this flag with MSVC - both MSVC and Clang/LLVM automatically enable the bigobj mode when the object file becomes too large for the regular form.

I still don't understand the whole patch.
The compilation error either comes from the compiler or the OS/Platform.
I don't understand CYGWIN+MSVC?!
How about building on MinGW for MinGW?
I think you should replace if(CYGWIN) by if(CYGWIN OR WIN32)

Also the mlir options should inculded after LLVM_CMAKE_DIR get included in STANDALONE mode.

I still don't understand the whole patch.
The compilation error either comes from the compiler or the OS/Platform.
I don't understand CYGWIN+MSVC?!
How about building on MinGW for MinGW?
I think you should replace if(CYGWIN) by if(CYGWIN OR WIN32)

Excuse me, please check the post here:

Unfortunately, I don't know if this applies also to WIN32.
I think that it does, but unfortunately I cannot test it.
At least in my opinion, it should be verified by people that found this issue on WIN32, actually.

flang/CMakeLists.txt
339 ↗(On Diff #555590)

I don't know about Clang/LLVM, but it doesn't seem to me that MSVC changes automatically the output format.
By default, MSVC has a limit of 2^16 sections into an object file and if you exceeded this limit, then error C1128 will be emitted:
https://learn.microsoft.com/en-us/cpp/error-messages/compiler-errors-1/fatal-error-c1128?view=msvc-170
According to the documentation, this option is automatically enabled only for UWP projects, otherwise you need to manually add it:
https://learn.microsoft.com/en-us/cpp/build/reference/bigobj-increase-number-of-sections-in-dot-obj-file?view=msvc-170

You didn't answer my questions. I haven't find anything related to them in the post linked above!

You didn't answer my questions. I haven't find anything related to them in the post linked above!

I'm sorry, probably it was not explained correctly. I try to do better.

I still don't understand the whole patch.
The compilation error either comes from the compiler or the OS/Platform.

The build error is emitted by the compiler, when it reaches a limit of the file format used by the object files.
This is the explanation written at MSDN:
https://learn.microsoft.com/en-us/cpp/build/reference/bigobj-increase-number-of-sections-in-dot-obj-file?view=msvc-170

I don't understand CYGWIN+MSVC?!

Although it may not be so usual, you can use Visual Studio with CYGWIN.
Actually, I regularly use it and I would say that it is the best tool for writing, compiling and debugging applications with CYGWIN (personal opinion, of course).
However, adding MSVC inside the patch was driven by a different intent from me.
As described into the first post of this topic, some issues about this compile time error already exists, but for a different environment.
The target of this patch is fixing support for CYGWIN, but in my opinion it could be used also for fixing MinGW.
All you need to do is splitting the new code, by leaving the addition of _GNU_SOURCE alone and adding an if (CYGWIN OR WIN32), as explained into the comment that I signaled.
The result code would be something like this:

if (CYGWIN)
  # On most UNIX like platforms g++ and clang++ define _GNU_SOURCE as a
  # predefined macro, which turns on all of the wonderful libc extensions.
  # However g++ doesn't do this in Cygwin so we have to define it ourselfs
  # since we depend on GNU/POSIX/BSD extensions.
  add_definitions(-D_GNU_SOURCE=1)
endif()

if (CYGWIN OR WIN32)
  # Compiling large source files that use templates intensively, you may
  # receive the "File too big/too many sections" error. For MSVC you can
  # use the /bigobj flag while GNU Assembler provides the -mbig-obj flag.
  if( MSVC )
    append("/bigobj" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
  endif()
  if ( LLVM_COMPILER_IS_GCC_COMPATIBLE )
    append("-Wa,-mbig-obj" CMAKE_C_FLAGS CMAKE_CXX_FLAGS)
  endif( LLVM_COMPILER_IS_GCC_COMPATIBLE )
endif()

How about building on MinGW for MinGW?

Unfortunately, I don't know.
I have not tested a fully native build.
By reading issue #63582, #51143 and #57718, CMake probably uses "Mingw Makefiles" generator, while I'm using "Unix Makefiles".
However, CYGWIN also provides cross compilers for i686-w64-mingw32 and x86_64-w64-mingw32, probably a test could be done but it won't be exactly the same thing.
The solution may work since it is an error strictly related to the compiler and the generated code, but actually the testing conditions would be different.
I was not sure that the fix for CYGWIN could be also accepted for WIN32, that's why it is written as it is, to be used only for CYGWIN, by leaving the addition of if (CYGWIN OR WIN32) to the authors of the above issues.

I think you should replace if(CYGWIN) by if(CYGWIN OR WIN32)

Perhaps, I could do it anyways... according to the issues, it doesn't already work right now, if it doesn't fix the error, at worst it won't change anything.
I was not just sure that it could be accepted since it was not tested.

I hope that this explains.
Thank you very much for your time.

Also the mlir options should inculded after LLVM_CMAKE_DIR get included in STANDALONE mode.

I'm sorry, I have not understood what you mean here.

Also the mlir options should inculded after LLVM_CMAKE_DIR get included in STANDALONE mode.

I'm sorry, I have not understood what you mean here.

There are two ways to build MLIR, As part of the whole project (with llvm, clang...) or standalone (where llvm tools should be preinstalled). Your patch breaks the later because 'append' command is unknown to cmake unless llvm_cmake_dir get included (lines after)

MehdiChinoune added a comment.EditedSep 4 2023, 8:58 AM

Although it may not be so usual, you can use Visual Studio with CYGWIN.
Actually, I regularly use it and I would say that it is the best tool for writing, compiling and debugging applications with CYGWIN (personal opinion, of course).

It seems like you don't you understand what do CYGWIN and MSVC mean in CMake.
Please read cmake docs.

Prove me wrong and test this code in cmake:

if( CYGWIN AND MSVC )
message( STATUS "CYGWIN AND MSVC" )
endif()
mstorsjo added inline comments.Sep 8 2023, 3:45 AM
flang/CMakeLists.txt
339 ↗(On Diff #555590)

You're right, I misremembered - MSVC does indeed error out and need an explicit /bigobj flag. And for Clang (either in mingw or MSVC mode), the bigobj flags -Wa,-mbig-obj or /bigobj are both no-ops.

With that in mind, this is probably fine, but I wouldn't enclose the bigobj flag handling, especially if you're adding it for MSVC too, in an outer if (CYGWIN), I'd rather make it if (WIN32 OR CYGWIN) which covers all of them, then have the MSVC vs LLVM_COMPILER_IS_GCC_COMPATIBLE inside of that.

Prove me wrong and test this code in cmake:

Well, this is not the way I'm using MSVC for building CYGWIN applications but perhaps you could do it by using a toolchain script file. Probably, with a simple implementation, CMake won't be even able to configure because MSDOS and UNIX paths have different syntax, which can be bypassed by writing something similar to VCPKG so, in any case, this wouldn't be so immediate. But it seems it is no more important.

In the last days, I compiled llvm-project with MinGW-w64 from CYGWIN.
This is a cross-compilation actually, so this is the toolchain file used for configuring CMake.
I made it available for reference here, if somebody will want to use this build environment:

# The name of the target operating system.
SET(CMAKE_SYSTEM_NAME Windows)

# Force the version of the system.
#SET(CMAKE_SYSTEM_VERSION 10)

# The classical MinGW-32 (http://www.mingw.org/)
#SET(TOOLCHAIN_PREFIX "i586-mingw32msvc")

# 32 or 64 bits MinGW-w64 (http://mingw-w64.sourceforge.net/)
#SET(TOOLCHAIN_PREFIX "i686-w64-mingw32")
SET(TOOLCHAIN_PREFIX "x86_64-w64-mingw32")

# Toolchain path prefix (CYGWIN cross compiler)
SET(COMPILER_PREFIX ${TOOLCHAIN_PREFIX}/sys-root/mingw)

# Toolchain path prefix (standard prefix)
#SET(COMPILER_PREFIX ${TOOLCHAIN_PREFIX})

SET(PKG_CONFIG_EXECUTABLE ${TOOLCHAIN_PREFIX}-pkg-config)

# Compilers to use for C and C++
FIND_PROGRAM(CMAKE_C_COMPILER NAMES ${TOOLCHAIN_PREFIX}-gcc)
FIND_PROGRAM(CMAKE_CXX_COMPILER NAMES ${TOOLCHAIN_PREFIX}-g++)
FIND_PROGRAM(CMAKE_RC_COMPILER NAMES ${TOOLCHAIN_PREFIX}-windres)

# Path to the target environment
SET(CMAKE_FIND_ROOT_PATH /usr/${COMPILER_PREFIX})

# Adjust the default behaviour of the FIND_XXX() commands:
# search headers and libraries in the target environment, search 
# programs in the host environment
SET(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
SET(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)
SET(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)

#SET(CMAKE_SYSTEM_PROCESSOR x86)
SET(CMAKE_SYSTEM_PROCESSOR x64)

This is made for building for 64bit, but it possible to compile for 32bit by changing some comments inside.

With MinGW-w64, the sources were compiled successfully, no more "too many sections" and "File too big" errors.
I got other errors and I have seen some strange things when doing this cross compilation, but not the ones signaled into the above issues.
Running Ninja in verbose mode, I discovered that "-Wa,-mbig-obj" is already feed into the flags.
Searching into the sources, I found this:

https://github.com/llvm/llvm-project/blob/88348252a60e22dacde9cf92a0767e9f89c2e287/llvm/cmake/modules/HandleLLVMOptions.cmake#L466

Here, the "-Wa,-mbig-obj" option is added and here, few lines above this point:

https://github.com/llvm/llvm-project/blob/88348252a60e22dacde9cf92a0767e9f89c2e287/llvm/cmake/modules/HandleLLVMOptions.cmake#L461

there is the if statement that allows to add it.
So, I replaced it with:

elseif(MINGW OR CYGWIN)

I recompiled the sources after this change and it worked fine also on CYGWIN.
So, about these issues:

https://github.com/llvm/llvm-project/issues/63582
https://github.com/llvm/llvm-project/issues/57718
https://github.com/llvm/llvm-project/issues/51143

it looks like that they were already fixed, but nobody tested and/or closed them. Is it possible?
I removed the definition of _GNU_SOURCE, which was a copy/paste from here:

https://github.com/llvm/llvm-project/blob/53602e6193d98b1e814b76a27e99ef7e18c9769c/third-party/benchmark/CMakeLists.txt#L230

I will add it into a different patch, since it is probably off-topic here.
About the existing "-Wa,-mbig-obj" option into llvm/cmake/modules/HandleLLVMOptions.cmake, I don't know if it is good for both normal and standalone builds, but if it had been accepted for MINGW, I expect that it would be ok also for CYGWIN.

It also seems to me that this global "-Wa,-mbig-obj" increased the build time of llvm.
From 3 hours, now it required about 4 hours for completing the build on my PC, but perhaps it's just normal.

Herald added a project: Restricted Project. · View Herald TranscriptSep 12 2023, 2:19 AM
mstorsjo accepted this revision.Sep 12 2023, 2:30 AM

This version of the change looks good to me, thanks! And here we also see that we're already enabling /bigobj for MSVC on line 587.

For landing your patch, what's youre preferred git author line, Real Name <email@address>?

This revision is now accepted and ready to land.Sep 12 2023, 2:30 AM

For landing your patch, what's youre preferred git author line, Real Name <email@address>?

Real name: Carlo Bramini
email: carlo_bramini@users.sourceforge.net

BTW: excuse me, what about the 3 issues #63582, #57718 and #51143?
I was not able to reproduce those bugs with MinGW-w64, actually.

For landing your patch, what's youre preferred git author line, Real Name <email@address>?

Real name: Carlo Bramini
email: carlo_bramini@users.sourceforge.net

Ok, I'll try to land the change later today.

BTW: excuse me, what about the 3 issues #63582, #57718 and #51143?
I was not able to reproduce those bugs with MinGW-w64, actually.

I haven't run into those either. Looks like I've commented on https://github.com/llvm/llvm-project/issues/57718 at least. Not sure if that's caused by the same (too many sections, fixed by -Wa,-mbig-obj) or if it's a case where the string table exceeds 10 MB, which LLVM/MSVC handles fine but I think binutils hasn't implemented. If it is the latter, there's no fix other than implementing the >10 MB string table offset handling in binutils, or changing the build configuration to produce less data (e.g. not building in debug mode perhaps?).

This revision was landed with ongoing or failed builds.Sep 14 2023, 2:42 AM
This revision was automatically updated to reflect the committed changes.