For the Itanium ABI, we emit an initializer for each module. This is responsible
for handling initialization of global vars. Relates to P1874R1.
The initializer has a known mangling and is automatically called from any TU that
imports a module. Since, at present, the importer has no way to determine that an
imported module does not require an initializer, we generate the initializer for
all cases (even when it is empty).
Initializers must be run once, with the ordering guaranteed by the import graph
and this is ensured in the current code by addition of a guard variable.
In the case that a module has no requirement for global initializers, and also does
not import any other modules, we can elide the guard variable.
It's not just 'current implementation' requirement, it's an ABI requirement. Remember, one could generate the interface object file from one compiler and then generate (and consume) the CMIs with a different compiler. this achieves object-file interoperability, but does not require CMI compatibility. We have no expectation any particular compiler implements the optimization you refer to.
Feel free to note this at your discretion.