This is an archive of the discontinued LLVM Phabricator instance.

Paths

Table of Contentst

-
llvm/
-
include/llvm/IR/
-
llvm/
-
IR/
-
Module.h
-
lib/
-
IR/
-
Module.cpp
-
LTO/
-
LTO.cpp
2/2
LTOCodeGenerator.cpp
-
test/LTO/ARM/
-
LTO/
-
ARM/
1
lto-linking-metadata-already-present.ll
-
lto-linking-metadata-overwrite.ll

Differential D139816

[LTO] Don't generate invalid modules if "LTOPostLink" MD already exists
AbandonedPublic

Authored by Pierre-vh on Dec 12 2022, 1:12 AM.

Download Raw Diff

Details

Reviewers

pcc
arsenm
ostannard
MaskRay
tejohnson
itf
smeenai
steven_wu

Summary

Prevents the LTO library from generating an invalid module when the
LTOPostlink MD is already present (which was added in D63932).

Diff Detail

Repository: rG LLVM Github Monorepo

Event Timeline

Pierre-vh created this revision.Dec 12 2022, 1:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 12 2022, 1:12 AM

Herald added subscribers: ormris, steven_wu, hiraditya, inglorion. · View Herald Transcript

Pierre-vh requested review of this revision.Dec 12 2022, 1:12 AM

Herald added a project: Restricted Project. · View Herald TranscriptDec 12 2022, 1:12 AM

Herald added subscribers: llvm-commits, wdng. · View Herald Transcript

Harbormaster completed remote builds in B202508: Diff 482023.Dec 12 2022, 2:18 AM

Can you indicate what is your motivation for such change? From the original change, the flag is to indicate whether the post link optimization is done or not, and overwriting it is just hiding the actual problem of post link is run twice (so perhaps one of the run is done without the full information, thus the result is not correct).

In D139816#3988961, @steven_wu wrote:

Can you indicate what is your motivation for such change? From the original change, the flag is to indicate whether the post link optimization is done or not, and overwriting it is just hiding the actual problem of post link is run twice (so perhaps one of the run is done without the full information, thus the result is not correct).

I'm not sure if running the post-link optimization twice is an issue in itself. I assumed it wasn't (I don't see why running the passes multiple times would be an issue?) and that's why I proposed this change, but if we determine that running post-link optimizations twice is an issue/something undesirable, then it's fine for me to put a diagnostic or something else (e.g. skip the pipeline if the flag is present) instead.
For instance, a case where this came up involved re-running IR obtained from -save-temps through LLD, which crashed due to the verification error (duplicate flag).

The motivation is just to prevent an avoidable verifier error. Whether we do it by adding the flag more carefully, or by checking for its presence earlier and exiting more elegantly, I don't mind

(cc @arsenm)

Ping + rebase

Harbormaster completed remote builds in B205618: Diff 486184.Jan 4 2023, 3:02 AM

Ping - can someone please take a look?

Gentle ping :)
Can someone please take a quick look?

Pierre-vh added a reviewer: MaskRay.Jan 19 2023, 12:24 AM

Please review

My understanding of the flag is preventing you from:

run the devirtual pass
add some new function
run the devirtual pass again

The LTO devirtualization passes require visibility for all the functions when the passes are run.

In D139816#4080610, @steven_wu wrote:

My understanding of the flag is preventing you from:

run the devirtual pass

add some new function

run the devirtual pass again

The LTO devirtualization passes require visibility for all the functions when the passes are run.

Can the pass cause a miscompilation or even crash if it's run twice?
If yes, then this patch is indeed not the right solution and we should probably emit an error instead.
If no, then I don't think this patch is a bad idea to prevent the crash, but maybe we should print a warning/remark if the flag is already set?

I am not the expert to answer that question. I will think this is an assertion-like behavior. It tells you as a compiler developer that your pipeline configuration is wrong in a sense that you should add all the functions in before running post link passes. You should delay the post link pass when you hit this error, rather than disable the check without a compelling reason.

For example, the legacy LTO pipeline used to hit the same error, when you run ld -r (it is the linking option in ld64 that produces another object file for later usage), then link with other bitcode object in the final link. The fix is to not run post link pass during ld -r so it can be run in the end once.

In D139816#4083630, @steven_wu wrote:

I am not the expert to answer that question. I will think this is an assertion-like behavior. It tells you as a compiler developer that your pipeline configuration is wrong in a sense that you should add all the functions in before running post link passes. You should delay the post link pass when you hit this error, rather than disable the check without a compelling reason.

For example, the legacy LTO pipeline used to hit the same error, when you run ld -r (it is the linking option in ld64 that produces another object file for later usage), then link with other bitcode object in the final link. The fix is to not run post link pass during ld -r so it can be run in the end once.

I think this violates a core tenet of being a modular, reusable IR. It shouldn't be wrong to run a pass twice. It certainly shouldn't error by way of verifier error

I think this violates a core tenet of being a modular, reusable IR. It shouldn't be wrong to run a pass twice. It certainly shouldn't error by way of verifier error

There are plenty of verifiers that will error on wrong usage of IR. And there is a reason why the ModuleFlag has a Behavior field and it will just error out when it sees incompatible values. This is conceptually not different.

There are two parts of the changes in this commit. First is to add an API to allow overwriting module flag, which is generally fine. The second part is that LTOPostLink should be overwritten with new value. I think that is against the reason why it is set the way it is now. I am against it because:

You didn't provide reason why it should not be a fix on the toolchain setup, like how legacy LTO API did. Any insights into the actual problem you run into will be helpful to evaluate the situation.
I think overwrite it basically kills it function. In that case, it should just be removed.

This flag just seems to be used to tell GlobalDCE it's running Post-Link. It's not used to guard against multiple runs of the pass.
I also agree with Matt that running a pass or even a pipeline twice shouldn't result in a verifier error or an assertion failure. If the workfllow is indeed bad, then diagnostics are a much better way to communicate to the user. (Or just skipping the pipeline + emit a warning on the second run if it's harmless)

In this particular case I see no compelling argument why this pipeline can't run multiple times on the same output, so it should work. Whether it should work _and_ run the passes a second time is another question (instead of this, we can add another flag to skip the pipeline on the second+ run if there's a good reason to)

arsenm added inline comments.Jan 30 2023, 3:31 AM

llvm/lib/LTO/LTOCodeGenerator.cpp
632	Do you really need a new method? Can you just use ModFlagBehavior::Override?

Pierre-vh marked an inline comment as done.Jan 30 2023, 6:55 AM

Pierre-vh added inline comments.

llvm/lib/LTO/LTOCodeGenerator.cpp
632	I tried using 'addModuleFlag` with override and got `module flag identifiers must be unique (or of 'require' type)` Note that the new function is just for convenience. I could also just call `setModuleFlag` and create a ConstantInt/ConstantAsMetadata here but it'd be uglier

Gentle ping :)

Ping

In D139816#4083672, @steven_wu wrote:

I think this violates a core tenet of being a modular, reusable IR. It shouldn't be wrong to run a pass twice. It certainly shouldn't error by way of verifier error

There are plenty of verifiers that will error on wrong usage of IR. And there is a reason why the ModuleFlag has a Behavior field and it will just error out when it sees incompatible values. This is conceptually not different.

This is conceptually very different. Verifier errors are for malformed IR by construction. It's for catching API errors and bugs in compiler passes, not tool usage. By running any combination of tools or passes, it should not be possible to hit a verifier error.

I don't really see the reason to have the verified check this at all, can we just drop it? I can see the use of emitting an informative module flag but this doesn't need to be semantically enforced. It is still possible to introduce new code after this point in ways that aren't wrong.

steven_wu resigned from this revision.Feb 28 2023, 9:33 AM

Ping

arsenm added reviewers: tejohnson, akyrtzi, itf, smeenai.Jun 22 2023, 7:13 AM

From the semantic meaning of this module flag as described in D63932, and from its usage, it seems it should not block IR from being sent through the LTO pipeline multiple times.

Adding a new interface works, but if you do that please add a test of it to llvm/unittests/IR/ModuleTest.cpp (see where the other setModuleFlag interface is tested there). The other alternative would be to simply guard the addModuleFlag with a check of getModuleFlag("LTOPostLink") returning null (and if non-null assert that the value is '1').

After looking at the use of this module flag again, I think it would actually be better to remove it completely and replace it by passing a flag down to the GlobalDCEPass constructor when it is called from PassBuilder::buildLTODefaultPipeline (looks like there are a few calls there). That is the more typical way of communicating this information.

akyrtzi removed a reviewer: akyrtzi.Jun 22 2023, 8:52 AM

Removing new APIs, use existing ones

I'm not sure about removing the flag yet, it's present in a lot of tests and the intention with this patch is to be almost a NFC and just fix a edge case compiler crash

Harbormaster completed remote builds in B240727: Diff 533907.Jun 23 2023, 4:31 AM

arsenm added inline comments.Jun 23 2023, 6:15 AM

llvm/test/LTO/ARM/lto-linking-metadata-already-present.ll
2	Might as well use llvm-as for this part

In D139816#4443836, @Pierre-vh wrote:

Removing new APIs, use existing ones

I'm not sure about removing the flag yet, it's present in a lot of tests and the intention with this patch is to be almost a NFC and just fix a edge case compiler crash

All but 2 of the test uses are just incidental. Only 2 tests fail when this is removed - one is obsolete if the flag is removed (it was specifically just to test that the flag gets generated), and the other just needs to switch to using the LTO default pipeline. I have a small fix to GlobalDCEPass to get this passed down from the pass manager, I'll send it for review once I rerun all the tests.

tejohnson mentioned this in D153655: [LTO][GlobalDCE] Use pass parameter instead of module flag for LTO phase.Jun 23 2023, 11:43 AM

In D139816#4444791, @tejohnson wrote:

In D139816#4443836, @Pierre-vh wrote:

Removing new APIs, use existing ones

I'm not sure about removing the flag yet, it's present in a lot of tests and the intention with this patch is to be almost a NFC and just fix a edge case compiler crash

All but 2 of the test uses are just incidental. Only 2 tests fail when this is removed - one is obsolete if the flag is removed (it was specifically just to test that the flag gets generated), and the other just needs to switch to using the LTO default pipeline. I have a small fix to GlobalDCEPass to get this passed down from the pass manager, I'll send it for review once I rerun all the tests.

D153655

Can the summary be changed to summarize the useful prior discussions?

In D139816#4445502, @MaskRay wrote:

Can the summary be changed to summarize the useful prior discussions?

Are you talking about this patch (which is about to be obsolete after D153655), or D153655? If the latter, I did summarize the issue there in its summary, but pointed back to this one for full context.

tejohnson mentioned this in rG200cc952a28a: [LTO][GlobalDCE] Use pass parameter instead of module flag for LTO phase.Jun 23 2023, 5:05 PM

Pierre-vh abandoned this revision.Jun 26 2023, 12:10 AM

Revision Contents

Path

Size

llvm/

include/

llvm/

IR/

Module.h

2 lines

lib/

IR/

Module.cpp

11 lines

LTO/

LTO.cpp

2 lines

LTOCodeGenerator.cpp

2 lines

test/

LTO/

ARM/

lto-linking-metadata-already-present.ll

24 lines

lto-linking-metadata-overwrite.ll

26 lines

Diff 486184

llvm/include/llvm/IR/Module.h

Show First 20 Lines • Show All 510 Lines • ▼ Show 20 Lines	/// @{
/// Add a module-level flag to the module-level flags metadata. It will create		/// Add a module-level flag to the module-level flags metadata. It will create
/// the module-level flags named metadata if it doesn't already exist.		/// the module-level flags named metadata if it doesn't already exist.
void addModuleFlag(ModFlagBehavior Behavior, StringRef Key, Metadata *Val);		void addModuleFlag(ModFlagBehavior Behavior, StringRef Key, Metadata *Val);
void addModuleFlag(ModFlagBehavior Behavior, StringRef Key, Constant *Val);		void addModuleFlag(ModFlagBehavior Behavior, StringRef Key, Constant *Val);
void addModuleFlag(ModFlagBehavior Behavior, StringRef Key, uint32_t Val);		void addModuleFlag(ModFlagBehavior Behavior, StringRef Key, uint32_t Val);
void addModuleFlag(MDNode *Node);		void addModuleFlag(MDNode *Node);
/// Like addModuleFlag but replaces the old module flag if it already exists.		/// Like addModuleFlag but replaces the old module flag if it already exists.
void setModuleFlag(ModFlagBehavior Behavior, StringRef Key, Metadata *Val);		void setModuleFlag(ModFlagBehavior Behavior, StringRef Key, Metadata *Val);
		void setModuleFlag(ModFlagBehavior Behavior, StringRef Key, Constant *Val);
		void setModuleFlag(ModFlagBehavior Behavior, StringRef Key, uint32_t Val);

/// @}		/// @}
/// @name Materialization		/// @name Materialization
/// @{		/// @{

/// Sets the GVMaterializer to GVM. This module must not yet have a		/// Sets the GVMaterializer to GVM. This module must not yet have a
/// Materializer. To reset the materializer for a module that already has one,		/// Materializer. To reset the materializer for a module that already has one,
/// call materializeAll first. Destroying this module will destroy		/// call materializeAll first. Destroying this module will destroy
▲ Show 20 Lines • Show All 463 Lines • Show Last 20 Lines

llvm/lib/IR/Module.cpp

Show First 20 Lines • Show All 383 Lines • ▼ Show 20 Lines	for (unsigned I = 0, E = ModFlags->getNumOperands(); I != E; ++I) {
if (isValidModuleFlag(*Flag, MFB, K, V) && K->getString() == Key) {		if (isValidModuleFlag(*Flag, MFB, K, V) && K->getString() == Key) {
Flag->replaceOperandWith(2, Val);		Flag->replaceOperandWith(2, Val);
return;		return;
}		}
}		}
addModuleFlag(Behavior, Key, Val);		addModuleFlag(Behavior, Key, Val);
}		}

		void Module::setModuleFlag(ModFlagBehavior Behavior, StringRef Key,
		Constant *Val) {
		setModuleFlag(Behavior, Key, ConstantAsMetadata::get(Val));
		}

		void Module::setModuleFlag(ModFlagBehavior Behavior, StringRef Key,
		uint32_t Val) {
		Type *Int32Ty = Type::getInt32Ty(Context);
		setModuleFlag(Behavior, Key, ConstantInt::get(Int32Ty, Val));
		}

void Module::setDataLayout(StringRef Desc) {		void Module::setDataLayout(StringRef Desc) {
DL.reset(Desc);		DL.reset(Desc);
}		}

void Module::setDataLayout(const DataLayout &Other) { DL = Other; }		void Module::setDataLayout(const DataLayout &Other) { DL = Other; }

const DataLayout &Module::getDataLayout() const { return DL; }		const DataLayout &Module::getDataLayout() const { return DL; }

▲ Show 20 Lines • Show All 453 Lines • Show Last 20 Lines

llvm/lib/LTO/LTO.cpp

Show First 20 Lines • Show All 1,160 Lines • ▼ Show 20 Lines	for (const auto &R : GlobalResolutions) {
if (!GV \|\| GV->hasLocalLinkage() \|\| GV->isDeclaration())		if (!GV \|\| GV->hasLocalLinkage() \|\| GV->isDeclaration())
continue;		continue;
GV->setUnnamedAddr(R.second.UnnamedAddr ? GlobalValue::UnnamedAddr::Global		GV->setUnnamedAddr(R.second.UnnamedAddr ? GlobalValue::UnnamedAddr::Global
: GlobalValue::UnnamedAddr::None);		: GlobalValue::UnnamedAddr::None);
if (EnableLTOInternalization && R.second.Partition == 0)		if (EnableLTOInternalization && R.second.Partition == 0)
GV->setLinkage(GlobalValue::InternalLinkage);		GV->setLinkage(GlobalValue::InternalLinkage);
}		}

RegularLTO.CombinedModule->addModuleFlag(Module::Error, "LTOPostLink", 1);		RegularLTO.CombinedModule->setModuleFlag(Module::Error, "LTOPostLink", 1);

if (Conf.PostInternalizeModuleHook &&		if (Conf.PostInternalizeModuleHook &&
!Conf.PostInternalizeModuleHook(0, *RegularLTO.CombinedModule))		!Conf.PostInternalizeModuleHook(0, *RegularLTO.CombinedModule))
return finalizeOptimizationRemarks(std::move(DiagnosticOutputFile));		return finalizeOptimizationRemarks(std::move(DiagnosticOutputFile));
}		}

if (!RegularLTO.EmptyCombinedModule \|\| Conf.AlwaysEmitRegularLTOObj) {		if (!RegularLTO.EmptyCombinedModule \|\| Conf.AlwaysEmitRegularLTOObj) {
if (Error Err =		if (Error Err =
▲ Show 20 Lines • Show All 521 Lines • Show Last 20 Lines

llvm/lib/LTO/LTOCodeGenerator.cpp

Show First 20 Lines • Show All 623 Lines • ▼ Show 20 Lines	bool LTOCodeGenerator::optimize() {
// We always run the verifier once on the merged module, the `DisableVerify`		// We always run the verifier once on the merged module, the `DisableVerify`
// parameter only applies to subsequent verify.		// parameter only applies to subsequent verify.
verifyMergedModuleOnce();		verifyMergedModuleOnce();

// Mark which symbols can not be internalized		// Mark which symbols can not be internalized
this->applyScopeRestrictions();		this->applyScopeRestrictions();

// Write LTOPostLink flag for passes that require all the modules.		// Write LTOPostLink flag for passes that require all the modules.
MergedModule->addModuleFlag(Module::Error, "LTOPostLink", 1);		MergedModule->setModuleFlag(Module::Error, "LTOPostLink", 1);
		arsenmUnsubmitted Done Reply Inline Actions Do you really need a new method? Can you just use ModFlagBehavior::Override? arsenm: Do you really need a new method? Can you just use ModFlagBehavior::Override?
		Pierre-vhAuthorUnsubmitted Done Reply Inline Actions I tried using 'addModuleFlag` with override and got `module flag identifiers must be unique (or of 'require' type)` Note that the new function is just for convenience. I could also just call `setModuleFlag` and create a ConstantInt/ConstantAsMetadata here but it'd be uglier Pierre-vh: I tried using 'addModuleFlag` with override and got `module flag identifiers must be unique (or…

// Add an appropriate DataLayout instance for this module...		// Add an appropriate DataLayout instance for this module...
MergedModule->setDataLayout(TargetMach->createDataLayout());		MergedModule->setDataLayout(TargetMach->createDataLayout());

if (!SaveIRBeforeOptPath.empty()) {		if (!SaveIRBeforeOptPath.empty()) {
std::error_code EC;		std::error_code EC;
raw_fd_ostream OS(SaveIRBeforeOptPath, EC, sys::fs::OF_None);		raw_fd_ostream OS(SaveIRBeforeOptPath, EC, sys::fs::OF_None);
if (EC)		if (EC)
▲ Show 20 Lines • Show All 151 Lines • Show Last 20 Lines

llvm/test/LTO/ARM/lto-linking-metadata-already-present.ll

This file was added.

				; RUN: opt %s -o %t1.bc

				arsenmUnsubmitted Not Done Reply Inline Actions Might as well use llvm-as for this part arsenm: Might as well use llvm-as for this part
				; RUN: llvm-lto %t1.bc -o %t1.save.opt -save-linked-module -save-merged-module -O1 --exported-symbol=foo
				; RUN: llvm-dis < %t1.save.opt.merged.bc \| FileCheck %s

				; RUN: llvm-lto2 run %t1.bc -o %t.out.o -save-temps \
				; RUN: -r=%t1.bc,foo,pxl
				; RUN: llvm-dis < %t.out.o.0.2.internalize.bc \| FileCheck %s

				; Tests that LTO won't add LTOPostLink twice.

				target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
				target triple = "armv7a-unknown-linux"

				define void @foo() {
				entry:
				ret void
				}

				!llvm.module.flags = !{!1}
				!1 = !{i32 1, !"LTOPostLink", i32 1}

				; CHECK: !llvm.module.flags = !{[[MD_NUM:![0-9]+]]}
				; CHECK: [[MD_NUM]] = !{i32 1, !"LTOPostLink", i32 1}

llvm/test/LTO/ARM/lto-linking-metadata-overwrite.ll

This file was added.

				; RUN: opt %s -o %t1.bc

				; RUN: llvm-lto %t1.bc -o %t1.save.opt -save-linked-module -save-merged-module -O1 --exported-symbol=foo
				; RUN: llvm-dis < %t1.save.opt.merged.bc \| FileCheck %s

				; RUN: llvm-lto2 run %t1.bc -o %t.out.o -save-temps \
				; RUN: -r=%t1.bc,foo,pxl
				; RUN: llvm-dis < %t.out.o.0.2.internalize.bc \| FileCheck %s

				; Tests that LTO won't add LTOPostLink twice and will overwrite
				; the existing flag with the correct value.

				target datalayout = "e-m:e-p:32:32-Fi8-i64:64-v128:64:128-a:0:32-n32-S64"
				target triple = "armv7a-unknown-linux"

				define void @foo() {
				entry:
				ret void
				}

				!llvm.module.flags = !{!1}
				!1 = !{i32 1, !"LTOPostLink", i32 0}

				; CHECK: !llvm.module.flags = !{[[MD_NUM:![0-9]+]]}
				; CHECK: [[MD_NUM]] = !{i32 1, !"LTOPostLink", i32 1}

This is an archive of the discontinued LLVM Phabricator instance.

[LTO] Don't generate invalid modules if "LTOPostLink" MD already existsAbandonedPublic

Details

Diff Detail

Event Timeline

Revision Contents

Diff 486184

llvm/include/llvm/IR/Module.h

llvm/lib/IR/Module.cpp

llvm/lib/LTO/LTO.cpp

llvm/lib/LTO/LTOCodeGenerator.cpp

llvm/test/LTO/ARM/lto-linking-metadata-already-present.ll

llvm/test/LTO/ARM/lto-linking-metadata-overwrite.ll

[LTO] Don't generate invalid modules if "LTOPostLink" MD already exists
AbandonedPublic