This is an archive of the discontinued LLVM Phabricator instance.

[WIP][llvm] A Unified LTO Bitcode Frontend
ClosedPublic

Authored by ormris on Apr 14 2022, 9:39 AM.

Details

Summary

Here's a high level summary of the changes in this patch. For more information
on rational, see the RFC (https://discourse.llvm.org/t/rfc-a-unified-lto-bitcode-frontend/61774).

  • Add config parameter to LTO backend, specifying which LTO mode is desired when using unified LTO.
  • Add unified LTO flag to the summary index for efficiency. Unified LTO modules can be detected without parsing the module.
  • Make sure that the ModuleID is generated by incorporating more types of symbols.

Diff Detail

Event Timeline

ormris created this revision.Apr 14 2022, 9:39 AM
ormris requested review of this revision.Apr 14 2022, 9:39 AM
ormris edited the summary of this revision. (Show Details)Apr 14 2022, 9:48 AM
ormris edited the summary of this revision. (Show Details)Apr 14 2022, 10:16 AM
ormris edited the summary of this revision. (Show Details)
ychen added a subscriber: ychen.Apr 14 2022, 11:06 AM
mehdi_amini added inline comments.Apr 15 2022, 10:43 AM
llvm/lib/Passes/PassBuilder.cpp
1144

It is concerning to me that we add one mode different code path / behavior to maintain instead of unifying everything.

If UnifiedLTO is able to use the LTO pipeline effectively, what would be the reason for ThinLTO to not align?

tejohnson added inline comments.Apr 15 2022, 11:23 AM
llvm/lib/Passes/PassBuilder.cpp
1144

If UnifiedLTO is able to use the LTO pipeline effectively, what would be the reason for ThinLTO to not align?

Perhaps it can eventually, but I would not want to make a major change to the ThinLTO pipelines without a lot of experimentation. I don't personally have the bandwidth to do that right now, but if this was in as an alternative mode under an option, it could be done more easily at some point on a wider range of applications. I'd be concerned for example of side effects on importing behavior which is based on instruction count thresholds.

mehdi_amini added inline comments.Apr 15 2022, 11:27 AM
llvm/lib/Passes/PassBuilder.cpp
1144

Right, but your objection is exactly the root of my concerned with this new mode in the first place right now.

tejohnson added inline comments.Apr 15 2022, 11:41 AM
llvm/lib/Passes/PassBuilder.cpp
1144

Yeah, it isn't ideal to have added complexity, but I do understand the different constraints. The new mode seems to work well enough for Sony's needs, but for users such as mine at Google that want to maximize performance from ThinLTO, it may not be the best approach (or may be ok, but needs to be carefully evaluated). Unfortunately, I don't have a good immediate solution to balancing those two sets of needs at the moment, other than supporting different modes.

I wonder if we can get partly to a more common approach but just have a flag to switch between the different pass managers in the pre and post LTO optimization pipelines. I haven't had a chance to look closely at the patches yet, but my sense is that the other major change is enabling "split" LTO bitcode files always, for which I don't yet have a good understanding of the implications. I'll try to spend some time looking at the patches in more detail in the next few days.

tejohnson added inline comments.Apr 15 2022, 1:46 PM
llvm/lib/Passes/PassBuilder.cpp
1144

Per discussion on the RFC, the unified LTO mode added here requires split thin/regular LTO units. This is not something we have been able to use internally because of the scalability of the regular LTO portions. So we will need to keep the usual "pure" ThinLTO mode operational.

tejohnson added inline comments.Apr 18 2022, 10:05 AM
llvm/lib/LTO/LTO.cpp
739

Why is this needed?

1128

Why is this needed?

1145

Why is this needed?

1544

Why is this needed?

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
212

Rather than adding the many checks in the file below, can the Perform* and PrepareFor* options just be initialized differently under the UnifiedLTO mode?

llvm/lib/Transforms/Utils/ModuleUtils.cpp
225 ↗(On Diff #422894)

As noted elsewhere, I'd prefer the UnifiedLTO not imply splitting. But regardless, this seems like a useful change whenever LTO unit is enabled - why not always compute the module hash this way?

ormris added inline comments.Apr 18 2022, 1:38 PM
llvm/lib/LTO/LTO.cpp
739

If cfi.functions isn't removed, LowerTypeTests will rename local functions in the merged module as "<function name>.1" when the regular LTO backend is used. This causes linking errors, since other parts of the module expect the original function name. We saw this happen in internal testing.

1128

Looks like it's not needed. I'll remove it.

llvm/lib/Transforms/IPO/PassManagerBuilder.cpp
212

No, I don't think so. The UnifiedLTO flag doesn't match any of those variables. I don't see a combination Perform*/PrepareFor* that would cleanly produce the result we want. I would also worry that reusing these variables would make this code less clear. Looking at it now, I wonder if it should be called PrepareForUnifiedLTO, though.

ormris updated this revision to Diff 523215.May 17 2023, 4:23 PM

Changelog:

  • Rebased
  • Remove legacy pipeline
nikic added a subscriber: nikic.May 18 2023, 2:03 PM
nikic added inline comments.
llvm/lib/Passes/PassBuilder.cpp
1144

I feel like I'm missing something here. Why do we need to force the use of the (known-broken, lower quality) full LTO pre-link pipeline here, rather than sticking to the thin LTO pre-link pipeline?

ormris updated this revision to Diff 523644.May 18 2023, 8:18 PM

Attempt to fix pre-merge checks

mehdi_amini added inline comments.May 19 2023, 12:57 AM
llvm/lib/Passes/PassBuilder.cpp
1144

Can you elaborate on what is known-broken with the full LTO pre-link pipeline? And if we were to adopt the ThinLTO pipeline here for FullLTO, what does it mean for the FullLTO pipeline at link time? The two goes hand-in-hand somehow and the current situation (as far as I remember) balances compile time between the two phases (which is much more sensitive for FullLTO since the link phase is sequential).

The new mode seems to work well enough for Sony's needs, but for users such as mine at Google that want to maximize performance from ThinLTO, it may not be the best approach (or may be ok, but needs to be carefully evaluated). Unfortunately, I don't have a good immediate solution to balancing those two sets of needs at the moment, other than supporting different modes.

I am still concerned with divergence that wouldn't be just temporary: what would be the timeline to reconcile the paths? I understand you may not have time just now, but I don't think it is reasonable to just keep code in-tree forever "because Google can't evaluate changes to the pipeline", it is akin to have a dedicated pipeline in-tree and a clang option -flto=google-pipeline (or -Ogoogle instead of -O2). You're getting into "this belongs to your downstream fork" territory IMO.
The point of having a limited set of configuration in-tree is that every user contribute also to the testing of these pipelines. Having a feature for "unified LTO" that isn't orthogonal to the optimization pipelines doesn't seem right to me in term of product.

tejohnson added inline comments.May 19 2023, 7:27 AM
llvm/lib/Passes/PassBuilder.cpp
1144

And if we were to adopt the ThinLTO pipeline here for FullLTO, what does it mean for the FullLTO pipeline at link time? The two goes hand-in-hand somehow and the current situation (as far as I remember) balances compile time between the two phases (which is much more sensitive for FullLTO since the link phase is sequential).

The reverse is also a question - if we are to adopt the full LTO pipeline here, what does it mean for ThinLTO performance (and compile time, given what appears to be a requirement that split modules be used which means that ThinLTO now would be required to include some amount of full LTO)? The current ThinLTO pipeline attempts to maximize performance since we don't have to worry about the full LTO scalability issues.

I am still concerned with divergence that wouldn't be just temporary: what would be the timeline to reconcile the paths? I understand you may not have time just now, but I don't think it is reasonable to just keep code in-tree forever "because Google can't evaluate changes to the pipeline", it is akin to have a dedicated pipeline in-tree and a clang option -flto=google-pipeline (or -Ogoogle instead of -O2). You're getting into "this belongs to your downstream fork" territory IMO.

Google is not the one asking for a major change to the ThinLTO pipelines, which have been set up roughly this way since inception. While we certainly rely on ThinLTO for performance with scalability, we're also certainly not the only users of ThinLTO. IMO a major change such as this should go in under an experimental option, so that existing users are easily able to try it out, without being expected to patch in multiple patches and do that manually. It will be a lot easier to try it out if this is under an option in the upstream sources.

Having a feature for "unified LTO" that isn't orthogonal to the optimization pipelines doesn't seem right to me in term of product.

Since Unified LTO is an intermediate between Thin and Full LTO, which have their own pipelines already to balance their different needs, having a different pipeline for a different LTO mode with different needs doesn't seem like a terrible thing to me.

What happens if Unified LTO does degrade performance and/or compile time for existing ThinLTO users?

1146

It is a bit odd to see that under unified LTO the regular LTO "pre-link" pipeline is used during the post link phase. I don't remember the reasons for this, maybe it is in the RFC, but it at least needs a clear comment.

mehdi_amini added inline comments.May 19 2023, 1:25 PM
llvm/lib/Passes/PassBuilder.cpp
1144

Google is not the one asking for a major change to the ThinLTO pipelines,

Right, but it came across to me that you were blocking it by lack of time for testing: it is fine to ask about a testing plan and some plan ahead of time on resources to commit, but it didn't seem like the dynamic at play here.

which have been set up roughly this way since inception. While we certainly rely on ThinLTO for performance with scalability, we're also certainly not the only users of ThinLTO. IMO a major change such as this should go in under an experimental option, so that existing users are easily able to try it out, without being expected to patch in multiple patches and do that manually. It will be a lot easier to try it out if this is under an option in the upstream sources.

So basically, IIUC, we should:

  1. add an option to use ThinLTO with an new pipeline
  2. have plan and a timeline for users to test this pipeline, and criteria of acceptation.
  3. either graduate this pipeline to replace the existing one, or kill this option if unsuccessful.

This seems very reasonable to me, but the stakeholder in keeping the feature working should be ready to participate in 2).

What happens if Unified LTO does degrade performance and/or compile time for existing ThinLTO users?

Isn't the premise of the proposal that the author believe they can get the same performance as ThinLTO? Re-reading the original RFC, it does not say much about the performance claim, hence my impression that UnifiedLTO was proposed as an "orthogonal feature" to the compilation pipelines. Some clarifications may be needed on this?

nikic added inline comments.May 19 2023, 1:39 PM
llvm/lib/Passes/PassBuilder.cpp
1144

Can you elaborate on what is known-broken with the full LTO pre-link pipeline? And if we were to adopt the ThinLTO pipeline here for FullLTO, what does it mean for the FullLTO pipeline at link time? The two goes hand-in-hand somehow and the current situation (as far as I remember) balances compile time between the two phases (which is much more sensitive for FullLTO since the link phase is sequential).

Basically, the only difference between the thin LTO and the full LTO pre-link pipelines is that full LTO runs module optimization pre-link, while thin LTO does not. Running module optimization pre-link is detrimental to both performance and compile time. The full LTO pre-link pipeline will be made the same as the thin LTO pre-link pipeline in D148010, but it might take a while until we're ready to land that change.

Once that change lands this question won't matter anymore as the pipelines will be the same, but until that time it would make a lot more sense to me to use the thin LTO pre-link pipeline here, as that's the one we're ultimately going to adopt.

mehdi_amini added inline comments.May 19 2023, 2:00 PM
llvm/lib/Passes/PassBuilder.cpp
1144

Running module optimization pre-link is detrimental to both performance and compile time

I have a different experience: I tried to align FullLTO on the ThinLTO pipeline while we were building ThinLTO (circa 2016), and ninja clang (with FullLTO enabled) would take basically twice more time. This is because you're basically shifting compile time from the parallel compile phase to the sequential link-time phase.

I ended up proposing a patch here: https://reviews.llvm.org/D29376 which was tested on the performance aspect on games and embedded system (see the comment thread), without a good conclusion.
The compile-time impact was deemed too high for it to be worthwhile to pursue at the time.

nikic added inline comments.May 19 2023, 2:04 PM
llvm/lib/Passes/PassBuilder.cpp
1144

From a quick look, what you were trying to do is align both the pre-link and the post-link full LTO pipelines. I'm talking only about the pre-link pipeline here. Making the post-link full LTO pipeline the same as the thin LTO pipeline would indeed likely run into compile-time issues.

tejohnson added inline comments.May 19 2023, 2:42 PM
llvm/lib/Passes/PassBuilder.cpp
1144

Basically, the only difference between the thin LTO and the full LTO pre-link pipelines is that full LTO runs module optimization pre-link, while thin LTO does not. Running module optimization pre-link is detrimental to both performance and compile time. The full LTO pre-link pipeline will be made the same as the thin LTO pre-link pipeline in D148010, but it might take a while until we're ready to land that change.

Also, the odd thing here (see my comment a couple lines below), is that this case is where the post-link pipeline has been requested, where we normally run the "ThinLTODefault" pipeline (not the pre-link). With UnifiedLTO the code is instead running the full LTO pre-link, however. But this code is just used for pipeline testing via opt I believe. The pipeline setup code in the companion clang patch seems to be doing the intended thing (using full LTO pre-link instead of thin LTO pre-link under the unified LTO option). However, as you note, it isn't clear whether that is what we want.

@ormris is this a bug here?

1144

Right, but it came across to me that you were blocking it by lack of time for testing: it is fine to ask about a testing plan and some plan ahead of time on resources to commit, but it didn't seem like the dynamic at play here.

Not trying to block, I was just trying to agree with the approach here in putting it upstream under an option. If it is in under an option, it is a lot easier for a wider range of people to try it out in parallel. For example, it will be a lot easier to send it through our various pre-release compile-time and performance testing suites with potentially multiple people looking at it.

So basically, IIUC, we should:

  1. add an option to use ThinLTO with an new pipeline
  2. have plan and a timeline for users to test this pipeline, and criteria of acceptation.
  3. either graduate this pipeline to replace the existing one, or kill this option if unsuccessful.

Agree, although regarding 3 my understanding is that they are trying to solve a specific problem, by allowing the decision about thin vs full LTO to be delayed until the LTO link time and simplifying deploying bitcode libraries. So the criteria for success for Sony and anyone else who wants these benefits is likely going to be different than from ThinLTO users who don't care about this and just want the best performance/build time tradeoff.

ormris added inline comments.May 19 2023, 5:04 PM
llvm/lib/Passes/PassBuilder.cpp
1146

After looking into this, it appears that this was added for testing purposes a while back, but is no longer in use. The correct pipelines are setup by the various frontends. While it's technically not a necessary part of this patch, I'd like to make sure that opt --passes="thinlto-pre-link<O1>" --unified-lto does the right thing, so moving it to the prelink condition seems best.

ormris added inline comments.May 22 2023, 3:14 PM
llvm/lib/Passes/PassBuilder.cpp
1144

They are trying to solve a specific problem, by allowing the decision about thin vs full LTO to be delayed until the LTO link time and simplifying deploying bitcode libraries.

Correct. There are specific aspects of the LTO UX that we wanted to change, as noted in the RFC.

it is a lot easier for a wider range of people to try it out in parallel

A few others have expressed interest on discourse in using this pipeline for other projects. I think it's likely we would see other projects testing this pipeline if it was committed.

mehdi_amini added inline comments.May 22 2023, 3:31 PM
llvm/lib/Passes/PassBuilder.cpp
1144

They are trying to solve a specific problem, by allowing the decision about thin vs full LTO to be delayed until the LTO link time and simplifying deploying bitcode libraries.

Correct. There are specific aspects of the LTO UX that we wanted to change, as noted in the RFC.

That isn't answering the performance goals questions with respect to current ThinLTO as well as the long term alignment of the pipelines?

ormris updated this revision to Diff 524762.May 23 2023, 9:22 AM

Update opt to set the UnifiedLTO pipeline tuning option correctly. Fix pipeline parsing. Add testing for UnifiedLTO prelink pipeline.

ormris marked an inline comment as done.May 23 2023, 6:06 PM
ormris added inline comments.
llvm/lib/Passes/PassBuilder.cpp
1144

That isn't answering the performance goals questions with respect to current ThinLTO as well as the long term alignment of the pipelines?

Our goal was to make these UX changes without severely impacting ThinLTO compile time and runtime performance. Our performance testing showed that runtime performance was the same or better, and that compile time performance was about 1% worse. So there is an impact on compile time performance, but it's far from severe. On the alignment question, this patch is able to optionally provide limited alignment. This alignment has consistently provided good performance for us, so we think it's in a good state for broader testing. I'm not sure that replacing the current ThinLTO pipeline with this pipeline makes sense at the moment. This pipeline provides different advantages and disadvantages to the current pipelines, and I think they can co-exist with minimal maintenance overhead. That's definitely been our experience maintaining this feature downstream. In the long term, full alignment of all LTO pipelines could be a good route, but it seems like the proposal is still being explored. We're ready to get more concrete feedback on our approach and we think it's likely to be useful in its current state.

1146

Fixed.

mehdi_amini added inline comments.May 23 2023, 6:50 PM
llvm/lib/Passes/PassBuilder.cpp
1144

I raised the discussion in the RFC, it seems more appropriate to discuss design discussions like this there. There is definitely a tradeoff to explore there, but I don't feel I've seen it called out in the RFC and enough data provided to justify it going one way or the other.

ormris updated this revision to Diff 526176.May 26 2023, 1:16 PM

clang-format

ormris updated this revision to Diff 527693.Jun 1 2023, 6:18 PM

Use the ThinLTO pre-link pipeline as the Unified LTO pre-link pipeline, as discussed in the RFC.

ormris added a comment.Jun 9 2023, 2:15 PM

gentle ping

The alignment in the pass pipeline LGTM, I don't know about all the aspects involved here so probably not the best person to approve the patch.

Looks pretty good but I have some mostly minor comments and questions.

Patch summary needs slight change to remove this note since that got refactored out of this patch:

Make sure that the ModuleID is generated by incorporating more types of symbols.

Also, is there still a requirement that split LTO units be enabled for Unified LTO? I cannot remember why that was the case, and I see some tests specify split LTO units and some don't. IMO it is better for split LTO and Unified LTO to be orthogonal if possible.

llvm/include/llvm/LTO/Config.h
174

Document. However, the only use I could find of this field is immediately after it is set, in the same scope. Does it need to be a Config field?

llvm/include/llvm/LTO/LTO.h
242

Document

416

Document

llvm/include/llvm/Passes/PassBuilder.h
74

s/our/the/ ?

llvm/include/llvm/Transforms/IPO/ThinLTOBitcodeWriter.h
34

This isn't used - remove?

llvm/lib/Bitcode/Reader/BitcodeReader.cpp
7433

Comment needs update. Also, what should the value of UnifiedLTO be set to in this case? I suppose it defaults to false, which seems correct, but it would be good to explicitly set/note that.

I think though it might be better to change this function to return a tuple of the 2 flags, since there are other fields in the BitcodeLTOInfo that are not being set here. I see that they are set by the caller, but this is a bit confusing IMO. Alternatively, change this function to a name "setEnable..." (s/get/set), and note explicitly in a comment above the function that the caller is expected to set the other fields.

llvm/lib/LTO/LTO.cpp
739

Please document this rationale in a comment, and note that this metadata is only needed for ThinLTO (which appears to be the case).

1128

Ping on this question, I think it should be removed? EnableLTOInternalization is an internal option that defaults to true anyway.

1145

Ping on this question - please add comment about why this is needed.

1544

Ping on this question. Please add comment about why needed.

llvm/lib/Passes/PassBuilder.cpp
1148

Add comment summarizing the decision/rationale for this.

llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
265

Since you are asserting that UnifiedLTO is false a few lines up, can this just be a constant false?

271

Diito

llvm/test/LTO/Resolution/X86/local-def-dllimport.ll
1–2

Why change this test? I assume it should still work with the old options. If you want to test also with Unified LTO, just duplicate the RUN lines so that it tests in both modes.

llvm/test/LTO/X86/unified-cfi.ll
89

Does the test need to include the textual summary, or will the correct summary be generated with -thinlto-bc?

llvm/test/LTO/X86/whole-program-no-crash.ll
3

Is this comment about crashing in previous versions of the compiler copied from another test? Is the crash related to unified LTO somehow?

(also typo in "devirtualizaiton")

91

Similar question to the other test - do we need to include the textual summary or does it get automatically generated by -thinlto-bc?

llvm/test/ThinLTO/X86/dup-cgprofile-flag.ll
2

Add comment at the top about what the test is testing (it isn't clear to me).

llvm/test/Transforms/ThinLTOBitcodeWriter/split-unified.ll
3

Can you add some checking of the generated minimized bitcode file %t2? Also, it is not just without the debug metadata, it is without all IR.

7

Why this checking?

llvm/tools/llvm-lto2/llvm-lto2.cpp
158

Note the 2 accepted values in the message. Should it also accept "default"? Looks like the code will not, but we might want to for completeness.

338

Needs comment about why

llvm/tools/opt/opt.cpp
121

Note that it is ignored unless -thinlto-bc specified

ormris added inline comments.Jun 12 2023, 4:24 PM
llvm/include/llvm/LTO/Config.h
174

I can derive this from other values. Removed.

llvm/include/llvm/LTO/LTO.h
242

Fixed.

416

Fixed.

llvm/include/llvm/Passes/PassBuilder.h
74

Fixed.

llvm/include/llvm/Transforms/IPO/ThinLTOBitcodeWriter.h
34

Fixed

llvm/lib/Bitcode/Reader/BitcodeReader.cpp
7433

Yeah, I see what you mean. I think it would be best to make this function return a tuple. Fixed.

llvm/lib/LTO/LTO.cpp
739

Fixed.

1128

Fixed.

1145

Fixed.

1544

Now that we're using the ThinLTO pre-link pipeline, we can remove this.

llvm/lib/Passes/PassBuilder.cpp
1148

Fixed

llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
265

Fixed

271

Fixed

llvm/test/LTO/Resolution/X86/local-def-dllimport.ll
1–2

Fixed

llvm/test/LTO/X86/unified-cfi.ll
89

No, that's not needed. Fixed.

llvm/test/LTO/X86/whole-program-no-crash.ll
3

Yes, it was. This is a very old test for a crash that's long been fixed. Essentially, the issue was that type test instructions were not being removed, and that caused crashes during codegen. Honestly, I think it would be best to keep this test private. It's good for our internal test suite, but I'm not sure it adds value here.

91

It's auto-generated. I'll remove this.

llvm/test/ThinLTO/X86/dup-cgprofile-flag.ll
2

Fixed.

llvm/tools/llvm-lto2/llvm-lto2.cpp
158

Yes, that makes sense. Fixed.

llvm/tools/opt/opt.cpp
121

Fixed

ormris updated this revision to Diff 530699.Jun 12 2023, 4:25 PM

Address feedback.

There are a few comments that don't appear to be addressed, noted in the patch below, but also the below comment and question about split LTO - can you clarify whether that is still required for unified LTO? I also realized after going through the changes that there are a few things that need more testing, see comments.

Looks pretty good but I have some mostly minor comments and questions.

Patch summary needs slight change to remove this note since that got refactored out of this patch:

Make sure that the ModuleID is generated by incorporating more types of symbols.

Also, is there still a requirement that split LTO units be enabled for Unified LTO? I cannot remember why that was the case, and I see some tests specify split LTO units and some don't. IMO it is better for split LTO and Unified LTO to be orthogonal if possible.

llvm/lib/LTO/LTO.cpp
1145

Specifically, why only in UnifiedRegular LTO mode?

llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
265

Document const parameter

271

Ditto

llvm/test/LTO/Resolution/X86/unified-lto-check.ll
3

Is this a correct description? It seems to give an error, not silently handle it.

llvm/test/ThinLTO/X86/dup-cgprofile-flag.ll
2

I'm still unclear as to what is happening in the test. It seems that there is an error when running LTO without specifying a --lto= option. It isn't clear to me why, or why specifically that case duplicates a module flag.

This raises a couple questions:

  1. is it unsupported to LTO link unifiedLTO IR objects without specifying a non-default LTO mode via --lto=[thin|full]?
  2. is it unsupported to specify an LTO mode other than default via --lto=[thin|full] for non-unifiedLTO IR objects?

The unified-lto-check.ll test does test a few of these option combinations, but not all of them. There should be a test for all combinations, with a clear error for any that are not supported. IMO it might be nice to handle case 1 silently with a reasonable default (probably ThinLTO since that's the pre-link pipeline used).

It would also be good to have a test that more explicitly ensures that we get the expected ThinLTO vs RegularLTO backend handling for unifiedLTO IR objects with both --lto=thin and --lto=full (maybe this exists, but I don't see such a test right now scanning the patch again).

llvm/test/Transforms/ThinLTOBitcodeWriter/split-unified.ll
3

ping on the comments in this test.

llvm/tools/llvm-lto2/llvm-lto2.cpp
158

Needs a test.

338

Ping on why the CallGraphProfile is related to the Unified LTO setting. Especially now that the similar handling was removed from LTO.cpp.

ormris added inline comments.Jun 20 2023, 2:13 PM
llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
265

Fixed? Does this need more explanation?

271

Fixed?

llvm/test/LTO/Resolution/X86/unified-lto-check.ll
3

Fixed

llvm/test/ThinLTO/X86/dup-cgprofile-flag.ll
2

OK. I've added further details to the comment. Let me know if that makes sense.

is it unsupported to LTO link unifiedLTO IR objects without specifying a non-default LTO mode via --lto=[thin|full]?

I think that should be unsupported. Otherwise, small pipeline differences could catch users by surprise. A default of ThinLTO does make sense. Fixed.

is it unsupported to specify an LTO mode other than default via --lto=[thin|full] for non-unifiedLTO IR objects?

It probably should be. There's a chance that someone could use the switch by accident and get a strange result. Fixed.

There should be a test for all combinations

Agreed. I've added the rest of these cases to unified-lto-check.ll.

It would also be good to have a test that more explicitly ensures that we get the expected ThinLTO vs RegularLTO backend

Yes, that would be useful. Fixed.

llvm/test/Transforms/ThinLTOBitcodeWriter/split-unified.ll
3

Sorry for the delay here. This test is named incorrectly. It was intended to test the case where the ModuleID is not generated. Since we've removed that case from discussion, I've changed this test to cover the normal case.

7

See above

llvm/tools/llvm-lto2/llvm-lto2.cpp
338

This can also be removed, since we're using the ThinLTO pre-link pipeline.

ormris updated this revision to Diff 533041.Jun 20 2023, 2:14 PM

Address comments

n-omer added a subscriber: n-omer.Jun 21 2023, 8:03 AM

Sorry for the delay, couple more follow ups below.

llvm/lib/LTO/LTO.cpp
1147

We can have split LTO units without UnifiedLTO, however.

llvm/lib/Transforms/IPO/ThinLTOBitcodeWriter.cpp
265

Yep, this is what I meant.

271

yep

llvm/test/LTO/Resolution/X86/unified-lto-check.ll
15

Since this option is only about UnifiedLTO and will give an error in this usage, it would be better to rename it to --unified-lto=.

llvm/test/ThinLTO/X86/dup-cgprofile-flag.ll
2

is it unsupported to LTO link unifiedLTO IR objects without specifying a non-default LTO mode via --lto=[thin|full]?

I think that should be unsupported. Otherwise, small pipeline differences could catch users by surprise. A default of ThinLTO does make sense. Fixed.

The first 2 sentences contradict the second 2 I think? In any case, I think it makes sense to have a reasonable default, which seems to be implemented now.

is it unsupported to specify an LTO mode other than default via --lto=[thin|full] for non-unifiedLTO IR objects?

It probably should be. There's a chance that someone could use the switch by accident and get a strange result. Fixed.

See my comment in one of the tests, I think the option name should be clearer that it is just about UnifiedLTO.

It would also be good to have a test that more explicitly ensures that we get the expected ThinLTO vs RegularLTO backend

Yes, that would be useful. Fixed.

I don't think the new debug messages being emitted and tested are correctly testing this, however, since runRegularLTO and runThinLTO are both essentially unconditionally invoked (see callsites in LTO::run). It would be better to add the messages to lto::backend and lto::thinBackend.

Address feedback

llvm/lib/LTO/LTO.cpp
1147

True. Added a bit more to the comment.

llvm/test/LTO/Resolution/X86/unified-lto-check.ll
15

That makes sense. Fixed.

llvm/test/ThinLTO/X86/dup-cgprofile-flag.ll
2

Fixed

ormris updated this revision to Diff 536317.Jun 30 2023, 10:28 AM

Address feedback

This revision is now accepted and ready to land.Jun 30 2023, 2:27 PM

Thanks @tejohnson and @mehdi_amini for all the feedback and review!

This revision was landed with ongoing or failed builds.Jul 5 2023, 2:54 PM
This revision was automatically updated to reflect the committed changes.