Add a flag to the LLVMContext to disable name for Value other than GlobalValue
ClosedPublic

Authored by mehdi_amini on Mar 7 2016, 4:07 PM.

Details

Summary

This is intended to be a performance flag, on the same level as clang
cc1 option "--disable-free". LLVM will never initialize it by default,
it will be up to the client creating the LLVMContext to request this
behavior. Clang will do it by default in Release build (just like
--disable-free).

"opt" and "llc" can opt-in using -disable-named-value command line
option.

When performing LTO on llvm-tblgen, the initial merging of IR peaks
at 92MB without this patch, and 86MB after this patch,setNameImpl()
drops from 6.5MB to 0.5MB.
The total link time goes from ~29.5s to ~27.8s.

Diff Detail

mehdi_amini retitled this revision from to Add a flag to the LLVMContext to disable name for Value other than GlobalValue.Mar 7 2016, 4:07 PM
mehdi_amini updated this object.
mehdi_amini added a subscriber: llvm-commits.

What about from llvm-lto, do you want an option there too? Right now it will use the NDEBUG behavior in LTOCodeGenerator's constructor.

Note to self: add similar flag and set on context from gold plugin too when this goes in.

lib/AsmParser/LLParser.cpp
52

Do you mean "Can't read..."?

Good point, I haven't thought about llvm-lto. Looks like I need to find some way to push this out of the LTOCodeGenerator (I'll wait to see what others think about the direction taken by this patch first).

lib/AsmParser/LLParser.cpp
52

Indeed :-)

Note: this would make the first template parameter of the IR builder "PreserveName" obsolete.
Some passes such as SROA were doing:

/// \brief Provide a typedef for IRBuilder that drops names in release builds.
#ifndef NDEBUG
typedef llvm::IRBuilder<true, ConstantFolder, IRBuilderPrefixedInserter<true>>
    IRBuilderTy;
#else
typedef llvm::IRBuilder<false, ConstantFolder, IRBuilderPrefixedInserter<false>>
    IRBuilderTy;
#endif
}

I thought about applying this runtime check in the IRBuilder itself, but many passes are not going through the IRBuilder and creates value directly.

rafael added a subscriber: rafael.Mar 8 2016, 5:52 AM

This looks like a good idea, but you should delete the PreserveName
option in a followup commit :-)

mehdi_amini updated this revision to Diff 50104.Mar 8 2016, 9:15 PM

Rename preserveNames() to preserveNonGlobalValueNames()

chandlerc added inline comments.Mar 9 2016, 1:33 AM
lib/AsmParser/LLParser.cpp
52

FWIW, latest version of patch still has "Can read ..." here. =]

lib/LTO/LTOCodeGenerator.cpp
83–85

It'd be nice to make this be flag-controlled as well instead of NDEBUG controlled. Is that not possible?

tools/llc/llc.cpp
107

"disable-named-values" maybe?

213

No need for the comment, the code seems quite clear.

tools/opt/opt.cpp
200

Same here.

Update, trying to account for comments (hopefully I didn't forget any)

I propose to name these: discardValueNames() / setDiscardValueNames(), Justin what do you think?

Chandler: I tries to time on clang -O3 on a large .cpp file but it didn't have much impact (setValueName is a very small part in this case apparently)

mehdi_amini marked 7 inline comments as done.Mar 9 2016, 11:12 AM
chandlerc accepted this revision.Mar 9 2016, 11:25 AM

LGTM with the nits below fixed and with whatever name we finally settle on. (We can also tweak the name later if needed). But sync up with Justin before you land it to be sure he's OK.

include/llvm/IR/LLVMContext.h
115

The parameter name seems... confusing. ;] Probably want this to be "Discard" now.

lib/IR/LLVMContextImpl.h
1037–1039

I would make this consistent with the interface (DiscardValueNames).

lib/LTO/LTOCodeGenerator.cpp
68–69

I would probably name the flags to exactly match the context: '-discard-value-names' (or whatever that ends up being).

This revision is now accepted and ready to land.Mar 9 2016, 11:25 AM

Take Chandler's comment into account

mehdi_amini marked 3 inline comments as done.Mar 9 2016, 12:02 PM
mehdi_amini added inline comments.
lib/IR/LLVMContextImpl.h
1037–1039

Ohhhh it is event worse than a just name bikeshedding, there is unintended behavior change here! The default would have been to Discard names...
(I didn't notice in the tests because I only ran opt/llc tests with this last change and these will always explicitly set it to false based on the cl::opt, it would have impacted other llvm clients)

I don't really care if this lands in it's current form, but I'm somewhat of the belief this is the wrong approach to the problem.

An alternate approach which would seem more natural:

  • We introduce an invariant for the optimizer wherein we only introduce named instructions if the instruction being replaced was named. Same with basic blocks.
  • We add a utility pass which drops names from instructions and basic blocks.
  • Clang optionally schedules the utility pass early in it's pipeline.
mehdi_amini marked an inline comment as done.Mar 9 2016, 12:59 PM

Hey Philip,

Thanks for chiming in.
Can you articulate what you have in mind for "an invariant for the optimizer wherein we only introduce named instructions if the instruction being replaced was named. Same with basic blocks."?
I don't visualize how it would be expressed in practice, unless you intend to audit all the passes and change all the places where prefix/suffix are introduced?

mehdi_amini updated this revision to Diff 50218.Mar 9 2016, 5:34 PM

Update to committed rev

In D17946#371188, @joker.eph wrote:

Hey Philip,

Thanks for chiming in.
Can you articulate what you have in mind for "an invariant for the optimizer wherein we only introduce named instructions if the instruction being replaced was named. Same with basic blocks."?
I don't visualize how it would be expressed in practice, unless you intend to audit all the passes and change all the places where prefix/suffix are introduced?

Yes. If we have code which unconditionally adds a suffix, it would become if (hasName() ? getName() + "suffix" : ""). This seems entirely reasonable.

LGTM with the nits below fixed and with whatever name we finally settle on. (We can also tweak the name later if needed). But sync up with Justin before you land it to be sure he's OK.

In D17946#371188, @joker.eph wrote:

Hey Philip,

Thanks for chiming in.
Can you articulate what you have in mind for "an invariant for the optimizer wherein we only introduce named instructions if the instruction being replaced was named. Same with basic blocks."?
I don't visualize how it would be expressed in practice, unless you intend to audit all the passes and change all the places where prefix/suffix are introduced?

Yes. If we have code which unconditionally adds a suffix, it would become if (hasName() ? getName() + "suffix" : ""). This seems entirely reasonable.

I totally understand why you want the names -- I agree they're incredibly important for debugging.

I'm surprised you don't find the approach Mehdi is pursuing here strictly superior though. This essentially lets us completely remove the NDEBUG disabling of names. That means you can always pass a flag to Clang and get names in the output for debugging.

I find this specifically nicer than keying on whether the original value had a name because there are a number of passes that produce values without names. Even when they do that, I would very much like to have the name suffixes introduced by subsequent passes when I'm debugging.

Does that make sense?

(I'd rather not go converting all of LLVM and Clang to rely on this approach unless you're comfortable with it...)

LGTM with the nits below fixed and with whatever name we finally settle on. (We can also tweak the name later if needed). But sync up with Justin before you land it to be sure he's OK.

In D17946#371188, @joker.eph wrote:

Hey Philip,

Thanks for chiming in.
Can you articulate what you have in mind for "an invariant for the optimizer wherein we only introduce named instructions if the instruction being replaced was named. Same with basic blocks."?
I don't visualize how it would be expressed in practice, unless you intend to audit all the passes and change all the places where prefix/suffix are introduced?

Yes. If we have code which unconditionally adds a suffix, it would become if (hasName() ? getName() + "suffix" : ""). This seems entirely reasonable.

I totally understand why you want the names -- I agree they're incredibly important for debugging.

I'm surprised you don't find the approach Mehdi is pursuing here strictly superior though. This essentially lets us completely remove the NDEBUG disabling of names. That means you can always pass a flag to Clang and get names in the output for debugging.

I find this specifically nicer than keying on whether the original value had a name because there are a number of passes that produce values without names. Even when they do that, I would very much like to have the name suffixes introduced by subsequent passes when I'm debugging.

I hadn't thought about this aspect. Good point.

Does that make sense?

(I'd rather not go converting all of LLVM and Clang to rely on this approach unless you're comfortable with it...)

I don't have a strong opinion here. I was pointing out what seemed like a natural approach to me, but I'm also more than comfortable deferring to you and others who have thought about this more. Provided that the default is to have names even in release mode (default for LLVM, not necessarily clang), I'm comfortable with whatever direction you want to take this.

LGTM with the nits below fixed and with whatever name we finally settle on. (We can also tweak the name later if needed). But sync up with Justin before you land it to be sure he's OK.

In D17946#371188, @joker.eph wrote:

Hey Philip,

Thanks for chiming in.
Can you articulate what you have in mind for "an invariant for the optimizer wherein we only introduce named instructions if the instruction being replaced was named. Same with basic blocks."?
I don't visualize how it would be expressed in practice, unless you intend to audit all the passes and change all the places where prefix/suffix are introduced?

Yes. If we have code which unconditionally adds a suffix, it would become if (hasName() ? getName() + "suffix" : ""). This seems entirely reasonable.

I totally understand why you want the names -- I agree they're incredibly important for debugging.

I'm surprised you don't find the approach Mehdi is pursuing here strictly superior though. This essentially lets us completely remove the NDEBUG disabling of names. That means you can always pass a flag to Clang and get names in the output for debugging.

I find this specifically nicer than keying on whether the original value had a name because there are a number of passes that produce values without names. Even when they do that, I would very much like to have the name suffixes introduced by subsequent passes when I'm debugging.

I hadn't thought about this aspect. Good point.

Does that make sense?

(I'd rather not go converting all of LLVM and Clang to rely on this approach unless you're comfortable with it...)

I don't have a strong opinion here. I was pointing out what seemed like a natural approach to me, but I'm also more than comfortable deferring to you and others who have thought about this more. Provided that the default is to have names even in release mode (default for LLVM, not necessarily clang), I'm comfortable with whatever direction you want to take this.

Awesome, and thanks for following up.

I think this will end up with essentially the best of all worlds in that at all places where we can have extra information we will, and when we don't need it, we can turn off the cost of preserving it.