This patch replaces the copy of MCSubtargetInfo in MCRelaxableFragment, which accounted for nearly 40% of MCRelaxableFragment, with a const reference to MCSubtargetInfo to make MCRelaxableFragment smaller.
Details
Diff Detail
- Repository
- rL LLVM
Event Timeline
- In ARMAsmParser::parseDirectiveArch, declare STI as a reference.
- Rebase to r252145.
- Remove trailing whitespace.
We will get invalid references if SubtargetCopies is reallocated.
You need to keep the index or use an allocator when giving out copies.
What are the memory savings the you get with this patch?
Cheers, Rafael
Ah, thanks. I've made changes to use SpecificBumpPtrAllocator instead of SmallVector for copies of MCSubtargetInfo.
With this patch is applied, the memory usage when I run llc on verify-uselistorder.lto.bc (without debug info) decreased about 4%.
Can you please first commit a change just ading the getSti() and rebase? It should make this patch a lot easier to read.
lib/CodeGen/AsmPrinter/AsmPrinter.cpp | ||
---|---|---|
239 ↗ | (On Diff #39425) | Why do you need this copy? The idea is that we create a copy every time we toggle a bit during parsing, no? If so, this should not need a copy. |
lib/CodeGen/AsmPrinter/AsmPrinter.cpp | ||
---|---|---|
239 ↗ | (On Diff #39425) | getSubtargetCopy is called here because STI is a unique_ptr and gets destructed when it goes out of scope. We don't need a copy here, but we need a subtarget object that is kept alive until it is used. |
getSubtargetCopy is called here because STI is a unique_ptr and gets destructed when it goes out of scope. We don't need a copy here, but we need a subtarget object that is kept alive until it is used.
Oh, OK.
LGTM with the patch split in two.
Cheers,
Rafael
This seems like it could also be solved by keeping a cache on the MCTargetMachine that the mips guys are working up rather than on the context like this. Then we could just use a pointer to the cache location?
Thoughts?
-eric
Currently, MCTargetMachine doesn't seem to have a cache for MCSubtargetInfo (based on what I saw in the patches under review), but I think we can move MCSubtargetAllocator to MCTargetMachine as long as there is a way for target parsers to access MCTargetMachine. What is the advantage of keeping the cache in MCTargetMachine over keeping it in MCContext? It seems like you'll still be creating a new MCSubtargetInfo every time the feature bits are toggled, is that right?
It doesn't have a cache yet but it should be easy to add one.
but I think we can move MCSubtargetAllocator to MCTargetMachine as long as there is a way for target parsers to access MCTargetMachine.
MCTargetMachine::createMCAsmParser() can pass an MCTargetMachine pointer/reference to the target parser for all targets or (once it's been virtualized) for an individual target.
OK, it sounds like it won't be hard to move the cache to MCTargetMachine if we decide to do so later.
Eric, do you have any other comments on this patch?
Nope, just didn't want to make it harder for the work in the future, this is pretty obviously goodness.
Thanks!
-eric