This is an archive of the discontinued LLVM Phabricator instance.

[lldb/DWARF] Refactor to prefer emplace_back() vs. push_back()
AcceptedPublic

Authored by dmantipov on Sep 15 2020, 2:13 AM.

Details

Summary

Prefer emplace_back() over push_back() where appropriate to avoid extra calls to copy and/or move constructors.

Diff Detail

Event Timeline

dmantipov created this revision.Sep 15 2020, 2:13 AM
dmantipov updated this revision to Diff 291838.Sep 15 2020, 2:31 AM

Minor style adjustments.

Using emplacement instead of construction is a good idea, where it makes sense. However, I am not convinced that adding constructors to otherwise-trivial structures is worth it. These structures are usually trivially copyable and/or movable and so an emplacement does not really save much.

And I am definitely sure that the constructors we do add should be taking (lvalue) references -- that precludes construction with anything other than an lvalue, which is just weird. They also force at least one copy, whereas a equivalent (properly written) push_back call could use no copies.

So, I'd just limit this patch to the cases where one does not need to add additional constructors to make emplace_back work.

dmantipov updated this revision to Diff 291870.Sep 15 2020, 5:24 AM

Use std::pair rather than an ad-hoc two-member trivial structures. Since std::pair provides the default constructor to initialize both members, no extra C++ glue is required for emplace_back()'ing the pairs into containers.

Use std::pair rather than an ad-hoc two-member trivial structures. Since std::pair provides the default constructor to initialize both members, no extra C++ glue is required for emplace_back()'ing the pairs into containers.

I am sorry, but I don't consider that an improvement. The reason these were not pairs in the first place is because the first/second accessors are very nondescript. If you can demonstrate that these actually improve performance, then we can talk about the best way to do this. However, I'd be very surprised if any of these show up in a profile.

dmantipov updated this revision to Diff 291894.Sep 15 2020, 6:52 AM

OK let's reduce it to minimum minimorum.

As for the profiling, l can see possible improvement around DWARFAttributes::Append() - inlining even at -O2 and eliminating extra constructor call makes

10.61%  intern-state     liblldb.so.12.0.0git   [.] DWARFDebugInfoEntry::GetAttributes
7.28%  intern-state     liblldb.so.12.0.0git   [.] DWARFDebugInfoEntry::Extract
6.88%  intern-state     liblldb.so.12.0.0git   [.] lldb_private::ManualDWARFIndex::IndexUnitImpl
5.76%  intern-state     libpthread-2.31.so     [.] __pthread_rwlock_rdlock
5.43%  intern-state     liblldb.so.12.0.0git   [.] lldb_private::DataExtractor::GetU32
4.70%  intern-state     liblldb.so.12.0.0git   [.] DWARFUnit::GetDIE
4.49%  intern-state     liblldb.so.12.0.0git   [.] lldb_private::DWARFContext::LoadOrGetSection
4.37%  intern-state     libLLVM-12git.so       [.] llvm::StringMapImpl::FindKey
3.50%  intern-state     liblldb.so.12.0.0git   [.] DWARFFormValue::GetFixedSize
2.73%  intern-state     liblldb.so.12.0.0git   [.] DWARFUnit::ExtractDIEsRWLocked
2.61%  intern-state     liblldb.so.12.0.0git   [.] Pool::GetConstCStringWithStringRef
2.08%  intern-state     liblldb.so.12.0.0git   [.] lldb_private::DataExtractor::GetULEB128
1.82%  intern-state     ld-2.31.so             [.] __tls_get_addr
1.77%  intern-state     liblldb.so.12.0.0git   [.] std::__introsort_loop<__gnu_cxx::__normal_iterator<lldb_private::UniqueCStringMap<DIERef>::Entry*, std::vector<lldb_private::UniqueCStringMap<DIERef>::Entry, std::allocator<lldb_private::UniqueCStringMap<DIERef>::Entry> > >, long, __gnu_cxx::__ops::_Iter_comp_iter<lldb_private::Uniqu
eCStringMap<DIERef>::Compare> >
1.60%  intern-state     liblldb.so.12.0.0git   [.] DWARFFormValue::ExtractValue
1.55%  intern-state     libpthread-2.31.so     [.] __pthread_rwlock_unlock
1.23%  intern-state     libc-2.31.so           [.] __strlen_avx2
1.15%  intern-state     libc-2.31.so           [.] _int_free
1.15%  intern-state     libc-2.31.so           [.] malloc
1.14%  intern-state     liblldb.so.12.0.0git   [.] DWARFAttributes::ExtractFormValueAtIndex

from:

7.68%  intern-state     liblldb.so.12.0.0git   [.] DWARFAttributes::Append
7.25%  intern-state     liblldb.so.12.0.0git   [.] DWARFDebugInfoEntry::Extract
6.34%  intern-state     liblldb.so.12.0.0git   [.] lldb_private::ManualDWARFIndex::IndexUnitImpl
6.30%  intern-state     liblldb.so.12.0.0git   [.] DWARFDebugInfoEntry::GetAttributes
5.55%  intern-state     libpthread-2.31.so     [.] __pthread_rwlock_rdlock
5.51%  intern-state     liblldb.so.12.0.0git   [.] lldb_private::DataExtractor::GetU32
4.50%  intern-state     liblldb.so.12.0.0git   [.] DWARFUnit::GetDIE
4.34%  intern-state     liblldb.so.12.0.0git   [.] lldb_private::DWARFContext::LoadOrGetSection
4.27%  intern-state     libLLVM-12git.so       [.] llvm::StringMapImpl::FindKey
2.92%  intern-state     liblldb.so.12.0.0git   [.] DWARFFormValue::GetFixedSize
2.65%  intern-state     liblldb.so.12.0.0git   [.] DWARFUnit::ExtractDIEsRWLocked
2.51%  intern-state     liblldb.so.12.0.0git   [.] Pool::GetConstCStringWithStringRef
1.93%  intern-state     liblldb.so.12.0.0git   [.] lldb_private::DataExtractor::GetULEB128
1.79%  intern-state     liblldb.so.12.0.0git   [.] std::__introsort_loop<__gnu_cxx::__normal_iterator<lldb_private::UniqueCStringMap<DIERef>::Entry*, std::vector<lldb_private::UniqueCStringMap<DIERef>::Entry, std::allocator<lldb_private::UniqueCStringMap<DIERef>::Entry> > >, long, __gnu_cxx::__ops::_Iter_comp_iter<lldb_private::UniqueCStringMap<DIERef>::Compare> >
1.78%  intern-state     ld-2.31.so             [.] __tls_get_addr
1.46%  intern-state     libpthread-2.31.so     [.] __pthread_rwlock_unlock
1.45%  intern-state     liblldb.so.12.0.0git   [.] DWARFFormValue::ExtractValue
1.22%  intern-state     libc-2.31.so           [.] __strlen_avx2
1.16%  intern-state     libc-2.31.so           [.] malloc
1.11%  intern-state     libc-2.31.so           [.] _int_free
1.08%  intern-state     liblldb.so.12.0.0git   [.] DWARFAttributes::ExtractFormValueAtIndex

when loading symbols from clang + libLLVM.so (release build with '-O2 -g').

labath accepted this revision.Sep 16 2020, 1:18 AM

Ok, this is something we can talk about. dwarf parsing is a big bottle neck, so spending some time to optimize that (and even sacrifice some readability/mantainability/etc. if needed) is worthwhile.

That said, I am pretty sure that any performance gains you're seeing (you didn't explain what/how you're measuring so it's hard to say for sure), are due to the additional inlining opportunities afforded by moving this function to the header, and _not_ because of the use of emplace_back. In my comparison, the two implementations ended up looking fairly different, but I don't think the differences cannot be explained by the use of emplace_back. In both cases, the compiler was able to fully optimize away all the push_backs, constructor calls, etc., and the fast path for both functions consists of a bunch of move instructions. If one of the two versions ends up being slower, then this indicates a missed optimization opportunity on the side of the compiler (which could be removed or even reversed by using a different compiler), and not a fundamental problem with using push_back with trivially copyable structs.

_That_ said, I don't think adding the constructor is necessarily bad in this case, so I think it's fine to stick with the current implementation.

This revision is now accepted and ready to land.Sep 16 2020, 1:18 AM