Page MenuHomePhabricator

Optimize GSymCreator::finalize.
ClosedPublic

Authored by simon.giesecke on May 11 2021, 12:38 AM.

Details

Summary

The algorithm removing duplicates from the Funcs list used to have
amortized quadratic time complexity because it was potentially
removing each entry using std::vector::erase individually. This
patch is now using a erase-remove idiom with an adapted
removeIfBinary algorithm.

Probably this was made under the assumption that these removals are
rare, but there are cases where the case of duplicate entries is
occurring frequently. In these cases, the actual runtime was very
poor, taking hours to process a single binary of around 1 GiB size
including debug info. Another factor contributing to that is the
frequent output of the warning, which is now removed.

It seems this is particularly an issue with GCC-compiled binaries,
rather than clang-built binaries.

Diff Detail

Event Timeline

simon.giesecke requested review of this revision.May 11 2021, 12:38 AM
Herald added a project: Restricted Project. · View Herald TranscriptMay 11 2021, 12:38 AM
clayborg accepted this revision.May 11 2021, 3:02 PM

There are unit tests that could be added if you would like to test a bad case, but existing unit tests should still pass with this.

This revision is now accepted and ready to land.May 11 2021, 3:02 PM

There are unit tests that could be added if you would like to test a bad case, but existing unit tests should still pass with this.

I had a look at the test cases in llvm/unittests/DebugInfo/GSYM/GSYMTest.cpp. The TestDWARF* test cases seem to provide a template for such a test, but unfortunately I am not familiar with the details of DWARF to come up with a representative test case. With some advice, I could do that in another patch?

Also note this is my first contribution to LLVM and I don't have commit access, could you land the patch on my behalf?

Also note this is my first contribution to LLVM and I don't have commit access, could you land the patch on my behalf?

Sure thing!

There are unit tests that could be added if you would like to test a bad case, but existing unit tests should still pass with this.

I had a look at the test cases in llvm/unittests/DebugInfo/GSYM/GSYMTest.cpp. The TestDWARF* test cases seem to provide a template for such a test, but unfortunately I am not familiar with the details of DWARF to come up with a representative test case. With some advice, I could do that in another patch?

No worries. We have existing tests that catch this. But just to clarify, you ran llvm-gsymutil on a GCC binary that caused a major slowdown. It would be great if you could verify that if you run llvm-gsymutil with "--num-threads=1" using the current llvm-gsymutil and with the modified llvm-gsymutil, can you verify that they binaries match? If they don't then we do need to add a test. You have to run with --num-threads=1 to make sure threading doesn't make things change in the final GSYM. So if you can run this test for me and verify things match up, then I will commit this diff.

There are unit tests that could be added if you would like to test a bad case, but existing unit tests should still pass with this.

I had a look at the test cases in llvm/unittests/DebugInfo/GSYM/GSYMTest.cpp. The TestDWARF* test cases seem to provide a template for such a test, but unfortunately I am not familiar with the details of DWARF to come up with a representative test case. With some advice, I could do that in another patch?

No worries. We have existing tests that catch this. But just to clarify, you ran llvm-gsymutil on a GCC binary that caused a major slowdown. It would be great if you could verify that if you run llvm-gsymutil with "--num-threads=1" using the current llvm-gsymutil and with the modified llvm-gsymutil, can you verify that they binaries match? If they don't then we do need to add a test. You have to run with --num-threads=1 to make sure threading doesn't make things change in the final GSYM. So if you can run this test for me and verify things match up, then I will commit this diff.

Sure. I checked this with a gcc-built ninja, there are no differences:

[sgiesecke@DEVVM-sgiesecke ninja]$ diff buildDebug/ninja.gsym.new buildDebug/ninja.gsym.orig 
[sgiesecke@DEVVM-sgiesecke ninja]$

The baseline version produced lots of warnings such as:

warning: duplicate function info entries for range: [0x0000000000429ef4 - 0x0000000000429f0e)
warning: duplicate function info entries for range: [0x0000000000429f0e - 0x0000000000429f3f)

while the version with my patch didn't produce any warnings:

[sgiesecke@DEVVM-sgiesecke ninja]$ llvm-gsymutil --num-threads 1 --convert buildDebug/ninja
Input file: buildDebug/ninja
Output file (x86_64): buildDebug/ninja.gsym
Loaded 5424 functions from DWARF.
Loaded 11 functions from symbol table.
Pruned 2485 functions, ended with 2950 total
simon.giesecke added a comment.EditedMay 12 2021, 12:33 PM

FWIW, and unrelated to this patch: processing other, more complex gcc-built binaries, e.g. clang, produces a lot of the other warnings (with and without the patch). Not sure if that's concerning.

FWIW, and unrelated to this patch: processing other, more complex gcc-built binaries, e.g. clang, produces a lot of the other warnings (with and without the patch). Not sure if that's concerning.

This is par for the course. Everyone produces debug info and the debug info usually has some issues. llvm-gsymutil is merely pointing out issues it runs into when parsing this information. llvm-gsymutil is also smart enough to ignore debug info that should have been stripped, but can't. This stems from the fact that linkers don't know anything about DWARF and they like to just concatenate all DWARF sections from each .o file and relocate any addresses that were not stripped. This means for debug info that had no relocations will still remain in the DWARF. For those functions, they typically get their addresses set to zero or -1. llvm-gsymutil knows to ignore these when it calls GsymCreator::SetValidTextRanges(...) after it parses the sections of the object file. Any function infos whose start address doesn't start in a valid text range will be omitted.

You can also build "llvm-dwarfdump" and then run it using the "--verify" option. It will point out any issues that exist in the DWARF.

So warnings are normal because there are many issues with the DWARF produced by all compilers and linkers. Some are worse than others, but in general we try to work around these issues and still produce a GSYM. Link Time Optimizations (LTO) also tends to outline many functions and also combine functions with common code, so you can have multiple DWARF functions point to the same address. If the line entries are all the same and the function name matches, that is easily detected, but sometimes we have templated code that is inlined, like std::shared_ptr::~shared_ptr<T>() that ends up creating the same code for multiple different instances and they get combined even though the line tables would look different.

This revision was automatically updated to reflect the committed changes.

FWIW, and unrelated to this patch: processing other, more complex gcc-built binaries, e.g. clang, produces a lot of the other warnings (with and without the patch). Not sure if that's concerning.

This is par for the course. Everyone produces debug info and the debug info usually has some issues. llvm-gsymutil is merely pointing out issues it runs into when parsing this information. llvm-gsymutil is also smart enough to ignore debug info that should have been stripped, but can't. This stems from the fact that linkers don't know anything about DWARF and they like to just concatenate all DWARF sections from each .o file and relocate any addresses that were not stripped. This means for debug info that had no relocations will still remain in the DWARF. For those functions, they typically get their addresses set to zero or -1. llvm-gsymutil knows to ignore these when it calls GsymCreator::SetValidTextRanges(...) after it parses the sections of the object file. Any function infos whose start address doesn't start in a valid text range will be omitted.

You can also build "llvm-dwarfdump" and then run it using the "--verify" option. It will point out any issues that exist in the DWARF.

So warnings are normal because there are many issues with the DWARF produced by all compilers and linkers. Some are worse than others, but in general we try to work around these issues and still produce a GSYM. Link Time Optimizations (LTO) also tends to outline many functions and also combine functions with common code, so you can have multiple DWARF functions point to the same address. If the line entries are all the same and the function name matches, that is easily detected, but sometimes we have templated code that is inlined, like std::shared_ptr::~shared_ptr<T>() that ends up creating the same code for multiple different instances and they get combined even though the line tables would look different.

TIL, that's very good to know. I increasingly notice how little I know about DWARF.

Would it make sense to add an option to llvm-gsymutil to disable these warnings, if they are normal? When running this inside a larger build script, the presence of such normal warnings may be confusing, yet suppressing all output from llvm-gsymutil doesn't seem to be a good idea.

! In D102219#2755080, @simon.giesecke wrote:

Would it make sense to add an option to llvm-gsymutil to disable these warnings, if they are normal? When running this inside a larger build script, the presence of such normal warnings may be confusing, yet suppressing all output from llvm-gsymutil doesn't seem to be a good idea.

I would be fine with adding a --quiet option or --suppress-warnings to llvm-gsymutil. I would look around to try and match any other tools that already exist. In a separate patch of course.