This is an archive of the discontinued LLVM Phabricator instance.

Cache memory regions in ProcessMinidump and use the linux maps as the source of the information if available.
ClosedPublic

Authored by clayborg on Dec 10 2018, 10:08 AM.

Details

Summary

Breakpad creates minidump files that sometimes have:

  • linux maps textual content
  • no MemoryInfoList

Right now unless the file has a MemoryInfoList we get no region information.

This patch:

  • reads and caches the memory region info one time and sorts it for easy subsequent access
  • get the region info from the best source in this order:
    • linux maps info (if available)
    • MemoryInfoList (if available)
    • MemoryList or Memory64List
  • returns memory region info for the gaps between regions (before the first and after the last)

This patch is a different patch that would replace: https://reviews.llvm.org/D55472

If we decide to go with this patch, then I will add tests.

Diff Detail

Repository
rL LLVM

Event Timeline

clayborg created this revision.Dec 10 2018, 10:08 AM
clayborg retitled this revision from Cache memory regions and use the linux maps as the source of the information if available. to Cache memory regions in ProcessMinidump and use the linux maps as the source of the information if available..Dec 10 2018, 10:09 AM
clayborg updated this revision to Diff 177555.Dec 10 2018, 10:19 AM

Add sorting functions needed for MemoryRegionInfo

zturner added inline comments.Dec 10 2018, 10:39 AM
source/Plugins/Process/minidump/MinidumpParser.cpp
409 ↗(On Diff #177550)

You can write this as auto text = llvm::toStringRef(data);

411–412 ↗(On Diff #177550)

No reason to have both constexpr and const, since the former implies the latter.

As an aside, I find this quite strange. I'm not sure why we don't just use llvm::Optional<bool> and delete MemoryRegionInfo::OptionalBool. The former is an enum, which defaults to int, while sizeof(Optional<bool>) == 2, so it seems strictly better, as well as being more intuitive.

426 ↗(On Diff #177550)

line.take_front(4); is a little more idiomatic, but this is fine anyway. If we need to handle these strings being an invalid format we should also check for length before any take, drop, or substr operation

447–449 ↗(On Diff #177550)

This is a good example where Optional<bool> would be nice. You could just eliminate the ternary and pass the result of the equality comparison and it would work fine.

467–474 ↗(On Diff #177550)

Maybe we could add some member functions to MinidumpMemoryInfo such as isExecutable(), isReadable(), etc

541 ↗(On Diff #177550)

Can we early out here if m_regions.empty() is true?

547 ↗(On Diff #177550)

Since the previous line is a return statement, we don't need the else.

555–556 ↗(On Diff #177550)

This condition doesn't seem possible based on the code, because it would imply the list of regions is empty, but then we would either not be in this branch (current code), or have already returned (based on suggestion above).

I think I see what it's trying to do though. I think we need to move this check up in between the previous if/else block. So it should be something like:

if (pos != end && pos->GetRange().Contains(load_addr))
  return *pos;

if (pos == begin)
  return llvm::None;

--pos;
if (pos->GetRange().Contains(load_addr))
  return pos;

However, at this point, haven't we handled every possible case? Either the user passed the exact address of offset 0 of a memory region (case 0), they passed an address that is not in any region (case 2), or they passed an address that is in the middle of a region (case 3).

Can't we just write return llvm::None at this point? What is the rest of the code for?

582–583 ↗(On Diff #177550)

Actually, we're already checking if it's empty here before we call the function. Then inside the function we check if it's empty again. Since it's part of the function's private interface, we have full control over the inputs, we should assert inside of the implementation of FindMemoryRegion that it's not empty.

source/Plugins/Process/minidump/MinidumpParser.h
97–100 ↗(On Diff #177550)

Minor nit, but wherever possible, it's nice if we can eliminate private functions from a class's definition and move them to static internal-linkage functions in the implementation file. For example, I'm imagining you could re-write these as:

static bool CreateRegionsCacheFromLinuxMaps(ArrayRef<uint8_t> linux_maps, std::vector<MemoryRegionInfo> &regions) {
}

// etc

in the implementation file. It makes the header file smaller which is helpful when trying to read it and understand the functionality of a class, and also improves link time.

This patch certainly provides a more comprehensive treatment of memory region information in minidump. However, there might be some value in removing the excessive shared_ptrs in the other patch. I think that should be evaluated separately.

I was going to write some additional comments, but they largely coincide with the ones Zachary made already.

clayborg updated this revision to Diff 177566.Dec 10 2018, 11:10 AM
clayborg marked 2 inline comments as done.

Fixed Zach's issues.

clayborg updated this revision to Diff 177567.Dec 10 2018, 11:11 AM

Removed Xcode scheme changes.

clayborg marked 6 inline comments as done.Dec 10 2018, 11:12 AM
clayborg added inline comments.
source/Plugins/Process/minidump/MinidumpParser.cpp
536 ↗(On Diff #177567)

Fixed the logic in here to be able to have empty m_regions.

412 ↗(On Diff #177555)

I agree on the Optional<bool>, but since this enum is exposed via the public API, I would leave it alone for now?

583 ↗(On Diff #177555)

I fixed the logic so FindMemoryRegion work with the empty region. And will set a bool to indicate we parsed the memory regions and not rely on m_regions being empty for that.

This patch certainly provides a more comprehensive treatment of memory region information in minidump. However, there might be some value in removing the excessive shared_ptrs in the other patch. I think that should be evaluated separately.

Agreed. I will make that patch following this one.

Without overriding Process::GetMemoryRegions still does a lot of unnecessary job (FindMemoryRegion for each element in the vector).
With these changes, it may just return m_regions (if not empty), doesn't it?

source/Plugins/Process/minidump/MinidumpParser.cpp
475 ↗(On Diff #177555)

Here may be m_regions.reserve(mem_info_list.size())

484 ↗(On Diff #177555)

You can move region here.

500 ↗(On Diff #177555)

m_regions.reserve(memory_list.size()) and m_regions.shrink_to_fit() after the loop.

526 ↗(On Diff #177555)

Same

Don't you mind to take overridden GetMemoryRegions in this patch so that I can remove all minidump-related changes from D55472?

Don't you mind to take overridden GetMemoryRegions in this patch so that I can remove all minidump-related changes from D55472?

That is fine with me.

clayborg updated this revision to Diff 177592.Dec 10 2018, 1:16 PM

Add reserve and shrink to fit when parsing memory regions.

It just occurred to me that we have another copy of /proc/<pid>/maps -> MemoryRegionInfo conversion code. It lives in NativeProcessLinux.cpp (ParseMemoryRegionInfoFromProcMapsLine). It would be nice to extract this parser into a some place (Plugins/Process/Utility ?), where it can be shared by the two process plugins. As a bonus, that should make it easy to write unit tests for that function.

It just occurred to me that we have another copy of /proc/<pid>/maps -> MemoryRegionInfo conversion code. It lives in NativeProcessLinux.cpp (ParseMemoryRegionInfoFromProcMapsLine). It would be nice to extract this parser into a some place (Plugins/Process/Utility ?), where it can be shared by the two process plugins. As a bonus, that should make it easy to write unit tests for that function.

Nice! I should have looked around. I will move it to Plugins/Process/Utility as suggested

clayborg updated this revision to Diff 178168.Dec 13 2018, 7:15 PM

Fixed errors, fully test, and make static function that are local to MinidumpParser.cpp that parse the region info from linux maps, memory info list, memory list and memory 64 list.

labath accepted this revision.Dec 14 2018, 4:19 AM

Looks good to me. Thanks for moving the function to a more generic place. I can get rid of the copy in NativeProcessLinux once this lands.

source/Plugins/Process/Utility/LinuxProcMaps.cpp
19 ↗(On Diff #178168)

It doesn't look like this maps_line needs to be a reference.

This revision is now accepted and ready to land.Dec 14 2018, 4:19 AM
This revision was automatically updated to reflect the committed changes.

Fixed the test inputs CMakeLists.txt file:

$ svn commit
Sending CMakeLists.txt
Transmitting file data .done
Committing transaction...
Committed revision 349183.