This is an archive of the discontinued LLVM Phabricator instance.

[lldb] [Target] Support fallback to file address in ReadMemory()
AbandonedPublic

Authored by mgorny on Nov 28 2021, 6:41 AM.

Details

Summary

Add a fallback to GetFileAddress() when GetLoadAddress() fails
in ReadMemory(). This is consistent with how expression evaluation
behaves, and fixes the inconsistency between the two following commands
run on top of FreeBSD vmcore:

  • p *(int *)&hz (prints correct value)
  • memory read &hz (prints zeros)

Diff Detail

Event Timeline

mgorny created this revision.Nov 28 2021, 6:41 AM
mgorny requested review of this revision.Nov 28 2021, 6:41 AM
labath requested changes to this revision.Nov 29 2021, 12:26 AM
labath added a reviewer: jingham.

I don't think this is the right solution to this problem.

The way I see it, there are two separate questions here:

  • why is the module containing hz not loaded? The core file contains a proper memory address for it, does it not? We need to figure out why is that not being applied. I took a brief look, but I couldn't actually find the place where elf file modules are created and where their load addresses are set. I think the first step should be to figure out how this actually works (or why does it not work). Without a load address, a lot of things will not operate properly. If it helps you can look at ProcessMinidump::ReadModuleList to see how it works there, but be aware that this uses a different (I am not saying that it's better or worse) approach to create and load modules -- it does it directly, whereas the elf class seems to rely on DynamicLoaderStatic to do the loading.
  • should "memory read" work on unloaded modules. I think this was deliberately disabled because the concept of "memory" only really makes sense on a "running process" (a core file is considered a running process here). However, your example kinda makes sense even on an unloaded module (a not-yet-started process if the typical situation where you get those). So, we could consider making this work as well. I'm adding Jim to see what he thinks about this. However, if we do decide to do that, then the right solution is to use the Target version of ReadMemory which accepts an Address object directly. D113098 might be relevant/interesting here.
This revision now requires changes to proceed.Nov 29 2021, 12:26 AM

I don't think this is the right solution to this problem.

If it's not, then the expression parser should probably be changed for consistency.

The way I see it, there are two separate questions here:

  • why is the module containing hz not loaded? The core file contains a proper memory address for it, does it not?

Not sure what you're asking about. The symbol's address (as determined from the kernel executable) is found in one of the PT_LOAD segments.

I don't think this is the right solution to this problem.

If it's not, then the expression parser should probably be changed for consistency.

I don't think we'd want to do that (*), but we could definitely make sure that the memory read command produces a meaningful error message instead of a bunch of zeroes. But lets wait to see what Jim thinks about that.

(*) My reasoning is that the expression evaluator can be used to view the static contents of the module before the process is even launched, whereas the memory read command is low level, and has "memory" in its name -- if there's no process, then there's no memory.

The way I see it, there are two separate questions here:

  • why is the module containing hz not loaded? The core file contains a proper memory address for it, does it not?

Not sure what you're asking about. The symbol's address (as determined from the kernel executable) is found in one of the PT_LOAD segments.

Yes, but is that information making its way to the right place (Target::GetSectionLoadList)? That's the thing which provides load addresses for section+offset pairs. What does "image list" say in this case?

(lldb) image list
[  0] 199D1036-3762-54C0-1775-361BD9AF5B17-C58273BB 0xffffffff80200000 /home/mgorny/git/libfbsdvmcore/kernel
jingham added a comment.EditedNov 29 2021, 5:26 PM

Is this the proposal: If you are talking to a live process (including a Core File process) and the plugin you are using says it knows nothing about 0x1234, then as a fallback you ask some file if it has that address? Why do you need to do this? Are you guessing that the core file is just busted, and there should have been records in it for some loaded section but they were left out, and so you are guessing that they are going to be in the same place as if that section loaded rigidly in its place in the original file (not necessarily true)?

Once you have a process, the process is the truth of the memory layout. Unless there's a real reason why the process lied to you about what was in the binary, I don't think we should be making up other data sources. And it also seems like you would have to understand WHY this particular plugin was lying to you to know how to actually recover that memory correctly. So solving it at this layer also seems wrong.

This doesn't seem like a good idea to me, and also seems like it's covering some other bug.

Well, that's what happens with the core file I have and what you're asking for is really above my paygrade. That said, this is kernel core dump, so load addresses should probably be the same as vm-addresses in the file. If adding something simple to the core file will achieve this, I suppose I can modify the generator instead.

Well, that's what happens with the core file I have and what you're asking for is really above my paygrade.

Maybe you should ask for a raise then. :P

That said, this is kernel core dump, so load addresses should probably be the same as vm-addresses in the file.

Yes, and the real question why doesn't lldb know that. The point of the load addresses is to tell lldb where a particular section is loaded, so the problem/bug is that that is not happening. Normally this is the job of the dynamic loader plugin. ProcessElfCore is using DynamicLoaderPOSIXDYLD, but that is probably not correct for the kernel cores. If the file addresses in the kernel image are always correct, then it may be sufficient to just make it use DynamicLoaderStatic for these kinds of cores. If not, you may want to try implementing something like the existing minidump plugin, which directly sets load addresses based on the information in the minidump file (bypassing the dynamic loader plugin, which wouldn't really work anyway, as the relevant information would not be captured in the minidump).

Blindly falling back to file addresses is not a solution, because it will be wrong for the cases where the file and load addresses do not match (e.g. an elf pie binary will have a file address of zero, but that's definitely not where it will be loaded).

mgorny abandoned this revision.Dec 6 2021, 2:47 AM

Closing in favor of D114911.