This is an archive of the discontinued LLVM Phabricator instance.

Merge target triple into module triple when constructing module from memory
ClosedPublic

Authored by xiaobai on Feb 19 2019, 1:46 PM.

Details

Summary

While debugging an android process remotely from a windows machine, I
noticed that the modules constructed from an object file in memory only had
information about the architecture. Without knowledge of the OS or environment,
expression evaluation sometimes leads to incorrectly generated code or a
debugger crash. While we cannot know for certain what triple a module
constructed from an in-memory object file will have, we can use the triple
from the target to try and fill in the missing details.

Diff Detail

Repository
rL LLVM

Event Timeline

xiaobai created this revision.Feb 19 2019, 1:46 PM

a good guess is the triple that the target executable was compiled for.

What does this do when the executable has multiple slices, such as a Mach-O universal binary?

a good guess is the triple that the target executable was compiled for.

What does this do when the executable has multiple slices, such as a Mach-O universal binary?

Hmm, I hadn't considered that case. I'll test that and see what happens. Thanks for pointing that out.

clayborg added inline comments.Feb 19 2019, 2:05 PM
source/Target/Process.cpp
2638–2639 ↗(On Diff #187438)

Is the MergeFrom in the first part not enough? I am worried about the case where we don't have even an executable, no one has set the architecture on the target, or worse yet, they have set the wrong architecture on the target. We want to correct the architecture on the target if we didn't specify it or the target was wrong. I am worried if we do this here we might hose up things in those cases.

xiaobai marked an inline comment as done.Feb 19 2019, 2:10 PM
xiaobai added inline comments.
source/Target/Process.cpp
2638–2639 ↗(On Diff #187438)

MergeFrom is not enough. When debugging an android-aarch64 binary, the triple was just set to aarch64--- for modules constructed from in-memory object files, which is not enough info to do anything meaningful.

However, thinking about this further, MergeFrom might not even be what we want here. Specifically, this is *just* a guess and the information from the in-memory object file is likely more reliable. It would probably better not to merge but to overwrite when information is available.

clayborg added inline comments.Feb 19 2019, 2:28 PM
source/Target/Process.cpp
2639 ↗(On Diff #187438)

So if you have a target that is set to "aarch64-linux-android', and you have "aarch64---" in your object file/module, you would want to merge the missing bits from the target to augment the module's architecture so it grabs the "linux-android" bits. So I do believe MergeFrom should be used to augment the Module's architecture.

xiaobai marked an inline comment as done.Feb 19 2019, 2:35 PM
xiaobai added inline comments.
source/Target/Process.cpp
2639 ↗(On Diff #187438)

Ah, yes! Merging from the other direction is definitely what I wanted. Thanks for pointing that out.

xiaobai updated this revision to Diff 187458.Feb 19 2019, 3:27 PM

Updating based on feedback from clayborg

xiaobai updated this revision to Diff 187460.Feb 19 2019, 3:28 PM

Minor change

xiaobai retitled this revision from Make educated guess when constructing module from memory to Merge target triple into module triple when constructing module from memory.Feb 19 2019, 3:30 PM
xiaobai edited the summary of this revision. (Show Details)
Harbormaster completed remote builds in B28302: Diff 187460.

Can you think of a way to test this?

Probably just a replay of GDB protocol packets connected to a testing server would work. But, I don't know if there is infrastructure for that yet.

Perhaps a unit test?

labath accepted this revision.Feb 19 2019, 11:50 PM

Augmenting the architecture with the target information sounds right to me. If we're reading a module from memory, then we already have a running target, and I consider that information more reliable (since it comes from os/lldb-server) than the object file architecture (which is just random bits in process memory). I might even go so far as to say that if we detect that the module and target ArchSpecs substantially differ (for some meaning of "substantially"), we should reject the attempt to create the module in the first place. Maybe we even do that already at some level?

Unfortunately, I think testing this is going to be hard. The infrastructure for testing in-memory object files is very under-developed. I think the best attempt would be to have a test program which creates some object files in memory (via llvm jit, or perhaps by loading pre-prepared binaries from disk), and then using the gdb jit interface to notify lldb about it. But I don't think we have anything similar to this currently.

a good guess is the triple that the target executable was compiled for.

What does this do when the executable has multiple slices, such as a Mach-O universal binary?

I don't think this should be an issue since a Module object always points to a specific slice in the fat binary.

This revision is now accepted and ready to land.Feb 19 2019, 11:50 PM
clayborg accepted this revision.Feb 20 2019, 1:38 PM

Looks good.

This revision was automatically updated to reflect the committed changes.
Herald added a project: Restricted Project. · View Herald TranscriptFeb 20 2019, 3:12 PM