Page MenuHomePhabricator

Adding a RegisterContextMinidump_x86_64 converter

Authored by dvlahovski on Sep 26 2016, 8:53 AM.



This is a register context converter from Minidump to Linux reg context.
This knows the layout of the register context in the Minidump file
(which is the same as in Windows FYI) and as a result emits a binary data
buffer that matches the Linux register context binary layout.
This way we can reuse the existing RegisterContextLinux_x86_64 and
RegisterContextCorePOSIX_x86_64 classes.

Diff Detail


Event Timeline

dvlahovski updated this revision to Diff 72500.Sep 26 2016, 8:53 AM
dvlahovski retitled this revision from to Adding a RegisterContextMinidump_x86_64 converter.
dvlahovski updated this object.
dvlahovski added reviewers: labath, zturner.
dvlahovski added subscribers: amccarth, lldb-commits.

I tested this on the yet-to-be-submitted plugin code and it works fine on Linux (should work fine everywhere for that matter). I can do a backtrace, go up/down the stack frames and, of course, see all of the registers with register read

labath added inline comments.Sep 26 2016, 9:20 AM
42 ↗(On Diff #72500)

Replace these two functions with llvm::ArrayRef<uint8_t> GetData() const

55 ↗(On Diff #72500)

This function does not reflect the actual memory layout of the data, and you are not using it. I think it should go away.

161 ↗(On Diff #72500)

This looks like a private utility function. It should not be in the header.

186 ↗(On Diff #72500)

Could you add // end namespace minidump comments. Without them it's hard to tell what is going on here.

176 ↗(On Diff #72500)


Also i think having the register tests in this file is fine.

zturner added inline comments.Sep 26 2016, 9:48 AM
46–47 ↗(On Diff #72500)

Can this be a const MinidumpContext_x86_64_Flags*? Then you wouldn't have to do the uint32_t conversion each time since you could just & the flag you're checking directly against this value.

48 ↗(On Diff #72500)

Can we use a sizeof() here?

37 ↗(On Diff #72500)

Should these be endian specific types like ulittle16_t? If someone writes a minidump on a little endian system and runs this code on a big endian system, it would fail to parse the minidump correctly.

161 ↗(On Diff #72500)

The order of parameters here is confusing. the output buffer should be adjacent to its length. So it should be:

void writeRegister(ArrayRef<>, uint8_t*, size_t, const RegisterInfo*)

That said, passing around pointers and lengths is a pattern that needs to die. I would much rather this be:

void writeRegister(ArrayRef<uint8_t> &reg_src, MutableArrayRef<uint8_t> reg_dest, const RegisterInfo &reg);

177 ↗(On Diff #72500)

I would make the conversion a macro like this:

#define REG_VAL(x) *(reinterpret_cast<uint64_t *>(x))

Then write the asserts inline. The reason for this is that when you do it this way, when a test fails the test runner will report that the failure happened in registerEqualToVal, which won't give you any clue about what test it failed in. if the assertion is kept in the body of the test, the failure will report which test it was in, so you can more easily see.

dvlahovski marked 4 inline comments as done.

Updating the CL regarding Pavel's comments

I will fix the comments that Zachary made in the next revision

labath edited edge metadata.Sep 26 2016, 10:05 AM

lgtm, after Zachary is happy.

dvlahovski added inline comments.Sep 26 2016, 10:07 AM
47–48 ↗(On Diff #72509)

If it is then when I do a & the result is an enum class of type MinidumpContext_x86_64_Flags. And the compiler complains that this is not convertible to bool

dvlahovski added inline comments.Sep 26 2016, 10:10 AM
49 ↗(On Diff #72509)

sizeof(uint16_t), sizeof(uint32_t), etc ?

zturner added inline comments.Sep 26 2016, 10:36 AM
49 ↗(On Diff #72509)

I think my comments got out of line and this is no longer at the position I suggested it. that said, I actually didn't notice all these constants being passed into writeRegister. I would propose changing this. Find the RegisterInfo class in lldb-private-types.h and add the following methods:

llvm::ArrayRef<uint8_t> data(const uint8_t *context_base) const { 
  return llvm::ArrayRef<uint8_t>(context_base + byte_offset, byte_size);

llvm::MutableArrayRef<uint8_t> mutable_data(uint8_t *context_base) const {
  return llvm::MutableArrayRef<uint8_t>(context_base + byte_offset, byte_size);

Then you can write:

writeRegister(source_data, reg_info[lldb_cs_x86_64].mutable_data(result_base));
amccarth added inline comments.Sep 26 2016, 10:45 AM
1 ↗(On Diff #72509)

Should match file name.

31 ↗(On Diff #72509)

It might be better to put this comment with the declaration in the header file, since it explains what to pass in and what it does. Comments in .cpp files should contain implementation details that the callers don't necessarily need to know.

47–48 ↗(On Diff #72509)

I think what Zach means is that you could locally define a uint32_t const, initialized with the value from the enum. Then each if statement could use that constant without a cast.

Also, is this right? MinidumpContext_x86_x64_Flags::Control has two bits set, so the condition will be true if either of them is set. Is that the intended behavior? Or should you be ensuring that they're both set like this:

const utin32_t ControlFlags = MinidumpContext_x86_64::Control;
if ((*context_flags & ControlFlags) == ControlFlags) {


1 ↗(On Diff #72509)

Please make this line match the file name.

177 ↗(On Diff #72509)

+1 to Zach's idea.

Also, consider whether EXPECT_EQ is more appropriate than ASSERT_EQ for these.

dvlahovski added inline comments.Sep 26 2016, 11:08 AM
47–48 ↗(On Diff #72509)

Now that I think about I should check the arch flag bit at the beggining of this.
But then this if's are OK I think - in them I only want to check if that specific bit is set.

dvlahovski marked 5 inline comments as done.Sep 27 2016, 8:31 AM
dvlahovski added inline comments.
49 ↗(On Diff #72509)

This will make the code very clean indeed, but there is a problem.

From the RegisterInfo I only get the byte_offset and I don't use the byte_size because some of the registers in the Minidump context are smaller (e.g. cs is 16 bit). All of the registers in the Linux context are 64 bits in size. So that's why I basically hard-coded the size in each of the funciton calls. Now with the next revision I tried to make the code clearer.

38 ↗(On Diff #72509)

So I had the actual struct for the register context in the previous revision, but now I've removed it, because I actually don't use it. So added a comment describing it's layout.

For the endianness part - because this code only permutes the bytes in the layout the Linux register context expects them, I don't think that here is the place to handle that. I was hoping that the Linux register context will handle it, but I don't think that's the case when I look at the code in RegisterContextCorePOSIX_x86_64::ReadRegister.

Am I right - do the Linux context assume little endian? If so maybe I can also fix that.

177 ↗(On Diff #72509)

Probably EXPECT_EQ is better because if there is a mismatch in the reg values, one can see all of the values.

I still don't have the sense when to use EXPECT_EQ and when ASSERT_EQ.

dvlahovski updated this revision to Diff 72659.Sep 27 2016, 8:33 AM
dvlahovski edited edge metadata.

Addressing Zachary's and Adrian's comments

zturner edited edge metadata.EditedSep 27 2016, 8:51 AM

ASSERT_EQ causes the test method to return immediately if the condition fails. EXPECT_EQ will continue running the same test body. I like to use ASSERT_EQ, for example, if you are checking the value of a pointer returned by a function. First you want to check if it's null, then you want to check that its value is the value you expect. So I would do

ASSERT_NE(p, nullptr);
EXPECT_EQ(*p, 42);

Because if it's null, the rest of the statements in the test body are undefined.

Another example is if a method returns some value, say number of items in a list, and then you want to call another method to process that many items. Say:

int x = foo.GetItemCount();
const Item *items = foo.GetItems();
bool Success = ProcessItems(items, x);

If the number of items is not what you expect, it doesn't make sense to run the next statement. So you could write:

int x = foo.GetItemCount();
ASSERT_EQ(42, x);
const Item *items = foo.GetItems();
EXPECT_TRUE(ProcessItems(items, x));

On the other hand, if each thing you are testing is independent of the previous statement, everything can be expect. For example:

EXPECT_EQ("0", Foo(0));
EXPECT_EQ("1", Foo(1));
EXPECT_EQ("2", Foo(2));
EXPECT_EQ("3", Foo(3));

This way you can see as much diagnostic information as possible in the output (for example you might be able to spot a pattern in which values are failing.

Thanks for the explanation and examples! :) Will keep this in mind in the future.

amccarth added inline comments.Sep 27 2016, 9:03 AM
177 ↗(On Diff #72659)

EXPECT_xxx will check the condition and report if it's wrong. But then the test proceeds.

ASSERT_xxx will check the condition, and, if it's wrong, it will report _and_ terminate the current test.

So my rule of thumb is:

  1. Prefer EXPECT_xxx when checking that the code does what it's supposed to do.
  2. Use ASSERT_xxx when the rest of the test depends on this condition.
dvlahovski added inline comments.Sep 27 2016, 9:15 AM
177 ↗(On Diff #72659)

Thanks :)

zturner accepted this revision.Sep 27 2016, 9:28 AM
zturner edited edge metadata.
zturner added inline comments.
50 ↗(On Diff #72659)

If they are just smaller (but never bigger), you can call take_front(N) on the result. So you could say:

writeRegister(source_data, reg_info[lldb_cs_x86_64].mutable_data(result_base).take_front(2));

Now the line is starting to get long so might look ugly if it wraps, so it's up to you. It is somewhat safer at least since you will assert if you accidentally specify an invalid size, and you don't have to worry about getting the computation right.

Perhaps you could combine this with the approach you've already taken here. Add the functions to RegisterInfo, then change getDestRegister to look like this:

llvm::MutableArrayRef<uint8_t> getDestRegister(const RegisterInfo &reg,
                                               uint8_t *context, uint32_t lldb_reg_num) {
  auto bytes = reg.mutable_data(context);

  switch (lldb_reg_num) {
  case lldb_cs_x86_64:
  case lldb_ds_x86_64:
  case lldb_es_x86_64:
  case lldb_fs_x86_64:
  case lldb_gs_x86_64:
  case lldb_ss_x86_64:
    return bytes.take_front(2);
  case lldb_rflags_x86_64:
    return bytes.take_front(4);
    return bytes.take_front(8);

then you could call writeRegister like this:

writeRegister(source_data, getDestRegister(reg_info, result_base, lldb_cs_x86_64));

At this point though I'm not sure what's better. So don't consider this a requirement, up to you :)

This revision is now accepted and ready to land.Sep 27 2016, 9:28 AM
dvlahovski edited edge metadata.

I like the combined approach more. So this is the implementation

This revision was automatically updated to reflect the committed changes.