This is an archive of the discontinued LLVM Phabricator instance.

BreakpadRecords: Address post-commit feedback

Authored by labath on Jan 21 2019, 2:06 PM.



This addresses the issues raised in D56844. It removes the accessors from the
breakpad record structures by making the fields public. Also, I refactor the
UUID parsing code to remove hard-coded constants.

Diff Detail


Event Timeline

labath created this revision.Jan 21 2019, 2:06 PM

Let me know what you think of the UUID code and if you want to include that in this patch


rename to "consume_hex_integer" or an extra parameter for the base instead of hard coding to 16?


For Apple platforms Breakpad actually incorrectly byte swaps the first 32 bits and the next two 16 bit values. I have a patch for this, but since this is moving, it would be great to get that fix in here. Also, many linux breakpad file have the UUID field present but set to all zeroes which is bad as UUID parsing code will cause multiple modules to claim a UUID of all zeroes is valid and causes all modules with such a UUID to just use the first one it finds. The code I have in my unsubmitted patch is:

if (pdb70_uuid->Age == 0) {
  bool all_zeroes = true;
  for (size_t i=0; all_zeroes && i<sizeof(pdb70_uuid->Uuid); ++i)
    all_zeroes = pdb70_uuid->Uuid[i] == 0;
  // Many times UUIDs are not filled in at all, so avoid claiming that
  // all such libraries have a valid UUID that is all zeroes.
  if (all_zeroes)
    return UUID();
  if (arch.GetTriple().getVendor() == llvm::Triple::Apple) {
    // Breakpad incorrectly byte swaps the first 32 bit and next 2 16 bit
    // values in the UUID field. Undo this so we can match things up
    // with our symbol files
    uint8_t apple_uuid[16];
    // Byte swap the first 32 bits
    apple_uuid[0] = pdb70_uuid->Uuid[3];
    apple_uuid[1] = pdb70_uuid->Uuid[2];
    apple_uuid[2] = pdb70_uuid->Uuid[1];
    apple_uuid[3] = pdb70_uuid->Uuid[0];
    // Byte swap the next 16 bit value
    apple_uuid[4] = pdb70_uuid->Uuid[5];
    apple_uuid[5] = pdb70_uuid->Uuid[4];
    // Byte swap the next 16 bit value
    apple_uuid[6] = pdb70_uuid->Uuid[7];
    apple_uuid[7] = pdb70_uuid->Uuid[6];
    for (size_t i=8; i<sizeof(pdb70_uuid->Uuid); ++i)
      apple_uuid[i] = pdb70_uuid->Uuid[i];
    return UUID::fromData(apple_uuid, sizeof(apple_uuid));
  } else
    return UUID::fromData(pdb70_uuid->Uuid, sizeof(pdb70_uuid->Uuid));
} else
  return UUID::fromData(pdb70_uuid, sizeof(*pdb70_uuid));
labath marked 2 inline comments as done.Jan 22 2019, 8:48 AM
labath added inline comments.

I'll rename the function before submitting.


The byte swapping is already handled in this code (that's why it's this complicated). My impression was that the swapping is not apple-specific, but rather depends where you read the UUID from (MODULE record has it swapped, INFO record doesn't). Though that may depend on how we chose to interpret the raw bytes in other UUID sources (object files, pdb files, ...).

As for the all-zero case, that should be easily fixable, by changing fromData to fromOptionalData, but I'm surprised that this is necessary, as I was under the impression that breakpad invent's its own UUIDs in case the original object file doesn't have them. (This would actually be the better case, as otherwise we'd have to figure out how to tell the fictional and non-fictional uuids apart.) @lemo: Can you shed any light on this?

clayborg added inline comments.Jan 22 2019, 8:59 AM

This is OK as long as the UUIDs for ELF don't fall into this category. I am able to match up UUIDs for ELF just fine for breakpad files for Android.

labath marked an inline comment as done.Jan 22 2019, 9:05 AM
labath added inline comments.

Normally on linux you should always have the INFO record, which will have the unmangled UUID, and which we will give preference to if it is available.

If for some reason we don't find an INFO record, then we will use the version from the MODULE record, which we will manually unmangle. But that should be the right thing to do as it matches what the breakpad generator does. (You can see this by looking at a file which has both of these records -- they will differ in that the first one will be mangled and will have an extra zero at the end).

lemo accepted this revision.Jan 23 2019, 12:32 PM

Looks good - the all-zero UUID case is a bit unfortunate but it seems we have to handle it (like Greg suggested)


0; ?


coding-convention-wise: should these definitions use struct instead of class?


const method qualifier?



This revision is now accepted and ready to land.Jan 23 2019, 12:32 PM
labath marked 5 inline comments as done.Jan 23 2019, 1:23 PM

After some internal discussion, it seems that the situation with the all-zero UUIDs is as follows:

  • breakpad symbol files do not attach a special meaning to a zero UUID - if a file does not have a build-id, the dump_syms tool will use a hash of the first page of the text section (or something equally silly)
  • minidump files may treat the missing build-id by replacing it with zeroes depending on the tool used to produce the minidump: breakpad doesn't do that (it does the same hash as above), crashpad does.

So it seems like there is nothing to do here. Maybe the UUID reading code in ProcessMinidump needs revising though.


That is not necessary, as to_integer initializes it. Perhaps more importantly, not initializing this allows tools like msan and valgrind to actually detect the cases when you end up using an uninitialized value.


I don't think we have a strict rule about this case, but personally, when something starts using inheritance, I tend to think of it as a class.


This is a free function, not a method. :)

This revision was automatically updated to reflect the committed changes.
labath marked an inline comment as done.