This is an archive of the discontinued LLVM Phabricator instance.

Treat a UUID of all zeros consistently to mean "Invalid UUID"
ClosedPublic

Authored by jingham on Aug 18 2022, 5:34 PM.

Details

Summary

At present, there are two ways to get data into a UUID object, an unchecked version and what's called the "Optional" version. So for instance, there's setFromData and setFromOptionalData. The difference is that the Optional version looks at the UUID bytes and if they are all 0's it clears the data store for the UUID, rendering it an Invalid UUID.

However, there was no apparatus for determining which of these two variants to use. The PDB version always uses the Optional version, the Darwin support code used both somewhat inconsistently. The ELF object file code uses the unchecked version. And the generic parts of the code sometimes use one and sometimes another. So the current approach is not very coherent.

On Darwin, treating all-zero's UUID's as significant is actually wrong behavior - so that needs to be changed. But that still leaves the system seeming arbitrary. For instance, SBModuleSpec::SetUUIDBytes uses the checked method. So even if some ObjectFile format treats UUID's of 0 as significant, you wouldn't be able to find it with SBModuleSpec.

My guess is this is all just historical and it would be fine to treat UUID's of 0 as meaning "not valid for matching", which is what this patch does.

If this is not the case, we'll have to add something like ObjectFile::AllZerosUUIDIsValid, and then check it everywhere outside the object file reader that we build UUID's. But I'd rather not do that if there isn't a compelling reason to do so.

Diff Detail

Event Timeline

jingham created this revision.Aug 18 2022, 5:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2022, 5:34 PM
jingham requested review of this revision.Aug 18 2022, 5:34 PM
Herald added a project: Restricted Project. · View Herald TranscriptAug 18 2022, 5:34 PM
clayborg requested changes to this revision.Aug 18 2022, 7:52 PM

Minidump files have UUID values that are zeroed out. We will need to do something for these that can sense all zeroes and return a default constructed one.

lldb/source/Plugins/Process/minidump/MinidumpParser.cpp
72

Minidump files created by Google's Breakpad generates UUIDs with zeros:

Modules[2].CvRecord.location.DataSize = 0x00000032 (50)
Modules[2].CvRecord.location.Rva = 0x000698e0
Modules[2].CvRecord.signature    = 0x53445352 (Pdb70)
Modules[2].CvRecord.uuid         = 00000000-0000-0000-0000-000000000000
Modules[2].CvRecord.age          = 0x00000000
Modules[2].CvRecord.pdb_name     = system@framework@boot.art

It would be fine to just check for the "pdb70_uuid->Uuid" being all zeroes here and return a default constructed UUID object

This revision now requires changes to proceed.Aug 18 2022, 7:52 PM
clayborg added inline comments.Aug 18 2022, 7:57 PM
lldb/source/API/SBModuleSpec.cpp
137

If we want this API to be consistent to what it was doing before, then we need to check for all zeroes here too?

lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
1392

Have you checked if the kernel ever just specifies all zeroes here? It it does, this will change things right?

lldb/source/Plugins/DynamicLoader/MacOSX-DYLD/DynamicLoaderMacOSXDYLD.cpp
888

More LC_UUID values are valid AFAIK, so this should be ok.

lldb/source/Plugins/Process/minidump/MinidumpParser.cpp
70–73

Fix the UUID UUID::fromCvRecord(UUID::CvRecordPdb70) function and call that function from here.

lldb/source/Utility/UUID.cpp
45

Ditto above comment from the MinidumpParser. Actually this function should be used, the implementation from the MinidumpParser should use this function.

Minidump files have UUID values that are zeroed out. We will need to do something for these that can sense all zeroes and return a default constructed one.

I'm confused. The MinidumParser.cpp already used fromOptionalData and fromOptionalStringRef to create its UUID's. So this patch won't change the Minidump behavior.

Remember, all this patch does is make a universal rule that "All zero UUID's are placeholders and not meant to be matched against." The only platforms/ObjectFile readers that weren't already following that rule were Darwin - where this was definitely just historical accident - and ObjectFileELF. Then there were a few places in generic code where we seem to have tossed a coin for which behavior to use. Windows & Breakpad were already calling the "Optional" version of the set functions. So this is only a change in behavior for ELF.

lldb/source/Plugins/DynamicLoader/Darwin-Kernel/DynamicLoaderDarwinKernel.cpp
1392

On Darwin, UUID's of all zeros always means "I had to put in a UUID, but I don't actually want you to use it". I don't know whether the Kernel currently produces this or not - it's currently used for the stub binaries that Playgrounds uses.

lldb/source/Plugins/Process/minidump/MinidumpParser.cpp
70–73

Note that fromCvRecord already used the fromOptional version, so its behavior doesn't change with this patch. I don't think anything else needs to be done here.

clayborg accepted this revision.Aug 19 2022, 1:52 PM
This revision is now accepted and ready to land.Aug 19 2022, 1:52 PM
labath accepted this revision.Aug 22 2022, 1:12 AM
labath added inline comments.
lldb/include/lldb/Utility/UUID.h
50–51

The main reason this was a factory function was so that we could use the factory name to disambiguate between the two interpretations of zero "uuids".

As we're removing that, we could also go back to using regular constructors for object creation.

jingham updated this revision to Diff 456473.Aug 29 2022, 3:22 PM

Remove the static factories and replace them with constructors.

This was mostly mechanical, but a biggish patch against the previous version, so I thought I'd put this up for one more look in case folks were interested.

This revision was automatically updated to reflect the committed changes.