This is an archive of the discontinued LLVM Phabricator instance.

[LLDB] Enable 64 bit debug/type offset
ClosedPublic

Authored by ayermolo on Nov 23 2022, 3:17 PM.

Details

Summary

This came out of from https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902
With big binaries we can have .dwp files where .debug_info.dwo section can grow
beyond 4GB. We would like to support this in LLVM and in LLDB.

The plan is to enable manual parsing of cu/tu index in DWARF library
(https://reviews.llvm.org/D137882), and then
switch internal index data structure to 64 bit.
For the second part is to enable 64bit offset support in LLDB with
this patch.

Diff Detail

Event Timeline

ayermolo created this revision.Nov 23 2022, 3:17 PM
Herald added a project: Restricted Project. · View Herald Transcript
ayermolo published this revision for review.Nov 23 2022, 3:19 PM
Herald added a project: Restricted Project. · View Herald Transcript
labath added a subscriber: labath.Nov 24 2022, 2:17 AM

I am puzzled by the OSO changes in the DIERef class. How do they tie in with the increase in the offset size? It seems like it should at best be a separate patch...

ayermolo added a comment.EditedNov 24 2022, 7:40 AM

I am puzzled by the OSO changes in the DIERef class. How do they tie in with the increase in the offset size? It seems like it should at best be a separate patch...

That part was more about having a consistent interface for setting and getting lldb::user_id_t to avoid bugs, and having implicit assumptions what bit layout looks like.

For example here:
user_id_t SymbolFileDWARF::GetUID(DIERef ref) {

if (GetDebugMapSymfile())
  return GetID() | ref.die_offset();

I am puzzled by the OSO changes in the DIERef class. How do they tie in with the increase in the offset size? It seems like it should at best be a separate patch...

I have been helping Alexander get this patch ready for open source. We needed to do these changes or this patch doesn't work and would break mac debugging. The reason is some people were manually creating lldb::user_id_t IDs and then manually decoding them. If we change the DIE offset size, then the people that were manually creating user_id_t would now be encoding bits into the wrong bits if they were every put into a DIERef we would extract the wrong information.

Part of what this patch is doing is allowing a DIERef to get a user_id_t from the object and also create itself from a user_id_t. This allows a single consistent interface. No one should be manually encoding user_id_t values now, and it should always be done through DIERef.

labath added a comment.Dec 1 2022, 7:40 AM

The explanation makes sense, and I *think* the patch is ok, but it's hard to review it with all the noise. I still believe the DIERef change would be better off as a separate patch, so that the change is not obscured by the (hopefully mechanical) aspects of increasing the size of the offset field.

The explanation makes sense, and I *think* the patch is ok, but it's hard to review it with all the noise. I still believe the DIERef change would be better off as a separate patch, so that the change is not obscured by the (hopefully mechanical) aspects of increasing the size of the offset field.

I don't think that would be mechanical, because of implicit and explicit assumptions of bit layout, and trying to keep the size of DIERef from doubling.

The explanation makes sense, and I *think* the patch is ok, but it's hard to review it with all the noise. I still believe the DIERef change would be better off as a separate patch, so that the change is not obscured by the (hopefully mechanical) aspects of increasing the size of the offset field.

I am saying we can't make the DIERef change separately because it breaks the buildbots. Without modifying the OSO stuff, we can't make this patch work. If we need to change the max DIE offsets size, we need all these fixes.

We have modified the DIERef class over the years and made it work for both DWARF in .o files (OSO) for mac and then for fission (.dwo). The two made different changes that never affected either one but with these changes they need to coexist and actually work with DIERef in the same way with no assumptions on the DIE offset being in the lower 32 bits of a user_id_t.

labath added a comment.Dec 2 2022, 6:33 AM

I don't believe there's no way to split this patch up. I mean, just half of it is dedicated to changing PRIx32 into PRIx64. Surely that can be a patch of it's own, even if it meant that DWARFDIE::GetOffset temporarily returns something other than dw_offset_t. And if we first changed that to use the llvm::formatv function (which does not require width specifiers), then changing the size of dw_offset_t would be a no-op there. And the fact that the patch description can be summed up into one sentence (enable 64-bit offsets) means that I as a reviewer have to reverse-engineer from scratch what all of these (very subtle) changes do and how they interact.

For example, I only now realized that this patch makes the DIERef class essentially interchangable with the user_id_t type (in has an (implicit!) constructor and a get_id() function). That's not necessarily bad, if every DIERef can really be converted to a user_id_t. However, if that's the case, then why does SymbolFileDWARF::GetUID still exist?

Previously the DIERef class did not encode information about which "OSO" DWARF file in the debug map it is referring to, and they way that was enforced was by having all conversions go through that function (which was the only way to convert from one to the other). If every DIERef really does carry the OSO information then this function (and it's counterpart, DecodeUID) should not exist. If it doesn't carry that information, then we're losing some type safety because we now have two ways to do the conversion, and one of them is (sometimes?) incorrect. Maybe there's no way to avoid that, it's definitely worth discussing, and it that would be a lot easier without the other changes in the way.

As for the discussion, I am still undecided about whether encoding the OSO information into the DIERef is a good thing. In some ways, it is very similar to dwo files (whose information is encoded there), but OTOH, they are also very different. An OSO is essentially a completely self-contained dwarf file, and we represent it in lldb as such (with its own Module, SymbolFile objects, etc.). A DWO file is only syntactically independent (e.g. its DIE tree can be parsed independently), but there's no way to interpret the information inside it without accessing the parent object file (as that contains all the address information). This is also reflected in how they're represented in LLDB. The don't have their own Module objects, and the SymbolFileDWARFDwo class is just a very thin wrapper that forwards everything to the real symbol file. Therefore, it does not seem *un*reasonable to have one way/object to reference a DIE inside a single SymbolFileDWARF (and all the DWO files it references), and another to reference *any* DIE in the set of all SymbolFileDWARFs (either a single object, or multiple objects managed by a SymbolFileDWARFDebugMap) which provide the information for this module.

lldb/source/Plugins/SymbolFile/DWARF/DIERef.h
51

At least make this explicit so it can't be constructed from any random integer. I'd consider even making this a named (static) function (e.g. DIERef fromUID(user_id_t)), as one should be extra careful around these conversions.

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
15

This would look much better in the block on line 60, next to the other includes from this directory.
Or, even better, if you just delete all the empty lines between the includes, then clang-format will automatically sort the whole thing.

ayermolo updated this revision to Diff 479769.Dec 2 2022, 4:00 PM

Addressed inlined comments.

I don't believe there's no way to split this patch up. I mean, just half of it is dedicated to changing PRIx32 into PRIx64. Surely that can be a patch of it's own, even if it meant that DWARFDIE::GetOffset temporarily returns something other than dw_offset_t. And if we first changed that to use the llvm::formatv function (which does not require width specifiers), then changing the size of dw_offset_t would be a no-op there. And the fact that the patch description can be summed up into one sentence (enable 64-bit offsets) means that I as a reviewer have to reverse-engineer from scratch what all of these (very subtle) changes do and how they interact.

For example, I only now realized that this patch makes the DIERef class essentially interchangable with the user_id_t type (in has an (implicit!) constructor and a get_id() function). That's not necessarily bad, if every DIERef can really be converted to a user_id_t. However, if that's the case, then why does SymbolFileDWARF::GetUID still exist?

Previously the DIERef class did not encode information about which "OSO" DWARF file in the debug map it is referring to, and they way that was enforced was by having all conversions go through that function (which was the only way to convert from one to the other). If every DIERef really does carry the OSO information then this function (and it's counterpart, DecodeUID) should not exist. If it doesn't carry that information, then we're losing some type safety because we now have two ways to do the conversion, and one of them is (sometimes?) incorrect. Maybe there's no way to avoid that, it's definitely worth discussing, and it that would be a lot easier without the other changes in the way.

As for the discussion, I am still undecided about whether encoding the OSO information into the DIERef is a good thing. In some ways, it is very similar to dwo files (whose information is encoded there), but OTOH, they are also very different. An OSO is essentially a completely self-contained dwarf file, and we represent it in lldb as such (with its own Module, SymbolFile objects, etc.). A DWO file is only syntactically independent (e.g. its DIE tree can be parsed independently), but there's no way to interpret the information inside it without accessing the parent object file (as that contains all the address information). This is also reflected in how they're represented in LLDB. The don't have their own Module objects, and the SymbolFileDWARFDwo class is just a very thin wrapper that forwards everything to the real symbol file. Therefore, it does not seem *un*reasonable to have one way/object to reference a DIE inside a single SymbolFileDWARF (and all the DWO files it references), and another to reference *any* DIE in the set of all SymbolFileDWARFs (either a single object, or multiple objects managed by a SymbolFileDWARFDebugMap) which provide the information for this module.

How would you like me to break up this diff? Is factoring OSO into another diff enough, or do you want more granular one?

labath added a comment.Dec 5 2022, 2:58 AM

How would you like me to break up this diff? Is factoring OSO into another diff enough, or do you want more granular one?

Hard to say without seeing what the patches would look like, but yes, in general, I'd say that the OSO/DIERef integration is the most fundamental part of this patch, and I'd optimize things such that this part stands out as much as possible. If you can do that, then maybe everything else can be in the second patch (sequenced either before or after it). Another obvious patch could be to create formatv-based version of the Module::ReportWarning function, and all of the PRIx64 parts of this patch to call that instead. That will reduce the blast radius of the subsequent dw_offset_t size change.

ayermolo updated this revision to Diff 482550.Dec 13 2022, 10:37 AM

Separated format patch, and oso patch. Although without OSO changes a lot of tests fail on mac.

Thanks for splitting this up. We still need to figure out what to do with the first patch, but these two are looking very good now.

At least, in the sense that one can clearly see what is happening -- I'd still like to have to discussion about the DIERef->user_id conversion. Essentially, I'd like to see if we can preserve the property that the conversion always produces a valid user_id. With this patch that is not true anymore, because the OSO DIERefs need to be passed through the SymbolFileDWARF::GetUID function, even though it is _very_ tempting to just call get_id() on them. Previously, there was no get_id function, so one had no choice but to call GetUID. If we can make sure that the OSO symbol files always construct DIERefs with the oso field populated correctly, then I think this approach would be fine (and we should be able to delete the GetUID function). Otherwise, I'd like to explore some options of keeping the DIERef->user_id conversion tightly controlled. Personally, I am still not convinced that doing the conversion in the GetUID function is a bad thing. IIUC, the main problem is the part where we do a bitwise or of the DIERef (the die offset part) and the OSO ID (GetID() | ref.die_offset()), but I don't see why we couldn't do something about that. Basically we could just create an inverse function of GetOSOIndexFromUserID (which extracts the OSO symbol file from the user id) -- and make sure the functions are close to each other and their implementation matches.

Thanks for splitting this up. We still need to figure out what to do with the first patch, but these two are looking very good now.

At least, in the sense that one can clearly see what is happening -- I'd still like to have to discussion about the DIERef->user_id conversion. Essentially, I'd like to see if we can preserve the property that the conversion always produces a valid user_id. With this patch that is not true anymore, because the OSO DIERefs need to be passed through the SymbolFileDWARF::GetUID function, even though it is _very_ tempting to just call get_id() on them. Previously, there was no get_id function, so one had no choice but to call GetUID. If we can make sure that the OSO symbol files always construct DIERefs with the oso field populated correctly, then I think this approach would be fine (and we should be able to delete the GetUID function). Otherwise, I'd like to explore some options of keeping the DIERef->user_id conversion tightly controlled. Personally, I am still not convinced that doing the conversion in the GetUID function is a bad thing. IIUC, the main problem is the part where we do a bitwise or of the DIERef (the die offset part) and the OSO ID (GetID() | ref.die_offset()), but I don't see why we couldn't do something about that. Basically we could just create an inverse function of GetOSOIndexFromUserID (which extracts the OSO symbol file from the user id) -- and make sure the functions are close to each other and their implementation matches.

I don't think creation of ID was that tightly controlled.
For example
oso_symfile->SetID(((uint64_t)m_cu_idx + 1ull) << 32ull);
in SymbolFileDWARFDebugMap.cpp
But on more general note all the assumptions about bit layout scattered through few places:
GetOSOIndexFromUserID
GetUID
SymbolFileDWARFDwo constructor
GetDwoNum
Encode/Decode UID
and even something like
const dw_offset_t function_die_offset = func.GetID();
return DecodedUID{

*dwarf, {std::nullopt, DIERef::Section::DebugInfo, dw_offset_t(uid)}};

So probably moving to something uniform to construct uid for dwo and oso cases, and extract relevant fields is a step in a right direction.

I do see your point that it's currently not quite there.

This

if (GetDebugMapSymfile()) {
    DIERef die_ref(GetID());
    die_ref.set_die_offset(ref.die_offset());
    return die_ref.get_id();
  }

Is kind of not ideal.
In other places it does work.

static uint32_t GetOSOIndexFromUserID(lldb::user_id_t uid) {
    llvm::Optional<uint32_t> OsoNum = DIERef(uid).oso_num();
    lldbassert(OsoNum && "Invalid OSO Index");
    return *OsoNum;
  }
oso_symfile->SetID(DIERef(DIERef::IndexType::OSONum, m_cu_idx,
                                      DIERef::Section(0), 0)
                                   .get_id());
llvm::Optional<uint32_t> GetDwoNum() override {
    return DIERef(GetID()).dwo_num();
  }

I guess main issue with GetUID is that we rely on an internal state of SymbolFileDWARF to

  1. figure out if we are dealing with dwo or oso with check for GetDebugMapSymfile
  2. get extra data GetDwoNum(), and GetID()

We can either push that logic on the caller side of things (not I deal I would think) and remove GetUID, or extend the constructor to be a bit more explicit. This way all the bit settings are still consolidated, but the logic of when to create what is still hidden in GetDebugMapSymfile.

What do you think?

I think that the main reason we've arrived at such different conclusions is that I treat the "user IDs of DIEs" and and "user IDs of symbol files" as essentially two different things (namespaces if you will). Obviously, one needs the symbol file ID in order to compute the DIERef ID, but theoretically those two can use completely different encodings. With this take on things, I stand by my assertion that DIERef->user_id conversions are tightly controlled. The symbol file ID computations are a mess.

You, if I understand correctly, see the ID of a symbol file as a special case of an all-encompassing user id -- essentially a user_id (or a DIERef) pointing to the first byte of the symbol file. with this world view, the entirety of user ID computation is a mess. :)
I can definitely see the appeal of viewing the world that way. It's nice and uniform and unambiguous (since you can't have a DIE at offset zero) -- it's just not the view I had when I was writing this code a couple of years ago. :) And it has the disadvantage of obscuring the DIERef->user_id transition (for DIEs at least), and that's what I'm weight against the other advantages of that approach.

I guess main issue with GetUID is that we rely on an internal state of SymbolFileDWARF to

  1. figure out if we are dealing with dwo or oso with check for GetDebugMapSymfile
  2. get extra data GetDwoNum(), and GetID()

We can either push that logic on the caller side of things (not I deal I would think) and remove GetUID, or extend the constructor to be a bit more explicit. This way all the bit settings are still consolidated, but the logic of when to create what is still hidden in GetDebugMapSymfile.

What do you think?

I'm not entirely sure what you mean by that, but I think either of those could be fine. Essentally, what I'm trying to achieve is to make sure is that if the DIERef<->user_id conversion is trivial, then it is always valid to perform it (i.e. there are no partially constructed DIERefs). Ideally, there wouldn't be partially constructed DIERefs in any case, but that is not as important if one is forced to provide that additional information in order to do the conversion.

However, I also want to throw out this alternative. This one goes in the completely opposite direction. Instead of centralizing the conversions, it federates it (which is I think is roughly what I had in mind when I worked on this last time). There is no single place which controls the conversion, but there are multiple disjoint places which do that:

  • one for the OSO case. This includes the following problematic lines you've listed:

GetOSOIndexFromUserID
GetUID (1/2)
Encode/Decode UID (1/2)
return DecodedUID{

*dwarf, {std::nullopt, DIERef::Section::DebugInfo, dw_offset_t(uid)}};
  • one for the DWO case:

GetUID (1/2)
Encode/Decode UID (1/2)

  • one for Symbol File IDs (which is does a +1 on the internal index -- bacause the main object file has ID 0)

oso_symfile->SetID(((uint64_t)m_cu_idx + 1ull) << 32ull);
SymbolFileDWARFDwo constructor
GetDwoNum (cancels out the previous one)

And I don't think it's an obstacle for making the die offsets larger -- I've included comments on how I think that could happen.

It doesn't handle this one, which seems just wrong, and should be made to use GetUID/DecodeUID

const dw_offset_t function_die_offset = func.GetID();

You, if I understand correctly, see the ID of a symbol file as a special case of an all-encompassing user id -- essentially a user_id (or a DIERef) pointing to the first byte of the symbol file. with this world view, the entirety of user ID computation is a mess. :)

Ah I see. Thanks for providing context. Yeah that is kind of the way I see it after talking with Greg. Looking over your proposal it makes sense with the way you described how you see this. You and @clayborg have a historical context for this code. He is on PTO right now, lets see what he thinks when he is back in a couple weeks? :)

In meantime what do you think about closing on D139955, so it can land? I think it is fully independent of design decision in this diff, and can land separately. Would be great to juggle one less diff on a stack.

I think that the main reason we've arrived at such different conclusions is that I treat the "user IDs of DIEs" and and "user IDs of symbol files" as essentially two different things (namespaces if you will). Obviously, one needs the symbol file ID in order to compute the DIERef ID, but theoretically those two can use completely different encodings. With this take on things, I stand by my assertion that DIERef->user_id conversions are tightly controlled. The symbol file ID computations are a mess.

You, if I understand correctly, see the ID of a symbol file as a special case of an all-encompassing user id -- essentially a user_id (or a DIERef) pointing to the first byte of the symbol file. with this world view, the entirety of user ID computation is a mess. :)
I can definitely see the appeal of viewing the world that way. It's nice and uniform and unambiguous (since you can't have a DIE at offset zero) -- it's just not the view I had when I was writing this code a couple of years ago. :) And it has the disadvantage of obscuring the DIERef->user_id transition (for DIEs at least), and that's what I'm weight against the other advantages of that approach.

FWIW: User IDs of symbol files is not part of any API. It was added to SymbolFileDWARF to allow us to identify .o files for mac without dSYM and was used by the fission code that Pavel wrote as well. Since this is internal, it doesn't matter at all how we make or use the IDs. No public interface will ever expect a SymboFile to have a lldb::user_id_t. Therefore it is just down to how to use these IDs within the DWARF symbol plug-ins for both mac with .o files and for fission with .dwo files.

Things that are created by the DWARF, like lldb_private::CompileUnit, lldb_private::Function, lldb_private::Block, and lldb_private::Type do have lldb::user_id_t and they are expected to make IDs that make sense for the individual SymbolFile plug-in to be able to easily match up with something in the DWARF. All of these objects have DIEs in the DWARF, so we must be able to make a lldb::user_id_t that allows us to easily answer more questions about something at a later date in the DWARF. Like we can make a lldb_private::CompileUnit without making any functions, blocks or types. If we are later asked to find all functions for a compile unit, we should be able to take the lldb::user_id_t of the compile unit and easily do this.

So how DIERef is used is solely up to the DWARF symbol file plug-in.

So we could just assign the user IDs of each SymbolFileDWARF to be the index of the .o file for mac, or the index of the .dwo file for fission. It doesn't really matter. As long as we can easily take a user_id_t from a virtual interface and track it back to the DIE we care about.

I guess main issue with GetUID is that we rely on an internal state of SymbolFileDWARF to

  1. figure out if we are dealing with dwo or oso with check for GetDebugMapSymfile
  2. get extra data GetDwoNum(), and GetID()

We can either push that logic on the caller side of things (not I deal I would think) and remove GetUID, or extend the constructor to be a bit more explicit. This way all the bit settings are still consolidated, but the logic of when to create what is still hidden in GetDebugMapSymfile.

What do you think?

I'm not entirely sure what you mean by that, but I think either of those could be fine. Essentally, what I'm trying to achieve is to make sure is that if the DIERef<->user_id conversion is trivial, then it is always valid to perform it (i.e. there are no partially constructed DIERefs). Ideally, there wouldn't be partially constructed DIERefs in any case, but that is not as important if one is forced to provide that additional information in order to do the conversion.

However, I also want to throw out this alternative. This one goes in the completely opposite direction. Instead of centralizing the conversions, it federates it (which is I think is roughly what I had in mind when I worked on this last time). There is no single place which controls the conversion, but there are multiple disjoint places which do that:

  • one for the OSO case. This includes the following problematic lines you've listed:

GetOSOIndexFromUserID
GetUID (1/2)
Encode/Decode UID (1/2)
return DecodedUID{

*dwarf, {std::nullopt, DIERef::Section::DebugInfo, dw_offset_t(uid)}};
  • one for the DWO case:

GetUID (1/2)
Encode/Decode UID (1/2)

  • one for Symbol File IDs (which is does a +1 on the internal index -- bacause the main object file has ID 0)

oso_symfile->SetID(((uint64_t)m_cu_idx + 1ull) << 32ull);
SymbolFileDWARFDwo constructor
GetDwoNum (cancels out the previous one)

And I don't think it's an obstacle for making the die offsets larger -- I've included comments on how I think that could happen.

It doesn't handle this one, which seems just wrong, and should be made to use GetUID/DecodeUID

const dw_offset_t function_die_offset = func.GetID();

yes this is wrong and should be changed. It only used to work because we knew that the bottom 32 bits of a a 64 bit user_id_t was the DIE offset, which isn't true anymore.

Since the user IDs of SymbolFileDWARF plug-ins mean nothing to anyone else, we can make them what we need them to be so they work for us. I would suggest to remove the use of DIERef from calculating the IDs of symbol files and have .o files for mac and .dwo files for fission use a 1 based index as their ID to make it easy to encode into a DIERef when needed for lldb::user_id_t values that _are_ included in objects that we hand out. Is there anything else that would need to be done to keep everyone happy here?

ayermolo updated this revision to Diff 491008.Jan 20 2023, 5:14 PM

updated based on Gregs suggestion

...
Since the user IDs of SymbolFileDWARF plug-ins mean nothing to anyone else, we can make them what we need them to be so they work for us. I would suggest to remove the use of DIERef from calculating the IDs of symbol files and have .o files for mac and .dwo files for fission use a 1 based index as their ID to make it easy to encode into a DIERef when needed for lldb::user_id_t values that _are_ included in objects that we hand out. Is there anything else that would need to be done to keep everyone happy here?

I think that the 1-based index thingy helps a lot here, but I haven't seen anything (in your reponse, or in the new patch) that would address my concernt DIERef<->user_id conversion ambiguity. I.e. how is one supposed to know what is the "right" way to convert a DIERef to a user_id:

  • die_ref.get_id()
  • or symbol_file.GetUID(die_ref) (which, funnily enough, will construct another DIERef, and *then* call get_id? (return DIERef(GetID(), ref.section(), ref.die_offset()).get_id();)

What's your position on that? That we should live with the ambiguity?

lldb/source/Plugins/SymbolFile/DWARF/DIERef.h
72

Can we remove this function now?

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.cpp
11

nor this

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARFDwo.h
12

I guess this isn't necessary anymore.

...
Since the user IDs of SymbolFileDWARF plug-ins mean nothing to anyone else, we can make them what we need them to be so they work for us. I would suggest to remove the use of DIERef from calculating the IDs of symbol files and have .o files for mac and .dwo files for fission use a 1 based index as their ID to make it easy to encode into a DIERef when needed for lldb::user_id_t values that _are_ included in objects that we hand out. Is there anything else that would need to be done to keep everyone happy here?

I think that the 1-based index thingy helps a lot here, but I haven't seen anything (in your reponse, or in the new patch) that would address my concernt DIERef<->user_id conversion ambiguity. I.e. how is one supposed to know what is the "right" way to convert a DIERef to a user_id:

  • die_ref.get_id()
  • or symbol_file.GetUID(die_ref) (which, funnily enough, will construct another DIERef, and *then* call get_id? (return DIERef(GetID(), ref.section(), ref.die_offset()).get_id();)

What's your position on that? That we should live with the ambiguity?

Searching for GetUID doesn't look like it's used all that often, maybe follow up patch is just to get rid of it, and replace with DIERef?

I think that the 1-based index thingy helps a lot here, but I haven't seen anything (in your reponse, or in the new patch) that would address my concernt DIERef<->user_id conversion ambiguity. I.e. how is one supposed to know what is the "right" way to convert a DIERef to a user_id:

  • die_ref.get_id()
  • or symbol_file.GetUID(die_ref) (which, funnily enough, will construct another DIERef, and *then* call get_id? (return DIERef(GetID(), ref.section(), ref.die_offset()).get_id();)

What's your position on that? That we should live with the ambiguity?

Searching for GetUID doesn't look like it's used all that often, maybe follow up patch is just to get rid of it, and replace with DIERef?

If you could make that work, that would be awesome, but I think that's going to be fairly hard. It's true that there aren't that many call sites of this functions, but the ones that are there are very crucial. The user_id_t type represents a symbol-file-neutral identifier (cookie, if you will) that different symbol file implementations use to identify parsed objects (types, mostly). SymbolFileDWARF uses it (via DIERef et al.) to identify the DIE belonging to that type. PDB symbol files use it differently, but the idea is the same. If we wanted to remove that, we'd have to come up with a whole new way to parse/link types -- and one that would work for non-dwarf symbol files as well.

(also, this would not really address my concern, because the question would then become "which DIERef is safe to be used as a Type cookie" (answer: only the one which has the OSO field set), but the reduction in complexity resulting from removing one step from the conversion process (DWARFDIE->DIERef->user_id) might be well worth it.)

I think that the 1-based index thingy helps a lot here, but I haven't seen anything (in your reponse, or in the new patch) that would address my concernt DIERef<->user_id conversion ambiguity. I.e. how is one supposed to know what is the "right" way to convert a DIERef to a user_id:

  • die_ref.get_id()
  • or symbol_file.GetUID(die_ref) (which, funnily enough, will construct another DIERef, and *then* call get_id? (return DIERef(GetID(), ref.section(), ref.die_offset()).get_id();)

What's your position on that? That we should live with the ambiguity?

Searching for GetUID doesn't look like it's used all that often, maybe follow up patch is just to get rid of it, and replace with DIERef?

If you could make that work, that would be awesome, but I think that's going to be fairly hard. It's true that there aren't that many call sites of this functions, but the ones that are there are very crucial. The user_id_t type represents a symbol-file-neutral identifier (cookie, if you will) that different symbol file implementations use to identify parsed objects (types, mostly). SymbolFileDWARF uses it (via DIERef et al.) to identify the DIE belonging to that type. PDB symbol files use it differently, but the idea is the same. If we wanted to remove that, we'd have to come up with a whole new way to parse/link types -- and one that would work for non-dwarf symbol files as well.

SymbolFileDWARF is the only SymbolFile that inherits from UserID. So this is a DWARF internal thing only where symbol files have user_id_t values. So as long as we make this work for all things DWARF we are good to go.

(also, this would not really address my concern, because the question would then become "which DIERef is safe to be used as a Type cookie" (answer: only the one which has the OSO field set), but the reduction in complexity resulting from removing one step from the conversion process (DWARFDIE->DIERef->user_id) might be well worth it.)

DIERef should be used anywhere a DIE needs a user_id_t. If DIERef now uses the file index (OSO index, or DWO index) in the same way where we always ask the SymbolFileDWARFXXX::GetID() as the file index, then things will be the same between the two (DWO or OSO).

We just need to create all DIERef objects using the GetID() from the symbol file as the file index, and we should be able to remove the SymbolFile::GetUID() function now. As long as file index zero is reserved for "vanilla DWARF that doesn't use DWO or OSO we will know the difference. We might want to not have SymbolFileDWARF inherit from UserID at all, and switch over to have SymbolFileDWARF add a virtual function:

uint32_t m_file_index = 0; // Zero means main DWARF file, 1...N identifies the Nth DWO file or OSO file
virtual uint32_t GetFileIndex() { return m_file_index; }

Then anyone can set the file index correctly for DWO or OSO files. And we avoid using user_id_t values for the symbol files since they aren't needed.

lldb/source/Plugins/SymbolFile/DWARF/DIERef.h
72

We should be able to, and we should move this lldbassert to the constructor to ensure that die_offset is not too large.

ayermolo marked 5 inline comments as done.Jan 26 2023, 11:34 AM

We just need to create all DIERef objects using the GetID() from the symbol file as the file index, and we should be able to remove the SymbolFile::GetUID() function now. As long as file index zero is reserved for "vanilla DWARF that doesn't use DWO or OSO we will know the difference. We might want to not have SymbolFileDWARF inherit from UserID at all, and switch over to have SymbolFileDWARF add a virtual function:

uint32_t m_file_index = 0; // Zero means main DWARF file, 1...N identifies the Nth DWO file or OSO file
virtual uint32_t GetFileIndex() { return m_file_index; }

Then anyone can set the file index correctly for DWO or OSO files. And we avoid using user_id_t values for the symbol files since they aren't needed.

This isn't about the "user id" of a symbol file. I'm totally happy with the changes there -- though I also wouldn't be opposed to changing the "user id" field to something more explicit (like the file index).

My problem is with the "user id"s of individual DIEs. Currently, if I have a DWARFDIE, the only way to get its user id is to do something like die.GetDWARF()->GetUID(die). With this patch, there are two ways:

  1. the same as before
  2. die.GetDIERef()->get_id()

The problem is that the second way is not going to be correct for OSO files because that path will not set the "oso" component of the DIERef. The worst part is that the second method is much shorter than the first one, so I think it will be very tempting to use it -- and it will actually be right most of the time, until that code is used in an OSO context.

ayermolo updated this revision to Diff 492841.Jan 27 2023, 10:27 AM

addressed comments

Created commit that removes getUID(...) https://reviews.llvm.org/D142775
Seems like it's isolated to SymbolFile and DWARF code.
So now userid goes through DIERef.

I'm sorry, but that patch does not fix the problem I am trying to point out. In fact, I think it makes things a lot worse.

We clearly have some kind of a communication problem, but I am running out of ideas of what can I do about it. Let me try rephrasing it one more time:

  • this patch creates two path for converting a DIERef to a user_id_t -- a) ref.get_id(); and b) dwarf.GetUID(ref)
  • of those two ways, one is clearly more intuitive
  • of those two ways, one is always correct
  • those two ways aren't the same -- (a) is simpler; (b) is correct
  • you can't fix that by simply taking (b) away. All that does is make the API misuse even more likely. That patch essentially just deletes GetUID, and inlines it into all its callers.

Forget about the what the code does for a moment, and tell me which of these snippets looks better:
i)

if (IsValid())
  return GetDWARF()->GetUID(*this);

ii)

const std::optional<DIERef> &ref = this->GetDIERef();
if (ref)
  return DIERef(GetID(), ref->section(), ref->die_offset()).get_id();

iii)

if (IsValid())
  return GetDIERef()->get_id();

Now look up the implementation and tell me which one is correct.

I'm sorry, but that patch does not fix the problem I am trying to point out. In fact, I think it makes things a lot worse.

We clearly have some kind of a communication problem, but I am running out of ideas of what can I do about it. Let me try rephrasing it one more time:

  • this patch creates two path for converting a DIERef to a user_id_t -- a) ref.get_id(); and b) dwarf.GetUID(ref)
  • of those two ways, one is clearly more intuitive
  • of those two ways, one is always correct
  • those two ways aren't the same -- (a) is simpler; (b) is correct
  • you can't fix that by simply taking (b) away. All that does is make the API misuse even more likely. That patch essentially just deletes GetUID, and inlines it into all its callers.

Forget about the what the code does for a moment, and tell me which of these snippets looks better:
i)

if (IsValid())
  return GetDWARF()->GetUID(*this);

ii)

const std::optional<DIERef> &ref = this->GetDIERef();
if (ref)
  return DIERef(GetID(), ref->section(), ref->die_offset()).get_id();

iii)

if (IsValid())
  return GetDIERef()->get_id();

Now look up the implementation and tell me which one is correct.

Thank you for providing an example. Yes sometimes it's hard to communicate over comments.
In this context yes first one is better.
Question is what should it look "under the hood".
For example:
DIERef::Decode
SymbolFileDWARF::GetUID
SymbolFileDWARF::DecodeUID
There are all these bit shifts scattered around.

If this is such a blocker I did expand on your diff, and added 64 bit support (it still has to be cleaned up a bit like various static constexpr probably moved out of DIERef to #define in dwarfh, but for illustrative purposes) https://reviews.llvm.org/D142779

Looks good to me. Pavel?

lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
67

Why is this needed? No casting should be needed for using the llvm formatting stuff?

198

Needed? Same as above

I'm sorry, but that patch does not fix the problem I am trying to point out. In fact, I think it makes things a lot worse.

We clearly have some kind of a communication problem, but I am running out of ideas of what can I do about it. Let me try rephrasing it one more time:

  • this patch creates two path for converting a DIERef to a user_id_t -- a) ref.get_id(); and b) dwarf.GetUID(ref)
  • of those two ways, one is clearly more intuitive
  • of those two ways, one is always correct
  • those two ways aren't the same -- (a) is simpler; (b) is correct
  • you can't fix that by simply taking (b) away. All that does is make the API misuse even more likely. That patch essentially just deletes GetUID, and inlines it into all its callers.

Forget about the what the code does for a moment, and tell me which of these snippets looks better:
i)

if (IsValid())
  return GetDWARF()->GetUID(*this);

ii)

const std::optional<DIERef> &ref = this->GetDIERef();
if (ref)
  return DIERef(GetID(), ref->section(), ref->die_offset()).get_id();

iii)

if (IsValid())
  return GetDIERef()->get_id();

Now look up the implementation and tell me which one is correct.

Sorry a lot of noise on all of these things. Forget me last comment, I hit submit too quickly.

How about we make DIERef constructor always take all the info that is needed to construct the objects correctly:

DIERef(DWARFDie die);
DIERef(SymbolFileDWARF *dwarf, dw_offset_t die_offset); // might not need this one?
DIERef(user_id_t uid);

We might not need all of these. But in this case, we can't incorrectly use the APIs since all of the objects that are needed to fill it in are in the constructor args. We take away the ability to manually fill in the DWO num and other fields. Would that fix the issues you have with this patch Pavel?

So looks like this function needs to be fixed:

llvm::Optional<DIERef> DWARFBaseDIE::GetDIERef() const {
  if (!IsValid())
    return llvm::None;

  return DIERef(m_cu->GetSymbolFileDWARF().GetDwoNum(), m_cu->GetDebugSection(),
                m_die->GetOffset());
}

This currently only works for DWO files, but not for OSO files. If we make the DIERef constructor that this function uses "protected", then only this function can use it. And it should be able to do things correctly.

If we switched the DIERef constructor to take a SymbolFileDWARF as a mandatory first arg, then we can manually supply the section + offset, then DWARFBaseDie can always do this correctly for both DWO files and OSO files.

We might need to add an accessor to SymbolFileDWARF like:

virtual uint32_t SymbolFileDWARF::GetFileIndex();

And there can be a setter for this as well. Then the OSO stuff would set the file index to a 1 based index, and the DWO files would also set this as the file index in the SymoblFileDWARF base class and then this all works.

So my suggestion would be:

  • make only one DIERef constructor: "DIERef(SymbolFileDWARF *dwarf, Section section, dw_offset_t die_offset)"
  • add SymbolFileDWARF::GetFileIndex() and SymbolFileDWARF::SetFileIndex(...) and a backing ivar
  • OSO and DWO files will set the file index manually early on so it is always available and can always create valid DIERef objects in DWARFBaseDIE::GetDIERef
  • Change DWARFBaseDIE::GetDIERef to use the GetFileIndex() where it expects zero for a non DWO or OSO file and a 1 based index for DWO or OSO stuff
  • Don't allow anyone else to create manually DIERef objects unless you ask for it from a DWARFBaseDie except for the encode/decode functions for saving to/from cache

How about we make DIERef constructor always take all the info that is needed to construct the objects correctly:

DIERef(DWARFDie die);
DIERef(SymbolFileDWARF *dwarf, dw_offset_t die_offset); // might not need this one?
DIERef(user_id_t uid);

We might not need all of these. But in this case, we can't incorrectly use the APIs since all of the objects that are needed to fill it in are in the constructor args. We take away the ability to manually fill in the DWO num and other fields. Would that fix the issues you have with this patch Pavel?

Yes, I believe it would, but I do want to add two things:

  • I don't consider it important whether most of the construction work happens inside the DIERef constructor, or outside of it. So, I would consider these two implementations equally fine
DIERef(DWARFDie die); // compute this somehow
DWARFDie::GetDIERef() { return DIERef(*this); }

vs.

DIERef(dwo_id, type_unit_flag, die_offset, ...); // a dumb constructor
DWARFDie::GetDIERef() { return DIERef(...); } // computation happens here

The first one is what you described -- the second one is how it roughly how it works right now.

  • My main source of frustration was that my concern is getting overlooked/ignored (not necessarily your fault -- I've been told I am not always sufficiently clear). I think that is something we could live with, if we thing the other cleanups in this patch are worth it (which could very well be the case) -- however, I would want us to be clear that's what we're doing.
  • My main source of frustration was that my concern is getting overlooked/ignored (not necessarily your fault -- I've been told I am not always sufficiently clear). I think that is something we could live with, if we thing the other cleanups in this patch are worth it (which could very well be the case) -- however, I would want us to be clear that's what we're doing.

I do want to state that if we fix things the way I describe it will work seamlessly with OSO or DWO files. The current state of things is the DWO stuff only uses the fancy DIERef constructor and fills in the dwo number correctly only to have it overwritten in SymbolFile::GetUID(...). The SymbolFile::GetUID(...) is needed for OSOs currently because the DIERef that SymbolFileDWARF (which is used for OSO) doesn't correctly create DIERef objects since they always return llvm::None for SymbolFileDWARF::GetDwoNum(). But the new API will have SymbolFileDWARF::GetFileIndex() to be used instead of SymbolFileDWARF::GetDwoNum(), and the file index will be set correctly for both DWO and OSO files. We can then change DIERef away from DWO specific naming, and have DIERef have a "m_file_index" and "m_file_index_valid" instead of the dwo specific members. As long as both OSO and DWO files can be found from the user_id_t API calls we are all good. Not sure if this addresses all of your issues or not.

If all of your concerns are not clarified above, can you clarify what is still being overlooked? Both Alexander and I are usually thinking we are addressing everything you want, but we obviously still aren't, so restating your remaining concerns would help us get this patch moving.

labath added a comment.Feb 3 2023, 4:04 AM
  • My main source of frustration was that my concern is getting overlooked/ignored (not necessarily your fault -- I've been told I am not always sufficiently clear). I think that is something we could live with, if we thing the other cleanups in this patch are worth it (which could very well be the case) -- however, I would want us to be clear that's what we're doing.

I do want to state that if we fix things the way I describe it will work seamlessly with OSO or DWO files. The current state of things is the DWO stuff only uses the fancy DIERef constructor and fills in the dwo number correctly only to have it overwritten in SymbolFile::GetUID(...). The SymbolFile::GetUID(...) is needed for OSOs currently because the DIERef that SymbolFileDWARF (which is used for OSO) doesn't correctly create DIERef objects since they always return llvm::None for SymbolFileDWARF::GetDwoNum(). But the new API will have SymbolFileDWARF::GetFileIndex() to be used instead of SymbolFileDWARF::GetDwoNum(), and the file index will be set correctly for both DWO and OSO files. We can then change DIERef away from DWO specific naming, and have DIERef have a "m_file_index" and "m_file_index_valid" instead of the dwo specific members. As long as both OSO and DWO files can be found from the user_id_t API calls we are all good. Not sure if this addresses all of your issues or not.

If all of your concerns are not clarified above, can you clarify what is still being overlooked? Both Alexander and I are usually thinking we are addressing everything you want, but we obviously still aren't, so restating your remaining concerns would help us get this patch moving.

Everything is clarified is and fine. I agree with your plan Thanks. I was trying to return the favor and clarify my own (potentially rude) responses.

  • My main source of frustration was that my concern is getting overlooked/ignored (not necessarily your fault -- I've been told I am not always sufficiently clear). I think that is something we could live with, if we thing the other cleanups in this patch are worth it (which could very well be the case) -- however, I would want us to be clear that's what we're doing.

I do want to state that if we fix things the way I describe it will work seamlessly with OSO or DWO files. The current state of things is the DWO stuff only uses the fancy DIERef constructor and fills in the dwo number correctly only to have it overwritten in SymbolFile::GetUID(...). The SymbolFile::GetUID(...) is needed for OSOs currently because the DIERef that SymbolFileDWARF (which is used for OSO) doesn't correctly create DIERef objects since they always return llvm::None for SymbolFileDWARF::GetDwoNum(). But the new API will have SymbolFileDWARF::GetFileIndex() to be used instead of SymbolFileDWARF::GetDwoNum(), and the file index will be set correctly for both DWO and OSO files. We can then change DIERef away from DWO specific naming, and have DIERef have a "m_file_index" and "m_file_index_valid" instead of the dwo specific members. As long as both OSO and DWO files can be found from the user_id_t API calls we are all good. Not sure if this addresses all of your issues or not.

If all of your concerns are not clarified above, can you clarify what is still being overlooked? Both Alexander and I are usually thinking we are addressing everything you want, but we obviously still aren't, so restating your remaining concerns would help us get this patch moving.

Everything is clarified is and fine. I agree with your plan Thanks. I was trying to return the favor and clarify my own (potentially rude) responses.

Great, thank you for your and Greg feedback. Let me modify my changes according to Gregs suggestion.

ayermolo updated this revision to Diff 495582.Feb 7 2023, 9:53 AM

Implemented Gregs suggestion. Also consolidated three diffs back into one.
To make it clear the scope of changes, and for ease of testing.

Very clean patch now, just a few nits about asserts!

lldb/source/Plugins/SymbolFile/DWARF/DebugNamesDWARFIndex.cpp
129

Does this assert really need to exist? Why would we not require a .dwo file (old code) be able to index? Can we remove this assert? It seems wrong?

lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp
403–404

Does this assert really need to exist? Why would we not require a .dwo file (old code) be able to index? Can we remove this assert? It seems wrong?

lldb/source/Plugins/SymbolFile/DWARF/NameToDIE.cpp
52–53

Does this assert really need to exist? Why would we not require a .dwo file (old code) be able to index? Can we remove this assert? It seems wrong?

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
1404

Pavel: note we now really on "SymbolFileDWARF::GetDie(user_id_t)" to be the one source of truth when finding a DIE. We could make "SymbolFileDWARF:GetDie(DIERef ref)" be the one source of truth and then have "SymbolFileDWARF::GetDie(user_id_t)" just create a local DIERef and then call "SymbolFileDWARF:GetDie(DIERef ref)" if that would be cleaner.

1682

Pavel: note we now really on "SymbolFileDWARF::GetDie(user_id_t)" to be the one source of truth when finding a DIE. We could make "SymbolFileDWARF:GetDie(DIERef ref)" be the one source of truth and then have "SymbolFileDWARF::GetDie(user_id_t)" just create a local DIERef and then call "SymbolFileDWARF:GetDie(DIERef ref)" if that would be cleaner.

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.h
570–572
labath added inline comments.Feb 8 2023, 10:44 AM
lldb/source/Plugins/SymbolFile/DWARF/DIERef.h
21

I don't understand this sentence.

31

they're not actually protected

lldb/source/Plugins/SymbolFile/DWARF/DWARFBaseDIE.cpp
75

Is this the only call site of the DIERef(SymbolFileDWARF&) constructor?

If so, and if we make it such that DWARFBaseDIE::GetDIERef returns the fully filled in DIERef, then this function can just call get_id() on the result, and we can delete that constructor.

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
1682

+1

clayborg added inline comments.Feb 8 2023, 11:35 AM
lldb/source/Plugins/SymbolFile/DWARF/DIERef.h
19–24
31

There were for a bit to control access to this, but in reality we ended up friending too many classes and then ran into issues with the DIERefTest stuff, so we decided to make them public again. We can remove this comment.

lldb/source/Plugins/SymbolFile/DWARF/DWARFBaseDIE.cpp
75

This line doesn't make sense. If we got a valid DIERef back from GetDIERef(), then we just return that as it would have used the SymbolFileDWARF to fill everything in already. So we might not need that extra constructor if this is the only place as Pavel suggested.

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
1682

Ok. So lets do this - change "DWARFDIE SymbolFileDWARF::GetDIE(lldb::user_id_t uid)" to just be:

DWARFDIE SymbolFileDWARF::GetDIE(lldb::user_id_t uid) {
  return GetDIE(DIERef(uid));
}

And then change the current "DWARFDIE SymbolFileDWARF::GetDIE(lldb::user_id_t uid)" to be the one that does all of the work:

DWARFDIE SymbolFileDWARF::GetDIE(DIERef die_ref) {
  std::optional<uint32_t> file_index = die_ref.file_index();
  if (file_index) {
    if (SymbolFileDWARFDebugMap *debug_map = GetDebugMapSymfile())
      symbol_file = debug_map->GetSymbolFileByOSOIndex(*file_index); // OSO case
    else if (*file_index == DIERef::k_file_index_mask)
      symbol_file = m_dwp_symfile.get(); // DWP case
    else
      symbol_file = this->DebugInfo()
                        .GetUnitAtIndex(*die_ref.file_index())
                        ->GetDwoSymbolFile(); // DWO case
  } else if (die_ref.die_offset() == DW_INVALID_OFFSET) {
    symbol_file = nullptr;
  } else {
    symbol_file = this;
  }

  if (symbol_file)
    return symbol_file->GetDIE(die_ref);

  return DWARFDIE();
}
ayermolo marked 16 inline comments as done.Feb 8 2023, 2:25 PM
ayermolo added inline comments.
lldb/source/Plugins/SymbolFile/DWARF/DIERef.h
31

Oops, forgot to remove. For one of the internal revisions I experimented with making them protected.

lldb/source/Plugins/SymbolFile/DWARF/DWARFDebugInfoEntry.cpp
198

m_offset is is bit field now, so without it clang produces error.

lldb/source/Plugins/SymbolFile/DWARF/SymbolFileDWARF.cpp
1682

ah, yes, great suggestion.

ayermolo updated this revision to Diff 495952.Feb 8 2023, 2:26 PM
ayermolo marked 2 inline comments as done.

addressed comments

Looks good to me. Pavel?

labath accepted this revision.Feb 13 2023, 5:40 AM
labath added inline comments.
lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp
403–404

That was because it a split dwarf setup, there are two compile units, two symbol files and two CU DIEs. In a DIERef, the unit offset refers to the offset of the main unit within the main symbol file (because that's globally unique), but the die offset refers to the offset in the separate file (because that's where the dies are). The indexing process needs to start with the main unit (not the one from the split file) in order for the DIERefs to come out right, and these assertions were enforcing that. Therefore, I think we should put all of these back in.

Or at least, that was the case at some point in the past... I don't know whether this has changed since then, but I wouldn't expect it to.

This revision is now accepted and ready to land.Feb 13 2023, 5:40 AM

Thank you for your patience. I'm really happy with the overall result here.

ayermolo updated this revision to Diff 497064.Feb 13 2023, 12:01 PM
ayermolo marked an inline comment as done.

Added asserts back in

ayermolo added inline comments.
lldb/source/Plugins/SymbolFile/DWARF/ManualDWARFIndex.cpp
403–404

Thank you for elaborating, Will put them back in.

This revision was automatically updated to reflect the committed changes.
ayermolo reopened this revision.Feb 13 2023, 2:10 PM

Had to revert, broke some build bots. :(

This revision is now accepted and ready to land.Feb 13 2023, 2:10 PM
ayermolo updated this revision to Diff 497467.Feb 14 2023, 3:48 PM

fixed cross-project tests, and also normal test that I somehow missed.
Need to get access to windows machine to figure out why that fails.

ayermolo updated this revision to Diff 497468.Feb 14 2023, 3:51 PM

small clenaup

OK, managed to setup windows VM, and couldn't repro.
After reverting my fix for cross-projects tests locally, I saw test failures.
Will try to push again and see if it triggers bot build failures.

This revision was automatically updated to reflect the committed changes.
mib added a subscriber: mib.Feb 16 2023, 5:08 PM

Hi @ayermolo!

This patch is causing some failure on the macOS lldb bot: https://green.lab.llvm.org/green/job/lldb-cmake/51257/

Could you take a look ? If you don't have the time, we can revert your patch until you manage to reproduce these failures.

Let me know if you need help with that :)

Hi @ayermolo!

This patch is causing some failure on the macOS lldb bot: https://green.lab.llvm.org/green/job/lldb-cmake/51257/

Could you take a look ? If you don't have the time, we can revert your patch until you manage to reproduce these failures.

Let me know if you need help with that :)

Sorry about that. I ran on mac and tests passed. Let me see if I can repro tomorrow.
In meantime reverted.

ayermolo reopened this revision.Feb 16 2023, 5:22 PM
This revision is now accepted and ready to land.Feb 16 2023, 5:22 PM
This comment was removed by ayermolo.
ayermolo updated this revision to Diff 499365.Feb 21 2023, 9:20 PM

Fixed logic for DwarfMap, also removed an assert in AppleDWARFIndex::GetGlobalVariables.
Before when it would invoke GetDwoNum it will go to a virtual API that alays returned nullopt.
The ID was handled through GetOSOIndexFromUserID. Now that it's all consolidated under
GetFileIndex it's no longer an apporpriate assert I think.

ayermolo updated this revision to Diff 499601.Feb 22 2023, 11:28 AM

removed two more asserts. It made one of the tests fail as "Unresolved".

This revision was automatically updated to reflect the committed changes.

https://green.lab.llvm.org/green/job/lldb-cmake/51484/ passed.
There were other built bots failures, but looks like it was built failure related to other diff that was part of testing. Since I only changed things on mac side, and previously other built bots passed, fingers crossed this is it. :)

mib added a comment.Feb 22 2023, 2:30 PM

https://green.lab.llvm.org/green/job/lldb-cmake/51484/ passed.
There were other built bots failures, but looks like it was built failure related to other diff that was part of testing. Since I only changed things on mac side, and previously other built bots passed, fingers crossed this is it. :)

Thank you @ayermolo ! I really appreciate that you took the time to fix this :)

https://green.lab.llvm.org/green/job/lldb-cmake/51484/ passed.
There were other built bots failures, but looks like it was built failure related to other diff that was part of testing. Since I only changed things on mac side, and previously other built bots passed, fingers crossed this is it. :)

Thank you @ayermolo ! I really appreciate that you took the time to fix this :)

No, problem. Thank you for bringing it to my attention. Hopefully no more reverts will be necessary. :)