This is an archive of the discontinued LLVM Phabricator instance.

[llvm][dwwarf] Change CU/TU index to 64-bit
ClosedPublic

Authored by ayermolo on Dec 5 2022, 4:21 PM.

Details

Summary

Changed contribution data structure to 64 bit. I added the 32bit and 64bit
accessors to make it explicit where we use 32bit and where we use 64bit. Also to
make sure sure we catch all the cases where this data structure is used.

Diff Detail

Event Timeline

ayermolo created this revision.Dec 5 2022, 4:21 PM
Herald added a reviewer: Amir. · View Herald Transcript
Herald added a reviewer: maksfb. · View Herald Transcript
Herald added a project: Restricted Project. · View Herald Transcript
ayermolo requested review of this revision.Dec 5 2022, 4:21 PM
Herald added projects: Restricted Project, Restricted Project. · View Herald TranscriptDec 5 2022, 4:21 PM
ayermolo updated this revision to Diff 480299.Dec 5 2022, 5:55 PM

update s/getLength()/getLength32() in DWP.cpp

Perhaps the change to use accessors could be removed, now that you've used it to find all the places that needed to be fixed up? (like just using it for cleanup/temporary purposes, without needing to commit that API change?)

llvm/include/llvm/DebugInfo/DWARF/DWARFUnitIndex.h
115

How come this became an array? Rather than keeping it as two named fields?

Perhaps the change to use accessors could be removed, now that you've used it to find all the places that needed to be fixed up? (like just using it for cleanup/temporary purposes, without needing to commit that API change?)

I am not sure what other projects are using it, that I missed, or not in llvm-trunk, but are based of it.

ayermolo added inline comments.Dec 5 2022, 6:01 PM
llvm/include/llvm/DebugInfo/DWARF/DWARFUnitIndex.h
115

So I can provide a generic set interface to be used in llvm-dwp and on bolt side. Maybe there is another way of doing it in c++? I also didn't want to make it overly complicated.

// Write the offsets.

writeIndexTable(Out, ContributionOffsets, IndexEntries,
                DWARFUnitIndex::Entry::SectionContribution::OffsetFieldIndex);

// Write the lengths.
writeIndexTable(Out, ContributionOffsets, IndexEntries,
                DWARFUnitIndex::Entry::SectionContribution::LengthFieldIndex);
void writeIndexTable(MCStreamer &Out, ArrayRef<unsigned> ContributionOffsets,
                     const MapVector<uint64_t, UnitIndexEntry> &IndexEntries,
                     unsigned Index) {
  for (const auto &E : IndexEntries)
    for (size_t I = 0; I != std::size(E.second.Contributions); ++I)
      if (ContributionOffsets[I])
        Out.emitIntValue((E.second.Contributions[I].getField32(Index)), 4);
}
ayermolo updated this revision to Diff 480619.Dec 6 2022, 2:21 PM

fixed test

ayermolo updated this revision to Diff 480994.Dec 7 2022, 11:22 AM

rebase, clang-format

Accidentally pushed when pushing bolt changes. Reverted.

ayermolo abandoned this revision.Dec 7 2022, 2:01 PM

Redid in D139577.

ayermolo reclaimed this revision.Dec 7 2022, 3:37 PM

Perhaps the change to use accessors could be removed, now that you've used it to find all the places that needed to be fixed up? (like just using it for cleanup/temporary purposes, without needing to commit that API change?)

I am not sure what other projects are using it, that I missed, or not in llvm-trunk, but are based of it.

It's awkward to convolute the API to ensure a breakage - I think it's best to leave it as-is, for the most part. I guess you needed the 32 bit accessors so existing code doesn't become UB because of truncation to 32 bit?

Could the code keep the existing member names, provide the wrappers without turning the members into an array and needing named index constants, etc, at least? (though even then, seems like there's more to it than needed)

ayermolo updated this revision to Diff 485156.Dec 23 2022, 2:17 PM

Removed generic setters for Index, and changed back to variables

Perhaps the change to use accessors could be removed, now that you've used it to find all the places that needed to be fixed up? (like just using it for cleanup/temporary purposes, without needing to commit that API change?)

I am not sure what other projects are using it, that I missed, or not in llvm-trunk, but are based of it.

It's awkward to convolute the API to ensure a breakage - I think it's best to leave it as-is, for the most part. I guess you needed the 32 bit accessors so existing code doesn't become UB because of truncation to 32 bit?

Could the code keep the existing member names, provide the wrappers without turning the members into an array and needing named index constants, etc, at least? (though even then, seems like there's more to it than needed)

Thanks for circling back to this during holidays. :)
Right, that was my original thought pattern. I am fully open to the idea that this is not the right approach. :)

@dblaikie Happy New Year!
Penny for your thoughts on this patch? :)

@dblaikie If getter/setter APIs are a blocker, I can remove them and do casts (uint32_t) instead....?

dblaikie accepted this revision.Jan 9 2023, 4:41 PM

Perhaps the change to use accessors could be removed, now that you've used it to find all the places that needed to be fixed up? (like just using it for cleanup/temporary purposes, without needing to commit that API change?)

I am not sure what other projects are using it, that I missed, or not in llvm-trunk, but are based of it.

It's awkward to convolute the API to ensure a breakage - I think it's best to leave it as-is, for the most part. I guess you needed the 32 bit accessors so existing code doesn't become UB because of truncation to 32 bit?

Could the code keep the existing member names, provide the wrappers without turning the members into an array and needing named index constants, etc, at least? (though even then, seems like there's more to it than needed)

Thanks for circling back to this during holidays. :)
Right, that was my original thought pattern. I am fully open to the idea that this is not the right approach. :)

I don't know that there's enough out of tree usage to worry about forcing a break by changing the API like this - or to include the "get*32" functions.

Is this actually an undefined behavior issue (I don't think the implicit conversion would be any more broken than the explicit cast, at least - but can't find the specific wording about defined/undefined behavior off hand at the moment) , or only an attempt to identify places that could benefit from updates to support 64 bit lengths?

Eh, still seems weird, but guess it's not that much of a big deal - so let's go with it.

This revision is now accepted and ready to land.Jan 9 2023, 4:41 PM

Awesome, thank you for reviewing, I appreciate it!

This revision was automatically updated to reflect the committed changes.

@ayermolo This has broken some buildbots - please can you take a look? https://lab.llvm.org/buildbot/#/builders/245/builds/3279

@ayermolo This has broken some buildbots - please can you take a look? https://lab.llvm.org/buildbot/#/builders/245/builds/3279

ok

https://reviews.llvm.org/D141504
TBH not 100% sure on the fix, but looking where it's failing, maybe it's some kind of interaction with APIs returning 64bit number, and 32bit format macros on ARM?
I don't have access to ARM machine.

ayermolo reopened this revision.Jan 11 2023, 2:43 PM
This revision is now accepted and ready to land.Jan 11 2023, 2:43 PM
ayermolo updated this revision to Diff 488393.EditedJan 11 2023, 2:51 PM

Trying a fix for arm.
Couldn't repro locally with colleague on mac with ARM. So reverted will resubmit with potential fix to test on build bot.

This revision was landed with ongoing or failed builds.Jan 11 2023, 3:08 PM
This revision was automatically updated to reflect the committed changes.