This is an archive of the discontinued LLVM Phabricator instance.

[NFC] Refactor Apple Accelerator Tables
ClosedPublic

Authored by JDevlieghere on Jan 20 2018, 8:02 AM.

Details

Summary

[NFC] Refactor Apple Accelerator Tables

This patch completely overhauls the accelerator table and makes them
truly generic. There have been several attempts to do this in the past:

In the end I didn't like either of them, because for some reason or
another parts of it felt hacky or decreased runtime performance. I
didn't want to completely rewrite them as I was hoping that we could
reuse parts for the successor in the DWARF standard. However, it seems
less and less likely that there will be a lot of opportunities for
sharing code and/or an interface. I just went for what I think is the
proper solution to our problem.

A quick summary for why these changes are necessary: dsymutil likes to
reuse the current implementation of the Apple accelerator tables.
However, LLDB expects a slightly different interface than what is
currently emitted. Additionally, in dsymutil we only have offsets and no
actual DIEs.

Although a lot of code seems to have changed, in the end this change is
pretty straightforward:

  • We turned HashDataContents into a template parameter.
  • We created two implementations of this class, one for type tables and one for everything else. There will be a third one for dsymutil with just an offset.
  • We use the supplied class to deduct the atoms for the header which makes the structure of the table fully self contained.
  • We renamed the prefix from DWARF- to Apple- to make space for the future implementation of .debug_names.

One could argue that the latter should've been a separate patch, but
since this refactoring touches almost every line, it doesn't really
matter.

The disadvantage of this is of course that a lot of the code is moved to
the header. One mitigation was splitting off the header class so we could
have the ::Emit() method implemented in the cpp file. With regards to
includes we only need to import two more headers than before
(CodeGen/AsmPrinter.h and llvm/MC/MCStreamer.h) which seems
reasonable.

Diff Detail

Repository
rL LLVM

Event Timeline

JDevlieghere created this revision.Jan 20 2018, 8:02 AM
aprantl added inline comments.Jan 20 2018, 8:16 PM
lib/CodeGen/AsmPrinter/DwarfAccelTable.h
178 ↗(On Diff #130765)

Reading just the struct definition I'm a little puzzled why this needs to be a template if we are only storing a pointer to DataT. i wonder if we could get away with just templateizing one or two methods instead of the entire class.

288 ↗(On Diff #130765)

Could you convert this to LLVM style with capitalized variable names?

326 ↗(On Diff #130765)

does this look better as a range-based-for?

Thanks a lot for the prompt review Adrian!

lib/CodeGen/AsmPrinter/DwarfAccelTable.h
178 ↗(On Diff #130765)

I don't think there's anything preventing this, I just liked this approach better because:

  1. No overhead of virtual calls.
  2. No risk of accidentally mixing different types of data in one table. (make interfaces easy to use correctly and hard to use incorrectly)
  3. No need to explicitly specify the atoms in the constructor and having it abstracted and self contained in the templated class. (same reason as 2)

Because of (2) and (3) it makes sense to have an emit interface where you call the appropriate methods to emit the data, because the atoms are right there and you have a guarantee about their form. If this is not the case, I think we should go for an interface similar to D42246 where DataT has an interface to get the data for a given atom and emit the correct form based on the atom's form; which means a loop and more function calls.

tl;dr it's an interface design decision and not a technical limitation.

288 ↗(On Diff #130765)

Sure! I left it like this based on Chandler's comment here: https://reviews.llvm.org/D40987?id=126062#inline-357702

326 ↗(On Diff #130765)

Definitely. I removed most of them but this one slipped through the cracks. Thanks!

davide added a subscriber: davide.Jan 21 2018, 12:24 PM
aprantl added inline comments.Jan 22 2018, 9:18 AM
lib/CodeGen/AsmPrinter/DwarfAccelTable.h
178 ↗(On Diff #130765)

I'm mostly worried about code size. My understanding (please let me know if this is misinformed) is that by having the entire class be a template we will get N copies of all methods including the ones that don't depend on the templated type such as the constructor.

No overhead of virtual calls.

Do you mean virtual calls when calling methods on the Values stored in the vector?
You could templateize the emit method itself and/or create a non-template AppleAccelTableBase class that contains all the generic code and templated AppleAccelTable<DataT> inheriting from that class. That's a common pattern I've seen used all over LLVM.

JDevlieghere added inline comments.Jan 22 2018, 11:49 AM
lib/CodeGen/AsmPrinter/DwarfAccelTable.h
178 ↗(On Diff #130765)

Indeed, the virtual call would be the ::emit() method in DataT, so one indirect call for every entry in the list. Inheriting from a common base class isn't really feasible either, as every function (indirectly) depends on DataT.

As we discussed offline, I share your code size concerns. There should be at most 3 different instantiations of the template, so the growth is limited as long as no new configuration of tables are added, which is a realistic assumption given the existence of DWARFv5 accelerator tables. This also means that as we move to the DWARFv5 implementation on darwin, this will becomes less of a hot path and it might be worth settling for a little runtime overhead to save on code size.

I encourage everyone to share which of the two trade-offs they feel we should pursue.

aprantl added inline comments.Jan 22 2018, 12:02 PM
lib/CodeGen/AsmPrinter/DwarfAccelTable.h
178 ↗(On Diff #130765)

That is good point. Since this destined to become legacy code path that we will need to carry around for an indefinite time (because of backwards deployment), optimizing for code size sounds like the right approach to me. Does the implementation on opensource.apple.com also use virtual function calls or would this be a regression?

JDevlieghere planned changes to this revision.Jan 22 2018, 1:16 PM
JDevlieghere added inline comments.
lib/CodeGen/AsmPrinter/DwarfAccelTable.h
178 ↗(On Diff #130765)

Alright, I will update the patch to optimize for code size. Indeed, our internal implementation already uses a the virtual call, so this would not be a regression.

Optimize for code size and use virtual call to ::emit().

Fix discrepancies with the LLVM coding standards.

I've uploaded D42420 so you can see my current work on the debug_names generation. It's still quite ugly and there are several interactions that I still have to figure out (e.g., how do I get the list of CUs I am indexing), but the thing I want to show is that there are more opportunities for code sharing here than on the parsing side: obviously the header is completely different, but the code for emission of some individual subtables can be shared as long as we can customize the top-level emit() function to control the order. Also, the functions for gathering and processing the data (AddName, finalize) can largely be shared.

The one idea I think could make things cleaner, but I haven't tried out yet, is to split this code into two parts:

  • one would deal purely with abstract data representation: gathering names, computing hashes, bucket counts, etc. This one would identical for both tables.
  • the other part would deal with serialization of the abstract representation. Here we would have two classes for the two table kinds plus a base class for common functionality.

I don't think any of that affects this patch directly, but it may be interesting for you to see where the common functionality is.

I've uploaded D42420 so you can see my current work on the debug_names generation. It's still quite ugly and there are several interactions that I still have to figure out (e.g., how do I get the list of CUs I am indexing), but the thing I want to show is that there are more opportunities for code sharing here than on the parsing side: obviously the header is completely different, but the code for emission of some individual subtables can be shared as long as we can customize the top-level emit() function to control the order. Also, the functions for gathering and processing the data (AddName, finalize) can largely be shared.

Great, this is definitely very interesting. I'm pleasantly surprised by what you managed to push down into the AccelTable base class.

The one idea I think could make things cleaner, but I haven't tried out yet, is to split this code into two parts:

  • one would deal purely with abstract data representation: gathering names, computing hashes, bucket counts, etc. This one would identical for both tables.
  • the other part would deal with serialization of the abstract representation. Here we would have two classes for the two table kinds plus a base class for common functionality.

That sounds great, you have my full support on that. Let me know if/how I can help. Do you have anything particular in mind to achieve this?

I don't think any of that affects this patch directly, but it may be interesting for you to see where the common functionality is.

Definitely. This is part of the reason I want to get this change upstream now. The current implementation doesn't support all of Apple's use cases and making a change like this *after* having factored out the common parts will only make things harder. Especially because we want to support the same use cases with the DWARFv5 tables as well.

aprantl added inline comments.Jan 23 2018, 8:33 AM
lib/CodeGen/AsmPrinter/DwarfAccelTable.h
69 ↗(On Diff #131042)

Can you be specific as to which variant of accel table this is referring to?

100 ↗(On Diff #131042)

... Conceptually it is a column with ...

324 ↗(On Diff #131042)

This is just me being curious: Would it make sense to have users pass the new (Allocator) DataT(D) and not make this a template at all?

JDevlieghere marked 3 inline comments as done.
  • Feedback Adrian
lib/CodeGen/AsmPrinter/DwarfAccelTable.h
324 ↗(On Diff #131042)

Wouldn't you agree that exposing the allocator to the caller makes a horrible interface?

aprantl added inline comments.Jan 23 2018, 10:09 AM
lib/CodeGen/AsmPrinter/DwarfAccelTable.h
324 ↗(On Diff #131042)

Fair point.

aprantl accepted this revision.Jan 23 2018, 10:10 AM

I'm happy with this now, please be sure to address any outstanding feedback from Pavel, though, if any.

This revision is now accepted and ready to land.Jan 23 2018, 10:10 AM

Alright!

@labath Do you have any comments/concerns/feedback?

I find the fact that you have to specify type of data in the addName call a bit weird. This means you cannot guarantee the homogeneity of the table, as different addName calls on the same table may use different types (and none of them needs to match the atom array you passed to the constructor). It sounds to me like the data type should really a part of the table type (but you have not done that because of the code size concerns et al.).

This "non-template base class" idea has already been proposed earlier in the thread by Adrian, but got dismissed because at that point you were still trying to optimize everything for speed. Now that you already have a base class for the data types, it should be fairly trivial to implement: put everything except the constructor and the addName function in the base class; the templated constructor can implicitly take the atom array as T::Atoms, and addName will automatically construct entries of type T. That should guarantee consistency with only a tiny code size increase.

What do you think?

JDevlieghere planned changes to this revision.Jan 24 2018, 2:10 AM

I find the fact that you have to specify type of data in the addName call a bit weird. This means you cannot guarantee the homogeneity of the table, as different addName calls on the same table may use different types (and none of them needs to match the atom array you passed to the constructor). It sounds to me like the data type should really a part of the table type (but you have not done that because of the code size concerns et al.).

Exactly.

This "non-template base class" idea has already been proposed earlier in the thread by Adrian, but got dismissed because at that point you were still trying to optimize everything for speed.

Correct, because by templating the data type we don't need a virtual call.

Now that you already have a base class for the data types, it should be fairly trivial to implement: put everything except the constructor and the addName function in the base class; the templated constructor can implicitly take the atom array as T::Atoms, and addName will automatically construct entries of type T. That should guarantee consistency with only a tiny code size increase.

What do you think?

I think it's a great idea! This would allow us to regain the strong interface I was aiming for.

  • Split base class from templated implementation.
This revision is now accepted and ready to land.Jan 24 2018, 5:23 AM
labath accepted this revision.Jan 24 2018, 5:30 AM

Cool, thanks.

Thanks Pavel!

I'll wait on D42501 to land this, as I need both to merge these changes internally.

This revision was automatically updated to reflect the committed changes.