This is an archive of the discontinued LLVM Phabricator instance.

[CodingStandards] Clarify C++ Standard Library usage
ClosedPublic

Authored by jdoerfert on Feb 10 2020, 9:13 AM.

Details

Summary

The existing wording leaves it unclear if C++ standard library data
structures should be preferred over custom LLVM ones, e.g., SmallVector,
even though common practice seems clear on the issue. This change makes
the wording more explicit and aligns it better with the code base.

Some motivating statistics:

ag SmallVector llvm/lib/ | wc
  8846   40306  901421
 ag 'std::vector' llvm/lib/ | wc
  2123    8990  214482

ag SmallVector clang/lib/ | wc
  3023   13824  281691
ag 'std::vector' clang/lib/ | wc
   719    2914   72817

Diff Detail

Event Timeline

jdoerfert created this revision.Feb 10 2020, 9:13 AM
Herald added a project: Restricted Project. · View Herald TranscriptFeb 10 2020, 9:13 AM
Herald added subscribers: bollu, wdng. · View Herald Transcript

I think I would prefer if this were a somewhat less absolute statement, like:

When both C++ and the LLVM support libraries provide similar functionality,
and there isn't a specific reason to favor the C++ implementation, it is
generally preferable to use the LLVM library. For example, llvm::DenseMap
should almost always be used instead of std::map or std::unordered_map,
and llvm::SmallVector should usually be used instead of std::vector.

I think I would prefer if this were a somewhat less absolute statement, like:

When both C++ and the LLVM support libraries provide similar functionality,
and there isn't a specific reason to favor the C++ implementation, it is
generally preferable to use the LLVM library. For example, llvm::DenseMap
should almost always be used instead of std::map or std::unordered_map,
and llvm::SmallVector should usually be used instead of std::vector.

Actually, do we have a discussion of the trade-offs between collection
implementations anywhere? Some of that could go in header comments,
but it might make more sense to have a place in the documentation;
among other things, it'd be a way of pointing people towards the really
specialized implementations (like llvm::TinyPtrVector) that otherwise
they might have a hard time finding.

I think I would prefer if this were a somewhat less absolute statement, like:

When both C++ and the LLVM support libraries provide similar functionality,
and there isn't a specific reason to favor the C++ implementation, it is
generally preferable to use the LLVM library. For example, llvm::DenseMap
should almost always be used instead of std::map or std::unordered_map,
and llvm::SmallVector should usually be used instead of std::vector.

Instead of

It is generally
preferable to use the LLVM support libraries over the C++ standard library
version, especially if any of the specializations is expected to improve
performance, e.g., `SmallVector` vs `std::vector`.

?

it'd be a way of pointing people towards the really specialized implementations

A list of all "specialized" versions of a common data structure would be nice, with trade-offs or use cases.
We could link to that from here once it exists.

I think I would prefer if this were a somewhat less absolute statement, like:

When both C++ and the LLVM support libraries provide similar functionality,
and there isn't a specific reason to favor the C++ implementation, it is
generally preferable to use the LLVM library. For example, llvm::DenseMap
should almost always be used instead of std::map or std::unordered_map,
and llvm::SmallVector should usually be used instead of std::vector.

Instead of

It is generally
preferable to use the LLVM support libraries over the C++ standard library
version, especially if any of the specializations is expected to improve
performance, e.g., `SmallVector` vs `std::vector`.

?

That's what I was thinking. What do you think?

it'd be a way of pointing people towards the really specialized implementations

A list of all "specialized" versions of a common data structure would be nice, with trade-offs or use cases.
We could link to that from here once it exists.

Any interest on putting that together? :)

Use wording provided by @rjmccall

Any interest on putting that together? :)

Maybe but not right now. I'd be interested in reading it to discover new things ;)

I think your wording is good, I updated the text.

rjmccall added inline comments.Feb 10 2020, 12:43 PM
llvm/docs/CodingStandards.rst
73–77

There should be a comma after "structures" here.

Feel free to commit with that fix.

@rjmccall :

Feel free to commit with that fix.

For the record, I take that as an accept and commit this.

This revision was not accepted when it landed; it landed in state Needs Review.Feb 10 2020, 6:35 PM
This revision was automatically updated to reflect the committed changes.
fhahn added inline comments.Feb 11 2020, 2:50 AM
llvm/docs/CodingStandards.rst
85

The programmer's manual has a section Picking the Right Data Structure for a Task https://llvm.org/docs/ProgrammersManual.html#picking-the-right-data-structure-for-a-task

It goes into much more detail, but it might be good to link/unify both?

rjmccall added inline comments.Feb 11 2020, 8:30 AM
llvm/docs/CodingStandards.rst
85

Ah, I'd forgotten about that section completely. Yes, it should absolutely be linked here. I don't think we want to merge the sections, though; what we've got now is a fine lead-in to a link like "For more information about LLVM's data structures and the tradeoffs they make, please consult that section of the programmer's manual."