Page MenuHomePhabricator

[WIP] [Bitstream] take Expected<T> off critical reading paths
Needs ReviewPublic

Authored by sammccall on Jul 13 2022, 6:08 PM.

Details

Summary

BitstreamReader can encounter many types of error and returns Expected<T> to
signal them. This is rich & typesafe, though adds some overhead.

Unfortunately this technique is used for functions as low-level as "read an
n-bit integer" which are called frequently, and the overhead adds up.

This patch tries to avoid this:

  • next to low-level functions that return Expected (e.g. Read), adds a lower-level ReadE which returns the value, and signals errors out of band.
  • ReadE sets a "last error" object on the stream which needs to be checked
  • it turns out there are cases where we can safely elide/combine some checks, simplifying control flow
  • Read is still available with the same signature, but implemented on ReadE.
  • ReadRecord is converted to use ReadE and friends
  • to avoid inlining a bunch of "makeStringError" formatting into hot functions, we convert the errors used to a custom error type which is cheaper.

A motivating use case: clangd builds a clang PCH ones and then reuses it for
many reparses of the main file. This patch speeds up those reparses by ~7%
(tested with clang/lib/Sema/SemaOverload.cpp and clangd/AST.cpp).

Diff Detail

Event Timeline

sammccall created this revision.Jul 13 2022, 6:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2022, 6:08 PM
sammccall requested review of this revision.Jul 13 2022, 6:08 PM
Herald added a project: Restricted Project. · View Herald TranscriptJul 13 2022, 6:08 PM

This is very much a prototype, hopefully the idea is clear.

Would appreciate some feedback on:

  • is this a reasonable/comprehensible error-handling scheme, does it introduce too much tech debt for the performance win etc
  • whether to keep the E versions of the functions private, whether to make both versions public, or whether to completely replace the old versions
  • better naming

Profile before


Profile after

The workload is clangd --check=AST.cpp, modified to rebuild the AST 1000 times, it's roughly 2/3rds buildAST.
There's about 4% savings clearly visible, and 4% divided by 2/3rs is close to the 7% speedup I saw in the logs.

I would expect this to generalize to other PCH/modules use.

nridge added a subscriber: nridge.Jul 17 2022, 12:51 AM