Many sources show that xxh3 is much better than xxh64. This particular
instance may or may not have noticeable difference, but this change
moves us toward removing xxHash64.
Details
Diff Detail
- Repository
- rG LLVM Github Monorepo
Event Timeline
These hashes are used in the filenames of index shards, and in the metadata describing dependencies between shards.
So this is an incompatible change to indexing format, you also need to increment constexpr static uint32_t Version in Serialization.cpp:460.
(I think this is fine, but not stamping just yet as I want to sync with Kadir on monday - I don't think we have anything that GCs the old shards but probably that shouldn't block this change)
thanks for doing this!
as Sam pointed out this will result in invalidation of all the index shards, but that's not something new. we already don't clean up non-relevant index shards when people delete sources over time and rely on people having new checkouts or clearing things up manually. this will be somewhat more severe, as we'll double the index size all of a sudden.
but i'd like to keep this patch from landing until llvm-17 is cut (which should be tomorrow). we didn't have any changes to our index shard serialization since llvm-16 release, so forcing everyone to face this invalidation purely for the sake of a clean up that might never happen in the codebase doesn't feel enough of a justification. whereas it's more likely that we'll have changes to serialization format throughout the next release cycle and can hit users only once with big cache invalidations.