This is an archive of the discontinued LLVM Phabricator instance.

[Support] change StringMap hash function from djbHash to xxHash
ClosedPublic

Authored by erikdesjardins on Jan 29 2023, 8:02 PM.

Details

Summary

Depends on https://reviews.llvm.org/D142861.

Alternative to https://reviews.llvm.org/D137601.

xxHash is much faster than djbHash. This makes a simple Rust test case with a large constant string 10% faster to compile.

Previous attempts at changing this hash function (e.g. https://reviews.llvm.org/D97396) had to be reverted due to breaking tests that depended on iteration order.
No additional tests fail with this patch compared to main when running check-all with -DLLVM_ENABLE_PROJECTS="all" (on a Linux host), so I hope I found everything that needs to be changed.

Diff Detail

Event Timeline

erikdesjardins created this revision.Jan 29 2023, 8:02 PM
Herald added a project: Restricted Project. · View Herald Transcript
erikdesjardins requested review of this revision.Jan 29 2023, 8:02 PM
Herald added projects: Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project, Restricted Project. · View Herald TranscriptJan 29 2023, 8:02 PM
erikdesjardins added inline comments.Jan 29 2023, 8:09 PM
llvm/test/DebugInfo/X86/gnu-public-names.ll
68–71

Previously, this was

; ASM: .byte   32                      # Attributes: VARIABLE, EXTERNAL
; ASM-NEXT: .asciz  "global_variable"       # External Name
...more lines...
; ASM: .byte   32                      # Attributes: VARIABLE, EXTERNAL
; ASM-NEXT: .asciz  "ns::global_namespace_variable"       # External Name

but the reordering resulted in there being another .byte 32 (a very fragile check line!) before global_variable.

189–192

Similarly here--f7 used to be at the end of the list below, and was just ignored, but now it appears in the middle

Can you take a shot against https://llvm-compile-time-tracker.com/ so that we get an hint of the practical speedup?

Looks like xxhash.h is missing from the patch.

Do you intend to (optionnaly) provide XXH3 as described in https://github.com/Cyan4973/xxHash ?

Can you take a shot against https://llvm-compile-time-tracker.com/ so that we get an hint of the practical speedup?

I don't think I have permission to use that. @nikic?

Looks like xxhash.h is missing from the patch.

It already exists: https://github.com/llvm/llvm-project/blob/main/llvm/include/llvm/Support/xxhash.h

MaskRay added a comment.EditedJan 30 2023, 10:01 AM

To enable smooth hash function migration in the future, we can explore the idea of adding #ifdef EXPENSIVE_CHECKS\nshuffle to StringMap (see https://github.com/llvm/llvm-project/issues/34483).
That means we should fix these tests properly to be non-dependent on the iteration order.

FWIW, if the only use if hashtables xxhash has a slight perf bug in XXH3_len_129to240_128b and XXH3_len_129to240_64b where it will repeat the last the block if length is a factor of 32/16 respectively.
XXH3_len_129to240_128b:

XXH3_len_129to240_64b:

It's not fixed b.c xxhash needs to have a stable result for compression, but for hashtable it might be worth fixing in the internal llvm version?

rebase over landed D142861

Can you take a shot against https://llvm-compile-time-tracker.com/ so that we get an hint of the practical speedup?

I ran rustc perf tests using a patched LLVM in https://github.com/rust-lang/rust/pull/107552#issuecomment-1411927687,
you can see the results here: https://perf.rust-lang.org/compare.html?start=ad8e1dc2863f63c35ef3ceef3064d0851a1d2582&end=8aaf54b663467dc48ff24d764aa9f73c61db2733&stat=instructions:u
Improvements from 0.2%-2% on various benchmarks, no regressions.

@serge-sans-paille it looks like you have access to llvm-compile-time-tracker, can you push this patch up so we can see those perf tests too?
Probable oneliner: git remote add erikdesjardins https://github.com/erikdesjardins/llvm-project.git && git fetch erikdesjardins && git push origin erikdesjardins/xxhash-perf:refs/heads/perf/xxhash

Do you intend to (optionnaly) provide XXH3 as described in https://github.com/Cyan4973/xxHash ?

The current version of xxhash in tree is stripped down to just xxhash64. I don't intend to change that as part of this patch.

MaskRay added inline comments.Feb 2 2023, 5:20 PM
clang-tools-extra/clangd/unittests/BackgroundIndexTests.cpp
638 ↗(On Diff #494155)

Rebase. I've changed this to UnorderedElementsAre

clang-tools-extra/test/modularize/ProblemsInconsistent.modularize
6 ↗(On Diff #494155)

Rebase. I've fixed modularize's misuse of StringMap https://llvm.org/docs/ProgrammersManual.html#llvm-adt-stringmap-h

StringMap iteration order, however, is not guaranteed to be deterministic, so any uses which require that should instead use a std::map.

erikdesjardins added inline comments.Feb 2 2023, 7:29 PM
mlir/test/Transforms/print-op-graph.mlir
9 ↗(On Diff #494155)

Opened https://reviews.llvm.org/D143239 for this test, I'll rebase once it lands.

MaskRay accepted this revision.Feb 2 2023, 8:04 PM
MaskRay added inline comments.
mlir/test/Transforms/print-op-graph.mlir
1 ↗(On Diff #494155)

I fixed mlir/lib/Transforms/ViewOpGraph.cpp. Please rebase and drop changes to this file.

This revision is now accepted and ready to land.Feb 2 2023, 8:04 PM
MaskRay added a comment.EditedFeb 2 2023, 8:07 PM

For posterity, the tests that fail on main are:

Skipped : 43
[...]

You can remove these from the description. They are flaky tests unrelated to this patch (openmp tests tend to be flaky from my experience).

rebase, remove now-unnecessary test changes

erikdesjardins edited the summary of this revision. (Show Details)

update description

MaskRay accepted this revision.Feb 4 2023, 12:38 PM
MaskRay added inline comments.
llvm/test/ObjectYAML/Offload/multiple_members.yaml
21 ↗(On Diff #494836)

Rebase:) I fixed obj2yaml

llvm/test/tools/llvm-profdata/suppl-instr-with-sample.test
1

llvm-profdata uses OnDiskHashTable.h which I think is infeasible to stabilize without drastic change to the format. So keep the change.