TrigramIndex was added back in https://reviews.llvm.org/D27188 as an optimization to make SpecialCaseList::match() faster. I've found that TrigramIndex actually makes the function slower and it has no functional use, so we can remove it.
I grabbed the list of queries passed to SpecialCaseList::match() on a random very large file (AArch64ISelLowering.cpp) and measured the runtime to call match() on all of them with this line disabled and then enabled.
$ hyperfine --warmup 3 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests' 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests' Benchmark 1: GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests Time (mean ± σ): 575.9 ms ± 20.3 ms [User: 573.1 ms, System: 2.7 ms] Range (min … max): 555.5 ms … 620.0 ms 10 runs Benchmark 2: GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests Time (mean ± σ): 283.4 ms ± 6.7 ms [User: 280.3 ms, System: 3.0 ms] Range (min … max): 277.0 ms … 294.9 ms 10 runs Summary 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=0 build/unittests/Support/SupportTests' ran 2.03 ± 0.09 times faster than 'GTEST_FILTER="SpecialCaseListTest.Large" USE_TRIGRAMS=1 build/unittests/Support/SupportTests'
Using perf I found that most of the runtime in TrigramIndex::isDefinitelyOut() comes from a division operation that seems to come from std::unordered_map: https://github.com/llvm/llvm-project/blob/8e1f820bb4eadf5c0704818f6063e0db1006e32d/llvm/include/llvm/Support/TrigramIndex.h#L62
Removing TrigramIndex will make it easier to potentially switch to using GlobPattern instead of a full regex for SpecialCaseList. See discussion in https://reviews.llvm.org/D152762 for details.