The addition of inverse_throughput mode highlighted the disjointedness of snippet generators and benchmark runners because it used the UopsSnippetGenerator with the LatencyBenchmarkRunner. To keep the code consistent tie the snippet generators to parallelization/serialization rather than their benchmark runners.
Renaming LatencySnippetGenerator -> SerialSnippetGenerator.
Renaming UopsSnippetGenerator -> ParallelSnippetGenerator.
Renaming Uops -> Parallel in types and functions related to the ParallelSnippetGenerator.