Currently, -flto-unit is specified whenever LTO options are used
(unless using the old LTO API). This causes vtable defs to be processed
using regular LTO, which is needed for CFI and whole program vtable
optimizations, since they need to modify the vtables in a whole program
manner.
However, this causes non-negligible overhead due to the regular
LTO processing. Since this isn't needed when not using CFI or
-fwhole-program-vtables, only enable -flto-unit in those cases.
Otherwise all ThinLTO compiles pay the overhead, even when not needed.