Currently, the interaction between the triple, the CPU, and the supported features is a mess: the driver edits the triple to indicate the supported architecture version, and the LLVM backend uses this to figure out what instructions are legal. This makes it difficult to understand what's happening, and makes it impossible to LTO together two modules with different computed architectures.
Instead of relying on triple rewriting to get the correct target features, we should add the right target features explicitly.