Apparently, the features were getting mixed up, so we'd try to disassemble in ARM mode. Fix sub-architecture detection to compute the correct triple if we're detecting it automatically, so the user doesn't need to pass --triple=thumb etc.
It's possible we should be somehow tying the "+thumb-mode" target feature more directly to Tag_CPU_arch_profile? But this seems to work reasonably well, anyway.
While I'm here, fix up the other llvm-objdump tests that were explicitly specifying an ARM triple; that shouldn't be necessary.