Splitting this into an own commit to get some visibility.
After doing some extensive benchmarking wiht llvm test-suite, I see no correctness or compiletime changes. I see some very slight benchmark runtime improvements (which may or may not be noise). The only noticeable regression is in a benchmark were literally 6 lines of assembly changed for the better, so that must be cache issues.