Scheduling AESE/AESMC and AESD/AESIMC instruction pairs back-to-back
gives a double digit speedup on benchmarks using those instructions on
Cortex-A processors. In GCC, this optimization is part of the generic
processor model as well.
This change should not have a major performance impact on processors
that do not optimize AES instruction pairs, although I only had access
to Cortex-A processors for benchmarking.