In case of unaligned nop sequences, pad to the nearest 4-byte boundary
with zeros before filling with nop instructions. This is consistent
with gas behavior, and is necessary to compile the Linux kernel with
LLVM IAS.
Replace support::endian::write with OS.write while at it. This is
simpler and correct because we only have little endian.