If a kernel had no formal arguments but did have the implicit
arguments, we were reporting a required kernarg alignment of 4. For
some reason we require an 8-byte alignment for this, even though
there's no real advantage and I don't see where this is documented in
the ABI.
The code object header code also claims the minimum alignment is 16,
which is what I thought you always got at runtime anyway so I don't
know why this matters.