[AArch64] Patch for lowering trunc instructions to 'tbl' for (8|16)xi32 -> (8|16)xi8 conversions in [[ https://reviews.llvm.org/D133495 | D133495 ]] is extended to support trunc to tbl lowering for (8|16) x i64 to (8|16) x i8.
A microbenchmark for runtime has been added for all these cases in [[ https://reviews.llvm.org/D136274 | D136274 ]]
Depends on [[ https://reviews.llvm.org/D133495 | D133495 ]]