This mirrors the test-lower-to-llvm pass pipeline that provides some sanity when running e2e examples.
One peculiarity of the GPU pipeline is that we want to allow 32b indexing in kernels.
This is currently not straightforward as there are dependencies between passes.
This new test pass orders passes in a way that connects end-to-end.
inconsistent?