When handling loops whose VF is 1, fold-tail vectorization sets the backedge taken count of the original loop with a vector of a single element. This approach will cause type-mismatch during instruction generartion, as reported in https://reviews.llvm.org/D78847#2029339
Based on discussion with Ayal @Ayal , I provide this patch to address the case of VF==1.
nit: using auto VF = State->VF will help shorten this to 2 lines.