Lower Forall to the previously added hlfir.forall, hlfir.forall_mask.
hlfir.forall_index, and hlfir.region_assign operations.
The HLFIR assignment code lowering is moved into genDataAssignment for
more readability and so that user defined assignment (still a TODO),
will be able to share most of the logic.
Why cannot we do the same for the masked case?