[Hexagon] Simplify CFG after atomic expansion
This will remove suboptimal branching from the generated ll/sc loops.
The extra simplification pass affects a lot of testcases, which have
been modified to accommodate this change: either by modifying the
test to become immune to the CFG simplification, or (less preferablt)
by adding option -hexagon-initial-cfg-clenaup=0.