Previously, the column number in a diagnostic would be the byte position
in the line. This results in incorrect column numbers when a multi-byte
UTF-8 character would be present in the input. This change corrects for
those multi-byte characters and for zero-length diacritic marks.
This fixes PR21144.
Instead of adding a parameter to getColumnNumber, it would probably make sense to just make this caller correct the column number afterwards.