The ARM documentation for ARMv7-M indicates that atomic read-modify-write operations can be implemented using LDREX/STREX loops but some Cortex-M4 hardware implementations including ti_sitara_m4 and fsl_imx6sx_m4 do not provide the "global monitor" required to support exclusive load/stores. On such hardware, use of LDREX/STREX leads to a bus error (Signal 10) or segmentation fault (Signal 11).
See:
http://e2e.ti.com/support/arm/automotive_processors/f/1021/t/708195?tisearch=e2e-quicksearch&keymatch=ldrex
https://e2e.ti.com/support/legacy_forums/embedded/tirtos/f/355/p/541932/1979054#1979054
The "mno-gm" option (gm for Global Monitor) suppresses the generation of these RMW instructions and uses libcalls instead.