It was allowed by llvm::canSinkOrHoistInst to sink non-invariant loads into loops but it is illegal because it can introduce new data races. For example:
b = *p; for (int i = 0; i < N; i++) a[i] = b;
Assuming a is a thread local value, it should always contain a single value. If it were to contain different values at different indices, that would be a miscompile.