@efriedma My weak ordering was wrong. The logic I posted above does not always hold. A > B and B < A can both be true if one is scalable and the other is not. Anyhow what you propose makes sense. I made the changes in the latest edit.
Jun 8 2020
Jun 4 2020
Jun 1 2020
May 29 2020
Removed ElementCount operator overloads and reverted getVectorElementCount() for case i8 -> <1 x i1>
Mar 25 2020
@Hahnfeld Thanks for the review. I made your suggested changes in the latest update.
Moving KMP_MB above debug output.
Added memory barrier to solve potential data race when acquiring lock for critical section.