This was voted into C++17 at last week's Jacksonville meeting. The final P0152R1 paper will be in the upcoming post-Jacksonville mailing, and is also available here:
http://jfbastien.github.io/papers/P0152R1.html
The libc++ part of this implementation is in the following code review:
http://reviews.llvm.org/D17951
On some targets (like Hexagon), 4-byte values are cheap to inline, but 1-byte values are not. Clang is spotty about checking this, but TargetInfo::hasBuiltinAtomic seems like the right function to ask, if you have access to it.